'Can't manipulate dataframe in pandas

Don't understand why I can't do even the most simple data manipulation with this data i've scraped. I've tried all sorts of methjods to manipulate the data but all come up with the same sort of error. Is my data even in a data frame yet? I can't tell.

import pandas as pd
from urllib.request import Request, urlopen

req = Request('https://smallcaps.com.au/director-transactions/'
              , headers={'User-Agent': 'Mozilla/5.0'})
trades = urlopen(req).read()
df = pd.read_html(trades)
print(df) #<-- This line prints the df and works fine

df.drop([0, 1]) #--> THis one shows the error below
print(df)

Error:

Traceback (most recent call last):
  File "C:\Users\User\PycharmProjects\Scraper\DirectorTrades.py", line 10, in <module>
    df.drop([0, 1])
AttributeError: 'list' object has no attribute 'drop'

Solution 1:^[1]

Main issue is as mentioned that pandas.read_html() returns a list of dataframes and you have to specify by index wich you like to choose.

Is my data even in a data frame yet?

df = pd.read_html(trades) No it is not, cause it provides a list of dataframes
df = pd.read_html(trades)[0] Yes, this will give you the first dataframe from list of frames

Example

import pandas as pd
from urllib.request import Request, urlopen

req = Request('https://smallcaps.com.au/director-transactions/'
              , headers={'User-Agent': 'Mozilla/5.0'})
trades = urlopen(req).read()
df = pd.read_html(trades)[0]
df.drop([0, 1])
df

Output

	Date	Code	Company	Director	Value
0	27/4/2022	ESR	Estrella Resources	L. Pereira	?$1,075
1	27/4/2022	LNY	Laneway Resources	S. Bizzell	?126,750
2	26/4/2022	FGX	Future Generation Investment Company	G. Wilson	?$13,363
3	26/4/2022	CDM	Cadence Capital	J. Webster	?$25,110
4	26/4/2022	TEK	Thorney Technologies	A. Waislitz	?$35,384
5	26/4/2022	FGX	Future Generation Investment Company	K. Thorley	?$7,980

...

Solution 2:^[2]

read_html returns a list of dataframes.

Try:

dfs = pd.read_html(trades)
dfs = [df.drop([0,1]) for df in dfs]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	HedgeHog
Solution 2	Learning is a mess

'Can't manipulate dataframe in pandas

Solution 1:[1]

Example

Output

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]