Category "pandas"

how to check if value in a DataFrame is a type Decimal

I am writing a data test for some api calls that return a DataFrame with a date type and a type Decimal. I can't find a way to verify the Decimal the DataFrame

Get index and column with multiple headers and index_col in Pandas DataFrame

I have a dataframe with multiple headers and column indexes, and would like to retrieve the list of entries that are non-zero. The dataframe is constructed from

How to edit/ sort a non-column column in Python?

I wrote the script below, and I'm 98% content with the output. However, the unorganized manner/ disorder of the 'Approved' field bugs me. As you can see, I trie

Geopandas not plotting correct colors

My Geopandas DataFrame has 3 polygons and 9 points with color_rgba column computed with matplotlib.colors.to_rgba function: import contextily as ctx import geop

Numpy where function in python

I have a data frame like this: pd.DataFrame({'Material': ['Steel (16MnCr5)', 'X', 'X', 'X', 'Carbon black', 'Sulfur', 'Copper'], 'Weight': [4, 8, 0, 8, 6, 9, 3

how to do count of particular value of given column corresponding to other column

To count the particular value of given column

How to fix ParserError: year 0 is out of range: 0000-00-00 with Python Pandas to_datetime method

I am trying to convert a column "travel_start" to a datetime object. Dashboard["travel_start"] = pd.to_datetime(Dashboard["travel_start"]) But I get the fol

rows wise correlation between two Dataframe which have unequal columns

I have two Dataframes, (Dataset1=200rows, 34 column)(Dataset2=200rows, 22 column). I want rows wise correlation between both datasets. how can I perform this. I

Plot multiple columns side by side

I have the dataframe below. 111_a 111_b 222_a 222_b 333_a 333_b row_1 1.0 2.0 1.5 2.5 1.0 2.5 row_2 1.0 2.0 1.5 2.5 1.0

get the index of search item in a dataframe

I have a dataframe which contain a column combine 0 (43,FR,html5 full skinz html5) 1 (43,FR,mobile m-skinz2) 2 (43,FR,mobile m-skinz2 plus) 3

Transform a dataframe using pivot

I am trying to transform a dataframe using pivot. Since the column contains duplicate entries, i tried to add a count column following what's suggested here (Qu

text file rows into CSV column python

I've a question I've a text file containing data like this A 34 45 7789 3475768 443 67 8999 3343 656 8876 802 383358 873 36789 2374859 485994 86960 32838459 348

How do I print out the Phone number from a csv with padded 0 using pandas?

So I have a CSV file with the following content: Person,Phone One,08001111111 Two,08002222222 Three,08003333333 When I used the following code: import pandas a

Pandas' read_html not reading html tables

I am trying to see if I can use, and only use, Pandas' read_html function to scrape HTML tables from the following website: https://www.baseball-reference.com/t

Fill missing date and time in Python (pandas)

I have a large data set, a sample is given below. The data is recorded for 1 day with 5-min interval for 24 hours for 3214 unique ids. The time and date informa

How to delete older files and keep last day files for each month in python

I need to retain the backup file started on 31st april and ended next day May 1. backup timings differ for each folder, but the backup files are identical

Realise accumulated DataFrame from a column of Boolean values

Be the following python pandas DataFrame: ID Holidays visit_1 visit_2 visit_3 other 0 True 1 2 0 red 0 False 3 2 0 red 0 True 4 4 1 blue 1 False 2 0 0 red 1 Tr

How to calculate monthly changes in a time series using pandas dataframe

As I am new to Python I am probably asking for something basic for most of you. However, I have a df where 'Date' is the index, another column that is returning

invalid decimal literal when importing csv via pandas

not sure if something has changed within pandas but all of a sudden I am unable to import my .csv file using pd.read_csv due to the following error: PS C:\Users

Python pandas nlargest() not working properly with keep = 'all'

When I try to use the function below top3 = df1.nlargest(3, 'perChange', keep='all') Even if keep = 'all', the output is 92 3.828120 255 -0.673854 256