Category "pandas"

How to transform columns with method chaining?

What's the most fluent (or easy to read) method chaining solution for transforming columns in Pandas? (“method chaining” or “fluent” is

Python/Pandas Add string to rows in a column that contain a character a specific number of times

I have a Pandas DataFrame(data) with a ['Duration'] column as 'object' type that has time durations in format: 'H:%M:%S' such as '1:47:54' with 7 characters, bu

Adding new dataframe colonms using information extracted from the url in the url column, but the url could be missing information

Given: A pandas dataframe that contains a user_url column among other columns. Expectation: New columns added to the original dataframe where the columns are co

How do I remove hours and seconds from my DataFrame column in python? [duplicate]

I have a DataFrame : Age Gender Address Date 15 M 172 ST 2022-02-07 00:00:00 I Want to remove hh:mm:ss I tried: import datetime

Getting `A value is trying to be set on a copy of a slice from a DataFrame.` when setting a column

I know a value should not be set on a view of a pandas dataframe and I'm not doing that but I'm getting this error. I have a function like this: def do_somethin

Pandas groupby feature question for output CSV

I have the following code df.groupby('AccountNumber')[['TotalStake','TotalPayout']].sum() which displays as I would like it to in pandas The issue is when I ou

Alternative way to append a dataframe to itself N times and populate new column

Is there an alternative way to append a dataframe to itself N times where N is based on a list length, and the list contents are added as a new column to the da

Create multiple DataFrames using data from an api

I'm using the world bank API to analyze data and I want to create multiple data frames with the same indicators for different countries. import wbgapi as wb imp

Is there a way to control which vertices connect in a plotly.express.line_geo map?

I'm trying to make a connection map that has the option to use an animation_frame to show different months/years. Plotly.express has this option, but the plotly

Is there a way to validate data type lengths in Pandas when using the read_csv function?

I'm trying to put some sort of length validation for columns using Pandas. For example, let's say I have a csv named test.csv that has the following data within

Why am I getting NANs when concatenating a Data Frame with a Series

I have a Pandas Dataframe ('a') and a Series ('b') both with timeseries index (weekends excluded). I am trying to concatenate them. Both of them start with the

Apply loc to the entire dataframe but one column (keep the one column as it was and not remove it)

I am trying to divide the entire dataframe by a fix number but I want to keep the 'Year' column as is. I tried dividing the entire df with 100 and then multiply

Pandas - Cross referencing with DatetimeIndex - Groupby

I have data of many companies by month (End of Month). I want to create a new columns with groupby for each company where: new_col from Jul of this year to Jun

Compare 2 csv files and remove the common lines from 1st file | python

I want to compare 2 csv files master.csv and exclude.csv and remove all the matching lines based on column1 and write the final output in mater.csv file. master

Apply a weighted decay that changes over time in Python

I have a dataframe in Python that looks like the one below: I want to calculate the dnf_rate_weighted so that there's a 0.95 decay for each stage going back th

How to create a Dataframe from multiple dictionaries

I have a little issue with my the data I have (multiple dictionaries) to process and create a Dataframe from them. This what the data look like: print(data) 0

Easiest way to ignore or drop one header row from first page, when parsing table spanning several pages

I am parsing a PDF with tabula-py, and I need to ignore the first two tables, but then parse the rest of the tables as one, and export to a CSV. On the first re

Panda dataframe replace() method for row numbers

I need to replace some values in a column with a specific value using the row numbers list of the required values as an array like following array.Can I use dat

dataframe to save csv: not accumulating the records only saving the last dataframe group records

dataframe question in web scraping data group example:the first loop-eg:5 records, second loop-eg:3 records when I did my below code, the csv file was saved the

How to concatenate the values of a dataframe along column axis and fill missing values?

I really stuck in this problem for a long time. I have a data frame, I want to group the data based on the ids and then stick the values for each id together. H