Category "pandas"

Pandas groupby feature question for output CSV

I have the following code df.groupby('AccountNumber')[['TotalStake','TotalPayout']].sum() which displays as I would like it to in pandas The issue is when I ou

Alternative way to append a dataframe to itself N times and populate new column

Is there an alternative way to append a dataframe to itself N times where N is based on a list length, and the list contents are added as a new column to the da

Create multiple DataFrames using data from an api

I'm using the world bank API to analyze data and I want to create multiple data frames with the same indicators for different countries. import wbgapi as wb imp

Is there a way to control which vertices connect in a plotly.express.line_geo map?

I'm trying to make a connection map that has the option to use an animation_frame to show different months/years. Plotly.express has this option, but the plotly

Is there a way to validate data type lengths in Pandas when using the read_csv function?

I'm trying to put some sort of length validation for columns using Pandas. For example, let's say I have a csv named test.csv that has the following data within

Why am I getting NANs when concatenating a Data Frame with a Series

I have a Pandas Dataframe ('a') and a Series ('b') both with timeseries index (weekends excluded). I am trying to concatenate them. Both of them start with the

Apply loc to the entire dataframe but one column (keep the one column as it was and not remove it)

I am trying to divide the entire dataframe by a fix number but I want to keep the 'Year' column as is. I tried dividing the entire df with 100 and then multiply

Pandas - Cross referencing with DatetimeIndex - Groupby

I have data of many companies by month (End of Month). I want to create a new columns with groupby for each company where: new_col from Jul of this year to Jun

Compare 2 csv files and remove the common lines from 1st file | python

I want to compare 2 csv files master.csv and exclude.csv and remove all the matching lines based on column1 and write the final output in mater.csv file. master

Apply a weighted decay that changes over time in Python

I have a dataframe in Python that looks like the one below: I want to calculate the dnf_rate_weighted so that there's a 0.95 decay for each stage going back th

How to create a Dataframe from multiple dictionaries

I have a little issue with my the data I have (multiple dictionaries) to process and create a Dataframe from them. This what the data look like: print(data) 0

Easiest way to ignore or drop one header row from first page, when parsing table spanning several pages

I am parsing a PDF with tabula-py, and I need to ignore the first two tables, but then parse the rest of the tables as one, and export to a CSV. On the first re

Panda dataframe replace() method for row numbers

I need to replace some values in a column with a specific value using the row numbers list of the required values as an array like following array.Can I use dat

dataframe to save csv: not accumulating the records only saving the last dataframe group records

dataframe question in web scraping data group example:the first loop-eg:5 records, second loop-eg:3 records when I did my below code, the csv file was saved the

How to concatenate the values of a dataframe along column axis and fill missing values?

I really stuck in this problem for a long time. I have a data frame, I want to group the data based on the ids and then stick the values for each id together. H

How to deal with SettingWithCopyWarning in Pandas

Background I just upgraded my Pandas from 0.11 to 0.13.0rc1. Now, the application is popping out many new warnings. One of them like this: E:\FinReporter\FM_EXT

Round to 2 Decimal Places Even with Zeros

I am currently using the following logic to round to round to 2 decimal places: billables_all["Parts Charged"] = billables_all["Parts Charged"].fillna(0).round(

Round to 2 Decimal Places Even with Zeros

I am currently using the following logic to round to round to 2 decimal places: billables_all["Parts Charged"] = billables_all["Parts Charged"].fillna(0).round(

Save multiple dataframes to the same file, one after the other

Lets say I have three dfs x,y,z 0,1,1,1 1,2,2,2 2,3,3,3 a,b,c 0,4,4,4 1,5,5,5 2,6,6,6 d,e,f 0,7,7,7 1,8,8,8 2,9,9,9 How can I stick them all together so that

How can fit a keras model with a dataframe of numpy arrays?

I want to train a model with self-generated matrices (word vectors). My data have the following datatypes: print(type(X)) print(type(X[0])) print(type(X[0][0]))