Category "dataframe"

How to filter out a row if there are two consecutive instances of the same value?

I have a data frame with multiple similar sequences in which column Z has a string pattern containing "VALUE1" and "VALUE2" (only these two patterns matter) and

Pandas+Uncertainties producing AttributeError: type object 'dtype' has no attribute 'kind'

I want to use Pandas + Uncertainties. I am getting a strange error, below a MWE: from uncertainties import ufloat import pandas number_with_uncertainty = ufloa

New column with week on week spending by store

I have a dataset that I need to track customers spending week by week based on the store. store <- c(1,2,3,4,5,6,1,2,3,4,5,6) week <- c(1,1,1,1,1,1,2,2,2,

Multiple aggregations of the same column using pandas GroupBy.agg()

Is there a pandas built-in way to apply two different aggregating functions f1, f2 to the same column df["returns"], without having to call agg() multiple times

add a column in dataframe based on existing value in another dataframe

I have a dataframe DF3 : zone_id combine 0 ABD 10 BCD 20 ABC 30 ABE and a second dataframe :combinaison_df: zone_id combine 0

How can I create a cross-tab of two columns in a dataframe in Python and generate a total row and column in the output?

I have created a dataframe from a CSV file and now I'm trying to create a cross-tab of two columns ("Personal_Status" and "Gender"). The output should look like

Pandas approximating/rounding large numbers from csv

I am reading numbers from a csv file into a pandas dataframe. When the numbers I am reading are approximately >1E12, pandas will approximate the number to 3

Read .csv file in R

I am a beginner to R, I have a file like below. state population Alabama 4779736 Alaska 710231 Arizona 6392017

How to create ratios using value counts and separate fields in Python?

Using the data frame shown below I'd like to create manager to assistant and manager to associate percentages/ ratios based/ per location. I'm looking for the

R replace string in df with partial match in a list

I have a dataframe (df) in R and I want to create a new column (city1_n) that contains a line stored in the list key whenever there is a partial match between c

Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

I want to filter my dataframe with an or condition to keep rows with a particular column's values that are outside the range [-0.25, 0.25]. I tried: df = df[(df

Compare two excel files for the difference using pandas with multiple tabs

I found this nice script online which does a great job comparing the differences between 2 excel sheets but there's an issue - it doesn't work if the excel file

How to create a dummy variable corresponding to a change in a value in R

I have the following data: week <- c(1,2,3,4,1,2,3,4,1,2,3,4) product <- c("A", "A", "A", "A", "B", "B", "B", "B", "C", "C", "C", "C") price <- c(5,5,6

I have a dataframe with a json substring in 1 of the columns. i want to extract variables and make columns for them

imports json df = pd.read_json("C:/xampp/htdocs/PHP code/APItest.json", orient='records') print(df) I would like to create three columns extra: ['name','l

how to "transpose" datas from a date to another one in python

Sorry i had a lot of trouble explaining my problem in the title but i hope it will be more understandable with this example : i have a data source that tells me

How to select all the rows with 0

I have a dataset where I have some 0 values in it. I want to print all the rows having 0. I was able to print a single column, but can't find a way to print al

How do I select values from one array based on a boolean array?

Let's say I have 2 numpy arrays, with the same 1200x1200 shape. The first one contains boolean values. The second one is an image, that was converted to boolean

Summarize two dataframes in r

I have two dataframes df1 # var1 var2 # 1 X01 Red # 2 X02 Green # 3 X03 Red # 4 X04 Yellow # 5 X05 Red # 6 X06 Green df2 # X01 X02

Groupby and create a dummy =1 if column values do not contain 0, =0 otherwise

My df id var1 A 9 A 0 A 2 A 1 B 2 B 5 B 2 B 1 C 1 C 9 D 7 D 2 D 0 .. desired output will ha

Pandas Lookup to be deprecated - elegant and efficient alternative

The Pandas lookup function is to be deprecated in a future version. As suggested by the warning, it is recommended to use .melt and .loc as an alternative. df =