Category "dataframe"

Calculate row means on subset of columns

Given a sample data frame: C1<-c(3,2,4,4,5) C2<-c(3,7,3,4,5) C3<-c(5,4,3,6,3) DF<-data.frame(ID=c("A","B","C","D","E"),C1=C1,C2=C2,C3=C3) DF I

how to retrieve sector and industry for a list of tickers with python?

I have a list of tickers (below: tick1) that comes from the Earnings Report. I would like to add the "shortname", "sector" and the "industry" next to the ticker

pandas | list in column to binary column

I have the following dataframe: +------------+------------------+ | item | categories | +------------+------------------+ | blue_shirt | ['red', 'wh

Pandas Dataframe Categorical data transformation

I am having pandas dataframe as follows: import pandas as pd # dictionary with list object in values # Item=[Item1, Item2, Item3] details = { 'Date' : [

Change values of column in df using conditional in two columns

I'm having the following problem: I'm working with a dataset that can be found at https://www.kaggle.com/datasets/ricardomattos05/jogos-do-campeonato-brasileiro

not able to install dask on google colab

I use pip method to install on google lab. But I am not sure why it is not working. Here is what I got code pip install "dask[dataframe]" --upgrade error Requi

Efficiency of multiple chained str transformation and alternatives

I'm wanting to change a dataframe column so the values are lower case and also have their whitespace stripped. For this I used chained str transformations. df.l

Python Pandas - Lookup a variable column depending on another column's value

I'm trying to use the value of one cell to find the value of a cell in another column. The first cell value ('source') dictates which column to lookup. import p

Repeat rows in a pandas DataFrame based on column value

I have the following df: code . role . persons 123 . Janitor . 3 123 . Analyst . 2 321 . Vallet . 2 321 . Auditor . 5 The first line means that I hav

How to filter out a row if there are two consecutive instances of the same value?

I have a data frame with multiple similar sequences in which column Z has a string pattern containing "VALUE1" and "VALUE2" (only these two patterns matter) and

Pandas+Uncertainties producing AttributeError: type object 'dtype' has no attribute 'kind'

I want to use Pandas + Uncertainties. I am getting a strange error, below a MWE: from uncertainties import ufloat import pandas number_with_uncertainty = ufloa

New column with week on week spending by store

I have a dataset that I need to track customers spending week by week based on the store. store <- c(1,2,3,4,5,6,1,2,3,4,5,6) week <- c(1,1,1,1,1,1,2,2,2,

Multiple aggregations of the same column using pandas GroupBy.agg()

Is there a pandas built-in way to apply two different aggregating functions f1, f2 to the same column df["returns"], without having to call agg() multiple times

add a column in dataframe based on existing value in another dataframe

I have a dataframe DF3 : zone_id combine 0 ABD 10 BCD 20 ABC 30 ABE and a second dataframe :combinaison_df: zone_id combine 0

How can I create a cross-tab of two columns in a dataframe in Python and generate a total row and column in the output?

I have created a dataframe from a CSV file and now I'm trying to create a cross-tab of two columns ("Personal_Status" and "Gender"). The output should look like

Pandas approximating/rounding large numbers from csv

I am reading numbers from a csv file into a pandas dataframe. When the numbers I am reading are approximately >1E12, pandas will approximate the number to 3

Read .csv file in R

I am a beginner to R, I have a file like below. state population Alabama 4779736 Alaska 710231 Arizona 6392017

How to create ratios using value counts and separate fields in Python?

Using the data frame shown below I'd like to create manager to assistant and manager to associate percentages/ ratios based/ per location. I'm looking for the

R replace string in df with partial match in a list

I have a dataframe (df) in R and I want to create a new column (city1_n) that contains a line stored in the list key whenever there is a partial match between c

Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

I want to filter my dataframe with an or condition to keep rows with a particular column's values that are outside the range [-0.25, 0.25]. I tried: df = df[(df