Category "dataframe"

randomly split dataframe into groups with even distribution of values

I have a dataframe of two groups (A and B) and within those groups, 6 subgroups (a, b, c, d, e, and f). Example data below: index group subgroup value 0

How do I reorder a long string of concatenated date and timestamps seperated by commas using Python?

I have a string type column called 'datetimes' that contains multiple dates with their timestamps, and I'm trying to extract the earliest and last dates (withou

How do I reorder a long string of concatenated date and timestamps seperated by commas using Python?

I have a string type column called 'datetimes' that contains multiple dates with their timestamps, and I'm trying to extract the earliest and last dates (withou

How to create variables based on column names in dataframe?

I wanted to create variables in python based on the column names of my dataframe. Not sure if this is possible as I am quite new to Python. Lets say my df looks

How to create variables based on column names in dataframe?

I wanted to create variables in python based on the column names of my dataframe. Not sure if this is possible as I am quite new to Python. Lets say my df looks

AttributeError: Can't get attribute '_unpickle_block'

While using: with open("data_file.pickle", "rb") as pfile: raw_data = pickle.load(pfile) I get the error: AttributeError: Can't get attribute '_unpickle

Change order of categorical bars in Plotly parallel categories

I am trying to visualize changes in gene expression as categorical variables (up, down, no change) over various timepoints. I have a dataframe describing differ

Python: pandas merge multiple dataframes

I have diferent dataframes and need to merge them together based on the date column. If I only had two dataframes, I could use df1.merge(df2, on='date'), to do

how to remove milliseconds or decimals in a specific dataframe column

I have 2 columns containing date and time(hr,min,seconds:milliseconds) How do I remove the milliseconds from only one of the column? Name MinTime

Changing values in columns based on their previous marker

I have the following dataframe: df = {'id': [1,2,3,4], '1': ['Green', 'Green', 'Green', 'Green'], '2': ['34','67', 'Blue', '77'], '3': ['Blue', '45', '9

Removing NAs from two columns in data frame a shift up

I have this data frame atac.v1.pbmc.5k.possorted.bam.bam possorted.bam.bam chr1.9941.10736 NA

Converting column values to rows [duplicate]

I have a dataset where all values in column B are the same. It looks like this: A B 0 Marble Hill Pizza Place 1 Ch

'Series' object has no attribute 'values_counts'

When I try to apply the values_count() method to series within a function, I am told that 'Series' object has no attribute 'values_counts'. def replace_1_occ_f

How to get all Sundays on dates in pandas and extract the corresponding values with it then save as new dataframe and do subtraction

I have a dataframe with 3 columns: file = glob.glob('InputFile.csv') for i in file: df = pd.read_csv(i) df['Date'] = pd.to_datetime(df['Date']) pri

How to get all Sundays on dates in pandas and extract the corresponding values with it then save as new dataframe and do subtraction

I have a dataframe with 3 columns: file = glob.glob('InputFile.csv') for i in file: df = pd.read_csv(i) df['Date'] = pd.to_datetime(df['Date']) pri

Is there an R function to pick only certain row value combinations?

I have a data frame that looks something like this: my_data <- data.frame( letter = c("x","x","x","x","x","y","y","y","y","z","z","z","z"), number = c

dataframe Spark scala explode json array

Let's say I have a dataframe which looks like this: +--------------------+--------------------+--------------------------------------------------------------+

How to create tertile in R

I Have a column in my dataframe called Score for example DF$Score<-(1.2,2,2,3.2,4.4,4.5,2.5,6.7,8.9,4.8) I want to make a new column containing tertiles of

How to convert the values of an attribute having categorical values to integer type?

I have a dataset in which one of its columns is Ex-Showroom_Price, and I'm trying to convert its values to integers but I'm getting an error. import pandas as p

Overwrite columns in DataFrames of different sizes pandas

I have following two Data Frames: df1 = pd.DataFrame({'ids':[1,2,3,4,5],'cost':[0,0,1,1,0]}) df2 = pd.DataFrame({'ids':[1,5],'cost':[1,4]}) And I want to upd