Category "pandas"

Extract nested values from data frame using python

I've extracted the data from API response and created a dictionary function: def data_from_api(a): dictionary = dict( data = a['number'] ,created_b

Replace null values by the mean of each group

I have a dataset similar to below with several columns which contain Nan values. I would like to group the dataset by location and fill the Nan in Iso code and

Pandas store a json into a pandas column

I have a pandas dataframe like this data = {"Name": ["Tom", "nick", "kish", "jack"], "Age": [20, 21, 19, 18]}

Remove zeros from Dataframe of lists

I have such a DataFrame: index B 0 [0,1,2,0,4] 1 [1,0,2,0,0,1,7] I want to count the non zero values of each list for each row. Result: index B 0 3 1 4

LabelEncoding a permutation of combination of columns

I'd like to create class labels for a permutation of two columns using sklearn's LabelEncoder(). How do I achieve the following behavior? import pandas as pd im

(Pandas, Python) Selecting indices of a parent DF based on shared column values with a child DF

(I recently asked this question on r/learnpython (here), but didn't get any feedback, so am re-posting it verbatim here. Hope that is okay!) Suppose I have a D

Python rank: give negative rank to negative numbers

I have a basic set of data like: ID Value A 0.1 B 0.2 C -0.1 D -0.01 E 0.15 If we use data.rank() we get the result: ID Value A 3 B 5 C 1 D 2 E 4 Bu

Filter column list based on another column in Python

In Python, I have a dataset like this below, where column1 and column2 are objects and not strings: data = {'id': ['first_value', 'first_value', 'second_value'

Xarray: grouping by contiguous identical values

In Pandas, it is simple to slice a series(/array) such as [1,1,1,1,2,2,1,1,1,1] to return groups of [1,1,1,1], [2,2,],[1,1,1,1]. To do this, I use the syntax:

What could be wrong with a Pandas DataFrame?

I couldn't make head or tail of this: I have a function that reads a bunch of csv files from a S3 bucket, concats them and returns the DataFrame: def create_df(

i want to make urls

1.Link is "https://www.xyz.{country}/dp/{asin}" 2.I have to pick two things from csv file which country and asin. CSV file contains : Asin Country 0

Most efficient way to transform this data using Pandas?

I currently have several hundred .csv files in the format shown on the left below, and I need to transform them all into the format shown on the right. I tried

Retrieving values based on other values (dataframe) - how to make my code more efficient?

So after much trying I've managed to get something a bit closer to what I intend to do. Scenario is as follows, a dataframe with many columns of which one conta

How can I plot specific Excel data from two columns with conditions?

I have a huge spreadsheet of data that looks something like this: Date IDNumber Item 2021-05-10 1 Apple 2021-05-10 1 Orange 2021-05-10 2 Apple 2021-05-10 2 Gra

Sum of list values in a df, new column, values are objects

I have a df made of values from a dictionary. I can get rid of [], ',' and split it all in different cols (one col per number). But can't make the transfer to f

make a mean of several year dataframes, hour by hour

I have several dataframes of some value taken very hour, on several year, like this : df1 Out[6]: time P G(i) H_sun T2m WS10m Int

How to convert mean value of each column variable and fill this mean value to corresponding variable in dataframe? [duplicate]

I have a mining dataset which has a following features Rock_type, Gold in grams(AU). Rock type has 8 different rock types and Gold (AU) has pr

Iterating through XMLs, making dataframes from nodes and merging them with a master dataframe. How should I optimize this code?

I'm trying to iterate through a lot of xml files that have ~1000 individual nodes that I want to iterate through to extract specific attributes (each node has 1

Split second level multindex column to create three level column in Pandas

Given a multiindex df X E1_ex0 E1_ex2 E2_ex0 E4_ex0 0 3 4 1 1 1 4 3 2 0 I would like to s

Pandas Merging 101

How can I perform a (INNER| (LEFT|RIGHT|FULL) OUTER) JOIN with pandas? How do I add NaNs for missing rows after a merge? How do I get rid of NaNs after merging?