Category "pandas"

Unable to identify cause of: ValueError: Must have equal len keys and value when setting with an iterable

Background:I have a script that makes a daily API call for financial data, returns the data as a JSON object, saves it into a pandas df before doing some manipu

Python/Pandas Calculate the mean time (hour) of a Datetime column

I have a Pandas DataFrame (data) with a column ['Date'] in DateTime (date and time) which represents the time of arrival. How to calculate the mean of only the

Plotting the frequency of occurrences per date

I'm new to pandas and plotly. And I have a large csv file with two columns, a date column and a column that contains a string of text (event). Each event is a n

creating a list from a column with multiple lines

I have a Pandas data frame that in one column called SourceDocument I have multiple lines of data in each cell (separated by \n). SourceDocuments PRDS-002039\nP

Can´t copy tupel from one dataframe into another (Length of values does not match length of index)

I want to create columns in a dataframe (df_joined) that contains as values tupels from a second df (df_tupels). The tupels are (10,50) and (20,60). I tried var

I am using pandas to check user input in multiple columns, i want output as entire row which matches input

below is my code: for r in cols: full_row_of_matched = cols[cols.isin([input_ip]).any(axis=1)] exact_column = list(cols.columns[cols.eq(input_ip).any(0)

Find matching name in another table, return value associated w/ column in pandas

I have 2 tables. I want to take DF1 and adjust the values in the tables given the values in DF2. DF2 is simply a groupby of a column in DF1. In domain terms, I

Smart for loop in python for a portfolio performance

this is my first question here, so go easy on me. I've computed a certain portfolio in python, for which I've gotten a dataframe (or list for that matter) of ar

Groupby id and change values for all rows for the earliest date to NaN

I have the following id, i would like to groupby id and then replace value X with NaN. My current df. ID Date X other variables.. 1 1/1/18

Calculate cosine similarity and output without duplicates?

I have the following vectors in my toy example: data = pd.DataFrame({ 'id': [1, 2, 3, 4, 5], 'a': [55, 2123, -19.3, 9, -8],

Python complex iterating through excel files to concatenate colnames that are not named equal

I have multiple xls files in a directory. each file dataframe headers are different but data type is same. 1.xls Location StreetAddress America Pvtld 80

Pandas: Values to columns and then group and merge by same Id [duplicate]

I have a dataframe like this df = DataFrame({'Id':[1,2,3,3,4,5,6,6,6], 'Type': ['T1','T1','T2','T3','T2','T1','T1','T2','T3'],

creating dictionaries from values in pandas columns with repeating values

Considering this sample dataframe: location emp 0 fac_1 emp1 1 fac_2 emp2 2 fac_2 emp3 3 fac_3 emp4 4 fac_4 emp5 It can be recreated by

Latex expressions in pandas dataframes not rendering in vscode

I am trying to set some labels and the caption of my dataframe using mathjax, but it doesn't render in vscode. For example, when I do import pandas as pd test =

TypeError: '<=' not supported between instances of 'str' and 'float'

I want to find the number of rows of clin dataframe where the OS_MONTHS value is <= 12.0. The values in the OS_MONTHS are float. This seems like a trivial qu

how to use list comprehension to subset the dataframe with the valuecounts

make year honda 2011 honda 2011 honda n/a toyota 2011 toyota 2022 Im trying to get list of the make that has value counts more than 2 below is

How to find minimum of some variable with repeating row indexes and preserve all other variables in Python Pandas

Basically, I have multiple repeating dates and the indices (1/2/1990 many times followed by 1/3/1990 many more times, etc.) I want to find the minimum of a give

searching in range between columns using sqlite3 in pandas

I have found solution to my problem in one question Merge pandas dataframes where one value is between two others I tried to modify it for my situation but it d

What's the equivalent of `pandas.Series.map(json.loads)` in polars?

Based on the document of polars, one can use json_path_match to extract JSON fields into string series. But can we do something like pandas.Series.map(json.load

Python Pandas SUMIF excel equivalent

I can't figure out how to achieve a certain task in my python script. I have a dataframe that contains media coverage for a specific topic. One of my columns na