Category "pandas"

Collapse pandas DataFrame based on daily column value

I have a pandas DataFrame with multiple measurements per day (for example hourly measurements, but that is not necessarily the case), but I want to keep only th

Keyerror when processing pandas dataframe

For a pathway pi, the CNA data of associated genes were extracted from the CNV matrix (C), producing an intermediate matrix B∈Rn×ri⁠, where ri

Pandas Data Frame - Remove Overlapping Intervals

Suppose that you have a Pandas data frame that can be created using code below: test_df = pd.DataFrame( {'start_date': ['2021-07-01', '2021-07-02', '2021-07

Pandas Pivot is not producing desired output

I my data looks like below. I am trying to pivot the dataframe such that SCHEMA NAME AND TABLE NAME are in columns and Row Count, Table Type, date created and D

How to create and assign indexes for each group in a dataframe

[This is DataFrame loaded with data from an Excel file] STUDY Teacher UPDATE_DATE 0 math A 2022-02-25 1 math

Python Pandas. How to extract single column from downloaded yahoo_fin option chain data?

What is the proper way to extract a single column from downloaded option_chain from yahoo_fin? My code for EXXON MOBILE option chains: from yahoo_fin import opt

How to Merge two datasets with different indexes but one common ID factor?

I am working with two distinct datasets: one regarding COVID-19 statistics and one with demographic characteristics of a city. The covid19 one, namely covid.df

How to append row value to the other column

I have a table that includes two columns: P_SEG SEG_ID 1 [2, 4] 2 [4, 3,5] I want to create a new column that includes the 1st column value in the second colu

How to preform loc with one condition that include two columns

I have df with two columns A and B both of them are columns with string values. Example: df_1 = pd.DataFrame(data={ "A":['a','b','c'], "B":['a x d','z y

Python- How to Combine 2 pandas.core.frame =.dataframe with the same column name together in python [duplicate]

So i got 2 pandas.core.frame.DataFrame like this: anomalies: Sales outlet Date 2006-07-01 700 2 a

Trying to get the minimum date and getting TypeError: '<' not supported between instances of 'datetime.datetime' and 'int'

i'm reading from an excel file GA = pd.read_excel("file.xlsx", sheet_name=0, engine= "openpyxl") The data type is: Email object Date datetime64[ns] Name object

Adding new column with first non Nan for each row closest to a chosen column from a dataset Python

Hello I want to create a new column from a given dataset (that I call here "df") with the first non-Nan for each row and closest from a given column For example

How to draw a continuous contour plot with discrete coordinate data (DataFrame form)?

The row data has 3 columns and cannot shape a uniform grid based on 'x'&'z', so I am not able to plot the contour as the existed question: Create Contour Pl

How to drop columns and rows with missing values?

I've been trying to take a pandas.Dataframe and drop its rows and columns with missing values simultaneously. While trying to use dropna and applying on both ax

How to write excel formula (SUM()) dynamically in a range of columns using openpyxl?

In my Excel spreadsheet I need to enter excel formula on a bottom that will summarize values. Number of rows can be different. But not columns. So in cell B10 s

Normalise data where every second row is a column header

I have a system where you manually input data, for example data about people. Some fields are mandatory but majority are optional. When the data is outputted it

Convert date + time strings to epoch milliseconds in dataframe column (when present)

I have a dataframe with a column called "snapshot_timestamp" where the time is in this format: 2022-05-01 23:45:47.428 (year, month, day, hour, minutes, seconds

Send csv file in pytest test using restframework ApiClient

I'm trying to test my view which requires csv_file and one parameter in request, but getting error "details": {"csv_file": ["The submitted data was not a file.

Turning incomplete lines into new columns on Pandas

Folks, I have converted a PDF using tabula-py and, due to the formatting (there are two lines with names in each name cell) I get this: col1 name doc col

Nested Data inside output needing to be expanded

I have the following code import requests import json import pandas as pd import csv import numpy from pandas.io.json import json_normalize url = 'http://URL/a