Category "pandas"

How to append row value to the other column

I have a table that includes two columns: P_SEG SEG_ID 1 [2, 4] 2 [4, 3,5] I want to create a new column that includes the 1st column value in the second colu

How to preform loc with one condition that include two columns

I have df with two columns A and B both of them are columns with string values. Example: df_1 = pd.DataFrame(data={ "A":['a','b','c'], "B":['a x d','z y

Python- How to Combine 2 pandas.core.frame =.dataframe with the same column name together in python [duplicate]

So i got 2 pandas.core.frame.DataFrame like this: anomalies: Sales outlet Date 2006-07-01 700 2 a

Trying to get the minimum date and getting TypeError: '<' not supported between instances of 'datetime.datetime' and 'int'

i'm reading from an excel file GA = pd.read_excel("file.xlsx", sheet_name=0, engine= "openpyxl") The data type is: Email object Date datetime64[ns] Name object

Adding new column with first non Nan for each row closest to a chosen column from a dataset Python

Hello I want to create a new column from a given dataset (that I call here "df") with the first non-Nan for each row and closest from a given column For example

How to draw a continuous contour plot with discrete coordinate data (DataFrame form)?

The row data has 3 columns and cannot shape a uniform grid based on 'x'&'z', so I am not able to plot the contour as the existed question: Create Contour Pl

How to drop columns and rows with missing values?

I've been trying to take a pandas.Dataframe and drop its rows and columns with missing values simultaneously. While trying to use dropna and applying on both ax

How to write excel formula (SUM()) dynamically in a range of columns using openpyxl?

In my Excel spreadsheet I need to enter excel formula on a bottom that will summarize values. Number of rows can be different. But not columns. So in cell B10 s

Normalise data where every second row is a column header

I have a system where you manually input data, for example data about people. Some fields are mandatory but majority are optional. When the data is outputted it

Convert date + time strings to epoch milliseconds in dataframe column (when present)

I have a dataframe with a column called "snapshot_timestamp" where the time is in this format: 2022-05-01 23:45:47.428 (year, month, day, hour, minutes, seconds

Send csv file in pytest test using restframework ApiClient

I'm trying to test my view which requires csv_file and one parameter in request, but getting error "details": {"csv_file": ["The submitted data was not a file.

Turning incomplete lines into new columns on Pandas

Folks, I have converted a PDF using tabula-py and, due to the formatting (there are two lines with names in each name cell) I get this: col1 name doc col

Nested Data inside output needing to be expanded

I have the following code import requests import json import pandas as pd import csv import numpy from pandas.io.json import json_normalize url = 'http://URL/a

What column should I assign to parse_dates while working with google finance?

I wrote a code to show a graph from google finance but i got this error: ValueError: Missing column provided to 'parse_dates': 'Date' This was my code: from bok

How to combine two columns in pandas dataframe and set values to them?

I have two columns in pandas dataframe Latitude and Longitude. I am trying two combine them in single column LOCATION. If we see the data there are only two loc

Assigning each excel sheet to a variable while looping (using openpyxl) and create dataframe of each sheets

I have an excel document with multiple sheets containing different data sets. For instance, first sheet has 2 column data where as the second sheet (sheet 2) ha

How align content in every cell to center?

I'm trying to center the table content using df.style.set_properties(**{'text-align': 'center'}). But I couldn't do it. Is there any other way? Here is the full

How do I match variations of a pandas string based on a list?

I have a pandas dataframe with one column containing country names and I'd like to flag them if they appear in a list of countries I have. However, some of the

Why is there an extra row of zeros in the histogram of images in a folder?

I have a folder comprising 20 images (.jpg format). I am trying to obtain the histogram of each of the images and store it as a Pandas data frame. My code is sh

Python pandas - series to dataframe

. How do I print out only the country names that exist in the dataframe among series with country names as index?