Category "dataframe"

Binning 2D data with circles instead of rectangles - from pandas df

I have a dataframe of x, y data and need to bin it into circles. Ie a grid of circles of certain size and spacing centered on some point. So for example some da

How Do I Uploading Data Externally in Explainerdashboard

I am trying to upload external data into the dashboard using explainer.set_x_row_func() and explainer.set_y_func(). Does anyone know how to do this? Below is ho

Panda merge returns NAN values

Please consider 2 dataframes panda df1 and df2: df1 = pd.read_csv('df1.csv', sep=';') df2 = pd.read_csv('df2.csv', sep=';') We convert to date fields: df1['

Add a new record for each missing second in a DataFrame with TimeStamp [duplicate]

Be the next Pandas DataFrame: | date | counter | |-------------------------------------|--------------

Replacing negative values in specific columns of a dataframe

This is driving me crazy! I want to replace all negative values in columns containing string "_p" with the value multiplied by -0.5. Here is the code, where Tdf

Comparing 2 columns with different rows in different csv files, and output status to another csv file

I have 2 csv files as shown below. They contain different numbers of rows and the columns are not aligned/sorted along a common index. I need to compare the col

%>% .$column_name equivalent for R base pipe |>

I frequently use the dplyr piping to get a column from a tibble into a vector as below iris %>% .$Sepal.Length iris %>% .$Sepal.Length %>% cut(5) How

Is there any difference between python scripts in airflow and same script in python

I was writing the below code but it is running endless in airflow, but in my system it take 5 min to run gc=pygsheets.authorize(service_account_file='file.json'

Get statistics for each group (such as count, mean, etc) using pandas GroupBy?

I have a data frame df and I use several columns from it to groupby: df['col1','col2','col3','col4'].groupby(['col1','col2']).mean() In the above way I almos

dtale show in jupyter notebook

I am exploring this new Python package named dtale. It is very convenient for pandas data frames visualization. https://pypi.org/project/dtale/ It worked onc

Pyspark Window function on entire data frame

Consider a pyspark data frame. I would like to summarize the entire data frame, per column, and append the result for every row. +-----+----------+-----------+

how to check if value in a DataFrame is a type Decimal

I am writing a data test for some api calls that return a DataFrame with a date type and a type Decimal. I can't find a way to verify the Decimal the DataFrame

Get index and column with multiple headers and index_col in Pandas DataFrame

I have a dataframe with multiple headers and column indexes, and would like to retrieve the list of entries that are non-zero. The dataframe is constructed from

How to edit/ sort a non-column column in Python?

I wrote the script below, and I'm 98% content with the output. However, the unorganized manner/ disorder of the 'Approved' field bugs me. As you can see, I trie

How to fix ParserError: year 0 is out of range: 0000-00-00 with Python Pandas to_datetime method

I am trying to convert a column "travel_start" to a datetime object. Dashboard["travel_start"] = pd.to_datetime(Dashboard["travel_start"]) But I get the fol

How do to speed up ordinary dataframe loop in python? vectorisation? multiprocess?

I have a simple piece of code. Essentially, I want to speed up my loop that creates a dataframe using dataframes. I haven't found an example and would appreciat

rows wise correlation between two Dataframe which have unequal columns

I have two Dataframes, (Dataset1=200rows, 34 column)(Dataset2=200rows, 22 column). I want rows wise correlation between both datasets. how can I perform this. I

get the index of search item in a dataframe

I have a dataframe which contain a column combine 0 (43,FR,html5 full skinz html5) 1 (43,FR,mobile m-skinz2) 2 (43,FR,mobile m-skinz2 plus) 3

Transform a dataframe using pivot

I am trying to transform a dataframe using pivot. Since the column contains duplicate entries, i tried to add a count column following what's suggested here (Qu

How to create columns from anothers columns?

I want to built a dataframe like df2 from df1, looking always for the name of the column where the value is closet to 0: Where clossets_1 - closer value to 0 of