Category "pandas"

Pandas Lookup to be deprecated - elegant and efficient alternative

The Pandas lookup function is to be deprecated in a future version. As suggested by the warning, it is recommended to use .melt and .loc as an alternative. df =

Use a list with function names to iteratively apply over a dataframe column

Context: I'm allowing a user to add specific methods for a cleaning process pipeline (appended to a main list with all the methods chosen). Each element from th

AttributeError: 'numpy.ndarray' object has no attribute 'columns' even after using pandas dataframe

import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split,cross_val_score from sklearn.tree import DecisionTreeCl

Append new data into an existing frame and upload to sheets Python

I'm connected to my APIs client, sent the credentials, I made the request, I asked the API for data and put it to a DF. Then, I have to upload this data to a sh

Python - Hours between two dates, excluding weekends

I'm doing my first steps in python programing language. I want to create a script that aims to open an excel file and add an extra column that will be the hourl

Pandas: Calculate Difference between a row and all other rows and create column with the name

We have data as below Name value1 Value2 finallist 0 cosmos 10 20 [10,20] 1 network 30 40 [30,40] 2 unab 20 40 [20,40]

Updating a Value of A Panda Dataframe with a Function

I have a function which updates a dataframe that I have passed in: def update_df(df, x, i): for i in range(x): list = ['name' + str(i), i + 2, i - 1

ModuleNotFoundError: pandas 1.3.5 with pyinstaller 4.10

I'm trying to compile a python script using pyinstaller and pyinstaller says " 10230 INFO: Building EXE from EXE-00.toc completed successfully" but when I execu

Dataframe Operation Splicing

I have a single column dataframe without headers and I want to split it into multiple columns as follows The current dataframe - 1 2 3 4 5 . . 100 I want to re

python how to use string value for custom sort?

I have an datafremae like this time_posted 0 5 days ago 1 an hour ago 2 a day ago 3 6 hours ago 4 4 hours ago I tried this df.sort_values(by='time_p

How to correctly generate training data based on percentages?

I have a question. I am currently generating training data for my bayesian network as follows: (also as code down below) -> infected stands for people who a

Improve performance of LineString creation, that currently is created by a lambda function

I have a dataframe like this (this example has only four rows, but in practice it has O(10^6) rows): DF: nodeid lon lat wayid 0 1 1.70

When using read_sql_query in pandas, how to write the SQL across multiple lines?

my question is pretty much what it sounds like: Is it possible to write my SQL across multiple lines for ease of reading when using the read_sql_query method pl

Unable to read a column of an excel by Column Name using Pandas

Excel Sheet I want to read values of the column 'Site Name' but in this sheet, the location of this tab is not fixed. I tried, df = pd.read_excel('TestFile.xlsx

Python Rolling Window with Timestamp

I'm having a hard time looping through my pandas data frame while trying to apply a) a specific window size (6hours) and b) specific step size (1hour). I have t

Replace entire pandas dataframe after scaling without warning

I have tried this according to this awnser x = df[feature_collums] y = df[[label_column]][label_column] from sklearn.preprocessing import MinMaxScaler scaler =

Showing different size circles in heatmap with legend using Matplotlib

I am asking a question stemming from this original post Heatmap with circles indicating size of population I am trying to replicate this using my dataframe, how

dataframe-image 0.1.1 does not export as image

The library dataframe_image is being used to convert a dataframe to png at spyder. However, it sends an error when the example code is executed. The error is: T

Dividing each element of a python dataframe to a Series

I have a dataframe like below: import pandas as pd data1 = {"a":[1.,3.,5.,2.], "b":[4.,8.,3.,7.], "c":[5.,45.,67.,34]} data2 = {"a":[4., 6, 8] }

Pandas - No Null values but pd.to_datetime gives “Reindexing only valid with uniquely valued Index values"?

Sample Data +---------+------------------------+ | | date | +---------+------------------------+ | 0 | 2020-12-31 00:00:00 |