Category "pandas"

Scikit-learn pipeline: Non-finite test scores error / Inconsistent number of samples

I have a dataframe with two columns of texts and only the POS tags (of the same texts), which I want to use for language classification. I am trying to use both

Why does column + column concatenation create arrays for some Windows accounts?

When running the below Python code, I get different results depending on the user account/admin privileges that is used. The code is saved as test.py on a Windo

Select all in pandas where column equals value and all other columns are blank

I have a DataFrame containing permissions for roles of each user, e.g. function/role role1_permissions role2_permissions role3_permissions role4_permissions ca

Pandas : How to apply a function with multiple column inputs and where condition

I have a pandas dataframe. I want to generate a new variable (column) based on multiple column inputs where the year index is greater than a certain value. The

SettingWithCopyWarning Python3

I am finding the max of df2 by row, and setting the max value to new col on df1. df1['max'] = df2[df2.keys().tolist()].max(axis=1) This line is throwing a Sett

how to read a csv file edited from excel in python?

Hi I tried loading the csv file that has been edited and saved as csv comma delimited. However it's not loading right. I used the normal pd_read_csv like so. df

How to replace nan with a certain value across rows but only between values

I have the following dataframe, and I want to replace nan with a certain value, let's say, 0.0001, only if there is a value right to the missing value. ID 2021_

pandas scraping html tables

There is an HTML file of tables. There are about 100 of them, and they all often have the same values. The values in the second and first column of all tables a

Python Rolling sum for 32 bit vs 64 bit

I am getting strange results when doing rollingSum for 64 bit vs 32 bit precision. Please see the code for display 1 vs 2. Display 1 shows the right rolling sum

Why are all my variables objects instead of numerical values (int,float) when uploaded?

I just started so that might be stupid, but I have following problem: I created a .csv-file for some basic data description. However, although they are all nume

Utilizing multiple processors to load large number of files for pandas (python)

I need to load hundreds or thousands of JSON files into a big pandas dataframe. My current solution using a for loop to iterate the directory is slow and is not

Groupby hours +/- some integer of additional hours

I have a data frame consisting of some columns, where the index is datetime, i.e. it looks something like: df = col1 col2

Make a dataframe avaliable until it's update [duplicate]

I have a Flask application with reads a dataframe and provide it in a service. The problem is that I need to update it (only a reading from s

Getting error 'NoneType' object has no attribute 'read' in python for image processing image_dataframe['image']

I am working on image classification using CNN. I am using below source code for that task. I am stuck with this error : AttributeError: 'NoneType' object has

Split dataframe column at specific words

One column in my dataframe is a long string. I want to split out portions of the string into its own column based on a few different words. What would be the be

How to reassign values in column by condition in dataframe?

df = pd.DataFrame([["A", "AA", "AAA", "found"], ["A", "AB", "ABA", "not found"], ["A", "AB", "ABB", "not found"],

Make a dataframe avaliable until it's update [duplicate]

I have a Flask application with reads a dataframe and provide it in a service. The problem is that I need to update it (only a reading from s

KeyError while reading a CSV file in Python

I am trying to plot the fall of an object (an optical fork to be precise) as a function of time in order to verify that the law of gravity is indeed 9.81. The d

Python: Reading a Windows generated csv with carriage return in column

I'm working on a Python program that needs to read csv files that are produced on a Windows 2012 server machine. The aim of the Python code is to give a min/max

How to exclude weekends and holidays from finding the difference between two dates in python

I need to find the difference between 2 dates where certain end dates are blank. I am need to exclude the weekends, as well as the holidays when calculating the