Category "pandas"

How to create dummy variable for specifc values in a column?

I want to create a dummy variable for a specific value in a column. Let's say my database looks like this : I want a dummy variable just for the museums. pd.ge

Pandas combining slices and list to select columns

Let us assume that a DataFrame df has the following columns: ['c1', 'c2', 'c3', 'c4', 'c5', 'c6', 'c7'] We can use a slice or a list to select some columns: Wit

Perform a merge by date field without creating an auxiliary column in the DataFrame

Be the following DataFrames in python pandas: | date | counter | |-----------------------------|------------------| | 2022-01-0

iterating different length arrays and replace values

I have a dataframe that looks like this: df = pd.DataFrame({'col1': [[[1,5,3],[0,0,0]], [[1,2,3],[0,0,0], [1,2,3]]]}) # which looks like this: col1 0 [[1

How to plot distribution of missing values in a dataframe

I have a data frame with 100's of column and would like to investigate the proportion of missing values by plotting graph. I'm able to get the proportion using

removing columns with pandas from csv - not found in axis

I'm trying to remove 1 column from .csv but I'm receiving an error. import pandas as pd df.drop("First Invoice #", axis = 1, inplace= True) KeyError: "['First

Concat null columns data with actual data in pandas?

I have set of columns need to be merged into single column where some columns have data and some don't have where it should be joined with the data to single co

pandas, creating dataframes based on tuple

I have a tuple that has data for several categories. Now I want to extract small dataframes from this tuple for each category based on a list I created. I want

How to plot correlation matrix/heatmap with categorical and numerical variables

I have 4 variables of which 2 variables are nominal (dtype=object) and 2 are numeric(dtypes=int and float). df.head(1) OUT: OS_type|Week_day|clicks|avg_app_s

Sum of different slices rows and column

I have pandas DataFrame df and three arrays columns_list, lower_boarder and upper_boarder all have the same shape. I want to find array with shape as input arra

FastAPI - Dataframe updated change lost between route

I'm trying to make a simple FastAPI api. Let's suppose these routes: The POST Route @api.post('/user', name='Get list of users') def get_user(user: User):

Pandas cannot open an Excel (.xlsx) file

Please see my code below: import pandas df = pandas.read_excel('cat.xlsx') After running that, it gives me the following error: Traceback (most recent call las

Count occurrences within a specific range

I have a data frame that looks like this: Tag 0 skip_1 1 run 2 skip_1 3 run 4 skip_1 5

TypeError: This COM object can not automate the makepy process - please run makepy manually for this object

What this kind of error? Traceback error C:\Users\DELL\PycharmProjects\MyNew\venv\Scripts\python.exe C:/Users/DELL/PycharmProjects/MyNew/agaaaaain.py Traceba

Unable to resolve pandas encoding error by changing encoding

I'm having trouble resolving an encoding error when reading a csv file using the pandas library. import pandas as pd filepath = "D:\Datasets\2019HighwayBridgeIn

Splitting and grouping pandas into intervals and calculating mean based on different column

I have a well-known Titanic dataset and I am trying to find the survival probability of a person, based on their age and sex. The input I am given is the number

convert numpy array of float64 values to datetime64 with python and pandas

If I have a NumPy array of float64 values. I know these values represent dates in format of datetime64[ns]. I try to convert them with pandas. But I get an Valu

Parse txt file in Pandas

I have the table in file and it looks like that: +-------------+-----------------+---------------+---------------+--------------+ |number |name

Pandas Value Error: Cannot set item on a Categorical with a new category, set the categories first

I've been looking for other similar issues on this ValueError, but none of them has the same code as I have. So here it is. As I am still very new at this, I am

How to compare just the date or just date time ignoring seconds in a Python Pandas dataframe column of mixed data types?

In a pandas dataframe, I have a column of mixed data types, such as text, integers and datetimes. I need to find columns where datetimes match: (1) exact values