Category "pandas"

Pagination not working in Python Session.put()

I am trying to upload a file to a website (that has an inbuilt API) using the following code. The code reads a list of medical codes/diagnoses codes etc. (1 col

Read excel file in python using pandas

I am trying to read excel file in pycharm using pandas. I installed the package successfully. My issue is that I am trying to use file location in addition to i

Join two pyarrow tables

I have orc with data as after. Table A: Name age school address phone tony 12 havard UUU 666 tommy 13 abc

replace the empty value in the dataframe with a list of python values

There is a list of shops |Shop ID| |-------| | Shop1 | | Shop2 | | Shop3 | There is a list of events that took place in the store |Shop ID| Event | Start_date

Error: pandas hashtable keyerror

I have successfully read a csv file using pandas. When I am trying to print the a particular column from the data frame i am getting keyerror. Hereby i am shari

How to apply Target Encoding in test dataset?

I am working on a project, where I had to apply target encoding for 3 categorical variables: merged_data['SpeciesEncoded'] = merged_data.groupby('Species')['Wnv

Make Seaborn Distplot and Barplot the same color [duplicate]

I have been unable to figure out how to set the colors between distplot and barplot to be the same. Despite setting the color argument in both

AWS Athena table from python output with dates - dates get wrongly converted

I have a pandas DataFrame containing a date column ("2022-02-02"). I write this table to parquet using pyarrow. df[col] = df[col].astype(str) df.to_parquet(loc)

Binning 2D data with circles instead of rectangles - from pandas df

I have a dataframe of x, y data and need to bin it into circles. Ie a grid of circles of certain size and spacing centered on some point. So for example some da

How Do I Uploading Data Externally in Explainerdashboard

I am trying to upload external data into the dashboard using explainer.set_x_row_func() and explainer.set_y_func(). Does anyone know how to do this? Below is ho

Panda merge returns NAN values

Please consider 2 dataframes panda df1 and df2: df1 = pd.read_csv('df1.csv', sep=';') df2 = pd.read_csv('df2.csv', sep=';') We convert to date fields: df1['

Add a new record for each missing second in a DataFrame with TimeStamp [duplicate]

Be the next Pandas DataFrame: | date | counter | |-------------------------------------|--------------

Comparing 2 columns with different rows in different csv files, and output status to another csv file

I have 2 csv files as shown below. They contain different numbers of rows and the columns are not aligned/sorted along a common index. I need to compare the col

Error with delimiters on dataframe when trying to upload it to MSSQL

So I've been trying to upload a dataframe to an specific table that is under MSSQL, I've trying to use the BCPANDAS library to upload the data to it. However th

Get statistics for each group (such as count, mean, etc) using pandas GroupBy?

I have a data frame df and I use several columns from it to groupby: df['col1','col2','col3','col4'].groupby(['col1','col2']).mean() In the above way I almos

Poor accuarcy score for Semi-Supervised Support Vector machine

I am using a Semi-Supervised approach for Support Vector Machine in Python for the image classification from PASCAL VOC 2007 data. I have tried with the default

dtale show in jupyter notebook

I am exploring this new Python package named dtale. It is very convenient for pandas data frames visualization. https://pypi.org/project/dtale/ It worked onc

Inconsistent indexing of subplots returned by `pandas.DataFrame.plot` when changing plot kind

I know that, this issue is known and was already discussed. But I am encountering a strange behaviour, may be someone has idea why: When I run this: plot = df.p

how to check if value in a DataFrame is a type Decimal

I am writing a data test for some api calls that return a DataFrame with a date type and a type Decimal. I can't find a way to verify the Decimal the DataFrame

Get index and column with multiple headers and index_col in Pandas DataFrame

I have a dataframe with multiple headers and column indexes, and would like to retrieve the list of entries that are non-zero. The dataframe is constructed from