Category "pandas"

Finding percentage of rejection in pandas dataframe

I have a pandas data frame like given below Id1 YEAR CLAIM_STATUS no_of_claims 1 2019-01 4 1 1 2019-01 5 1

Visualization random sample with displaCy

How can I visualize using displaCy in a dataframe? I have a data called taks_output and want to visualize a sample of the columm msg_lower? What I did: import p

Reshape wide to long for many columns with a common prefix

My frame has many pairs of identically named columns, with the only difference being the prefix. For example, player1.player.id and player2.player.id. Here's an

create dataframe as week and their weekly sum from dictionary of datetime and int

I have datetime and int values dictionary like below. details = { datetime.datetime.strptime("04-01-2021", "%d-%m-%Y") : 15, datetime.datetime.strptime(

Trying to convert pandas df to np array, dtaidistance computes list instead

I am attempting to compute the distance matrix for an ndarray that I have converted from pandas. I tried to convert the pandas df currently in this format: move

Import local ".py" library in powerbi python scripting

I have written a python library (ak_sql.py) to query my sql server and written another ".py" script (query_sql.py) to invoke this library and return data in dat

How to extract a specific range out of a dataframe and store it in another dataframe and then delete the range out of the original dataframe | pandas

I have some timeseries of energy consumption and i can eyeball when someone is on holidays if the consumption is under a certain range. I have this piece of cod

Nested JSON to Multiple Dataframe in Pandas

I am trying to build a tool which can take any JSON data and convert that into multiple data frame based on data types. I am trying to add each data frame with

Why is my function parameter for Pandas to_datetime() being ignored?

I have a Pandas data frame with a column containing months and years. Unfortunately, the values are currently string objects not datetime objects; This means th

changing frequency in a pandas SeriesGroupBy

I'm struggling to find a simple way to change a frequency of a pd.Series that is grouped on some level of a pd.MultiIndex (so it's a pd.core.groupby.generic.Ser

Error in running Pandas Profile report Python

I try to do an exploratory data analysis with the Python package pandas ProfileReport, but I get the following error: Summarize dataset: 40%|██`

Dataframe returning empty after assignment of values?

Essentially, I would like to add values to certain columns in an empty DataFrame with defined columns, but when I run the code, I get. Empty DataFrame Columns:

how to create dependent dropdownlist in python and streamlit?

based on the answer of this post i was able to display the dataframe after apply the required filter. I have a streamlit code that display multiple dropdown l

Getting SettingWithCopyWarning with iloc or loc when some filtering is done on the dataframe wit regex [duplicate]

I have the following statement to compute the mean of three quiz scores and create a new column based on the computed mean: scores.loc[:, 'Ave

How to json_normalize nested json arrays

I have the complex json structure as below. I am able to json_normalize only first level of array (MatchingReleases.MatchingRelease). whereas I have one more

How can I substring to specific character in pandas?

For example, I have 2 columns(1,2), and in table 2 I want to fetch everything until " character. I wanted to do something like this: df.columns = ['1','2'] a =

fastapi using ORM not able to convert to pandas

I've been developing a fastapi way to query my database, instead of directly using SQL with pg. For some reason, I'm having issues converting the ORM query retu

`pd.read_sql(sql, engine)` raises NotImplementedError: This method is not implemented for SQLAlchemy 2.0

I tried to create a pandas DataFrame directly from my sqlserver database using an sqlalchemy engine: engine = create_engine(URL_string, echo=False, future=True)

pandas | list in column to binary column

I have the following dataframe: +------------+------------------+ | item | categories | +------------+------------------+ | blue_shirt | ['red', 'wh

How to find current upper Bollinger band in pandas-ta

I have a CSV file having columns Instrument, Date, Time, Open, High, Low, Close I want the rows having Current close greater than current upper Bollinger band(2