I have two xlsx files that have multiple tabs. I need to compare values in each tab based on the tab name. (e.g. sheet1 in file1 needs to be compared with sheet
I have a pandas series with string indices and integer values: (My actual series has >1000 entries) Count apple 1 bear 2 cat 3 Apple 10 pig 20 Cat 30 ApPl
I was trying to calculate the week number starting from first Monday of October. Is there any functions in pandas or datetime to do the calculation efficiently?
I am trying to create a dataframe for Sankey chart in Power BI which needs source and destination like this. id Source Destination 1 Starting a next point b 1
I've been researching this topic for a few days now and have yet to come up with a working solution. Apologies if this question is repetitive (although I have c
I want to create two column from an existing column which contains nested list of list as values. Rows of record consisting of 3 companies participant and their
I have a data frame object in pandas with columns (let's say) "group". There are 20 groups. I want to apply a function (sum) to multiple rows of the same groups
I have a dataframe as shown below: Col A Time Col B Col C 123 2018-01-06 03:45:23 B 1 141 2018-01-08 12:45:55 C 0 123 2018-01-08 11:45:29 A 0 123 2018-01-08 01
I ran into a strange observation where the same code works with np.float64 but not with np.float32 or np.float16. Here's code to reproduce the results: >>
I am trying to expand a dataframe containing a number of columns by creating rows based on the interval between two date columns. For this I am currently using
I'm working with a very long dataframe, so I'm looking for the fastest way to fill several columns at once given certain conditions. So let's say you have this
I have the following function: def create_col4(df): df['col4'] = df['col1'] + df['col2'] If I apply this function within my jupyter notebook as in create_c
I have a Pandas dataframe with ~100,000,000 rows and 3 columns (Names str, Time int, and Values float), which I compiled from ~500 CSV files using glob.glob(pat
I have a data frame with the date/time passed as "parse_dates" and then set as the index column for the data frame. Flow Enter Leave
I have an array: w = np.array([1, 2, 3]) and I need to create a Dataframe with a MultiIndex looking like this: df= 0 1 2 0 0 1 1 1 1 1 1 1 2 1
I am trying to convert a dataframe in which hourly data appears in distinct columns, like here: ... to a dataframe that only contains two columns ['datetime',
I am using this code to get the mode of a categorical column: df.groupby('user_id')['product'].agg(pd.Series.mode).reset_index().rename(columns = {'product': 'm
df.review: de la nada mi ya no se escucha I tried to set it up It is a good product The aim is to remove non-English rows. I tried this and
I am trying to plot both a scatterplot and a line plot, in the same figure. One is for objects and the other for lane markers. The outcome should be one figure
I have a dataframe with stock returns in one column, strategy values in another & and another column called trades with boolean values (True, False). My de