Category "pandas"

Apply formula on a column conditional to another column

import requests from geopy.geocoders import Nominatim import streamlit as st import numpy as np import pandas as pd import random, string print('ville :') code

Generate multiple new pandas dataframes using lists and for loops

I have the following dataframe: import pandas as pd import numpy as np from numpy import rec, nan df1=pd.DataFrame.from_records(rec.array([(202001L, 2020L, 'app

RankWarning: Polyfit may be poorly conditioned

I am trying to find the rolling price slope of btc trading data (minute data) using pandas. When I run the script, the following error / warning pops up sys:1:

RankWarning: Polyfit may be poorly conditioned

I am trying to find the rolling price slope of btc trading data (minute data) using pandas. When I run the script, the following error / warning pops up sys:1:

Update django model objects from editable dash data-table

I want to update my models by an editable plotly dash-table (populated by a dataframe, himself populated by sqlconnection with models) in Django but I don't kno

String Into Integer while sorting

Curious if there is a way to convert a string into an integer, only during the sort_values() process, or if it's easier to convert the variable to an integer pr

DataFrame is highly fragmented

I have the following code, but when I run it I receive the error: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling fra

Using multi character delimiter while exporting SQL table as text file using python

I am trying to export SQL table as multi-character delimited text file in python. I have tried using Pandas, but it is only supporting single character as delim

how do I find the average of this array but only for the values that are not equal to zero in python [duplicate]

array([1500, 1520, 1540, 1590, 1590, 1600, 1600, 1560, 1560, 1560, 1580, 1520, 1460, 1510, 1520, 1320, 1320, 1300, 1300, 1320, 1320, 1320, 132

How to use pandas and numpy to compare two excel workbooks with multiple tabs?

I have two xlsx files that have multiple tabs. I need to compare values in each tab based on the tab name. (e.g. sheet1 in file1 needs to be compared with sheet

Reindex Pandas Series case insensitive (Combining matches)

I have a pandas series with string indices and integer values: (My actual series has >1000 entries) Count apple 1 bear 2 cat 3 Apple 10 pig 20 Cat 30 ApPl

How to get the week number starting from monday of given month in python?

I was trying to calculate the week number starting from first Monday of October. Is there any functions in pandas or datetime to do the calculation efficiently?

Iterating over a dataframe twice: which is the ideal way?

I am trying to create a dataframe for Sankey chart in Power BI which needs source and destination like this. id Source Destination 1 Starting a next point b 1

Read many parquet files from S3 to pandas dataframe

I've been researching this topic for a few days now and have yet to come up with a working solution. Apologies if this question is repetitive (although I have c

Error in creating dynamic columns from existing column having nested list of lists

I want to create two column from an existing column which contains nested list of list as values. Rows of record consisting of 3 companies participant and their

Apply function on pandas.DataFrame by group of values in a columns

I have a data frame object in pandas with columns (let's say) "group". There are 20 groups. I want to apply a function (sum) to multiple rows of the same groups

How to group by values in a column and find time difference using python?

I have a dataframe as shown below: Col A Time Col B Col C 123 2018-01-06 03:45:23 B 1 141 2018-01-08 12:45:55 C 0 123 2018-01-08 11:45:29 A 0 123 2018-01-08 01

Pandas column-wise rolling works with np.float64 but returns empty array with np.float32 and np.float16

I ran into a strange observation where the same code works with np.float64 but not with np.float32 or np.float16. Here's code to reproduce the results: >>

Pandas create rows based on interval between to dates

I am trying to expand a dataframe containing a number of columns by creating rows based on the interval between two date columns. For this I am currently using

Fastest way to fill multiple columns by a given condition on other columns pandas

I'm working with a very long dataframe, so I'm looking for the fastest way to fill several columns at once given certain conditions. So let's say you have this