'How do I get a conditional total in pandas dataframe
I have a 32000 row 20 column dataframe consisting of data around many securities. Eg of target columns is as follows:
The output that I want is like this:
Effectively, what I need is for each security (there are about 1000 unique ones), the output row gives 0 when the input column is 0 and if the input col is 1, the output gives a running total (however it again resets to 0 when the series is broken). Loop will be last preference as dataset is big.
Solution 1:[1]
I think something like this should work
#Create cumsum of subsequent 0 value for Input
df['groupedInput0']=df.Input.eq(0).cumsum()
#Groupby that new column , consider Input and do a cumsum on that
df['Output']=df.groupby(['groupedInput0'])['Input'].transform(lambda x:x.cumsum())
#Then drop that fabricated column we used to get to the result
df.drop('groupedInput0',axis=1,inplace=True)
df
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Daniel Weigel |