'The steps to extract the total number of same data and maximum range of age from a single dataframe using pandas in python environment
I have a dataframe shown below
Age Cardio
74 1
77 1
45 0
56 0
72 1
71 1
70 1
From this dataframe, how can i find the maximum age of an individual from 'Age' column and total number of 1 in 'Cardio' column using pandas?
The output should be 77 from 'Age' column and 5 from 'Cardio' column
I have tried this earlier
df.['Age'].max()
but its not showing anything
Solution 1:[1]
Use DataFrame.agg
for Series
:
s = df.agg({'Age': 'max', 'Cardio': 'sum'})
print (s)
Age 77
Cardio 5
dtype: int64
If need scalars:
age, no_1 = df['Age'].max(), df['Cardio'].sum()
Or:
age, no_1 = s['Age'], s['Cardio']
Solution 2:[2]
You can use agg
with a dictionary mapping the column names and the aggregation functions:
df.agg({'Age': 'max', 'Cardio': 'sum'})
output:
Age 77
Cardio 5
dtype: int64
update
Getting the number of 0s in Cardio:
(df.assign(Cardio_0=df['Cardio'].rsub(1))
.agg({'Age': 'max', 'Cardio': 'sum', 'Cardio_0': 'sum'})
)
output:
Age 77
Cardio 5
Cardio_0 2
dtype: int64
Second approach
def nzero(s):
return len(s)-sum(s)
df.agg({'Age': 'max', 'Cardio': ['sum', nzero]})
output:
Age Cardio
max 77.0 NaN
sum NaN 5.0
nzero NaN 2.0
or as stack
Series:
df.agg({'Age': 'max', 'Cardio': ['sum', nzero]}).stack()
output:
max Age 77.0
sum Cardio 5.0
nzero Cardio 2.0
dtype: float64
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | jezrael |
Solution 2 |