'Groupby by a column and select specific value from other column in pandas dataframe
Input dataframe:
+-------------------------------+
|ID Owns_car owns_bike|
+-------------------------------+
| 1 1 0 |
| 5 1 0 |
| 7 0 1 |
| 1 1 0 |
| 4 1 0 |
| 5 0 1 |
| 7 0 1 |
+-------------------------------+
Expected Output:
+------------------------------+
|ID Owns_car owns_bike|
+------------------------------+
| 1 1 0 |
| 5 1 1 |
| 7 0 1 |
| 4 1 0 |
+------------------------------+
Grouping by ID and then selecting value '1' over 0 for the other columns. Checking if for a given ID the person owns a car and bike
Solution 1:[1]
You can use 'max' after your groupby to select the max value (which will prefer 1 over 0)
df = pd.DataFrame({'ID': [1, 5, 7, 1, 4, 5, 7],
'Owns_car': [1, 1, 0, 1, 1, 0, 0],
'owns_bike': [0, 0, 1, 0, 0, 1, 1]})
df.groupby('ID').max().reset_index()
Solution 2:[2]
Use transform
with max
and then remove duplicates by ID
df[['Owns_car', 'owns_bike']] = df.groupby('ID')[['Owns_car', 'owns_bike']].transform('max')
df = df.drop_duplicates('ID')
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 |