'How to get the groupby keys with a loop?
I need to do some somewhat complicated processing for each group after grouping.
in pandas
, it can be writed as follows:
for i,g in df.groupby(['id','sid']):
pass
While in polars, the groups
function returns a DataFrame, But this cannot be conveniently applied to for loops.
Solution 1:[1]
You could use partition by. This would yield a dictionary
where the groupby
keys map to the partitioned DataFrames
.
df = pl.DataFrame({
"groups": [1, 1, 2, 2, 2],
"values": pl.arange(0, 5, eager=True)
})
part_dfs = df.partition_by("groups", as_dict=True)
print(part_dfs)
{1: shape: (2, 2)
???????????????????
? groups ? values ?
? --- ? --- ?
? i64 ? i64 ?
???????????????????
? 1 ? 0 ?
???????????????????
? 1 ? 1 ?
???????????????????,
2: shape: (3, 2)
???????????????????
? groups ? values ?
? --- ? --- ?
? i64 ? i64 ?
???????????????????
? 2 ? 2 ?
???????????????????
? 2 ? 3 ?
???????????????????
? 2 ? 4 ?
???????????????????}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | ritchie46 |