'How to get the groupby keys with a loop?

I need to do some somewhat complicated processing for each group after grouping. in pandas, it can be writed as follows:

for i,g in df.groupby(['id','sid']):
    pass

While in polars, the groups function returns a DataFrame, But this cannot be conveniently applied to for loops.



Solution 1:[1]

You could use partition by. This would yield a dictionary where the groupby keys map to the partitioned DataFrames.

df = pl.DataFrame({
    "groups": [1, 1, 2, 2, 2],
    "values": pl.arange(0, 5, eager=True)
})

part_dfs = df.partition_by("groups", as_dict=True)

print(part_dfs)
{1: shape: (2, 2)
???????????????????
? groups ? values ?
? ---    ? ---    ?
? i64    ? i64    ?
???????????????????
? 1      ? 0      ?
???????????????????
? 1      ? 1      ?
???????????????????,
 2: shape: (3, 2)
???????????????????
? groups ? values ?
? ---    ? ---    ?
? i64    ? i64    ?
???????????????????
? 2      ? 2      ?
???????????????????
? 2      ? 3      ?
???????????????????
? 2      ? 4      ?
???????????????????}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ritchie46