'How create a column with list of jsons if duplicated rows on other column?
I have a Pandas dataframe looking like this:
buyer_id car color year
john ferrari yellow 2022
eric ferrari red 2022
john mercedes black 1990
victoria audi yellow 2017
I would like to create a new column (list of jsons in each row.
Create a column 'identical' with a list in each row:
One element in the list if only one buyer is found in 'buyer_id':
[{'car':..., 'color':..., 'year': ...}]
If same buyer on several rows in 'buyer_id'
[ {'car':'ferrari', 'color': 'yellow ', 'year': 2022}, {'car':'mercedes', 'color': 'black', 'year': 1990} ]
Expected output:
buyer_id car color year identical
john ferrari yellow 2022 [{'car':'ferrari', 'color': 'yellow ', 'year': 2022},{'car':'mercedes', 'color': 'black', 'year': 1990}]
eric ferrari red 2022 [{'car':'ferrari', 'color': 'red', 'year': 2022}]
john mercedes black 1990 [[{'car':'ferrari', 'color': 'yellow ', 'year': 2022},{'car':'mercedes', 'color': 'black', 'year': 1990}]
victoria audi yellow 2017 [{'car':'audi', 'color': 'yellow', 'year': 2017}]
I don't know how to do this with Pandas and if it is possible.
Solution 1:[1]
You could use GroupBy.apply
and to_json
with the orient="records"
parameter:
s = (df.groupby('buyer_id')
.apply(lambda g: g.drop('buyer_id', axis=1)
.to_json(orient='records'))
)
df2 = df.merge(s.rename('identical'), left_on='buyer_id', right_index=True)
or in place:
s = (df.set_index('buyer_id')
.groupby(level='buyer_id')
.apply(lambda g: g.to_json(orient='records'))
)
df['identical'] = df['buyer_id'].map(s)
output:
buyer_id car color year identical
0 john ferrari yellow 2022 [{"car":"ferrari","color":"yellow","year":2022},{"car":"mercedes","color":"black","year":1990}]
1 eric ferrari red 2022 [{"car":"ferrari","color":"red","year":2022}]
2 john mercedes black 1990 [{"car":"ferrari","color":"yellow","year":2022},{"car":"mercedes","color":"black","year":1990}]
3 victoria audi yellow 2017 [{"car":"audi","color":"yellow","year":2017}]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |