'Aggregate rows into complex json - Athena
I've an Athena query which gives me the below table for a given IDs:
ID | ID_2 | description | state |
---|---|---|---|
First | row | abc | [MN, SD] |
Second | row | xyz | [AL, CA ] |
I'm using the array_agg
function to merge states into an array. Within the query itself I want convert the output into the format below:
ID | ID_2 | custom_object |
---|---|---|
First | row | {'description': 'abc', 'state': ['MN', 'SD']} |
I'm looking at the Athena docs but haven't found function that does just this. I'm experimenting with multimap_agg
and map_agg
but this seems to be too complex to achieve. How can I do this, please help!
Solution 1:[1]
You can do it after aggregation by combining casts to json and creating map:
-- sample data
WITH dataset (ID, ID_2, description, state) AS (
VALUES ('First', 'row', 'abc', array['MN', 'SD']),
('Second', 'row', 'xyz', array['AL', 'CA' ])
)
-- query
select ID,
ID_2,
cast(
map(
array [ 'description',
'state' ],
array [ cast(description as json),
cast(state as json) ]
) as json
) custom_object
from dataset
Output:
ID | ID_2 | custom_object |
---|---|---|
First | row | {"description":"abc","state":["MN","SD"]} |
Second | row | {"description":"xyz","state":["AL","CA"]} |
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Guru Stron |