'Aggregate rows into complex json - Athena

I've an Athena query which gives me the below table for a given IDs:

ID ID_2 description state
First row abc [MN, SD]
Second row xyz [AL, CA ]

I'm using the array_agg function to merge states into an array. Within the query itself I want convert the output into the format below:

ID ID_2 custom_object
First row {'description': 'abc', 'state': ['MN', 'SD']}

I'm looking at the Athena docs but haven't found function that does just this. I'm experimenting with multimap_agg and map_agg but this seems to be too complex to achieve. How can I do this, please help!



Solution 1:[1]

You can do it after aggregation by combining casts to json and creating map:

-- sample data
WITH dataset (ID, ID_2, description, state) AS (
    VALUES ('First',    'row',  'abc',  array['MN', 'SD']),
    ('Second',  'row',  'xyz',  array['AL', 'CA' ])
) 

-- query
select ID,
    ID_2,
    cast(
        map(
            array [ 'description',
            'state' ],
            array [ cast(description as json),
            cast(state as json) ]
        ) as json
    ) custom_object
from dataset

Output:

ID ID_2 custom_object
First row {"description":"abc","state":["MN","SD"]}
Second row {"description":"xyz","state":["AL","CA"]}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Guru Stron