'Aggregate rows into complex json - Athena
I've an Athena query which gives me the below table for a given IDs:
| ID | ID_2 | description | state | 
|---|---|---|---|
| First | row | abc | [MN, SD] | 
| Second | row | xyz | [AL, CA ] | 
I'm using the array_agg function to merge states into an array. Within the query itself I want convert the output into the format below:
| ID | ID_2 | custom_object | 
|---|---|---|
| First | row | {'description': 'abc', 'state': ['MN', 'SD']} | 
I'm looking at the Athena docs but haven't found function that does just this. I'm experimenting with multimap_agg and map_agg but this seems to be too complex to achieve. How can I do this, please help!
Solution 1:[1]
You can do it after aggregation by combining casts to json and creating map:
-- sample data
WITH dataset (ID, ID_2, description, state) AS (
    VALUES ('First',    'row',  'abc',  array['MN', 'SD']),
    ('Second',  'row',  'xyz',  array['AL', 'CA' ])
) 
-- query
select ID,
    ID_2,
    cast(
        map(
            array [ 'description',
            'state' ],
            array [ cast(description as json),
            cast(state as json) ]
        ) as json
    ) custom_object
from dataset
Output:
| ID | ID_2 | custom_object | 
|---|---|---|
| First | row | {"description":"abc","state":["MN","SD"]} | 
| Second | row | {"description":"xyz","state":["AL","CA"]} | 
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source | 
|---|---|
| Solution 1 | Guru Stron | 

 amazon-web-services
amazon-web-services