'How to merge two json rows with duplicate ids and unique values in Snowflake?
I have to merge two(or more than two) JSON rows with the same ids but both the rows have unique values. Eg.
create or replace table source(v variant);
INSERT INTO source SELECT parse_json('{
"pd": {
"extraction_date": "1644471240",
"product_id": "357946",
"retailerName": "retailer",
"productName":"product"
"unique1":"value"
}
}');
INSERT INTO source SELECT parse_json('{
"pd": {
"extraction_date": "1644471242",
"product_id": "357946",
"retailerName": "retailer2",
"productName":"product2",
"unique2":"value"
}
}');
My desired output is:
"pd": {
"extraction_date": "1644471240",
"product_id": "357946",
"retailerName": "retailer",
"productName":"product",
"unique1":"value",
"unique2":"value"
}
}');
I looked through the documentations and StackOverflow but no solution works for this case. What can be a solution.
Solution 1:[1]
A JavaScript UDF like this one will merge objects like the one in the question:
create or replace function merge_objects(X array)
returns variant
language javascript
as $$
merged = {}
X.forEach((item) => {
keys = Object.keys(item['pd']);
for (let i = 0; i < keys.length; i++) {
merged[keys[i]] = item['pd'][keys[i]];
}
});
return {'pd': merged};
$$
Then you can use that UDF in SQL to perform the merge:
select count(*)
, merge_objects(array_agg(v)
within group (order by v:pd.extraction_date desc))
from source
group by v:pd.product_id;
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Felipe Hoffa |