'How to merge two json rows with duplicate ids and unique values in Snowflake?

I have to merge two(or more than two) JSON rows with the same ids but both the rows have unique values. Eg.

create or replace table source(v variant); 
     INSERT INTO source SELECT parse_json('{
      "pd": {
        "extraction_date": "1644471240",
        "product_id": "357946",
       "retailerName": "retailer",
        "productName":"product"
       "unique1":"value"
      }
    }');
     INSERT INTO source SELECT parse_json('{
      "pd": {
        "extraction_date": "1644471242",
        "product_id": "357946",
       "retailerName": "retailer2",
        "productName":"product2",
        "unique2":"value"
      }
    }');

My desired output is:

"pd": {
            "extraction_date": "1644471240",
            "product_id": "357946",
            "retailerName": "retailer",
            "productName":"product",
            "unique1":"value",
            "unique2":"value"
          }
        }');

I looked through the documentations and StackOverflow but no solution works for this case. What can be a solution.



Solution 1:[1]

A JavaScript UDF like this one will merge objects like the one in the question:

create or replace function merge_objects(X array)
returns variant
language javascript
as $$

merged = {}
X.forEach((item) => {
    keys = Object.keys(item['pd']);
    for (let i = 0; i < keys.length; i++) {
        merged[keys[i]] = item['pd'][keys[i]];
    }
});

return {'pd': merged};
$$

Then you can use that UDF in SQL to perform the merge:

select count(*)
    , merge_objects(array_agg(v) 
        within group (order by v:pd.extraction_date desc))
from source
group by v:pd.product_id;

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Felipe Hoffa