'How to store dynamically generated JSON object in Big Query Table?

I have a use case to store dynamic JSON objects in a column in Big Query. The schema of the object is dynamically generated by the source and not known beforehand. The number of key value pairs in the object can differ as well, as shown below.

Example JSON objects:
{"Fruit":"Apple","Price":"10","Sale":"No"}
{"Movie":"Avatar","Genre":"Fiction"}

I could achieve the same in Hive by defining the column as map<string, string> object and I could query the data in the column like col_name["Fruit"] or col_name["Movie"] for that corresponding row.

Is there an equivalent way of above usage in Big Query? I came across 'RECORD' data type but the schema needs to be same for all the objects in the column.

Note: Storing the column as string datatype is not an option as the users need to query the data on the keys directly without parsing after retrieving the data.



Solution 1:[1]

Storing the data as a JSON string seems to be the only way to implement your requirement, at the moment. As a workaround, you can create a JavaScript UDF that parses the JSON string and extracts the necessary information. Below is a sample UDF.

CREATE TEMP FUNCTION extract_from_json(json STRING, key STRING)
RETURNS STRING
LANGUAGE js AS """
  const obj = JSON.parse(json);
  return obj[key];
"""; 
WITH json_table AS (
  SELECT '{"Fruit":"Apple","Price":"10","Sale":"No"}' json_data UNION ALL
  SELECT '{"Movie":"Avatar","Genre":"Fiction"}' json_data
)
SELECT extract_from_json(json_data, 'Movie') AS photos
FROM json_table

You can also check out the newly introduced JSON data type in BigQuery. The data type offers more flexibility when handling JSON data but please note that the data type is still in preview and is not recommended for production. You will have to enroll in this preview. For more information on working with JSON data, refer to this documentation.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1