'Extract all records from a JSON column, using JSON type

I have a couple tables (see reproducible code at the bottom):

tbl1_have

id  json_col                             
1   {"a_i":"a","a_j":1}
1   {"a_i":"b","a_j":2}
2   {"a_i":"c","a_j":3}
2   {"a_i":"d","a_j":4}

tbl2_have

id  json_col                          
1   [{"a_i":"a","a_j":1},{"a_i":"b","a_j":2}]
2   [{"a_i":"c","a_j":3},{"a_i":"d","a_j":4}]

I wish to extract all json columns without providing explicit data type conversion for each columns since in my use case the names and amounts of nested attributes vary.

The expected output is the same for both cases:

tbl_want

id  a_i a_j                             
1   a   1
1   b   2
2   c   3
2   d   4

with a_i and a_j correctly stored as a character and numeric column, which mean I'd like to map json types to SQL types (say INT and VARCHAR() here) automatically.

The following gets me half way for both tables:

SELECT id, a_i, a_j FROM tbl2_have CROSS APPLY OPENJSON(json_col) 
WITH(a_i VARCHAR(100), a_j INT)

  id a_i a_j
1  1   a   1
2  1   b   2
3  2   c   3
4  2   d   4

How can I work around mentioning the types explicitly in with() ?


reproducible code :

CREATE TABLE tbl1_have (id INT, json_col VARCHAR(100))
INSERT INTO tbl1_have VALUES 
(1,   '{"a_i":"a","a_j":1}'),
(1,   '{"a_i":"b","a_j":2}'),
(2,   '{"a_i":"c","a_j":3}'),
(2,   '{"a_i":"d","a_j":4}')

CREATE TABLE tbl2_have (id INT, json_col VARCHAR(100))
INSERT INTO tbl2_have VALUES 
(1,   '[{"a_i":"a","a_j":1},{"a_i":"b","a_j":2}]'),
(2,   '[{"a_i":"c","a_j":3},{"a_i":"d","a_j":4}]')

SELECT id, a_i, a_j FROM tbl1_have CROSS APPLY OPENJSON(json_col) 
WITH(a_i VARCHAR(100), a_j INT)

SELECT id, a_i, a_j FROM tbl2_have CROSS APPLY OPENJSON(json_col) 
WITH(a_i VARCHAR(100), a_j INT)


Solution 1:[1]

I am assuming that you don't know the name and type of keys in advance. You need to use dynamic SQL.

You first need to use OPENJSON without the WITH clause on the {objects} like so:

select string_agg(quotename(k) + case t
    when 0 then ' nchar(1)'       -- javascript null
    when 1 then ' nvarchar(max)'  -- javascript string
    when 2 then ' float'          -- javascript number
    when 3 then ' bit'            -- javascript boolean
    else ' nvarchar(max) as json' -- javascript array or object
end, ', ') within group (order by k)
from (
    select j2.[key], max(j2.[type])
    from test
    cross apply openjson(case when json_col like '{%}' then '[' + json_col + ']' else json_col end) as j1
    cross apply openjson(j1.value) as j2
    group by j2.[key]
) as kt(k, t)

The inner query gives you the name and type of all the keys across all json values in the table. The outer query builds the WITH clause for dynamic SQL.

The rest is relatively straight forward, use the generated clause in your dynamic SQL. Here is the complete example:

declare @table_name nvarchar(100) = 'test';
declare @with_clause nvarchar(100);

declare @query1 nvarchar(999) = N'select @with_clause_temp = string_agg(quotename(k) + case t
    when 0 then '' nchar(1)''
    when 1 then '' nvarchar(max)''
    when 2 then '' float''
    when 3 then '' bit''
    else '' nvarchar(max) as json''
end, '', '') within group (order by k)
from (
    select j2.[key], max(j2.[type])
    from ' + quotename(@table_name) + '
    cross apply openjson(case when json_col like ''{%}'' then ''['' + json_col + '']'' else json_col end) as j1
    cross apply openjson(j1.value) as j2
    group by j2.[key]
) as kt(k, t)';
exec sp_executesql @query1, N'@with_clause_temp nvarchar(100) out', @with_clause out;

declare @query2 nvarchar(999) = N'select id, j.*
from ' + quotename(@table_name) + '
cross apply openjson(json_col)
with (' + @with_clause + ') as j';
exec sp_executesql @query2;

Demo on db<>fiddle

Solution 2:[2]

I have found a solution that maybe works for your use case. I am no SQL-expert by any means, and i did not manage to automatically detect the datatypes of the dynamic columns. But i found a solution for your two examples.

First i tried to get all column names dynamically from the json_col. I found an answer on stackoverflow and got this piece of code:

STUFF(
    (
        SELECT DISTINCT ', '+QUOTENAME(columnname) FROM #tmpTbl FOR XML PATH(''), TYPE
    ).value('.', 'nvarchar(max)'), 1, 1, '');

This will output all column names as a string separated by commas, in your example: ' [a_i], [a_j]'. This can then be used to dynamically SELECT columns.

As already mentioned above, i was not able to write a datatype detection algorithm. I just hardcoded the columns to have nvarchar(100) as datatype.

To dynamically get the column-names with the corresponding datatype (hardcoded as nvarchar(100)) i used a slightly modified version of above query:

STUFF(
    (
        SELECT DISTINCT ', '+QUOTENAME(columnname)+' nvarchar(100)' FROM #tmpTbl FOR XML PATH(''), TYPE
    ).value('.', 'nvarchar(max)'), 1, 1, '');

Then i just used them in the WITH-CLAUSE.


Full version for the table tbl1_have

DECLARE @cols NVARCHAR(MAX), @colsWithType NVARCHAR(MAX), @query NVARCHAR(MAX);
DROP TABLE IF EXISTS  #tmpTbl

SELECT outerTable.[id] AS columnid, innerTable.[key] AS columnname, innerTable.[value] AS columnvalue
    INTO #tmpTbl
    FROM tbl1_have outerTable CROSS APPLY OPENJSON(json_col) AS innerTable


SELECT * FROM #tmpTbl
SET @cols = STUFF(
    (
        SELECT DISTINCT ', '+QUOTENAME(columnname) FROM #tmpTbl FOR XML PATH(''), TYPE
    ).value('.', 'nvarchar(max)'), 1, 1, '');
SET @colsWithType = STUFF(
    (
        SELECT DISTINCT ', '+QUOTENAME(columnname)+' nvarchar(100)' FROM #tmpTbl FOR XML PATH(''), TYPE
    ).value('.', 'nvarchar(max)'), 1, 1, '');

SET @query = N'SELECT id, '+@cols+' FROM tbl1_have CROSS APPLY OPENJSON(json_col) 
WITH('+@colsWithType+')';

exec sp_executesql @query

Full Version for the table tbl2_have:

DECLARE @cols NVARCHAR(MAX), @colsWithType NVARCHAR(MAX), @query NVARCHAR(MAX);
DROP TABLE IF EXISTS  #tmpTbl
DROP TABLE IF EXISTS  #tmpTbl2
SELECT *
    INTO #tmpTbl
    FROM tbl2_have CROSS APPLY OPENJSON(json_col)

SELECT outerTable.[id] AS columnid, innerTable.[key] AS columnname, innerTable.[value] AS columnvalue
    INTO #tmpTbl2
    FROM #tmpTbl outerTable CROSS APPLY OPENJSON([value]) AS innerTable

SELECT * FROM #tmpTbl
SELECT * FROM #tmpTbl2
SET @cols = STUFF(
    (
        SELECT DISTINCT ', '+QUOTENAME(columnname) FROM #tmpTbl2 FOR XML PATH(''), TYPE
    ).value('.', 'nvarchar(max)'), 1, 1, '');
SET @colsWithType = STUFF(
    (
        SELECT DISTINCT ', '+QUOTENAME(columnname)+' nvarchar(100)' FROM #tmpTbl2 FOR XML PATH(''), TYPE
    ).value('.', 'nvarchar(max)'), 1, 1, '');

SET @query = N'SELECT id, '+@cols+' FROM tbl2_have CROSS APPLY OPENJSON(json_col) 
WITH('+@colsWithType+')';

exec sp_executesql @query

Solution 3:[3]

Would using the Value returned from OPENJSON work? It probably maps to a string data type, however, you do not have to know the type upfront. The official doc of the OPENJSON rowset function indicates that it returns a Key:Value pair as well as a Type for each parse. The Type value may be useful, however, it determines the datatype while parsing. I bet that Value is always a string type, as it would have to be.

;WITH X AS
(
    SELECT id, a_i=J.[Key], a_j=J.[Value]  FROM #tbl2_have CROSS APPLY OPENJSON(json_col) J
)

SELECT 
    id, 
    a_i=MAX(CASE WHEN J.[Key]='a_i' THEN J.[Value] ELSE NULL END), 
    a_j=MAX(CASE WHEN J.[Key]='a_j' THEN J.[Value] ELSE NULL END) 
FROM X CROSS APPLY OPENJSON(X.a_j) J
GROUP BY
    id,a_i,a_j

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Salman A
Solution 2 JKOERN
Solution 3