It is often necessary to process JSON data in databases. For example, some property fields in the JSON data need to be expanded and converted into a relational form of rows and columns. JSON data can be expanded flexibly by using various methods. This topic provides detailed examples of SQL statements to describe how to expand JSON data into a relational form of rows and columns.
General roadmap
Use JSON functions to analyze JSON strings and extract data.
Convert the extracted data to an array or map structure, or use lambda expressions to convert the data.
Use UNNEST to expand the converted data into a relational form of rows and columns.
Use UNNEST to expand a map into a relational form of rows and columns
SQL statement example:
SELECT t.m, t.n
FROM (
SELECT MAP(ARRAY['foo', 'bar'], ARRAY[1, 2]) as map_data
)
CROSS JOIN unnest(map_data) AS t(m, n);
Result:
+------+------+
| m | n |
+------+------+
| foo | 1 |
| bar | 2 |
+------+------+
Use UNNEST to expand JSON data into a relational form of rows and columns
SQL statement example 1:
SELECT json_extract(t.a, '$.a') AS a,
json_extract(t.a, '$.b') AS b
FROM (
SELECT cast(json_extract('{"x":[{"a":1,"b":2},{"a":3,"b":4}]}', '$.x')
AS array<JSON>) AS package_array
)
CROSS JOIN UNNEST(package_array) AS t(a);
Result:
+------+------+
| a | b |
+------+------+
| 1 | 2 |
| 3 | 4 |
+------+------+
SQL statement example 2:
SELECT t.m AS _col1, t.n AS _col2
FROM (
SELECT cast(json_extract('{"x":[{"a":1,"b":2},{"a":3,"b":4}]}', '$.x')
AS array<JSON>) AS array_1,
cast(json_extract('{"x":[{"a":5,"b":6}, {"a":7,"b":8}, {"a":9,"b":10}, {"a":11,"b":12}]}', '$.x')
AS array<JSON>) AS array_2
)
CROSS JOIN UNNEST(array_1, array_2) AS t(m, n);
Result:
+---------------+-----------------+
| _col1 | _col2 |
+---------------+-----------------+
| {"a":1,"b":2} | {"a":5,"b":6} |
| {"a":3,"b":4} | {"a":7,"b":8} |
| NULL | {"a":9,"b":10} |
| NULL | {"a":11,"b":12} |
+---------------+-----------------+
SQL statement example 3:
SELECT json_extract(t.m, '$.a') AS _col1,
json_extract(t.m, '$.b') AS _col2,
json_extract(t.n, '$.a') AS _col3,
json_extract(t.n, '$.b') AS _col4
FROM (
SELECT cast(json_extract('{"x":[{"a":1,"b":2},{"a":3,"b":4}]}', '$.x')
AS array<JSON>) AS array_1,
cast(json_extract('{"x":[{"a":5,"b":6}, {"a":7,"b":8}, {"a":9,"b":10}, {"a":11,"b":12}]}', '$.x')
AS array<JSON>) AS array_2
)
CROSS JOIN UNNEST(array_1, array_2) AS t(m, n);
Result:
+-------+-------+-------+-------+
| _col1 | _col2 | _col3 | _col4 |
+-------+-------+-------+-------+
| 1 | 2 | 5 | 6 |
| 3 | 4 | 7 | 8 |
| NULL | NULL | 9 | 10 |
| NULL | NULL | 11 | 12 |
+-------+-------+-------+-------+
Use UNNEST with lambda expressions to expand JSON data into a relational form of rows and columns
SQL statement example:
SELECT count(*) AS cnt,
package_name
FROM (
SELECT t.a AS package_name
FROM (
SELECT transform(packages_map_array, x -> Element_at(x, 'packageName'))
AS package_array
FROM (
SELECT cast(Json_extract(data_json, '$.packages')
AS array<map<VARCHAR, VARCHAR>>) AS packages_map_array
FROM (
SELECT json_parse(data) AS data_json
FROM (
SELECT '{
"packages": [
{
"appName": "China Railway 12306",
"packageName": "com.MobileTicket",
"versionName": "4.1.9",
"versionCode": "194"
},
{
"appName": "QQ Speed",
"packageName": "com.tencent.tmgp.speedmobile",
"versionName": "1.11.0.13274",
"versionCode": "1110013274"
},
{
"appName": "iReader",
"packageName": "com.chaozh.iReaderFree",
"versionName": "7.11.0",
"versionCode": "71101"
}
]
}'
AS data
)
)
)
) AS x (package_array)
CROSS JOIN UNNEST(package_array) AS t (a)
)
GROUP BY package_name
ORDER BY cnt DESC;
Result:
+------+------------------------------+
| cnt | package_name |
+------+------------------------------+
| 1 | com.MobileTicket |
| 1 | com.tencent.tmgp.speedmobile |
| 1 | com.chaozh.iReaderFree |
+------+------------------------------+