GET_JSON_OBJECT extracts a value from JSON data at a specified JSON path.
Syntax
STRING GET_JSON_OBJECT(JSON|STRING <json>, STRING <json_path>)Example: Returns Alice.
SELECT GET_JSON_OBJECT(JSON '{"name": "Alice", "age": 30}', '$.name');Parameters
json (required)
The JSON data to process. Accepts two input types:
JSON type: A value of the JSON data type, such as
JSON '{"name": "Alice", "age": 30}'.STRING type: A JSON-formatted string, such as
'{"name": "Alice", "age": 30}'. String inputs must meet these requirements:Format:
'{"Key":"Value", "Key":"Value",...}'Escape double quotation marks (
") with two backslashes (\\)Escape single quotation marks (
') with one backslash (\)
json_path (required)
A STRING that specifies the JSON path expression. The path must start with $. The following characters are supported:
| Character | Description | Example |
|---|---|---|
$ | Root node | $ |
. | Child node (for JSON objects) | $.store.book |
[''] | Child node (alternative syntax; required when a JSON key contains a period) | $['store']['book'] |
[number] | Array subscript, starting from 0 | $.items[0] |
[*] | Wildcard — returns all elements in an array; cannot be escaped | $.items[*] |
The['']notation requiresSET odps.sql.udf.getjsonobj.new=true;.
Unsupported JSON path syntax
GET_JSON_OBJECT does not support regular expression syntax in JSON paths.
The JSON path syntax for the JSON data type differs from the STRING-based specification and may cause compatibility issues.
Return value
Returns a STRING. The return value depends on the input:
| Condition | Return value |
|---|---|
json is valid and json_path exists | The extracted value as a string |
json is empty or has an invalid format | NULL |
json_path is invalid | NULL |
json_path contains [*] | A non-array string by default |
To force [*] results into a unified array format, run SET odps.sql.force.getjsonobj.array.format=true;.
Return behavior
GET_JSON_OBJECT has two return behaviors controlled by the odps.sql.udf.getjsonobj.new flag. Set it at the session or project level.
| Scenario | new=true (recommended) | new=false |
|---|---|---|
| String output | Returns the original string without modification | Escapes JSON reserved characters (e.g., \n → '\\n', " → '\\"') |
| Duplicate JSON keys | Parses successfully, returns first match | Returns NULL |
| Key output order | Preserves original JSON order | Alphabetical order |
We recommend using SET odps.sql.udf.getjsonobj.new=true; for more standard behavior, simplified data processing, and improved performance. If your MaxCompute project has existing jobs that rely on the behavior of escaping JSON reserved characters, continue using the original behavior until you have verified that switching does not cause errors or correctness issues.
Example: Check your project's current behavior
SELECT GET_JSON_OBJECT('{"a":"[\\"1\\"]"}', '$.a');
-- Behavior: escape JSON reserved characters → returns [\"1\"]
-- Behavior: preserve original string → returns ["1"]Default behavior by project creation date
Projects created on or after January 21, 2021: preserves original strings (
new=truebehavior)Projects created before January 21, 2021: escapes JSON reserved characters (
new=falsebehavior)
To switch your project's default to preserving original strings without setting the flag in every session, submit a ticket.
Enabling Hive compatibility mode (SET odps.sql.hive.compatible=true;) also preserves original strings.
Usage notes
Performance: avoid calling GET_JSON_OBJECT multiple times on the same JSON
Each call parses the JSON string independently. Calling it multiple times on the same row is inefficient. To avoid repeated parsing, use GET_JSON_OBJECT with a user-defined table-valued function (UDTF) to transform JSON log data. For more information, see Convert JSON log data using MaxCompute built-in functions and UDTFs.
Examples
JSON input
Get a value by key
-- Returns 1.
SELECT GET_JSON_OBJECT(JSON '{"a":1, "b":2}', '$.a');
-- Returns NULL (key does not exist).
SELECT GET_JSON_OBJECT(JSON '{"a":1, "b":2}', '$.c');Invalid json_path returns NULL
-- Returns NULL.
SELECT GET_JSON_OBJECT(JSON '{"a":1, "b":2}', '$invalid_json_path');STRING input
Extract from a nested JSON object
-- Prepare sample data.
CREATE TABLE IF NOT EXISTS src_json (json STRING);
INSERT OVERWRITE TABLE src_json VALUES
('{"store":
{"fruit":[{"weight":8,"type":"apple"},{"weight":9,"type":"pear"}],
"bicycle":{"price":19.95,"color":"red"}},
"email":"amy@only_for_json_udf_test.net",
"owner":"amy"}');
-- Returns amy.
SELECT GET_JSON_OBJECT(src_json.json, '$.owner') FROM src_json;
-- Returns {"weight":8,"type":"apple"} (preserves original string).
SET odps.sql.udf.getjsonobj.new=true;
SELECT GET_JSON_OBJECT(src_json.json, '$.store.fruit[0]') FROM src_json;
-- Returns NULL (field does not exist).
SELECT GET_JSON_OBJECT(src_json.json, '$.non_exist_key') FROM src_json;Extract from a JSON array
-- Returns 2222.
SELECT GET_JSON_OBJECT('{"array":[["aaaa",1111],["bbbb",2222],["cccc",3333]]}', '$.array[1][1]');
-- Returns ["h0","h1","h2"] (wildcard, preserves original string).
SET odps.sql.udf.getjsonobj.new=true;
SELECT GET_JSON_OBJECT('{"aaa":"bbb","ccc":{"ddd":"eee","fff":"ggg","hhh":["h0","h1","h2"]},"iii":"jjj"}', '$.ccc.hhh[*]');
-- Returns h1.
SELECT GET_JSON_OBJECT('{"aaa":"bbb","ccc":{"ddd":"eee","fff":"ggg","hhh":["h0","h1","h2"]},"iii":"jjj"}', '$.ccc.hhh[1]');Keys containing a period (`.`)
Use [''] to access JSON keys that contain a period. This requires SET odps.sql.udf.getjsonobj.new=true;.
-- Prepare sample data.
CREATE TABLE json_test (id STRING, json STRING);
-- Key contains a period.
INSERT INTO TABLE json_test (id, json) VALUES
("1", "{\"China.beijing\":{\"school\":{\"id\":0,\"book\":[{\"title\":\"A\",\"price\":8.95},{\"title\":\"B\",\"price\":10.2}]}}}");
-- Key does not contain a period.
INSERT INTO TABLE json_test (id, json) VALUES
("2", "{\"China_beijing\":{\"school\":{\"id\":0,\"book\":[{\"title\":\"A\",\"price\":8.95},{\"title\":\"B\",\"price\":10.2}]}}}");
-- Use [''] to extract from a key with a period. Returns 0.
SELECT GET_JSON_OBJECT(json, "$['China.beijing'].school['id']") FROM json_test WHERE id = 1;
-- Both . and [''] work for keys without special characters. Both return 0.
SELECT GET_JSON_OBJECT(json, "$['China_beijing'].school['id']") FROM json_test WHERE id = 2;
SELECT GET_JSON_OBJECT(json, "$.China_beijing.school['id']") FROM json_test WHERE id = 2;`['']` with the new behavior flag
SET odps.sql.udf.getjsonobj.new=true;
-- Returns 1.
SELECT GET_JSON_OBJECT('{"a.1":"1","a":"2"}', '$[\'a.1\']');Empty or invalid JSON input
-- Returns NULL (empty input).
SELECT GET_JSON_OBJECT('', '$.array[1][1]');
-- Returns NULL (missing outer braces).
SELECT GET_JSON_OBJECT('"array":["aaaa",1111],"bbbb":["cccc",3333]', '$.array[1][1]');Escaped characters in string values
SET odps.sql.udf.getjsonobj.new=true;
-- Returns "1".
SELECT GET_JSON_OBJECT('{"a":"\\"1\\"","b":"2"}', '$.a');
-- Returns '1'.
SELECT GET_JSON_OBJECT('{"a":"\'1\'","b":"2"}', '$.a');Emoji characters
-- Returns the emoji symbol.
SELECT GET_JSON_OBJECT('{"a":"<Emoji symbol>"}', '$.a');DataWorks does not support entering emoji characters directly. Use a tool such as Data Integration to write encoded emoji strings to MaxCompute, then process them with GET_JSON_OBJECT.What's next
JSON functions — related built-in functions for JSON processing
Migrate JSON data from OSS to MaxCompute — best practices for JSON data migration
LanguageManual UDF — JSON path reference from Apache Hive