AnalyticDB for MySQL provides JSON indexes as a semi-structured data retrieval feature to simplify semi-structured data processing.
Background information
In the era of big data, a variety of solutions are available to retrieve structured data. However, most big data is semi-structured, and the amount of semi-structured data is rapidly increasing. Semi-structured data is much more difficult to understand and analyze than structured data. Mature solutions are required to process semi-structured data. AnalyticDB for MySQL provides JSON indexes as a semi-structured data retrieval feature to simplify semi-structured data processing.
Precautions
- If you set the data type of a column to JSON when you create a table, a JSON index is automatically created in AnalyticDB for MySQL clusters that run a minor engine version earlier than 3.1.5.10.
After the JSON index is created, you can add or remove an index by executing the ALTER TABLE statement. For more information, see the "ALTER TABLE" of this topic.
- If you set the data type of a column to JSON when you create a table, no JSON index is created in AnalyticDB for MySQL clusters that run the minor engine version 3.1.5.10 or later. You must execute DDL statements to create a JSON index.
ALTER table <table_name> ADD key|index <index_name>(json_column);
- AnalyticDB for MySQL supports the standard JSON format. When you write JSON strings to a table, you must comply with the standard JSON format.
- Default values cannot be set for JSON data columns.
Create a table
In the following sample statement, the data in the vj
column is of the JSON type. After a table is created, AnalyticDB for MySQL automatically creates a JSON index for the vj
column.
CREATE TABLE json_test(
id int,
vj json COMMENT 'The vj column is of the JSON type. AnalyticDB for MySQL automatically creates a JSON index for this column.'
)
DISTRIBUTED BY HASH(id);
- Remove an index
ALTER TABLE db_name.table_name DROP KEY index_name;
Note You can execute the SHOW INDEXES statement to view information about all indexes in the table, including index names.key_name
indicates the name of an index. For more information, see SHOW. - Add an index
ALTER TABLE db_name.table_name ADD KEY index_name(column_name);
JSON format
keys
and values
:Keys
You must enclose a
key
with double quotation marks (""
). Example:"addr"
in{"addr":"xyz"}
.Values
- Data types available for a
value
include BOOLEAN, NUMBER, VARCHAR, ARRAY, OBJECT, and NULL.Note- A value of the NUMBER type must be within the allowed value range of the DOUBLE type. Otherwise, an error is returned.
- AnalyticDB for MySQL allows you to write JSON arrays, including plain arrays and nested arrays. Examples:
{"hobby":["basketball", "football"]}
and{"addr":[{"city":"beijing", "no":0}, {"city":"shenzhen", "no":0}]}
.
- If a
value
is of the STRING type, you must enclose thevalue
with double quotation marks (""
).Note If avalue
is of the STRING type and contains double quotation marks (""), you must add escape characters before the double quotation marks to indicate that the double quotation marks are part of thevalue
. For example, thevalue
in{"addr":"xyz"ab"c"}
is"xyz"ab"c"
. The value must be escaped as"xyz\"ab\"c"
. Backslashes (\
) are escaped when data is written, so you must write{"addr":"xyz\\"ab\\"c"}
. - If a
value
is of the NUMBER type, you must write thevalue
to the table instead of enclosing the value with double quotation marks (""
). - If a
value
is of theBOOLEAN
type, you must writetrue
orfalse
instead of1
or0
to the table.true
andfalse
must be in lowercase. - If a
value
isNULL
, you must writeNull
to the table. - A
key
can havevalues
of different data types. The value of a specified data type is returned during queries.For example, if the
INSERT INTO test_tb1 VALUES ({"id":1})
statement is executed, the value ofid
is1
. If theINSERT INTO test_tb1 VALUES ({"id":"1"})
statement is executed, the value ofid
is"1"
.If the
SELECT id FROM test_tb1 WHERE json_extract(col, '$.id')= 1;
statement is executed,"id": 1
is returned. If theSELECT id FROM test_tb1 WHERE json_extract(col, '$.id')= '1';
statement is executed,"id": "1"
is returned.
- Data types available for a
Write data
Fields of the JSON type are written to tables in single quotation marks (''), in the same manner as fields of the VARCHAR type.
The following SQL statements provide examples of different JSON data types:
INSERT INTO json_test VALUES(0, '{"id":0, "name":"abc", "age":0}');
INSERT INTO json_test VALUES(1, '{"id":1, "name":"abc", "age":10, "gender":"f"}');
INSERT INTO json_test VALUES(2, '{"id":3, "name":"xyz", "age":30, "company":{"name":"alibaba", "place":"hangzhou"}}');
INSERT INTO json_test VALUES(3, '{"id":5, "name":"a\\"b\\"c", "age":50, "company":{"name":"alibaba", "place":"america"}}');
INSERT INTO json_test VALUES(4, '{"a":1, "b":"abc-char", "c":true}');
INSERT INTO json_test VALUES(5, '{"uname":{"first":"lily", "last":"chen"}, "addr":[{"city":"beijing", "no":1}, {"city":"shenzhen", "no":0}], "age":10, "male":true, "like":"fish", "hobby":["basketball", "football"]}');
Query data
AnalyticDB for MySQL allows you to use the JSON_EXTRACT()
function for data query.
- Syntax
json_extract(json, jsonpath)
- Description
This function extracts one or more values specified by
jsonpath
from a valid JSON string. - Parameters
json
: the name of the column of the JSON type.jsonpath
: the path of thekey
that is separated by periods (.
).$
indicates the outermost path.
For more information about JSON functions, see JSON functions.
- Example
- Basic query
SELECT json_extract(vj,'$.name') FROM json_test WHERE id=1;
- Equivalent query
SELECT id, vj FROM json_test WHERE json_extract(vj, '$.name') = 'abc'; SELECT id, vj FROM json_test WHERE json_extract(vj, '$.c') = true; SELECT id, vj FROM json_test WHERE json_extract(vj, '$.age') = 30; SELECT id, vj FROM json_test WHERE json_extract(vj, '$.company.name') = 'alibaba';
- Range query
SELECT id, vj FROM json_test WHERE json_extract(vj, '$.age') > 0; SELECT id, vj FROM json_test WHERE json_extract(vj, '$.age') < 100; SELECT id, vj FROM json_test WHERE json_extract(vj, '$.name') > 'a' and json_extract(vj, '$.name') < 'z';
- IS NULL or IS NOT NULL query
SELECT id, vj FROM json_test WHERE json_extract(vj, '$.remark') is null; SELECT id, vj FROM json_test WHERE json_extract(vj, '$.name') is not null;
- IN query
SELECT id, vj FROM json_test WHERE json_extract(vj, '$.name') in ('abc','xyz'); SELECT id, vj FROM json_test WHERE json_extract(vj, '$.age') in (10,20);
- LIKE query
SELECT id, vj FROM json_test WHERE json_extract(vj, '$.name') like 'ab%'; SELECT id, vj FROM json_test WHERE json_extract(vj, '$.name') like '%bc%'; SELECT id, vj FROM json_test WHERE json_extract(vj, '$.name') like '%bc';
- ARRAY query
SELECT id, vj FROM json_test WHERE json_extract(vj, '$.addr[0].city') = 'beijing' and json_extract(vj, '$.addr[1].no') = 0;
Note You can use subscripts when you query arrays. However, you cannot iterate through an entire array.
- Basic query