Use search index query functions in SQL WHERE clauses to perform full-text search, array, nested, vector, and JSON queries on search index mapping tables.
Supported operations
Before you use search index SQL queries, create a mapping table for the search index. For more information, see DDL operations.
Function | Query type | Description |
TEXT_MATCH | Full-text search | Matches rows that contain at least one token from the query text. |
TEXT_MATCH_PHRASE | Full-text search | Matches rows in which the tokens appear consecutively in the specified order. |
ARRAY_EXTRACT | Array query | Expands an array column for filtering with operators. |
NESTED_QUERY | Nested type query | Requires all conditions to be met by a single JSON element. |
VECTOR_QUERY_FLOAT32 | Vector search | Performs approximate nearest neighbor (ANN) queries. |
SCORE() | Vector search | Returns the relevance score of a vector search result. |
->> | JSON function | Extracts a value at the specified path and converts it to a string. |
JSON_UNQUOTE | JSON function | Removes the outer quotation marks from a JSON value. |
JSON_EXTRACT | JSON function | Extracts the sub-document at the specified path. |
Full-text search
Match data in Text-type fields with TEXT_MATCH for token matching or TEXT_MATCH_PHRASE for phrase matching.
Before you use full-text search, configure the target column as the Text type in the search index and set a Tokenization. For columns with fuzzy tokenization, use TEXT_MATCH_PHRASE for high-performance fuzzy queries.
TEXT_MATCH (match query)
Tokenizes the query text and matches rows that contain at least one token. Returns a Boolean value: true for a match, false for no match.
TEXT_MATCH(fieldName, text [, options])Parameter | Type | Description |
fieldName | STRING | The name of the column to match. The column must be the Text type in the search index. |
text | STRING | The query text. The text is tokenized and then matched against the row data. A row matches if it contains any token. The search index tokenizer determines how the text is split into tokens. If no tokenizer is specified, single-character tokenization is used by default. |
options | STRING | Optional match parameters, including operator (the logical operator, which can be OR or AND, default OR) and minimum_should_match (the minimum number of matching tokens, default 1). If operator is OR, a row matches when it contains at least minimum_should_match tokens. If operator is AND, all tokens must be present in the row. |
TEXT_MATCH_PHRASE (phrase match query)
Similar to TEXT_MATCH, but requires the tokens to appear consecutively in the same order in the row data. Returns a Boolean value.
TEXT_MATCH_PHRASE(fieldName, text)The parameters are the same as TEXT_MATCH, but TEXT_MATCH_PHRASE requires the tokens to match consecutively in order. For example, the query text "this is" matches "this is tablestore" but does not match "this table is" or "is this".
Examples
Query data in the content column that contains the "tablestore" token:
SELECT * FROM search_exampletable WHERE TEXT_MATCH(content, 'tablestore') LIMIT 10;Query data in the content column that contains "sql query" as a consecutive phrase:
SELECT * FROM search_exampletable WHERE TEXT_MATCH_PHRASE(content, 'sql query') LIMIT 10;Use the options parameter to match data that contains at least 2 tokens:
SELECT * FROM search_exampletable WHERE TEXT_MATCH(content, 'tablestore is cool', 'or', '2') LIMIT 10;Use the AND operator to require all tokens to be present:
SELECT * FROM search_exampletable WHERE TEXT_MATCH(content, 'tablestore is cool', 'and') LIMIT 10;Array queries
Use the ARRAY_EXTRACT function to query data in array-type columns. Configure the column as an array type in the search index by enabling the array option in the console or setting IsArray to true in the SDK. When you write data, array values must be in JSON array format, such as ["a","b","c"].
Data type mapping
Data table type | Search index type | SQL type |
String | The actual type of array elements (Long, Double, Boolean, Keyword, or Text), with the array property enabled for the column | VARCHAR (primary key) or MEDIUMTEXT (predefined column) |
ARRAY_EXTRACT(col_name)ARRAY_EXTRACT expands an array column and combines with operators as a WHERE clause condition. Supported operators include equality (=), range (>, <), and LIKE.
You cannot use an array column directly with operators as a query condition. Use the ARRAY_EXTRACT function.
Limitations
ARRAY_EXTRACT can be used only on search index mapping tables, and only one array column parameter is allowed per call. The function can be used only as a WHERE clause condition. It cannot be used as a SELECT expression or for aggregation and sorting.
An array column without ARRAY_EXTRACT can be used as a SELECT column name or expression but cannot be used for aggregation and sorting.
When ARRAY_EXTRACT is combined with operators as a query condition, data type conversion is not supported. The query value must match the data type of the array column. For example, a Long-type array column supports
ARRAY_EXTRACT(col) = 1but does not supportARRAY_EXTRACT(col) = '1'.Text-type array elements must be used with the TEXT_MATCH or TEXT_MATCH_PHRASE function, such as
TEXT_MATCH(ARRAY_EXTRACT(col_text), 'keyword').
Examples
-- Query rows that contain the value 'apple' in the array
SELECT * FROM search_exampletable WHERE ARRAY_EXTRACT(col_array) = 'apple';
-- Query rows that contain elements starting with 'd' in the array
SELECT * FROM search_exampletable WHERE ARRAY_EXTRACT(col_array) LIKE 'd%';Nested type queries
Nested type columns store JSON arrays in which each element contains multiple subcolumns. The column data type in the data table must be String. Configure the column as the Nested type and specify the subcolumn data types when you create the search index.
When you create a mapping table, define nested type columns as MEDIUMTEXT. Internal subcolumns are created automatically and can be viewed with DESCRIBE, such as col_nested.name and col_nested.age. In queries, subcolumn names use the format nested_column.sub_column, with periods (.) separating multiple nesting levels, such as col1.col2.col3.
Data type mapping
Data table type | Search index type | SQL type |
String | Nested type. The subcolumn data types match the actual data types of the written data. | VARCHAR (primary key) or MEDIUMTEXT (predefined column) |
Query methods
Direct subcolumn query
Use nested subcolumns directly with operators. A row matches if any JSON element in the row has a subcolumn that meets the condition.
SELECT * FROM search_exampletable WHERE `col_nested.age` > 30;NESTED_QUERY function
Requires all conditions to be met by a single JSON element.
NESTED_QUERY(subcol_column_condition)subcol_column_condition specifies query conditions on subcolumns at the same nesting level. Combine multiple conditions with AND or OR.
Difference between the two methods
Assume the nested column tags contains the following row data: [{"tagName":"tag1", "score":0.8}, {"tagName":"tag2", "score":0.2}]:
tags.tagName— Matches, because the first element meets the tagName condition and the second element meets the score condition.NESTED_QUERY(— Does not match, because no single element meets both conditions.
Limitations
NESTED_QUERY can be used only on search index mapping tables and only as a WHERE clause. It cannot be used as a SELECT expression or for aggregation, grouping, or sorting.
Nested subcolumns cannot be used as SELECT expressions or for aggregation, grouping, or sorting.
ALTER TABLE cannot directly add or delete nested subcolumns. You can only add or delete the entire nested column, and the subcolumns are automatically added or deleted with it.
Nested subcolumns do not support data type conversion or function computations that cannot be pushed down to the search index. Make sure the data types of nested subcolumns are correct.
Examples
-- Direct subcolumn query: query rows where age > 30 in the nested column
SELECT * FROM search_exampletable WHERE `col_nested.age` > 30;
-- NESTED_QUERY: query rows where a single element has name starting with 'I' and age < 20
SELECT * FROM search_exampletable WHERE NESTED_QUERY(`col_nested.name` LIKE 'I%' AND `col_nested.age` < 20);
-- Multi-level nesting
SELECT * FROM search_exampletable WHERE NESTED_QUERY(`col1.col2` = 1 AND NESTED_QUERY(`col1.col3.col4` = 2));Virtual column queries
Search index virtual columns let you query new fields and types by modifying the search index schema, without changing the data table storage structure. Virtual columns are defined in the mapping table with their actual SQL data types.
Data type mapping
Search index virtual column type | SQL type | Description |
Keyword | MEDIUMTEXT | Virtual columns have no corresponding column in the data table. Only their source columns have corresponding columns. |
Text | MEDIUMTEXT | |
Long | BIGINT | |
Double | DOUBLE |
Supported usage
Filter data in WHERE clauses. The data type of the virtual column in the condition must match the query parameter type.
Use in aggregation and grouping. The source data type of the virtual column must be compatible with the operation. For example, only Long and Double types support SUM. Keyword-type virtual columns cannot be summed, and Text-type virtual columns do not support grouping.
TopN queries and sorting are supported. Sorting requires LIMIT.
Limitations
Virtual columns can be used only in search index mapping tables.
Virtual columns can be used only in query conditions. They cannot be used in SELECT to return column values. To return values, specify the source column of the virtual column.
SELECT *is not affected and automatically excludes virtual columns from the results.Virtual columns cannot be used for column comparisons, computations, or JOINs.
Virtual columns do not support data type conversion or function computations that cannot be pushed down to the search index. Currently, only aggregate functions can be pushed down in SQL queries.
Examples
Create a search index mapping table that includes virtual columns:
CREATE TABLE search_exampletable(
col_keyword MEDIUMTEXT,
col_keyword_virtual_long BIGINT
)
ENGINE='searchindex',
ENGINE_ATTRIBUTE='{"index_name":"exampletable_index","table_name":"exampletable"}';Query with virtual columns:
SELECT * FROM search_exampletable WHERE col_keyword_virtual_long > 100 LIMIT 10;Vector search
Use the VECTOR_QUERY_FLOAT32 function for approximate nearest neighbor (ANN) queries. Vector fields are stored as strings in the data table. Configure them as the vector type in the search index, and specify the dimensions, data type, and distance metric. The SQL data type of vector columns in the mapping table is MEDIUMTEXT.
VECTOR_QUERY_FLOAT32
VECTOR_QUERY_FLOAT32(fieldName, float32QueryVector, topK, filter)Parameter | Required | Description |
fieldName | Yes | The name of the vector column. The column must be the vector type in the search index. |
float32QueryVector | Yes | The query vector. The dimensions must match those of the vector field in the search index. |
topK | Yes | The number of nearest results to return. A larger K value improves recall but increases query latency and cost. If topK is less than the LIMIT value, the server automatically increases topK to match LIMIT. For the maximum topK value, see Search index limits. |
filter | No | A query filter that supports any combination of non-vector query conditions. The filter conditions are applied before vector search to narrow down the candidate set for more accurate results. You can also add filter conditions in the WHERE clause with AND, but those conditions filter the topK results after vector search. |
SCORE() function
Use SCORE() with VECTOR_QUERY_FLOAT32 as a SELECT expression to return the relevance score of each result. A higher score indicates greater similarity.
SCORE()Limitations
VECTOR_QUERY_FLOAT32 can be used only on search index mapping tables and must be used with LIMIT. HAVING clauses are not supported.
VECTOR_QUERY_FLOAT32 can be used only as a WHERE clause. It cannot be used as a SELECT expression or for aggregation, grouping, or sorting.
SCORE() can be used only with VECTOR_QUERY_FLOAT32 and only as a SELECT expression. It cannot be used in WHERE clauses, aggregation, or sorting.
Other conditions in the WHERE clause must support pushdown to the search index. Otherwise, the query fails. For supported push-down operators, see Query optimization.
Examples
Query the 10 most similar results to the specified vector in the col_vector column:
SELECT *, SCORE() FROM exampletable WHERE VECTOR_QUERY_FLOAT32(col_vector, '[1.5, -1.5, 2.5, -2.5]', 10) LIMIT 10;Use filter to narrow the candidate set before vector search for more accurate results:
SELECT *, SCORE() FROM exampletable WHERE VECTOR_QUERY_FLOAT32(col_vector, '[1.5, -1.5, 2.5, -2.5]', 100, col_keyword='cat_a' AND year_num=2024) LIMIT 10;Use AND in the WHERE clause for post-vector-search filtering. The topK results may not include all matching rows:
SELECT *, SCORE() FROM exampletable WHERE col_keyword='cat_a' AND VECTOR_QUERY_FLOAT32(col_vector, '[1.5, -1.5, 2.5, -2.5]', 500) LIMIT 10;JSON functions
Tablestore SQL JSON functions follow the MySQL 5.7 syntax and extract data from JSON-format columns.
Function | Syntax | Description |
->> |
| Extracts the value at the specified path and converts it to a string. Equivalent to |
JSON_UNQUOTE |
| Removes the outer quotation marks from a JSON value and returns a string. |
JSON_EXTRACT |
| Extracts the sub-document at the specified path. The return value retains the JSON format. |
->> (JSON path extraction)
Extracts the value at the specified path from a JSON column and unquotes it to a string. Equivalent to JSON_UNQUOTE(JSON_EXTRACT()).
column->>'$.path'Parameter | Type | Description |
column | STRING | The column name. |
path | STRING | A JSON path expression that must start with |
Example
SELECT col_json->>'$.city' AS city FROM exampletable LIMIT 10;JSON_UNQUOTE
Removes the outer quotation marks from a JSON value and returns a string. Returns NULL if the argument is NULL.
JSON_UNQUOTE(json_val)Parameter | Type | Description |
json_val | STRING | A JSON value, typically the return value of JSON_EXTRACT. An error is returned if the value starts and ends with double quotation marks but is not a valid JSON string literal. |
Example
SELECT JSON_UNQUOTE(JSON_EXTRACT(col_json, '$.city')) AS city FROM exampletable LIMIT 10;JSON_EXTRACT
Extracts the sub-document at the specified path from a JSON column. The return value retains the JSON format, with string values wrapped in quotation marks. Multiple paths can be specified, and the results are returned in array format.
Tablestore does not support native JSON types. JSON_EXTRACT cannot be used alone and returns an invalid column type: json error. Use JSON_EXTRACT with JSON_UNQUOTE.
JSON_EXTRACT(json_doc, path[, path] ...)Parameter | Type | Description |
json_doc | STRING | A JSON document. An error is returned if the value is not a valid JSON document. |
path | STRING | A JSON path expression that must start with |
Examples
Extract a single path:
SELECT JSON_UNQUOTE(JSON_EXTRACT(col_json, '$.city')) AS city FROM exampletable WHERE pk = 1;Extract multiple paths. The results are returned in array format:
-- Assume col_json contains {"a": 1, "b": 2, "c": {"d": 4}}
SELECT JSON_UNQUOTE(JSON_EXTRACT(col_json, '$.a', '$.b', '$.c.d')) AS subdoc FROM exampletable WHERE pk = 1;
-- Result: [1, 2, 4]JSON Path syntax
Paths must start with $, which represents the entire JSON document. Append path selectors after $. Selectors can be combined.
Selector | Example | Description |
$.key | $.a, $.c.d | Accesses an object member. Enclose keys that contain spaces in double quotation marks, such as |
[N] | $[0], $.f[1] | Accesses an array element. Indexes start from 0. |
.* | $.* | Object wildcard. Returns the values of all members. |
[*] | $.arr[*] | Array wildcard. Returns the values of all elements. |
prefix**suffix | $**.d | Path wildcard. Matches all paths that start with prefix and end with suffix. |
JSON object query examples
Assume the JSON column contains {"a": 1, "f": [1, 2, 3], "c": {"d": 4}}:
Path | Return value | Description |
$ | {"a": 1, "c": {"d": 4}, "f": [1, 2, 3]} | The entire document |
$.a | 1 | A direct member |
$.c | {"d": 4} | A nested object |
$.c.d | 4 | A nested object member |
$.f[1] | 2 | An array element |
JSON array query examples
Assume the JSON column contains [3, {"a": [5, 6], "b": 10}, [99, 100]]. Non-scalar return values support nested queries.
Path | Return value | Description |
$[0] | 3 | A scalar element |
$[1] | {"a": [5, 6], "b": 10} | A non-scalar value. Nested queries can continue. |
$[1].a | [5, 6] | A nested object member |
$[1].a[1] | 6 | A nested array element |
$[1].b | 10 | A nested object member |
$[2][0] | 99 | A nested array element |
$[3] | NULL | Out of bounds. Returns NULL. |