Search indexes support the following primitive data types, such as Long, Double, Boolean, Keyword, Text, Date, Geopoint, and Vector. Search indexes also supports the Array and Nested data types. The Array data type is suitable for storing a collection of the same type of data. The Nested data type is similar to the JSON data type and is suitable for storing data that has a hierarchical structure.
Array data type
The Array data type can be used only in search indexes. Data tables do not support the Array data type.
You can use the Array data type in the same manner you use a non-Array data type. When you use an Array field to query data, a row of data is returned if at least one of the values in the array meets the condition.
The vector data type cannot be used in arrays.
The Array data type is a composite data type and can be combined with primitive data types, such as Long, Double, Boolean, Keyword, Text, Date, and Geopoint, to construct complex data structures. For example, the combination of the Long data type and the Array data type is used to construct long arrays. A long array can contain multiple long integers. The Array data type is suitable for storing a collection of the same type of data.
Array formats
The following table describes the combinations of the Array data type and primitive data types in search indexes.
Combination | Description |
Long Array | An array of long integers. Example: |
Double Array | An array of floating-point numbers. Example: |
Boolean Array | An array of Boolean values. Example: |
Keyword Array | An array of strings. A keyword array is a JSON array. Example: |
Text Array | An array of text. A text array is a JSON array. Example: Text arrays are not commonly used. |
Date Array | An array of date data. Example of date data of the Integer type: |
Geopoint Array | An array of latitude and longitude coordinate pairs. Example: |
Usage notes
If the data type of a field in a search index is a combination of the Array data type and a primitive data type, such as Long or Double, the field in the data table for which the search index is created must be of the String type and the field in the search index must be of the corresponding primitive data type.
For example, the price
field is of the Double Array type. The value of the price
field in the data table must be of the String type, the value of the price field in the search index must be of the Double type, and isArray=true
must be specified.
Examples
The following table describes the sample data in the data table named array_search_table
.
The data table consists of thepk
primary key column of the String type, thecol_keyword_array
attribute column of the String type, and thecol_long_array
attribute column of the String type.
pk | col_keyword_array | col_long_array |
03#server#07 | ["Development environment", "Test environment", "Physical server", "Linux" ] | [2020, 2023] |
4c#server#ae | ["Production environment", "Cloud server", "Linux" ] | [2021, 2024] |
Create a search index.
In this example, a search index named
array_query_table_index
is created. The search index contains the following fields of the Array data type: thecol_keyword_array
field that stores elements of the Keyword data type and thecol_long_array
field that stores elements of the Long data type.Use the search index to query data of the Array data type.
The following sample Java code provides an example on how to use a search index to query data of the Array data type. In this example, the rows in which the value of the
col_keyword_array
column contains elements that exactly match "Cloud server" and the value of thecol_long_array
column contains elements that are equal to 2024 meet the query conditions.NoteYou can execute SQL statements to query data of the Array data type in a search index. For more information, see Execute SQL statements to query data by using the search index.
private static void query(SyncClient client) { // Condition 1: The value of the col_keyword_array column contains elements that exactly match "Cloud server". TermQuery keywordTermQuery = new TermQuery(); // Set the query type to TermQuery. keywordTermQuery.setFieldName("col_keyword_array"); // Specify the name of the column that you want to match. keywordTermQuery.setTerm(ColumnValue.fromString("Cloud server")); // Specify the value that you want to use to match the field. // Condition 2: The value of the col_long_array column contains elements that are equal to 2024. TermQuery longTermQuery = new TermQuery(); // Set the query type to TermQuery. longTermQuery.setFieldName("col_long_array"); // Specify the name of the column that you want to match. longTermQuery.setTerm(ColumnValue.fromLong(2024l)); // Specify the value that you want to use to match the field. SearchQuery searchQuery = new SearchQuery(); // Construct a Boolean query in which the query results meet Condition 1 and Condition 2 at the same time. BoolQuery boolQuery = new BoolQuery(); boolQuery.setMustQueries(Arrays.asList(keywordTermQuery, longTermQuery)); searchQuery.setQuery(boolQuery); //searchQuery.setGetTotalCount(true); // Specify that the total number of matched rows is returned. SearchRequest searchRequest = new SearchRequest("<TABLE_NAME>", "<SEARCH_INDEX_NAME>", searchQuery); // You can configure the columnsToGet parameter to specify the columns that you want to return or specify that all columns are returned. If you do not configure this parameter, only the primary key columns are returned. //SearchRequest.ColumnsToGet columnsToGet = new SearchRequest.ColumnsToGet(); //columnsToGet.setReturnAll(true); // Specify that all columns are returned. //columnsToGet.setColumns(Arrays.asList("ColName1","ColName2")); // Specify the columns that you want to return. //searchRequest.setColumnsToGet(columnsToGet); SearchResponse resp = client.search(searchRequest); //System.out.println("TotalCount: " + resp.getTotalCount()); // Specify that the total number of matched rows instead of the number of returned rows is displayed. System.out.println("Row: " + resp.getRows()); }
Nested data type
Data of the Nested type is nested documents. Nested documents are used when a row of data (document) contains multiple child rows (child documents). Multiple child rows are stored in a nested field. The Nested data type is suitable for storing data that has a hierarchical structure.
You must specify the schema of child rows in a nested field. The schema must include the fields of the child rows and the property of each field. The Nested data type can be used to store multiple values, which is similar to the JSON data type.
Nested formats
Nested fields are classified into single-level and multi-level nested fields. The following table describes the two types.
Type | Description |
Single-level nested field | A single-level nested field has a simple data structure that contains only one level. Single-level nested fields are suitable for scenarios in which data structures of multiple levels are not required but hierarchical structures are required. Example:
|
Multi-level nested field | A multi-level nested field has a complex data structure that contains multiple levels. Multi-level nested fields are suitable for scenarios in which complex data structures are required to store organized data of various levels in centralized modules. Example:
|
Usage notes
If a field in a search index is of the Nested type, the field in the data table for which the search index is created must be of the String type, and the field in the search index must be of the Nested type. You must perform nested queries to query fields of the Nested type.
When you write data to a field in a data table and the field corresponds to a Nested field in the search index created for the data table, make sure that the field in the data table is of the JSON Array type. Example: [{"tagName":"tag1", "score":0.8,"time": 1730690237000 }, {"tagName":"tag2", "score":0.2,"time": 1730691557000}]
.
You must write strings of the JSON Array type to a nested field regardless of whether the field contains only one child row.
Examples
Example of single-level Nested fields
You can create a single-level Nested field in the Tablestore console or by using a Tablestore SDK.
This section describes how to create a single-level Nested field by using Tablestore SDK for Java. In this example, a Nested field named tags is created. Each child row contains three fields. The following figure shows the details.
Field name: tagName. Field type: Keyword.
Field name: score. Field type: Double.
Field name: time. Field type: Date. Unit: milliseconds.
Sample data written to the data table: [{"tagName":"tag1", "score":0.8,"time": 1730690237000 }, {"tagName":"tag2", "score":0.2,"time": 1730691557000}]
.
// Create schemas for the fields in the child row.
List<FieldSchema> subFieldSchemas = new ArrayList<FieldSchema>();
subFieldSchemas.add(new FieldSchema("tagName", FieldType.KEYWORD)
.setIndex(true).setEnableSortAndAgg(true));
subFieldSchemas.add(new FieldSchema("score", FieldType.DOUBLE)
.setIndex(true).setEnableSortAndAgg(true));
subFieldSchemas.add(new FieldSchema("time", FieldType.DATE)
.setDateFormats(Arrays.asList("epoch_millis")));
// Use the schemas created for the child rows as the value of subfieldSchemas for the Nested field.
FieldSchema nestedFieldSchema = new FieldSchema("tags", FieldType.NESTED)
.setSubFieldSchemas(subFieldSchemas);
Example of multiple-level Nested fields
You can create a multi-level Nested field by using a Tablestore SDK.
This section describes how to create a multiple-level Nested field by using Tablestore SDK for Java. In this example, a Nested field named user is created. Each child row contains four fields of different primitive data types and one Nested field.
Field name: name. Field type: Keyword.
Field name: age. Field type: Long.
Field name: birth. Field type: Date. The value of the field is in the date format.
Field name: phone. Field type: Keyword.
Nested field name: address. Names of the fields in each child row: province, city, and street. Data type of all fields in each child row: Keyword.
Sample data written to the data table: [ {"name":"John","age":20,"brith":"2014-10-10 12:00:00.000","phone":"1390000****","address":[{"province":"Zhejiang","city":"Hangzhou","street":"1201 Xingfu Community, Sunshine Avenue"}]}]
.
// Create schemas for the three fields in the child rows of the address Nested field. The path specified by user.address can be used to query data of fields in a child row.
List<FieldSchema> addressSubFiledSchemas = new ArrayList<>();
addressSubFiledSchemas.add(new FieldSchema("province",FieldType.KEYWORD));
addressSubFiledSchemas.add(new FieldSchema("city",FieldType.KEYWORD));
addressSubFiledSchemas.add(new FieldSchema("street",FieldType.KEYWORD));
// Create a schema for each child row of the Nested field user. Each child row contains three fields of different primitive data types and one Nested field named address. The path specified by the Nested field user can be used to query data of fields in a child row.
List<FieldSchema> subFieldSchemas = new ArrayList<>();
subFieldSchemas.add(new FieldSchema("name",FieldType.KEYWORD));
subFieldSchemas.add(new FieldSchema("age",FieldType.LONG));
subFieldSchemas.add(new FieldSchema("birth",FieldType.DATE).setDateFormats(Arrays.asList("yyyy-MM-dd HH:mm:ss.SSS")));
subFieldSchemas.add(new FieldSchema("phone",FieldType.KEYWORD));
subFieldSchemas.add(new FieldSchema("address",FieldType.NESTED).setSubFieldSchemas(addressSubFiledSchemas));
// Use the schemas created for the child rows of the Nested field user as the value of subfieldSchemas for the Nested field.
List<FieldSchema> fieldSchemas = new ArrayList<>();
fieldSchemas.add(new FieldSchema("user",FieldType.NESTED).setSubFieldSchemas(subFieldSchemas));
Limits
Nested indexes do not support the IndexSort feature, which can be used to improve query performance in various scenarios.
If you use a search index that contains a nested field to query data and require pagination, you must specify the sorting method to return data in the query conditions. Otherwise, Tablestore does not return nextToken when only part of data that meets the query conditions is read.
Nested queries provide lower performance than other types of queries.
The Nested data type can be used in all queries, sorting, and aggregation.
References
When you use a search index to query data, you can use the following query methods: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, fuzzy query, Boolean query, geo query, nested query, KNN vector query, and exists query. You can select query methods based on your business requirements to query data from multiple dimensions.
You can sort or paginate rows that meet the query conditions by using the sorting and paging features. For more information, see Perform sorting and paging.
You can use the collapse (distinct) feature to collapse the result set based on a specific column. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).
If you want to analyze data in a data table, you can use the aggregation feature of the Search operation or execute SQL statements. For example, you can obtain the minimum and maximum values, sum, and total number of rows. For more information, see Aggregation and SQL query.
If you want to obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.