Before you use a search index, you must understand the data types that it supports and the mapping between these types and the data types in your data table.
Data type descriptions
Search index provides primitive data types, such as Long, Double, Boolean, Keyword, Text, Date, IP, Geopoint, Vector, and JSON. It also provides special types, such as array types, nested types, and virtual columns.
The search index Date type is supported in Tablestore Java software development kit (SDK) 5.13.9 and later. To use the search index Date type, make sure that you have the correct Java SDK version. For more information about the release history of the Java SDK, see Java SDK release history.
Primitive data types
Search index provides primitive data types, such as Long, Double, Boolean, Keyword, Text, Date, IP, Geopoint, Vector, and JSON. The following table describes these data types.
Primitive data type | Description |
Long | A 64-bit long integer. |
Double | A 64-bit double-precision floating-point number. |
Boolean | A Boolean value. |
Keyword | A non-tokenized string. |
FuzzyKeyword | A string that supports high-performance fuzzy queries. |
Text | A tokenized string or text. |
Date | A date and time type that supports custom date formats. |
IP | An IP data type that supports IP addresses in IPv4 and IPv6 formats. |
Geopoint | Coordinate information for a location point. The format is |
Vector | A vector type. The format is a string of a Float32 array. The length of the array is equal to the dimension of the field. For example, the vector string |
A JSON type that supports the OBJECT and NESTED index types. |
Array and nested types
In addition to primitive data types, such as Long, Double, Boolean, Keyword, Text, Date, Geopoint, and Vector, search index also provides two special types: array and nested. The array type is suitable for storing a series of data of the same type. The nested type is suitable for storing data with a hierarchical structure, similar to JSON. For more information, see Array and nested types.
Array type
Nested type
Virtual columns
To query new fields with new data types without changing the storage structure and data in Tablestore, use virtual columns in a search index. For more information, see Virtual columns.
The virtual column feature lets you map a column from a table to a virtual column in a search index when you create the index. The type of the new virtual column can be different from the original column type in the table. This lets you create a new column without modifying the table schema and data. The new column can be used to accelerate queries or to use a different tokenizer.
You can configure different tokenization methods for Text fields that are mapped to the same field in a table.
A single String column can be mapped to multiple Text columns of a search index. Different Text columns use different tokenization methods to meet various business requirements.
Query acceleration
You do not need to cleanse data or re-create a table schema. You need to only map required columns of a table to the columns in a search index. The column types can be different between the table and the search index. For example, map the numeric type to the Keyword type to improve the performance of a term query, and map the String type to the numeric type to improve the performance of a range query.
Data type mapping
The value of a field in a search index is derived from the value of the corresponding field in the data table. The data types of these fields must match. The following table describes the mappings between field data types in search indexes and data tables.
Data type of the search index field | Data type of the data table field |
Long | Integer |
Long Array | String |
Double | Double |
Double Array | String |
Boolean | Boolean |
Boolean Array | String |
Keyword | String |
Keyword Array | String |
Text | String |
Date | Integer, String |
Date Array | String |
IP | String |
IP Array | String |
Geopoint | String |
Geopoint Array | String |
Vector | String |
Nested | String |
JSON | String |