Search index has multiple efficient index schemas, which can help resolve complex query problems in big data scenarios.
Tables in Tablestore are typical distributed NoSQL data structures. Tables support storage and reads or writes of large-scale data such as monitoring data and log data. Tablestore only supported queries based on primary keys, such as reading data within a single row or specified range. Other types of queries, such as queries based on non-primary key columns and the bool query, were not available.
To resolve this issue, Tablestore introduced the search index feature. Search index supports multiple types of queries based on inverted indexes and column-oriented storage, including but not limited to:
- Query based on non-primary key columns
- Bool query
- Full-text search
- Query by geographical location
- Query by prefix
- Fuzzy query
- Nested query
Aside from queries based on primary keys in the base table, Tablestore provides two index schemas for accelerated queries: global secondary index and search index. The following table describes the differences among the three types of indexes.
|Table||A table is similar to a big map. Tables only support queries based on primary keys.||
|Global secondary index||You can create one or more global secondary indexes and issue query requests against these indexes. This way, you can perform queries based on the primary key columns of these indexes.||
|Search index||Search index uses inverted indexes, Bkd-trees, and column-oriented storage for various query scenarios.||All query and analysis scenarios that the table and the global secondary index do not support.|
- Common query API: Search
- Data export API: ParallelScan
|Search||An API that supports all functions of search index.||
|ComputeSplits+ ParallelScan||An API used to export data for multiple concurrent queries. This API supports the query function of search index but not the analysis function such as sorting and aggregation. Compared with the Search API, the ParallelScan API provides better performance.||
||The throughput of multiple queries in a single request is five times that of the Search API.|
- Index synchronization
If you have created a search index for a table, data is written to the table first. When the write is successful, a success message is immediately returned. At the same time, another asynchronous thread reads the newly written data from the table and writes the data to the search index. This is an asynchronous process.
The asynchronous data synchronization between a table and search index does not affect the write performance of Tablestore. The indexing latency is within seconds, most of which are within 10 seconds. You can view the indexing latency in the Tablestore console in real time.
You cannot create a search index within a table that has the time to live (TTL) parameter set.
- max versions
You cannot create a search index in a table where you have specified the max versions parameter.
You can customize the timestamp whenever you write data to an attribute column that only allows a single version. If you first write a major version number and then a minor version number, the index of the major version number may be overwritten by the index of the minor version number.
Tablestore can provide all features of databases and search engines, except for join
operations, transactions, and relevance of search results. Tablestore also has high
data reliability of databases and supports advanced queries of search engines. Therefore,
Tablestore can replace the common architecture of
database + search engine. If you do not need join operations, transactions, and relevance of search results,
we recommend that you use search index of Tablestore.