TairSearch is an in-house full-text search data structure of Tair and uses a query syntax similar to Elasticsearch (ES-LIKE).
Overview
TairSearch offers the following features:
Low latency and high performance: provides millisecond-level write and full-text search capabilities based on the ultra-high computing power of Tair. For more information, see TairSearch Performance Whitepaper.
Incremental and partial updates: supports partial updates of indexes and incremental updates of documents, including adding, updating, removing, and auto-incrementing fields.
Flexible syntax: provides custom sorting order and supports JSON syntax that allows bool, match, term, and paging queries. This syntax is similar to Elasticsearch.
Aggregate query: supports terms, metrics, and filter aggregations. For more information, see Aggregations.
Auto-complete suggestion: supports fuzzy match with prefixes and automatic completion for search operations.
Custom analyzers: provides a rich variety of powerful analyzers, including built-in analyzers for major world languages such as English (Standard and Stop) and Chinese (Jieba and IK). It also supports custom analyzers with user-defined dictionaries and stop words. For more information, see Search analyzers.
Shard index query: allows you to use the TFT.MSEARCH command to search for multiple shard indexes and return an aggregated result set.
Document compression: supports the storage of compressed documents to reduce memory usage. This feature is disabled by default.
Query caching: stores the latest query results in cache to improve the query efficiency of hot data.
Release notes
Redis 5.0-compatible DRAM-based instances
On March 11, 2022, TairSearch was released with Tair V1.7.27.
On May 24, 2022, Tair V1.8.5 was released to add support for the TairSearch aggregation feature.
On September 6, 2022, Tair V5.0.15 was released to add support for the TFT.MSEARCH command.
On January 13, 2023, Tair V5.0.25 was released to add support for analyzers.
On March 15, 2023, Tair V5.0.28 was released to add support for query caching, document compression, and the TFT.ANALYZER command.
On June 12, 2023, Tair V5.0.35 was released to add support for documents of the ARRAY data type and the similarity ranking algorithm Okapi BM25.
Redis 6.0-compatible DRAM-based instances
On February 7, 2023, Tair V6.2.4.1 was released to add support for TairSearch.
Tair V6.2.4.1 has all the features provided by Redis 5.0-compatible DRAM-based instances of Tair V5.0.25.
On March 14, 2023, Tair V6.2.5.0 was released to add support for query caching, document compression, and the TFT.ANALYZER command.
Tair V6.2.5.0 has all the features provided by Redis 5.0-compatible DRAM-based instances of Tair V5.0.28.
On June 12, 2023, Tair V6.2.7.3 was released to add support for documents of the ARRAY data type and the similarity ranking algorithm Okapi BM25.
Tair V6.2.7.3 has all the features provided by Redis 5.0-compatible DRAM-based instances of Tair V5.0.35.
On December 21, 2023, Tair V23.12.1.2 was released with support for the TFT.EXPLAINSCORE command.
Redis 7.0-compatible DRAM-based instances
On July 22, 2024, Tair V24.7.0.0 was released to add support for TairSearch.
Best practices
Prerequisites
The instance is a Tair memory-optimized instance, and the supported versions are as follows:
For a DRAM-based instance that is compatible with Redis 5.0, the minor version must be 1.7.27 or later.
For a DRAM-based instance that is compatible with Redis 6.0, the minor version must be 6.2.4.1 or later.
For a DRAM-based instance that is compatible with Redis 7.0, all minor versions are supported.
The latest minor version provides more features and higher stability. We recommend that you update the instance to the latest minor version. For more information, see Update the minor version of an instance. If your instance is a cluster instance or read/write splitting instance, we recommend that you update the proxy nodes in the instance to the latest minor version to ensure that all commands can be run as expected.
Precautions
The TairSearch data you want to manage is stored on a Tair instance.
To reduce memory usage, we recommend that you perform the following operations:
When you create indexes, set index to true for document fields to specify these fields as indexed fields. For other document fields, set index to false.
Specify an object containing arrays of includes and excludes patterns in the _source parameter to filter out document fields that you do not need and save fields that you need.
If you want to split a document into tokens, choose an appropriate analyzer to prevent unnecessary splitting and increased memory usage.
If a document is excessively large, use the document compression feature to automatically compress and decompress the document.
Do not add many documents to a single index. Add the documents to multiple indexes. We recommend that you keep the number of documents per index within 5 million to prevent data skew in cluster instances, balance read and write requests, and reduce the number of large keys and hotkeys.
Command list
Table 1. Full-text search commands
Command | Syntax | Description |
| Creates an index and a mapping for the index. The syntax used to create a mapping is similar to that used to create an explicit mapping in Elasticsearch. You must create an index before you can add documents to the index. | |
| Adds properties fields to the specified index or modifies the index settings. | |
| Obtains the mapping content of an index. | |
| Adds a document to an index. You can specify a unique ID for the document in the index by using WITH_ID. If the document ID already exists, the existing ID is overwritten. If you do not specify WITH_ID (default), a document ID is automatically generated. | |
| Adds multiple documents to an index. Each document must have a document ID that is specified by doc_id. If a document fails to be added due to an invalid format, all documents that the command involves are not added to the index. | |
| Updates the document specified by doc_id in an index. If the document fields that you want to update are mapped and indexed fields, the fields must be of the same type as those used by the mapping. If the fields are not indexed fields, the fields to be updated can be of any data type. Note If the fields already exist, the document is updated. If the fields do not exist, the fields are added. If the document does not exist, the document is automatically created. In this case, the command is equivalent to TFT.ADDDOC. | |
| Deletes the specified field from the document specified by doc_id in an index. If the field is an indexed field, the information of the field is also deleted from the index. Note If the specified field does not exist (for example, a field filtered out by _source), the operation fails. | |
| Adds an increment to the specified field in the document specified by doc_id in an index. The increment can be a positive or negative integer. The data type of the field can only be LONG or INTEGER. Note If the document does not exist, the document is automatically created. In this case, the existing value of the field is 0, and the updated field value is obtained by adding the increment value to the existing field value. If the field does not exist (for example, it is filtered out by _source), the operation fails. | |
| Adds an increment to the specified field in the document specified by doc_id in an index. The increment can be a positive or negative floating-point number. The data type of the field can only be DOUBLE. Note If the document does not exist, the document is automatically created. In this case, the existing value of the field is 0, and the updated field value is obtained by adding the increment value to the existing field value. If the field does not exist (for example, it is filtered out by _source), the operation fails. | |
| Obtains the content of the document specified by doc_id in an index. | |
| Checks whether the document specified by doc_id exists in an index. | |
| Obtains the number of documents in an index. | |
| Obtains the IDs of all documents in an index. | |
| Deletes the documents specified by doc_id from an index. Multiple document IDs can be specified. | |
| Deletes all documents from an index but retains the index. | |
| Queries the tokenization effects of the specified analyzer. | |
| Searches for documents in an index based on the query statement. The query syntax is similar to Elasticsearch syntax. | |
| Queries documents in multiple indexes that have mappings and settings set to the same values by using the query clause and gathers the results from these indexes. Then, the results are rated, sorted, aggregated, and returned. | |
| Queries the execution duration of a query statement. The output includes the number of documents that are involved in the query and the amount of time consumed by each operation in the query. | |
| Queries the detailed score information of documents resulting from the execution of a query statement. You can use this command to gain insights into the process of how document scores are calculated. Then, you can optimize search queries to enhance the effectiveness of document retrieval. | |
| You can use the native Redis DEL command to delete one or more TairSearch keys. |
Table 2. Auto-complete commands
Command | Syntax | Description |
| Adds one or more auto-complete texts and their weights to the specified index. | |
| Deletes one or more auto-complete text entries from the specified index. | |
| Obtains the number of auto-complete texts in the specified index. | |
| Obtains the auto-complete texts that can be matched based on the specified prefix. Texts are returned in descending order of weights. | |
| Obtains all auto-complete text entries in the specified index. |
The following list describes the conventions for the command syntax used in this topic:
Uppercase keyword
: indicates the command keyword.Italic text
: indicates variables.[options]
: indicates that the enclosed parameters are optional. Parameters that are not enclosed by brackets must be specified.A|B
: indicates that the parameters separated by the vertical bars (|) are mutually exclusive. Only one of the parameters can be specified....
: indicates that the parameter preceding this symbol can be repeatedly specified.
TFT.CREATEINDEX
Item | Description |
Syntax |
|
Command description | Creates an index and a mapping for the index. The syntax used to create a mapping is similar to that used to create an explicit mapping in Elasticsearch. You must create an index before you can add documents to the index. Note To prevent large keys from being generated, you can split large indexes into small indexes and devise load distribution rules to write data to different indexes. When you create indexes, make sure that these indexes have the mappings and settings parameters set to the same values. After these indexes are created, you can query them by using TFT.MSEARCH. |
Parameters |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.UPDATEINDEX
Category | Description |
Syntax |
|
Command description | Adds properties fields to the specified index or modifies the index settings. |
Parameters |
Note For the syntax of mappings and settings, see TFT.CREATEINDEX. |
The response parameters |
|
Example | Sample command:
Sample output:
|
TFT.GETINDEX
Category | Description |
Syntax |
|
Command description | Obtains the mapping content of an index. |
Parameters |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.ADDDOC
Item | Description |
Syntax |
|
Command description | Adds a document to an index. You can specify a unique ID for the document in the index by using WITH_ID. If the document ID already exists, the existing ID is overwritten. If you do not specify WITH_ID (default), a document ID is automatically generated. |
Parameters |
|
The response parameters |
|
Example | Sample command:
Sample output:
Sample arrays to be added:
|
TFT.MADDDOC
Category | Description |
Syntax |
|
Command description | Adds multiple documents to an index. Each document must have a document ID that is specified by doc_id. If a document fails to be added due to an invalid format, all documents that the command involves are not added to the index. |
Parameters |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.UPDATEDOCFIELD
Category | Description |
Syntax |
|
Command description | Updates the document specified by doc_id in an index. If the document fields that you want to update are mapped and indexed fields, the fields must be of the same type as those used by the mapping. If the fields are not indexed fields, the fields to be updated can be of any data type. Note If the fields already exist, the document is updated. If the fields do not exist, the fields are added. If the document does not exist, the document is automatically created. In this case, the command is equivalent to TFT.ADDDOC. |
Options |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.DELDOCFIELD
Category | Description |
Syntax |
|
Command description | Deletes the specified field from the document specified by doc_id in an index. If the field is an indexed field, the information of the field is also deleted from the index. Note If the specified field does not exist (for example, a field filtered out by _source), the operation fails. |
Options |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.INCRLONGDOCFIELD
Category | Description |
Syntax |
|
Command description | Adds an increment to the specified field in the document specified by doc_id in an index. The increment can be a positive or negative integer. The data type of the field can only be LONG or INTEGER. Note If the document does not exist, the document is automatically created. In this case, the existing value of the field is 0, and the updated field value is obtained by adding the increment value to the existing field value. If the field does not exist (for example, it is filtered out by _source), the operation fails. |
Options |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.INCRFLOATDOCFIELD
Category | Description |
Syntax |
|
Command description | Adds an increment to the specified field in the document specified by doc_id in an index. The increment can be a positive or negative floating-point number. The data type of the field can only be DOUBLE. Note If the document does not exist, the document is automatically created. In this case, the existing value of the field is 0, and the updated field value is obtained by adding the increment value to the existing field value. If the field does not exist (for example, it is filtered out by _source), the operation fails. |
Options |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.GETDOC
Category | Description |
Syntax |
|
Command description | Obtains the content of the document specified by doc_id in an index. |
Options |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.EXISTS
Category | Description |
Syntax |
|
Command description | Checks whether the document specified by doc_id exists in an index. |
Parameters |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.DOCNUM
Category | Description |
Syntax |
|
Command description | Obtains the number of documents in an index. |
Parameters |
|
Return value |
|
Example | Sample command:
Sample output:
|
TFT.SCANDOCID
Category | Description |
Syntax |
|
Command description | Obtains the IDs of all documents in an index. |
Parameters |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.DELDOC
Item | Description |
Syntax |
|
Command description | Deletes the documents specified by doc_id from an index. Multiple document IDs can be specified. |
Parameters |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.DELALL
Category | Description |
Syntax |
|
Command description | Deletes all documents from an index but retains the index. |
Parameters |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.ANALYZER
Category | Description |
Syntax |
|
Command description | Queries the tokenization effects of the specified analyzer. |
Options |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.SEARCH
Category | Description |
Syntax |
|
Command description | Searches for documents in an index based on the query statement. The query syntax is similar to Elasticsearch syntax. |
Options | index is the name of the index that you want to manage by running this command; query is the Query DSL statement that is similar to the syntax used in Elasticsearch. The following fields are supported:
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.MSEARCH
Item | Description |
Syntax |
|
Command description | Queries documents in multiple indexes that have mappings and settings set to the same values by using the query clause and gathers the results from these indexes. Then, the results are rated, sorted, aggregated, and returned. Note The output of the TFT.MSEARCH command is a result of rating, sorting, and aggregating query results from these indexes. The output is different from results generated by directly rating, sorting, and aggregating datasets in multiple indexes. TFT.MSEARCH policy:
|
Parameters |
Note Unlike the query statement of the TFT.SEARCH command, the query statement of the TFT.MSEARCH command does not support the from parameter, but supports paged query by using the size, reply_with_keys_cursor, and keys_cursor parameters. For more information about the syntax of other parameters, see TFT.SEARCH. |
The response parameters. |
|
Example | Run the following commands in advance:
Sample command:
Sample response:
Sample command for querying the second page:
Sample response:
|
TFT.EXPLAINCOST
Item | Description |
Syntax |
|
Command description | Queries the execution duration of a query statement. The output includes the number of documents that are involved in the query and the amount of time consumed by each operation in the query. |
Options |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.EXPLAINSCORE
Category | Description |
Syntax |
|
Command description | Queries the detailed score information of documents resulting from the execution of a query statement. You can use this command to gain insights into the process of how document scores are calculated. Then, you can optimize search queries to enhance the effectiveness of document retrieval. This command is available only for DRAM-based instances that are compatible with Redis 6.0. |
Parameter |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.ADDSUG
Category | Description |
Syntax |
|
Command description | Adds one or more auto-complete texts and their weights to the specified index. |
Parameters |
|
The response parameters |
|
Example | Sample command:
Sample output:
|
TFT.DELSUG
Category | Description |
Syntax |
|
Command description | Deletes one or more auto-complete text entries from the specified index. |
Parameters |
|
Output |
|
Example | Sample command:
Sample output:
|
TFT.SUGNUM
Category | Description |
Syntax |
|
Command description | Obtains the number of auto-complete texts in the specified index. |
Parameters |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.GETSUG
Category | Description |
Syntax |
|
Command description | Obtains the auto-complete texts that can be matched based on the specified prefix. Texts are returned in descending order of weights. |
Parameters |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
TFT.GETALLSUGS
Category | Description |
Syntax |
|
Command description | Obtains all auto-complete text entries in the specified index. |
Parameters |
|
The response parameters. |
|
Example | Sample command:
Sample output:
|
Aggregations
You can add an aggs (or aggregations) clause to a TFT.SEARCH request to aggregate the results of the query clause.
Usage
In most cases, in the aggs clause, you must specify a custom aggregation name, an aggregation type, and an aggregation field (field). Only fields of numeric and keyword types are supported. For example:
TFT.SEARCH shares '{"query":{"term":{"investor":"Jay"}},"aggs":{"Jay_Sum":{"sum":{"field":"purchase_price"}}}}'
# Specify Jay_Sum as the aggregation name, sum as the aggregation type, and purchase_price as the aggregation field.
The returned result contains the query results from query and the aggregation results from aggs:
{"hits":{"hits":[{"_id":"16581351808123930","_index":"today_shares0718","_score":1.0,"_source":{"shares_name":"XAX","logictime":14300210,"purchase_type":1,"purchase_price":101.1,"purchase_count":100,"investor":"Jay"}},{"_id":"16581351809626430","_index":"today_shares0718","_score":1.0,"_source":{"shares_name":"XAX","logictime":14300310,"purchase_type":1,"purchase_price":111.1,"purchase_count":100,"investor":"Jay"}}],"max_score":1.0,"total":{"relation":"eq","value":2}},"aggregations":{"Jay_Sum":{"value":212.2}}}
You can add "size":0
to the query statement so that only the aggs results are returned.
Aggregation types for aggs
Metric Aggregation, Terms Aggregation, and Filter Aggregation are supported by the aggs parameter.
Category | Description |
Metrics aggregation | Typically performs numerical calculations or statistics on numeric type fields (such as integer, double, etc.), and does not support nested sub-aggregations. The following metrics are supported:
Note Except for value_count, all other metrics only support numeric type fields. Output: DOUBLE-type values calculated from specific fields. |
Terms Aggregation | Counts the number of deduplicated values. Only fields of the keyword type are supported. Nested aggregations are supported. Parameter description:
Example:
Output: a JSON object with the aggregation name as the key and Object as the type. In the object, the key aggregation uses buckets to display statistics. Each bucket contains a key (aggregation field) and a doc_count (number of documents associated with the aggregation field). Example:
|
Filter Aggregation | You can input a query statement in filter to further filter the query results. Nested sub-aggregations are supported. Output: the number of documents (doc_count) that match the filter conditions. |
Aggregation examples
Create an index.
TFT.CREATEINDEX today_shares '{"mappings":{"properties":{"shares_name":{"type":"keyword"},"logictime":{"type":"long"},"purchase_type":{"type":"integer"},"purchase_price":{"type":"double"},"purchase_count":{"type":"long"},"investor":{"type":"keyword"}}}}' # Create an index that represents the stock trading volume of today. # shares_name: the name of each stock. # logictime: the time when the deal is complete. # purchase_type: the purchase type. # purchase_price: the purchase price. # purchase_count: the number of purchased stock shares. # investor: the investor ID.
Expected output:
OK
Add document data.
Run the following commands:
TFT.ADDDOC today_shares '{"shares_name":"XAX","logictime":14300210, "purchase_type":1,"purchase_price":101.1, "purchase_count":100,"investor":"Jay"}' TFT.ADDDOC today_shares '{"shares_name":"XAX","logictime":14300310, "purchase_type":1,"purchase_price":111.1, "purchase_count":100,"investor":"Jay"}' TFT.ADDDOC today_shares '{"shares_name":"YBY","logictime":14300410, "purchase_type":1,"purchase_price":11.1, "purchase_count":100,"investor":"Mila"}'
Expected output:
OK
Perform a query.
Sample queries:
sum
# Query the total amount that Jay spent to purchase the stocks. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"investor":"Jay"}},"aggs":{"Jay_Sum":{"sum":{"field":"purchase_price"}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":2}},"aggregations":{"Jay_Sum":{"value":212.2}}}
max
# Query the largest amount that Jay spent to purchase a stock. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"investor":"Jay"}},"aggs":{"Jay_Max":{"max":{"field":"purchase_price"}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":2}},"aggregations":{"Jay_Max":{"value":111.1}}}
avg
# Query the average amount that Jay spent to purchase different stocks. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"investor":"Jay"}},"aggs":{"Jay_Avg":{"avg":{"field":"purchase_price"}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":2}},"aggregations":{"Jay_Avg":{"value":106.1}}}
std_deviation
# Query the standard deviation of the amount that Jay spent to purchase stocks. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"investor":"Jay"}},"aggs":{"Jay_Std_Deviation":{"std_deviation":{"field":"purchase_price"}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":2}},"aggregations":{"Jay_Std_Deviation":{"value":5.0}}}
extended_stats
# Query the statistics of the amount that Jay spent to purchase stocks. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"investor":"Jay"}},"aggs":{"Jay_Extended_Stats":{"extended_stats":{"field":"purchase_price"}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":2}},"aggregations":{"Jay_Extended_Stats":{"count":2,"sum":212.2,"max":111.1,"min":101.1,"avg":106.1,"sum_of_squares":10221.21,"variance":25.0,"std_deviation":5.0}}}
terms
# Query the investors that have completed at least two transactions. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"purchase_type":1}},"aggs":{"Per_Investor_Freq":{"terms":{"field":"investor","min_doc_count":2,"order": {"_key":"desc"}}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":3}},"aggregations":{"Per_Investor_Freq":{"buckets":[{"key":"Jay","doc_count":2}]}}}
Nested terms aggregation
# Query the number of transactions and the average purchase price for each stock, excluding the XAX stock. TFT.SEARCH today_shares '{"size":0,"query":{"term":{"purchase_type":1}},"aggs":{"Per_Investor_Freq":{"terms":{"field":"shares_name","include":"[A-Z]+","exclude":["XAX"]},"aggs":{"Price_Avg":{"avg":{"field":"purchase_price"}}}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":3}},"aggregations":{"Per_Investor_Freq":{"buckets":[{"key":"YBY","doc_count":1,"Price_Avg":{"value":11.1}}]}}}
Nested filter aggregation
# Query the number of stocks purchased by Jay and the overall statistics (metric values). TFT.SEARCH today_shares '{"size":0,"query":{"term":{"purchase_type":1}}, "aggs":{"Jay_BuyIn_Filter": {"filter": {"term":{"investor": "Jay"}},"aggs":{"Jay_BuyIn_Quatation":{"extended_stats":{"field":"purchase_price"}}}}}}' # Expected output: {"hits":{"hits":[],"max_score":null,"total":{"relation":"eq","value":3}},"aggregations":{"Jay_BuyIn_Filter":{"doc_count":2,"Jay_BuyIn_Quatation":{"count":2,"sum":212.2,"max":111.1,"min":101.1,"avg":106.1,"sum_of_squares":10221.21,"variance":25.0,"std_deviation":5.0}}}}