Search indexes use inverted indexes and column stores to address complex query needs when a large amount of data exists. After you create a search index, you can use the search index to query data.

Prerequisites

  • A data table for which the max versions parameter is set to 1 is created.
  • A Tablestore client is initialized. For more information, see Initialization.

API operations

Category Operation Description
Management operations CreateSearchIndex Creates a search index.
DescribeSearchIndex Queries details about a search index.
ListSearchIndex Queries the list of search indexes.
DeleteSearchIndex Deletes a search index.
Query operations Search Implements all query features and analysis features such as sorting and aggregation. The results are returned in a specific order.
ParallelScan Implements all query features. You cannot call this operation to sort or aggregate data. Data that meets the query conditions is returned quickly.

When you call this operation, you must call the ComputeSplits operation to query the maximum number of parallel scan tasks for a single ParallelScan request.

Procedure

  1. Create a search index. For more information, see Create search indexes.
  2. Call the Search or ParallelScan operation to query data. The following table describes the query methods that you can use to query data when you call the Search or ParallelScan operation.
    Query method Query Description
    Match all query MatchAllQuery This query is used to query the total number of rows in a table or randomly retrieve multiple rows from a table.
    Match query MatchQuery This query uses approximate matches to retrieve query results. The keyword that you use for a query and the column values are tokenized based on the analyzer that you specified. Then, a match query is performed based on the tokens.

    The OR logical operator is used to relate tokens. If the number of tokens in a row that match the tokens in the tokenized keyword reaches the minimum value that you specified, the row meets the query conditions.

    Match phrase query MatchPhraseQuery This query is similar to match query. A row meets the query conditions only when the order and position of the tokens in the row match the order and position of the tokens that are contained in the tokenized keyword.
    Term query TermQuery This query uses full and exact matches to retrieve query results, which is similar to string matching.

    If a TEXT column is queried and one of the tokens in a row exactly matches the keyword, the row meets the query conditions.

    Terms query TermsQuery This query is similar to term query, but you can specify multiple keywords at the same time. If one of the tokens in a row matches one of the keywords, the row meets the query conditions.
    Prefix query PrefixQuery This query retrieves data that contains the specified prefix.

    If a TEXT column is queried and one of the tokens in a row contains the specified prefix, the row meets the query conditions.

    Range query RangeQuery This query retrieves data within a specified range.

    If a TEXT column is queried and one of the tokens in a row is within the specified range, the row meets the query conditions.

    Wildcard query WildcardQuery This query retrieves data that matches a string that contains one or more wildcard characters.

    You can use the asterisk (*) and question mark (?) wildcard characters in a string. The asterisk (*) matches a string of any length in, before, or after a search term. The question mark (?) matches a single character in a specific position.

    Boolean query BoolQuery You can use Boolean query to query rows based on a subquery or a combination of subqueries. Tablestore returns the rows that match the subquery or the combination of subqueries.

    Subquery conditions can be combined by using logical operators, such as AND, NOT, and OR.

    Nested query NestedQuery You can use nested query to query the data of nested fields.
    Geo-distance query GeoDistanceQuery You can specify a circular geographical area that consists of a central point and radius as a query condition. Tablestore returns the rows in which the value of the specified column falls within the geographical circular area.
    Geo-bounding box query GeoBoundingBoxQuery You can specify a rectangular geographical area as a query condition. Tablestore returns the rows in which the value of the specified column falls within the rectangular geographical area.
    Geo-polygon query GeoPolygonQuery You can specify a polygonal geographical area as a query condition. Tablestore returns the rows in which the value of the specified column falls within the polygonal geographical area.
    Exists query ExistQuery Exists query is also called NULL query or NULL-value query. This query is used for sparse data to determine whether a column of a row exists. For example, you can query the rows in which the value of the address column is not empty.

    If a column does not exist in a row or the value of the column is an empty array ("[]"), the column does not exist in the row.

Related operations

  • If you want to analyze data in a table, you can call the Search operation to use the aggregate feature. You can use the aggregate feature to query the maximum value, the sum of the values, and the number of rows. For more information, see Aggregation.
  • If you do not need to sort the rows that meet the query conditions and you want to quickly obtain all rows that meet the query conditions, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.
  • When you call the Search operation to query data, you can sort or paginate the rows that meet the query conditions. For more information, see Sorting and pagination.
  • When you call the Search operation to query data, you can use the collapse (distinct) feature to collapse the result set based on a specified column. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).
  • If you want to perform full-text search, you must tokenize the field for which tokenization can be performed and select a suitable query method to query data. For more information, see Tokenization.
  • If you want to store and query data in multiple logical relationships, you can store the data as a nested field and use nested query to query the data. For more information, see Nested and Nested query.
  • If you want the system to automatically delete the data that is retained in a search index for a period of time that exceeds the specified duration, you can use the time to live (TTL) feature of the search index. For more information, see TTL of search indexes.
  • If you want to add, update, or delete indexed columns from a search index, you can use the feature that allows you to dynamically modify the schema of search indexes. For more information, see Dynamically modify schemas.
  • If you want to query new fields or data of new field types without modifying the storage schema and the data in data tables, you can use the virtual column feature. For more information, see Virtual columns.