This topic describes the core features of search indexes and their equivalent SQL statements.

Core features

  • Queries based on non-primary key columns

    Data tables support only queries that are based on primary key columns or prefixes of primary key columns. Therefore, queries by using data tables cannot meet requirements in some scenarios. If you want to query a row based on a column, you can include the column in a search index and perform queries based on non-primary key columns.

  • Boolean query

    Boolean query is suitable for order scenarios. In order scenarios, a table may contain dozens of fields. It may be difficult to determine how to combine the fields required for queries when you create a table. Even if you have determined how to combine the required fields, hundreds of combinations may be available. If you use a relational database service, you may have to create hundreds of indexes. In addition, if a combination is not specified in advance, you cannot query the required data.

    However, you can use Tablestore to create a search index that includes the required field names, which can be combined in a query. Search indexes also support logical operators such as AND, OR, and NOT.

  • Geo query

    As mobile devices gain popularity, geographical location data becomes increasingly important. This data is used in most apps such as social media apps, food delivery apps, sports apps, and Internet of Vehicles (IoV) apps. These apps must support query features for the geographical location data that these apps provide.

    Search indexes support queries based on the following geographical location data:
    • Near: queries points within a specified radius based on a central point, such as the People Nearby feature in WeChat.
    • Within: queries points within a specified rectangular or polygonal area.

    Tablestore allows you to use these features to query geographical location data without using other database services or search engines.

  • Full-text search

    Search indexes can tokenize data to perform full-text search. However, unlike search engines, Tablestore returns only BM25-based results. Tablestore does not return custom relevant results in response to a query. If you need to query relevant results, we recommend that you use search engines.

    Search indexes support the following tokenization methods: single-word tokenization, delimiter tokenization, minimum semantic unit tokenization, maximum semantic unit tokenization, and fuzzy tokenization. For more information, see Tokenization.

  • Fuzzy query

    Search indexes support queries based on wildcards. This feature is similar to the LIKE operator in relational databases. You can specify characters and wildcards such as question marks (?) or asterisks (*) to implement the similar feature to the LIKE operator.

  • Prefix query

    Search indexes support prefix query. This feature is applicable to any natural languages. For example, in a query based on prefix "apple", results such as "apple6s" and "applexr" may be returned.

  • Nested query

    In addition to a flat structure, online data such as labeled pictures has some complex and multilayered structures. For example, a database stores a large number of pictures, and each picture has multiple elements such as houses, cars, and people. Each element in a picture has a unique weight score. The score is evaluated based on the size and position of an element in a picture. Therefore, each picture has multiple labels. Each label has a name and a weight score. You can use nested queries to query data based on the data labels.

    The following sample code provides an example of JSON data with labels:
    {
     "tags": [
       {
          "name": "car",
          "score": 0.78
       },
       {
          "name": "tree",
          "score": 0.24
       }
     ]
    }

    You can use nested queries to store and query multilayered data. This feature makes it easy to model complex data.

  • Collapse (distinct)

    Search indexes support the collapse (distinct) feature. This feature allows you to specify the highest frequency of occurrence of an attribute value to achieve high cardinality. For example, when you search for a laptop on an e-commerce platform, the first page may display only products of a single brand, which is not user-friendly. You can use the collapse (distinct) feature of Tablestore to resolve this issue.

  • Sorting

    In Tablestore, table data is sorted in alphabetical order of their primary key. To sort data based on other fields, you must use the sorting feature of search indexes. Tablestore supports a variety of sorting methods, including ascending, descending, single-field, and multi-field sorting. By default, returned results are sorted by primary key. Data in a search index is globally sorted.

  • Total number of rows

    You can specify the number of rows that Tablestore returns for the current request when you use a search index for a query. If you do not specify query conditions for the search index, Tablestore returns the total number of rows for which you have created indexes. If you stop writing new data to a table and create indexes for all data in the table, Tablestore returns the total number of rows in the table. This feature applies to data verification and data-driven operations.

  • Aggregation

    Search indexes support the aggregation feature to allow you to obtain the minimum value, maximum value, sum, average, count and distinct count of rows, and percentile statistics. You can also perform aggregation operations to group results or perform queries by histogram. This helps meet the basic aggregation requirements in lightweight analysis scenarios.

SQL

The following table describes the mappings between SQL statements and the features of search indexes that correspond to the SQL statements.
SQL Feature of search indexes
Show API operation: DescribeSearchIndex
Select Parameter: ColumnsToGet
From Parameter: index name
Important Single-column indexes are supported. Multi-column indexes are not supported.
Where Query: a variety of query methods such as term query
Order by Parameter: sort
Limit Parameter: limit
Delete API operations: DeleteRow
Like Query: WildcardQuery
And Parameter: operator = and
Or Parameter: operator = or
Not Query: BoolQuery(mustNotQueries)
Between Query: RangeQuery
Null Query: ExistsQuery
In Query: TermsQuery
Min Aggregation: min
Max Aggregation: max
Avg Aggregation: avg
Count Aggregation: count
Count(distinct) Aggregation: distinctCount
Sum Aggregation: sum
Group By GroupBy