A prefix query (PrefixQuery) finds rows in a search index where the value of a field starts with a specified string.
Overview
A PrefixQuery matches column values that start with a specified prefix. The query behavior depends on the data type of the column:
-
Keyword: A basic string data type. As the data volume increases, the query performance degrades. This type is suitable only for small datasets.
-
FuzzyKeyword: A data type optimized for fuzzy searches such as prefix queries. The query performance remains stable regardless of the data volume. This type is recommended for most prefix query scenarios.
-
Text: Column values are tokenized before being indexed. A row matches if at least one resulting token starts with the specified prefix. Due to the uncertainty of tokenization, query results can be unpredictable. This data type is supported for compatibility purposes only and should be used with caution.
How to choose a data type
The following table summarizes the suitability of each data type for prefix queries.
|
Type |
Performance |
Recommendation |
|
Keyword |
Degrades as data volume increases |
For small datasets only |
|
FuzzyKeyword |
Stable regardless of data volume |
Recommended for most scenarios |
|
Text |
Unpredictable results due to tokenization |
Not recommended |
Prefix matching examples
Assume a column contains the following values: hangzhou, beijing, shanghai, and harbin.
-
A prefix of
hangmatcheshangzhoubut does not matchbeijing,shanghai, orharbin. -
A prefix of
hamatches bothhangzhouandharbin.
APIs
You can perform a prefix query using the Search or ParallelScan API. The query type is PrefixQuery.
Parameters
|
Parameter |
Description |
|
query |
The query type. Set this to PrefixQuery. |
|
fieldName |
The name of the target column. |
|
prefix |
The prefix string to match against column values. For Text columns, column values are tokenized before matching. A row matches if at least one token starts with the specified prefix. |
|
getTotalCount |
Specifies whether to return the total number of matching rows. The default value is Setting this parameter to |
|
weight |
The query weight must be a positive floating-point number. In full-text search scenarios, this parameter adjusts the column's contribution to the BM25 relevance score. A higher value increases the column's influence on the ranking. This parameter does not affect the returned result set, only the BM25 score of each row in the results. |
|
tableName |
The name of the data table. |
|
indexName |
The name of the search index. |
|
columnsToGet |
Specifies the columns to return for each matching row. Configure this using the By default, Set |
Usage
You can perform prefix queries using the Tablestore console, the command-line tool, or an SDK. Before you begin, complete the following prerequisites.
PrefixQuery on FuzzyKeyword columns is currently supported only through the Tablestore SDK. The console and command-line tool support Keyword columns only.
Use an Alibaba Cloud account or a RAM user with the required permissions for Table Store operations. To grant permissions to a RAM user, see Grant permissions to a RAM user by using a RAM policy.
If you use an SDK or a command-line tool, create an AccessKey for your Alibaba Cloud account or RAM user if you do not have one.
You have created a data table.
A Search Index has been created for the data table.
If you use an SDK, initialize the Tablestore Client.
If you use the command-line tool, download and start the tool, then configure the connection to your instance and select the target table. For more information, see Download the command-line tool, Start the tool and configure connection information, and Data table operations.
Billing
Querying data by using a Search Index consumes read throughput. For more information, see Search Index metering and billing.
FAQ
Related documents
Search Index supports various query types for multi-dimensional data queries, including term query, terms query, match all query, match query, phrase match query, range query, prefix query, suffix query, wildcard query, token-based wildcard query, boolean query, geo query, nested query, vector search, and exists query.
When you query data, you can sort and paginate the result set or perform collapsing (deduplication).
For data analysis, such as finding the maximum or minimum value, calculating a sum, or counting rows, you can use the statistical aggregation or SQL query features.
To quickly export data regardless of the result set order, you can use the Parallel Scan feature.