All Products
Search
Document Center

Tablestore:Create a search index

Last Updated:Jul 26, 2024

You can call the CreateSearchIndex operation to create one or more search indexes for a data table. When you create a search index, you can add the fields that you want to query to the search index and configure advanced settings for the search index. For example, you can configure the routing key and presorting settings.

Prerequisites

  • An OTSClient instance is initialized. For more information, see Initialize an OTSClient instance.

  • A data table for which the MaxVersion parameter is set to 1 is created. One of the following conditions must be met by the TimeToAlive parameter of the data table: For more information, see Create a data table.

    • The TimeToAlive parameter of the data table is set to -1, which specifies that data in the data table never expires.

    • The TimeToAlive parameter of the data table is set to a value other than -1, and update operations on the data table are prohibited.

Usage notes

  • The data types of the fields in a search index must match the data types of the fields in the data table for which the search index is created. For more information, see Data type mappings.

  • To specify a value other than -1 for the TimeToAlive parameter of a search index, you must disable the UpdateRow operation on the data table for which the search index is created. The value of the TimeToAlive parameter for the search index must be less than or equal to the value of the TimeToAlive parameter for the data table. For more information, see TTL of search indexes.

Parameters

When you create a search index, you must configure the TableName, IndexName, and IndexSchema parameters. You must also configure the FieldSchemas, IndexSetting, and IndexSort parameters in the IndexSchema parameter. The following table describes the parameters.

Parameter

Description

TableName

The name of the data table.

IndexName

The name of the search index.

FieldSchemas

The list of field schemas. In each field schema, configure the following parameters:

  • FieldName: This parameter is required and specifies the name of the field in the search index. The value is used as a column name. Type: String.

    A column in a search index can be a primary key column or an attribute column of the data table.

  • FieldType: This parameter is required and specifies the type of the field. Specify the type in the tablestore.FieldType_XXX format. For more information, see Data type mappings.

  • Array: This parameter is optional and specifies whether the value is an array. Type: Boolean.

    If you set this parameter to true, the column stores data as an array. Data written to the column must be a JSON array. Example: ["a","b","c"].

    The values of nested fields are arrays. If you set the FieldType parameter to Nested, skip this parameter.

  • Index: This parameter is optional and specifies whether to enable indexing for the column. Type: Boolean.

    Default value: true. A value of true specifies that Tablestore indexes the column by using an inverted indexing or spatio-temporal indexing schema. A value of false specifies that indexing is disabled for the column.

  • Analyzer: This parameter is optional and specifies the type of analyzer that you want to use. If you set the FieldType parameter to Text, you can configure this parameter. If you do not configure this parameter, the default analyzer type single-word tokenization is used. For more information, see Tokenization.

  • EnableSortAndAgg: This parameter is optional and specifies whether to enable sorting and aggregation. Type: Boolean.

    Sorting can be enabled only for fields for which the EnableSortAndAgg parameter is set to true. For more information, see Perform sorting and paging.

    Important

    Fields of the Nested type do not support sorting and aggregation. The subcolumns of fields of the Nested type support sorting and aggregation.

  • Store: This parameter is optional and specifies whether to store the value of the field in the search index. Type: Boolean.

    If you set the Store parameter to true, you can read the value of the field from the search index without the need to query the data table. This improves query performance.

  • DateFormats: This parameter is optional and specifies the format of dates. Type: String. If you set the FieldType parameter to Date, you must configure this parameter. For more information, see Types of date data.

  • VectorOptions: This parameter is optional and specifies the properties of vector fields. If you set the FieldType parameter to Vector, you must configure this parameter. You can use the following parameters to specify the properties of vector fields:

    • DataType: the type of vector data. Only float32 is supported. If you want to use other types of vector data, submit a ticket.

    • Dimension: the dimension of the vector. For information about the limits on the dimension of a vector, see Search index limits.

    • MetricType: the algorithm that you want to use to measure the distance between vectors. Valid values: euclidean, cosine, and dot_product.

      • euclidean: the Euclidean distance algorithm that measures the shortest path between two vectors in a multi-dimensional space. The Euclidean distance algorithm in Tablestore does not perform the final square root calculation to achieve better performance. A greater value that is obtained by using the Euclidean distance algorithm indicates a higher similarity between two vectors.

      • cosine: the cosine similarity algorithm that calculates the cosine of the angle between two vectors in a vector space. A greater value that is obtained by using the cosine similarity algorithm indicates a higher similarity between two vectors. In most cases, the algorithm is used to calculate the similarity between text data.

      • dot_product: the dot product algorithm that multiplies the corresponding coordinates of two vectors of the same dimension and adds the products. A greater value that is obtained by using the dot product algorithm indicates a higher similarity between two vectors.

      For more information, see Appendix: distance measurement algorithms for vectors.

IndexSetting

The settings of the search index, including RoutingFields.

RoutingFields: This parameter is optional and specifies custom routing fields. You can specify specific primary key columns as routing fields. Tablestore distributes data that is written to a search index across different partitions based on the specified routing fields. Data whose routing field values are the same is distributed to the same partition.

IndexSort

The presorting settings of the search index, including Sorters. If the IndexSort parameter is left empty, field values are sorted by primary key.

Note

You can skip the presorting settings for search indexes that contain fields of the Nested type.

Sorters: This parameter is required and specifies the presorting method for the search index. PrimaryKeySort and FieldSort are supported. For more information, see Perform sorting and paging.

  • PrimaryKeySort: sorts data by primary key. You can specify the following parameter for the PrimaryKeySort parameter:

    Order: the sort order. Data can be sorted in ascending or descending order. By default, data is sorted in ascending order.

  • FieldSort: sorts data by field value. You can specify the following parameters for the FieldSort parameter:

    Only fields for which an index is created and the EnableSortAndAgg parameter is set to true can be presorted.

    • FieldName: the name of the field that is used to sort data.

    • Order: the sort order. Data can be sorted in ascending or descending order. By default, data is sorted in ascending order.

    • Mode: the sorting method that you want to use when the field contains multiple values.

TimeToLive

This parameter is optional and specifies the retention period of data in the search index. Unit: seconds. Default value: -1.

If the retention period of data exceeds the time to live (TTL), the data expires. Tablestore automatically deletes expired data.

The value of this parameter must be greater than or equal to 86400. A value of 86400 specifies one day. You can also set this parameter to -1, which specifies that data never expires.

For more information about how to use the TTL feature for search indexes, see TTL of search indexes.

Examples

Create a search index by using the default configurations

The following sample code provides an example on how to create a search index by using the default configurations. In this example, the search index consists of the following columns: the col_keyword column of the Keyword type, the col_long column of the Long type, and the col_vector column of the Vector type.

func createSearchIndex(client *tablestore.TableStoreClient) {
    request := &tablestore.CreateSearchIndexRequest{}
    request.TableName = "<TABLE_NAME>"
    request.IndexName = "<SEARCH_INDEX_NAME>"
    request.IndexSchema = &tablestore.IndexSchema{
        FieldSchemas: []*tablestore.FieldSchema{
            {
                FieldName:        proto.String("col_keyword"),
                FieldType:        tablestore.FieldType_KEYWORD, // The String type.
                Index:            proto.Bool(true),
                EnableSortAndAgg: proto.Bool(true),
            },
            {
                FieldName:        proto.String("col_long"),
                FieldType:        tablestore.FieldType_LONG, // The Long type.
                Index:            proto.Bool(true),
                EnableSortAndAgg: proto.Bool(true),
            },
            {
                FieldName: proto.String("col_vector"),
                FieldType: tablestore.FieldType_VECTOR, // The Vector type.
                Index:     proto.Bool(true),
                VectorOptions: &tablestore.VectorOptions{
                    VectorDataType:   tablestore.VectorDataType_FLOAT_32.Enum(),
                    Dimension:        proto.Int32(4), // Set the number of dimensions for the vector to 4 and the distance measurement algorithm for vectors to the dot product algorithm.
                    VectorMetricType: tablestore.VectorMetricType_DOT_PRODUCT.Enum(),
                },
            },
        },
    }
    _, err := client.CreateSearchIndex(request)
    if err != nil {
        fmt.Println("Failed to create searchIndex with error:", err)
        return
    }
}

Create a search index with the IndexSort parameter specified

The following sample code provides an example on how to create a search index with the IndexSort parameter specified. In this example, the search index consists of the col1 column of the Keyword type and the col2 column of the Long type.

func createSearchIndex_withIndexSort(client *tablestore.TableStoreClient){
    request := &tablestore.CreateSearchIndexRequest{}
    request.TableName = "<TABLE_NAME>" // Specify the name of the data table. 
    request.IndexName = "<SEARCH_INDEX_NAME>" // Specify the name of the search index. 

    schemas := []*tablestore.FieldSchema{}
    field1 := &tablestore.FieldSchema{
        FieldName: proto.String("col1"), // Specify the name of the field by calling the proto.String method. This method is used to request a string pointer. 
        FieldType: tablestore.FieldType_KEYWORD, // Specify the type of the field. 
        Index:     proto.Bool(true), // Enable indexing for the field. 
        EnableSortAndAgg: proto.Bool(true), // Enable sorting and aggregation. 
    }
    field2 := &tablestore.FieldSchema{
        FieldName: proto.String("col2"),
        FieldType: tablestore.FieldType_LONG,
        Index:     proto.Bool(true),
        EnableSortAndAgg: proto.Bool(true),
    }

    schemas = append(schemas, field1, field2)
    request.IndexSchema = &tablestore.IndexSchema{
        FieldSchemas: schemas, // Specify the fields that are included in the search index. 
        IndexSort: &search.Sort{ // Specify the index presorting settings. Data is sorted based on the value of the col2 column in ascending order and then sorted based on the value of the col1 column in descending order. 
            Sorters: []search.Sorter{
                &search.FieldSort{
                    FieldName: "col2",
                    Order:     search.SortOrder_ASC.Enum(),
                },
                &search.FieldSort{
                    FieldName: "col1",
                    Order:     search.SortOrder_DESC.Enum(),
                },
            },
        },
    }
    resp, err := client.CreateSearchIndex(request) // Call a client to create the search index. 
    if err != nil {
        fmt.Println("error :", err)
        return
    }
    fmt.Println("CreateSearchIndex finished, requestId:", resp.ResponseInfo.RequestId)
}

Create a search index with the TTL specified

Important

Make sure that update operations on the data table are disabled.

func createIndexWithTTL(client *tablestore.TableStoreClient) {
    request := &tablestore.CreateSearchIndexRequest{}
    request.TableName = "<TABLE_NAME>"
    request.IndexName = "<SEARCH_INDEX_NAME>"
    schemas := []*tablestore.FieldSchema{}
    field1 := &tablestore.FieldSchema{
        FieldName:        proto.String("col1"), // Specify the name of the field by calling the proto.String method. This method is used to request a string pointer. 
        FieldType:        tablestore.FieldType_KEYWORD, // Specify the type of the field. 
        Index:            proto.Bool(true),             // Enable indexing for the field. 
        EnableSortAndAgg: proto.Bool(true),             // Enable sorting and aggregation. 
    }
    field2 := &tablestore.FieldSchema{
        FieldName:        proto.String("col2"),
        FieldType:        tablestore.FieldType_LONG,
        Index:            proto.Bool(true),
        EnableSortAndAgg: proto.Bool(true),
    }
    schemas = append(schemas, field1, field2)
    request.IndexSchema = &tablestore.IndexSchema{
        FieldSchemas: schemas, // Specify the fields that are included in the search index. 
    }
    request.TimeToLive = proto.Int32(3600 * 24 * 7) // Set the TTL of the search index to 7 days. 
    resp, err := client.CreateSearchIndex(request)
    if err != nil {
       fmt.Println("error :", err)
       return
   }
    fmt.Println("createIndexWithTTL finished, requestId:", resp.ResponseInfo.RequestId)
}

FAQ

References

  • After you create a search index, you can use the query methods provided by the search index to query data from multiple dimensions based on your business requirements. When you use a search index to query data, you can use the following query methods: term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, geo query, Boolean query, KNN vector query, nested query, and exists query.

    If you call the Search operation to query data, you can sort or paginate rows that meet the query conditions by using the sorting and paging features. For more information, see Perform sorting and paging.

  • If you call the Search operation to query data, you can use the collapse (distinct) feature to collapse the result set based on a specific column. This way, data of the specified type appears only once in the query results. For more information, see Collapse (distinct).

  • You can specify the TTL for a search index to delete historical data in the search index or extend the retention period of data in the search index. For more information, see TTL of search indexes.

  • If you want to analyze data in a data table, you can use the aggregation feature of the Search operation or execute SQL statements. For example, you can obtain the minimum and maximum values, sum, and total number of rows. For more information, see Aggregation and SQL query.

  • If you want to obtain all rows that meet the query conditions without the need to sort the rows, you can call the ParallelScan and ComputeSplits operations to use the parallel scan feature. For more information, see Parallel scan.

  • You can dynamically modify the schema of a search index to add, update, or remove index columns in the search index. For more information, see Dynamically modify the schema of a search index.

  • You can call the ListSearchIndex operation to query all search indexes that are created for a data table. For more information, see List search indexes.

  • You can call the DescribeSearchIndex operation to query the description of a search index. For example, you can query the field information and search index configurations. For more information, see Query the description of a search index.

  • You can delete a search index that you no longer require. For more information, see Delete search indexes.