All Products
Search
Document Center

Tablestore:Create a search index

Last Updated:Apr 14, 2026

Use the CreateSearchIndex method to create a search index for a data table. A data table can have multiple search indexes. When you create a search index, add the fields that you want to query to the index. You can also configure advanced options, such as custom routing keys and pre-sorting.

Prerequisites

  • The Tablestore client is initialized. For more information, see Initialize Tablestore Client.

  • A data table is created and meets the following conditions:

    • The maximum number of versions is set to 1.

    • The time to live (TTL) is set to -1, or the table has updates disabled.

Usage notes

  • When you create a search index, the data type of each field in the search index must match the data type of the corresponding field in the data table.

  • To set a specific TTL for a search index (a value other than -1), you must disable the UpdateRow write feature for the data table. The TTL of the search index must be less than or equal to the TTL of the data table. For more information, see Lifecycle management.

Parameters

When you create a search index, you must specify the table name (table_name), the search index name (index_name), and the index schema (schema). The schema includes field schemas (field_schemas), index settings (index_setting), and index pre-sorting settings (index_sort). The following table lists these parameters.

Parameter

Description

table_name

The name of the data table.

index_name

The name of the search index.

field_schemas

A list of field_schema objects. Each field_schema object contains the following parameters:

  • field_name (Required): The name of the field to include in the search index. The value is a column name. Data type: String.

    A field can be a primary key column or an attribute column.

  • field_type (Required): The data type of the field. Data type: FieldType.XXX.

  • is_array (Optional): Specifies whether the field is an array. Data type: Boolean.

    If you set this parameter to True, the column stores an array. Data written to the column must be in JSON array format, such as ["a","b","c"].

    Because the Nested type is inherently an array, you do not need to set this parameter when field_type is Nested.

  • index (Optional): Specifies whether to enable indexing for the field. Data type: Boolean.

    Default value: True. A value of True indicates that an inverted index or a spatial index is created for the column. A value of False indicates that no index is created for the column.

  • analyzer (Optional): The tokenizer type. This parameter is valid only when the field type is Text. Default value: single-word tokenization.

  • enable_sort_and_agg (Optional): Specifies whether to enable sorting and aggregation. Data type: Boolean.

    Only fields with enable_sort_and_agg set to True can be used for sorting.

    Important

    Nested fields do not support sorting and aggregation. However, sub-columns within a Nested field support this feature.

  • sub_field_schemas (Optional): The index types for sub-columns in a nested document. This parameter is required when the field type is Nested. Data type: a list of field_schema objects.

  • is_virtual_field (Optional): Specifies whether the field is a virtual column. Data type: Boolean. Default value: False. To use a virtual column, set this parameter to True.

  • source_field_name (Optional): The name of the source field in the data table. Data type: String.

    Important

    This parameter is required when is_virtual_field is set to True.

  • date_formats (Optional): The date format. Data type: String. For more information, see Date and time types.

    Important

    This parameter is required when the field type is Date.

  • enable_highlighting (Optional): Specifies whether to enable the summary and highlighting feature. Data type: Boolean. Default value: False. To enable summary and highlighting, set this parameter to True. This feature is supported only for Text fields.

    Important

    This feature is available in Tablestore Python SDK 6.0.0 and later.

  • vector_options (Optional): The property parameters for a vector field. This parameter is required when the field type is Vector. This parameter contains the following settings:

    • data_type: The data type of the vector. Only float32 is supported. To use other data types, submit a ticket to contact us.

    • dimension: The number of dimensions in the vector. Maximum value: 4096.

    • metric_type: The algorithm used to measure the distance between vectors. Valid values: euclidean (Euclidean distance), cosine (cosine similarity), and dot_product (dot product).

      • Euclidean distance (euclidean): Measures the straight-line distance between two vectors in a multidimensional space. For performance reasons, the Euclidean distance algorithm in Tablestore does not perform the final square root calculation. A larger score indicates greater similarity between two vectors.

      • Cosine similarity (cosine): Measures the cosine of the angle between two vectors. A higher score indicates greater similarity. This algorithm is commonly used for text similarity.

      • Dot product (dot_product): Multiplies corresponding coordinates of two equal-dimension vectors and sums the results. A higher score indicates greater similarity.

      For more information about how to select a distance measure algorithm, see Distance measure algorithms.

  • json_type (Optional): The index type for JSON data. Valid values: OBJECT and NESTED. This parameter is required when the field type is JSON.

index_setting

The index settings, which include the routing_fields setting.

routing_fields (Optional): Custom routing fields. You can select some primary key columns as routing fields. Typically, you only need to set one. If you set multiple routing keys, the system concatenates their values into a single value.

When index data is written, the system determines the data partition based on the values of the routing fields. Records with the same routing field values are distributed to the same data partition.

index_sort

The index pre-sorting settings, which include the sorters setting. If you do not configure this parameter, data is sorted by primary key by default.

Note

Search indexes that contain Nested fields do not support index pre-sorting (indexSort).

sorters (Required): The pre-sorting method for the index. Supported methods: sorting by primary key and sorting by field value. For more information, see Sorting and pagination.

  • PrimaryKeySort: Sorts data by primary key. This method includes the following setting:

    sort_order: The sort order. Valid values: SortOrder.ASC (ascending, default) and SortOrder.DESC (descending).

  • FieldSort: Sorts data by field value. This method includes the following settings:

    Only fields that are indexed and have sorting and aggregation enabled can be used for pre-sorting.

    • field_name: The name of the field used for sorting.

    • sort_order: The sort order. Valid values: SortOrder.ASC (ascending, default) and SortOrder.DESC (descending).

    • sort_mode: The sorting mode when a field contains multiple values.

Examples

Specify analyzers when creating a search index

The following sample code creates a search index with analyzers configured. The search index contains six fields: k (Keyword), t (Text), g (Geopoint), ka (Keyword array), la (Long array), and n (Nested). The n field has three sub-fields: nk (Keyword), nl (Long), and nt (Text).

def create_search_index(client):
    # A Keyword field. Create an index and enable statistical aggregation.
    field_a = FieldSchema('k', FieldType.KEYWORD, index=True, enable_sort_and_agg=True)
    # A Text field. Create an index and use single-word tokenization.
    field_b = FieldSchema('t', FieldType.TEXT, index=True, analyzer=AnalyzerType.SINGLEWORD)
    # A Text field. Create an index and use fuzzy tokenization.
    #field_b = FieldSchema('t', FieldType.TEXT, index=True, analyzer=AnalyzerType.FUZZY,analyzer_parameter=FuzzyAnalyzerParameter(1, 6))
    # A Text field. Create an index and use a custom separator (a comma) for tokenization.
    #field_b = FieldSchema('t', FieldType.TEXT, index=True, analyzer=AnalyzerType.SPLIT, analyzer_parameter = SplitAnalyzerParameter(","))
    # A Geopoint field. Create an index.
    field_c = FieldSchema('g', FieldType.GEOPOINT, index=True)
    # A Keyword array field. Create an index.
    field_d = FieldSchema('ka', FieldType.KEYWORD, index=True, is_array=True)
    # A Long array field. Create an index.
    field_e = FieldSchema('la', FieldType.LONG, index=True, is_array=True)

    # A Nested field that includes three sub-fields: nk (Keyword), nl (Long), and nt (Text).
    field_n = FieldSchema('n', FieldType.NESTED, sub_field_schemas=[
        FieldSchema('nk', FieldType.KEYWORD, index=True),
        FieldSchema('nl', FieldType.LONG, index=True),
        FieldSchema('nt', FieldType.TEXT, index=True),
    ])

    fields = [field_a, field_b, field_c, field_d, field_e, field_n]

    index_setting = IndexSetting(routing_fields=['PK1']) 
    index_sort = None # When a search index contains a Nested field, you cannot set index pre-sorting.
    #index_sort = Sort(sorters=[PrimaryKeySort(SortOrder.ASC)])
    index_meta = SearchIndexMeta(fields, index_setting=index_setting, index_sort=index_sort)
    client.create_search_index('<TABLE_NAME>', '<SEARCH_INDEX_NAME>', index_meta)

Create a search index and configure vector fields

The following sample code creates a search index with vector fields configured. The search index contains three fields: col_keyword (Keyword), col_long (Long), and col_vector (Vector). The distance measure algorithm for the vector field is dot product.

def create_search_index(client):
    index_meta = SearchIndexMeta([
        FieldSchema('col_keyword', FieldType.KEYWORD, index=True, enable_sort_and_agg=True),  # String type
        FieldSchema('col_long', FieldType.LONG, index=True),  # Numeric type
        FieldSchema("col_vector", FieldType.VECTOR,  # Vector type
                    vector_options=VectorOptions(
                        data_type=VectorDataType.VD_FLOAT_32,
                        dimension=4,  # The vector dimension is 4, and the similarity algorithm is dot product.
                        metric_type=VectorMetricType.VM_DOT_PRODUCT
                    )),

    ])
    client.create_search_index(table_name, index_name, index_meta)

Enable summary and highlighting when creating a search index

The following sample code creates a search index with summary and highlighting enabled. The search index contains three fields: k (Keyword), t (Text), and n (Nested). The n field has three sub-fields: nk (Keyword), nl (Long), and nt (Text). Summary and highlighting is enabled for the t field and the nt sub-field of the n field.

def create_search_index0905(client):
    # A Keyword field. Create an index and enable statistical aggregation.
    field_a = FieldSchema('k', FieldType.KEYWORD, index=True, enable_sort_and_agg=True)
    # A Text field. Create an index, use single-word tokenization, and enable summary and highlighting for the field.
    field_b = FieldSchema('t', FieldType.TEXT, index=True, analyzer=AnalyzerType.SINGLEWORD,
                        enable_highlighting=True)

    # A Nested field that includes three sub-fields: nk (Keyword), nl (Long), and nt (Text). The summary and highlighting feature is enabled for the nt sub-column.
    field_n = FieldSchema('n', FieldType.NESTED, sub_field_schemas=[
        FieldSchema('nk', FieldType.KEYWORD, index=True),
        FieldSchema('nl', FieldType.LONG, index=True),
        FieldSchema('nt', FieldType.TEXT, index=True, enable_highlighting=True),
    ])

    fields = [field_a, field_b, field_n]

    index_setting = IndexSetting(routing_fields=['id'])
    index_sort = None  # When a search index contains a Nested field, you cannot set index pre-sorting.
    # index_sort = Sort(sorters=[PrimaryKeySort(SortOrder.ASC)])
    index_meta = SearchIndexMeta(fields, index_setting=index_setting, index_sort=index_sort)
    client.create_search_index('pythontest', 'pythontest_0905', index_meta)

FAQ

References