Create a search index using Tablestore SDK for Node.js - Tablestore

You can use the CreateSearchIndex method to create one or more search indexes for a data table. When you create a search index, you must add the fields that you want to query to the index. You can also configure advanced options, such as routing fields and pre-sorting.

Prerequisites

Initialize a Tablestore client. For more information, see Initialize a Tablestore client.
Create a data table that meets the following conditions. For more information, see Create a data table.
- The max versions must be 1.
- The time to live (TTL) must be -1 or updates on the data table must be disabled.

Usage notes

When you create a search index, the data types of the fields in the search index must match the data types of the fields in the data table.
To set the TTL of a search index to a value other than -1, you must disable the UpdateRow operation for the data table. The TTL of the search index must be less than or equal to the TTL of the data table. For more information, see Lifecycle management.

Parameters

When you create a search index, you must specify the table name (tableName), index name (indexName), and index schema (schema). The schema includes field schemas (fieldSchemas), index settings (indexSetting), and index pre-sorting settings (indexSort). The following table describes these parameters.

Parameter	Description
tableName	The name of the data table.
indexName	The name of the search index.
fieldSchemas	The list of field schemas. Each field schema contains the following parameters: fieldName (required): The name of the field to be indexed, which is the column name. Type: String. A field in a search index can be a primary key column or an attribute column. fieldType (required): The data type of the field. Set this parameter to TableStore.FieldType.XXX. index (optional): Specifies whether to enable indexing. Type: Boolean. The default value is true. This value indicates that an inverted index or spatial index is created for the column. If you set this parameter to false, no index is created for the column. analyzer (optional): The tokenizer type. You can set this parameter when the field type is Text. If you do not set this parameter, the default single-word tokenization is used. analyzerParameter (optional): The tokenizer parameter settings. Set the parameters based on the tokenizer type. You must set this parameter if you set the analyzer parameter for a field. enableSortAndAgg (optional): Specifies whether to enable sorting and statistical aggregation. Type: Boolean. You can sort only the fields for which enableSortAndAgg is set to true. Important Nested fields do not support sorting and statistical aggregation. However, the sub-columns within a Nested field support sorting and statistical aggregation. isAnArray (optional): Specifies whether the field is an array. Type: Boolean. If you set this parameter to true, the column is an array. When you write data, it must be in a JSON array format, such as ["a","b","c"]. Because the Nested type is an array, you do not need to set this parameter when fieldType is Nested. fieldSchemas (optional): When the field type is Nested, use this parameter to set the index types of sub-columns in the nested document. The type is a list of field schemas. isVirtualField (optional): Specifies whether the field is a virtual column. Type: Boolean. The default value is false. To use a virtual column, set this parameter to true. sourceFieldName (optional): The name of the field in the data table. Type: String. You must set this parameter if you set isVirtualField to true. dateFormats (optional): The format of the date. Type: String. You must set this parameter if the field type is Date. For more information, see Date and time data types. enableHighlighting (optional): Specifies whether to enable summary and highlighting. Type: Boolean. The default value is false, which disables summary and highlighting. To use summary and highlighting, set this parameter to true. Only Text fields support this feature. Important This feature is supported in Tablestore SDK for Node.js 5.5.0 and later. vectorOptions: This parameter is optional and specifies the properties of the Vector field. If you set the fieldType parameter to Vector, you must configure this parameter. You can use the following parameters to specify the properties of the Vector field: dataType: the type of vector data. Only float32 is supported. If you want to use other types of Vector data, submit a ticket. dimension: the number of dimensions of the vector. The maximum number of dimensions for a Vector field is 4,096. metricType: the algorithm that you want to use to measure the distance between vectors. Valid values: euclidean, cosine, and dot_product. euclidean: the Euclidean distance algorithm that measures the shortest path between two vectors in a multi-dimensional space. For better performance, the Euclidean distance algorithm in Tablestore does not perform the final square root calculation. A greater value that is obtained by using the Euclidean distance algorithm indicates a higher similarity between two vectors. cosine: the cosine similarity algorithm that calculates the cosine of the angle between two vectors in a vector space. A greater value that is obtained by using the cosine similarity algorithm indicates a higher similarity between two vectors. In most cases, the algorithm is used to calculate the similarity between text data. dot_product: the dot product algorithm that multiplies the corresponding coordinates of two vectors of the same dimension and adds the products. A greater value that is obtained by using the dot product algorithm indicates a higher similarity between two vectors. For more information, see Appendix: distance measurement algorithms for vectors.
indexSetting	The index settings, which include the routingFields setting. routingFields (optional): The custom routing fields. You can select some primary key columns as routing fields. In most cases, you only need to set one. If you set multiple routing keys, the system concatenates the values of the routing keys into a single value. When you write data to the index, the system calculates the distribution location of the index data based on the values of the routing fields. Records with the same routing field values are indexed to the same data partition.
indexSort	The index pre-sorting settings, which include the sorters setting. If you do not set this parameter, the data is sorted by primary key by default. Note indexSort is not supported for indexes that contain Nested fields. No pre-sorting is performed. sorters (required): The pre-sorting method for the index. You can sort by primary key or by field value. For more information about sorting, see Sorting and paging. PrimaryKeySort sorts by primary key. It includes the following setting: order: The sort order. You can sort in ascending or descending order. The default is ascending (TableStore.SortOrder.SORT_ORDER_ASC). FieldSort sorts by field value. It includes the following settings: You can pre-sort only the fields for which an index is created and sorting and statistical aggregation are enabled. fieldName: The name of the field to sort by. order: The sort order. You can sort in ascending or descending order. The default is ascending (TableStore.SortOrder.SORT_ORDER_ASC). mode: The sorting method to use when a field has multiple values.
timeToLive	(Optional) The time to live (TTL), which is the data retention period. Unit: seconds. The default value is -1, which indicates that the data never expires. The minimum value for TTL is 86,400 seconds (one day). You can also set it to -1. When the data retention period exceeds the specified TTL, the system automatically deletes the expired data.

Examples

Create a search index and set a tokenizer

The following example shows how to create a search index. The index includes the following columns: pic_id (Keyword), count (Long), time_stamp (Long), pic_description (Text), col_vector (Vector), pos (Geo-point), pic_tag (Nested), date (Date), analyzer_single_word (Text), analyzer_split (Text), and analyzer_fuzzy (Text). The pic_tag column includes two sub-fields: sub_tag_name (Keyword) and tag_name (Keyword). The analyzer_single_word column uses single-word tokenization. The analyzer_split column uses delimiter tokenization. The analyzer_fuzzy column uses fuzzy tokenization.

client.createSearchIndex({
    tableName: "<TABLE_NAME>", // Set the table name.
    indexName: "<INDEX_NAME>", // Set the search index name.
    schema: {
        fieldSchemas: [
            {
                fieldName: "pic_id",
                fieldType: TableStore.FieldType.KEYWORD, // Set the field name and field type.
                index: true, // Enable indexing.
                enableSortAndAgg: true, // Enable sorting and aggregation.
                isAnArray: false
            },
            {
                fieldName: "count",
                fieldType: TableStore.FieldType.LONG,
                index: true,
                enableSortAndAgg: true,
                isAnArray: false
            },
            {
                fieldName: "time_stamp",
                fieldType: TableStore.FieldType.LONG,
                index: true,
                enableSortAndAgg: false,
                isAnArray: false,
            },
            {
                fieldName: "pic_description",
                fieldType: TableStore.FieldType.TEXT,
                index: true,
                enableSortAndAgg: false,
                isAnArray: false,
            },
            {
                fieldName: "col_vector",
                fieldType: TableStore.FieldType.VECTOR,
                index: true,
                isAnArray: false,
                vectorOptions: {
                    dataType: TableStore.VectorDataType.VD_FLOAT_32,
                    dimension: 4,
                    metricType: TableStore.VectorMetricType.VM_COSINE,
                }
            },
            {
                fieldName: "pos",
                fieldType: TableStore.FieldType.GEO_POINT,
                index: true,
                enableSortAndAgg: true,
                isAnArray: false,
            },
            {
                fieldName: "pic_tag",
                fieldType: TableStore.FieldType.NESTED,
                index: false,
                enableSortAndAgg: false,
                fieldSchemas: [
                    {
                        fieldName: "sub_tag_name",
                        fieldType: TableStore.FieldType.KEYWORD,
                        index: true,
                        enableSortAndAgg: true,
                    },
                    {
                        fieldName: "tag_name",
                        fieldType: TableStore.FieldType.KEYWORD,
                        index: true,
                        enableSortAndAgg: true,
                    }
                ]
            },
            {
                fieldName: "date",
                fieldType: TableStore.FieldType.DATE,
                index: true,
                enableSortAndAgg: true,
                isAnArray: false,
                dateFormats: ["yyyy-MM-dd'T'HH:mm:ss.SSSSSS"],
            },
            {
                fieldName: "analyzer_single_word",
                fieldType: TableStore.FieldType.TEXT,
                analyzer: "single_word",
                index: true,
                enableSortAndAgg: false,
                isAnArray: false,
                analyzerParameter: {
                    caseSensitive: true,
                    delimitWord: false,
                }
            },
            {
                fieldName: "analyzer_split",
                fieldType: TableStore.FieldType.TEXT,
                analyzer: "split",
                index: true,
                enableSortAndAgg: false,
                isAnArray: false,
                analyzerParameter: {
                    delimiter: ",",
                }
            },
            {
                fieldName: "analyzer_fuzzy",
                fieldType: TableStore.FieldType.TEXT,
                analyzer: "fuzzy",
                index: true,
                enableSortAndAgg: false,
                isAnArray: false,
                analyzerParameter: {
                    minChars: 1,
                    maxChars: 5,
                }
            },
        ],
        indexSetting: { // The configuration options of the index.
            "routingFields": ["count", "pic_id"], // Only primary key columns can be set as routing fields.
            "routingPartitionSize": null
        },
        //indexSort: {// indexSort is not supported for indexes that contain Nested fields. No pre-sorting is performed.
            //sorters: [
                // { // If you do not set indexSort, the data is sorted by primary key in ascending order by default.
                //     primaryKeySort: {
                //         order: TableStore.SortOrder.SORT_ORDER_ASC
                //     }
                // },
                //{
                //   fieldSort: {
                //        fieldName: "Col_Keyword",
                //        order: TableStore.SortOrder.SORT_ORDER_DESC // Set the sorting order for indexSort.
                //    }
                //}
            //]
        //},
        timeToLive: 1000000, // Unit: seconds.
    }
}, function (err, data) {
    if (err) {
        console.log('error:', err);
        return;
    }
    console.log('success:',data);
});

Create a search index and enable highlighting

The following example shows how to enable highlighting when you create a search index. The index includes three fields: k (Keyword), t (Text), and n (Nested). The n field includes three sub-fields: nk (Keyword), nl (Long), and nt (Text). The highlighting feature is enabled for the t field and the nt sub-field.

client.createSearchIndex({
    tableName: "<TABLE_NAME>", // Set the table name.
    indexName: "<SEARCH_INDEX_NAME>", // Set the search index name.
    schema: {
        fieldSchemas: [
            {
                fieldName: "k",
                fieldType: TableStore.FieldType.KEYWORD, // Set the field name and field type.
                index: true, // Enable indexing.
                enableSortAndAgg: true, // Enable sorting and aggregation.
                isAnArray: false
            },
            {
                fieldName: "t",
                fieldType: TableStore.FieldType.TEXT,
                index: true,
                enableSortAndAgg: false,
                enableHighlighting: true, // Enable highlighting for the field.
                isAnArray: false,
            },
            {
                fieldName: "n",
                fieldType: TableStore.FieldType.NESTED,
                index: false,
                enableSortAndAgg: false,
                fieldSchemas: [
                    {
                        fieldName: "nk",
                        fieldType: TableStore.FieldType.KEYWORD,
                        index: true,
                        enableSortAndAgg: true,
                    },
                    {
                        fieldName: "nl",
                        fieldType: TableStore.FieldType.LONG,
                        index: true,
                        enableSortAndAgg: true,
                    },
                    {
                        fieldName: "nt",
                        fieldType: TableStore.FieldType.TEXT,
                        index: true,
                        enableSortAndAgg: false,
                        enableHighlighting: true, // Enable highlighting for the field.
                    },
                ]
            },
        ],
        indexSetting: { // The configuration options of the index.
            "routingFields": ["id"], // Only primary key columns can be set as routing fields.
            "routingPartitionSize": null
        },
        //indexSort: {// indexSort is not supported for indexes that contain Nested fields. No pre-sorting is performed.
            //sorters: [
                // { // If you do not set indexSort, the data is sorted by primary key in ascending order by default.
                //     primaryKeySort: {
                //         order: TableStore.SortOrder.SORT_ORDER_ASC
                //     }
                // },
                //{
                //   fieldSort: {
                //        fieldName: "Col_Keyword",
                //        order: TableStore.SortOrder.SORT_ORDER_DESC // Set the sorting order for indexSort.
                //    }
                //}
            //]
        //},
        timeToLive: 1000000, // Unit: seconds.
    }
}, function (err, data) {
    if (err) {
        console.log('error:', err);
        return;
    }
    console.log('success:',data);
});

FAQ

References

After you create a search index, you can select a query type to perform multi-dimensional data queries. Search index query types include term query, terms query, match all query, match query, match phrase query, prefix query, range query, wildcard query, geo query, Boolean query, vector search, nested query, and exists query.
When you query data, you can perform sorting and paging, highlighting, or collapse (deduplication) operations on the result set.
After you create a search index, you can manage it as needed. Operations include dynamically modifying the schema, updating the search index configuration, listing search indexes, querying search index descriptions, and deleting a search index.
To perform data analytics, such as finding the maximum or minimum value, calculating a sum, or counting rows, you can use the statistical aggregation feature or the SQL query feature.
To quickly export data regardless of the order of the entire result set, you can use the parallel scan feature.