CREATE INDEX builds an index on one or more columns of a LindormTable wide table. Use it to enable non-primary-key lookups, full-text search, and analytical queries without full-table scans.
Applicability
CREATE INDEX applies only to LindormTable. No minimum version is required for secondary indexes.
To create a search index or a columnstore index using CREATE INDEX, use Lindorm SQL V2.6.1 or later. To check your Lindorm SQL version, see SQL version guide.
Before you begin
Index building for secondary indexes, columnstore indexes, and search indexes requires data lookups, which generate read operations. If you enabled hot and cold data separation for your instance, monitor throttling on cold storage. Throttled reads on cold storage slow down index building and can cause backpressure on write operations.
Syntax
CREATE INDEX [IF NOT EXISTS] [index_identifier]
[USING {KV | SEARCH | COLUMNAR}]
ON table_identifier '(' index_key_expression ')'
[INCLUDE '(' column_identifier (, column_identifier)* ')']
[PARTITION BY partition_definition]
[{ASYNC | SYNC}]
[WITH '(' index_options ')']Sub-productions:
index_key_expression ::= index_key_definition (, index_key_definition)*
| wildcard_string_literal
index_key_definition ::= column_identifier [DESC]
| column_identifier '(' option_definition (, option_definition)* ')'
| function_expression
function_expression ::= Z-ORDER '(' column_identifier (, column_identifier)* ')'
| S2 '(' column_identifier, level ')'
| CAST '(' column_identifier AS type ')'
| MD5 '(' column_identifier ')'
| SHA256 '(' column_identifier ')'
partition_definition ::= {RANGE TIME} '(' column_identifier ')' [PARTITIONS number_literal]
| HASH '(' column_identifier (, column_identifier)* ')' [PARTITIONS number_literal]
| ENUMERABLE '(' column_identifier (, column_identifier)* ')'
index_options ::= option_definition (, option_definition)*
option_definition ::= option_identifier = string_literalDifferences
The supported syntax elements vary by index type.
Syntax element | Secondary index | Search index |
✔ | 0 | |
✔ | 0 | |
0 | ✖ | |
✖ | 0 | |
Index building mode (ASYNC|SYNC) Important Only LindormTable 2.6.3 and later support the | 0 | 0 |
0 | 0 |
| Syntax element | Secondary index | Search index |
|---|---|---|
USING keyword | Optional (KV is the default) | Optional |
| Index key expression | Required | Required |
INCLUDE | Optional | Not supported |
PARTITION BY | Not supported | Optional |
ASYNC / SYNC | Optional | Optional |
WITH index properties | Optional | Optional |
Usage notes
You can create a maximum of 3 secondary indexes and 1 search index per wide table.
Index type (USING)
Use the USING keyword to specify the index type. If omitted, a secondary index (KV) is created by default.
USING value | Index type | Use when |
|---|---|---|
KV (or omit) | Secondary index | Non-primary-key equality and range lookups. An instance supports at most 8 concurrent secondary index build tasks; a ninth request fails immediately |
SEARCH | Search index | Full-text search, fuzzy queries, multi-dimensional filtering, aggregation, and sorting with pagination. No limit on concurrent build tasks. Requires enabling the search index feature first — search nodes and Lindorm Tunnel Service (LTS) nodes incur fees. See Enable search indexes. The index key must include at least one non-primary-key column. Supported data types: all basic types except DATE, TIME, and DECIMAL. See Basic data types |
Index key expression (index_key_expression)
Define one or more columns as index keys. An index with multiple index keys is a composite index.
Index key definition (index_key_definition)
For secondary indexes, each key definition can be:
A column name, optionally with
DESCfor descending orderA function expression (Z-ORDER, S2, CAST, MD5, or SHA256)
For search indexes, each key definition can be:
A column name with optional index key properties
The wildcard constant
*to index all existing columns
Search index key properties (option_definition)
Specify properties per column using the syntax column(property=value, ...). For example: c3(type=text, analyzer=ik). You can also specify these properties when adding an index column using the ALTER INDEX statement.
| Property | Type | Description |
|---|---|---|
indexed | STRING | Specifies whether to build an index for this column. An inverted index is built for string columns; a BKD-Tree numeric index for numeric columns. Valid values: true (default) or false |
rowStored | STRING | Specifies whether to store the raw column value. Valid values: true or false (default) |
columnStored | STRING | Specifies whether to use column store to accelerate sorting and analysis. Valid values: true (default) or false |
type | STRING | Set to text for tokenized full-text fields. Must be used with analyzer |
analyzer | STRING | Tokenizer for type=text columns. Valid values: standard, english, ik, whitespace, comma. Must be used with type |
mapping | STRING | Custom index key properties as a JSON string, compatible with Elasticsearch mapping syntax. Applies only to Elasticsearch-compatible versions. Overrides all other properties for this key |
Secondary index function expressions (function_expression)
When you create a secondary index, you can specify a function expression as the index key.
MD5 and SHA256 function expressions require LindormTable 2.6.7.5 or later. If you cannot upgrade, contact Lindorm technical support on DingTalk at s0s3eg3.
| Function | Syntax | Use when |
|---|---|---|
| Z-ORDER | Z-ORDER(col1, col2, ...) | Spatio-temporal secondary index; columns must be a spatio-temporal data type. See Spatio-temporal indexes |
| S2 | S2(col, level) | S2 grid index for POLYGON or MULTIPOLYGON columns; level ranges from 1 to 30. See S2 index function |
| CAST | CAST(col AS type) | Index on the result of a data type conversion. See Basic data types |
| MD5 | MD5(col) | Index on the MD5-encoded value of a VARCHAR column. Requires LindormTable 2.6.7.5+. See MD5 function |
| SHA256 | SHA256(col) | Index on the SHA256-encoded value of a VARCHAR column. Requires LindormTable 2.6.7.5+. See SHA256 function |
Wildcard constant (wildcard_string_literal)
Only search indexes support the wildcard constant *. Use * to build an index on all existing columns at the time the statement runs.
CREATE INDEX IF NOT EXISTS idx5 USING SEARCH ON test(*);Columns added after the statement runs are not automatically indexed. Add them manually using
ALTER INDEX.Dynamic columns are not included. See Dynamic columns.
Included columns (INCLUDE)
The INCLUDE clause adds non-key columns from the primary table into the index table, forming a covering index. Queries that hit the index can retrieve included column values without looking up the primary table, improving read performance.
For secondary indexes, use theWITHkeyword to include dynamic columns by settingINDEX_COVERED_TYPE. See Secondary indexes.
Index partition (PARTITION BY)
Only search indexes support index partitioning. The server splits and stores data automatically. At query time, the system applies partition pruning to skip irrelevant partitions.
Supported partition types: RANGE and HASH. See Partitioned indexes.
Index building mode (ASYNC | SYNC)
Specify ASYNC or SYNC to control when the statement returns.
| Mode | Behavior | Statement blocks? | Index usable immediately after the statement returns? | Use when |
|---|---|---|---|---|
ASYNC (default from LindormTable 2.6.1) | Returns immediately after the build task starts | No | No — the build continues in the background | Production; writes continue unaffected while the index builds |
SYNC | Returns only after the build task completes | Yes | Yes | Schema migration scripts; testing; cases where the index must be ready before the next step |
SYNC requires LindormTable 2.6.3 or later.
The following table shows which modes each index type supports.twosecondary indexes and search indexestwo
| Building mode | Secondary index | Search index |
|---|---|---|
ASYNC (default from LindormTable 2.6.1) | Supported | Supported |
SYNC (requires LindormTable 2.6.3+) | Supported | Supported |
Index building mode | Secondary index | Search index |
ASYNC Important Starting from LindormTable 2.6.1, the default index building mode for the | 0 | 0 |
SYNC Important Only LindormTable 2.6.3 and later support synchronous index building. | 0 | ✔ |
Index properties (WITH)
Secondary index properties
| Property | Type | Description |
|---|---|---|
COMPRESSION | STRING | Compression algorithm for the index table. Valid values: SNAPPY, ZSTD, LZ4 |
INDEX_COVERED_TYPE | STRING | Redundancy method for included columns. COVERED_ALL_COLUMNS_IN_SCHEMA: includes all predefined non-primary-key columns. COVERED_DYNAMIC_COLUMNS: includes all predefined non-primary-key columns and dynamic columns. When set, omit the INCLUDE clause. Before including dynamic columns, enable the dynamic column feature. See Dynamic columns |
STARTKEY | STRING | Start key for the index table. Cannot be set for timestamp or spatial data type columns |
ENDKEY | STRING | End key for the index table. Cannot be set for timestamp or spatial data type columns |
NUMREGIONS | INTEGER | Number of pre-partitions for the index table. Cannot be set for timestamp or spatial data type columns |
Search index properties
| Property | Type | Description |
|---|---|---|
indexState | STRING | Initial state of the index. Valid values: ACTIVE (available), INACTIVE (unavailable), DISABLED (disabled) |
numShards | INTEGER | Number of shards. Default: twice the number of search nodes. Keep each shard between 30 million–100 million rows and 30–50 GB. A shard exceeding 2 billion rows affects system stability. Plan the shard count before creating production indexes. For time-series data with significant volume growth (such as orders or logs), use a time-partitioned index instead |
RANGE_TIME_PARTITION_START | INTEGER | Days before index creation to start creating partitions, for historical data. If historical data has a timestamp earlier than the partition start time, an error occurs. Required when creating a time-partitioned index |
RANGE_TIME_PARTITION_INTERVAL | INTEGER | Interval in days between new partitions. For example, 7 creates a new partition every week. Required when creating a time-partitioned index |
RANGE_TIME_PARTITION_TTL | INTEGER | Data retention period in days. For example, 180 retains six months of data and automatically clears older partitions. If not set, data is never cleared. Required when creating a time-partitioned index |
RANGE_TIME_PARTITION_MAX_OVERLAP | INTEGER | Maximum allowed future timestamp offset in days. Default: 1 day |
RANGE_TIME_PARTITION_FIELD_TIMEUNIT | LONG | Unit of the time partition field. Default: milliseconds (ms). If the unit is seconds (s), the field value must be 10 digits. If the unit is milliseconds (ms), it must be 13 digits |
RANGE_TIME_PARTITION_CHS | INTEGER | Hot/cold data boundary in seconds. For example, 864000 archives data older than 10 days to cold storage. If not set, hot and cold data separation is not enabled for this index |
INDEX_SETTINGS | STRING | Custom index properties as a JSON string, compatible with Elasticsearch index settings syntax. Applies only to Elasticsearch-compatible versions |
SOURCE_SETTINGS | STRING | Raw data storage policy for index columns, as a JSON string compatible with Elasticsearch _source settings. By default, search indexes do not store raw column data. Configure this parameter only when you need direct data queries through the search engine visualization UI. See data queries. Supported parameters: enabled (Boolean; true stores all columns, false stores none; cannot be used with includes or excludes), includes (string array; columns whose raw data to store; supports the * wildcard), excludes (string array; columns whose raw data to exclude; supports the * wildcard) |
Examples
All examples use the following primary table:
CREATE TABLE test (
p1 VARCHAR NOT NULL,
p2 INTEGER NOT NULL,
c1 BIGINT,
c2 DOUBLE,
c3 VARCHAR,
c4 TIMESTAMP,
c5 GEOMETRY(POINT),
PRIMARY KEY(p1, p2)
) WITH (CONSISTENCY = 'strong', MUTABILITY='MUTABLE_LATEST');Secondary indexes
Build an index asynchronously
The default mode is ASYNC. The statement returns immediately.
CREATE INDEX idx1 ON test(c1 DESC) INCLUDE(c3, c4) WITH (COMPRESSION='ZSTD');Verify the result:
SHOW INDEX FROM test;Create a composite index synchronously
CREATE INDEX idx1 ON test(c1, c2, c3) INCLUDE(c4) SYNC WITH (COMPRESSION='ZSTD');Verify the result:
SHOW INDEX FROM test;Create a spatio-temporal secondary index
CREATE INDEX idx ON roads (Z-ORDER(g1));
CREATE INDEX idt ON roads (Z-ORDER(g1, t));Include all predefined columns
CREATE INDEX idx1 ON test(c4 DESC) WITH (INDEX_COVERED_TYPE='COVERED_ALL_COLUMNS_IN_SCHEMA');Verify the result:
SHOW INDEX FROM test;Include all dynamic columns
CREATE INDEX idx1 ON test(c4 DESC) WITH (INDEX_COVERED_TYPE='COVERED_DYNAMIC_COLUMNS');Verify the result:
SHOW INDEX FROM test;Set pre-partitions for the index table
Create an index with 32 pre-partitions.
CREATE INDEX idx1 ON test(c4 DESC) INCLUDE(c5, c6) WITH (NUMREGIONS='32');Verify the result:
SHOW INDEX FROM test;Set start and end keys with pre-partitions
Create an index table with 32 pre-partitions between 11111111 and 9999999.
CREATE INDEX idx1 ON test(c3 DESC) INCLUDE(c5, c6)
WITH (NUMREGIONS='32', STARTKEY='11111111', ENDKEY='9999999');Verify the result:
SHOW INDEX FROM test;Create a Z-ORDER secondary index
CREATE INDEX idx1 ON test(Z-ORDER(c5));Verify the result:
SHOW INDEX FROM test;Create an S2 grid secondary index
S2 indexes only support asynchronous building. Run BUILD INDEX to trigger the build after creating the index.
CREATE INDEX idx1 ON test(S2(c5, 10));
BUILD INDEX s2_idx ON test;Verify the result:
SHOW INDEX FROM test;Index a column after a data type conversion
Convert column c3 to INTEGER, then create a secondary index on the result.
CREATE INDEX idx1 ON test(CAST(c3 AS INTEGER));Verify the result:
SHOW INDEX FROM test;Search indexes
Build a search index asynchronously
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test(p1, p2, c1, c2, c3);Verify the result:
SHOW INDEX FROM test;To monitor build progress, see View the full build progress of a search index.
Index all columns
If no column properties are specified, default values apply.
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test('*');To monitor build progress, see View the full build progress of a search index.
Add index key properties
Index all columns and configure column c3 for IK tokenization:
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test('*', c3(type=text, analyzer=ik, indexed=true));To use Elasticsearch-compatible custom mapping for column c3:
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test('*', c3(mapping='{
"type": "text",
"analyzer": "ik_max_word"
}'));The mapping parameter applies only to Elasticsearch-compatible versions and overrides all other properties for the column.
To monitor build progress, see View the full build progress of a search index.
Set the index state and shard count
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test(c1, c3(type=text, analyzer=ik))
WITH (indexState=ACTIVE, numShards=4);To monitor build progress, see View the full build progress of a search index.
Set custom index settings
Create a search index with ZSTD compression and a 10-second refresh interval:
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test(c1, c3(type=text, analyzer=ik))
WITH (indexState=ACTIVE, INDEX_SETTINGS='{
"index": {
"codec": "zstd",
"refresh_interval": "10s"
}
}');To monitor build progress, see View the full build progress of a search index.
Create a time-partitioned search index
Partition by column c4, starting from 30 days ago, with a new partition every 7 days and a 90-day retention period.
CREATE INDEX IF NOT EXISTS idx2 USING SEARCH ON test(c1, c2, c3, c4)
PARTITION BY RANGE TIME(c4) PARTITIONS 16
WITH (
indexState=ACTIVE,
RANGE_TIME_PARTITION_START='30',
RANGE_TIME_PARTITION_INTERVAL='7',
RANGE_TIME_PARTITION_TTL='90',
RANGE_TIME_PARTITION_MAX_OVERLAP='90'
);To monitor build progress, see View the full build progress of a search index.
Store raw data in the search index
By default, search indexes filter data but do not store raw column values. Enable raw data storage to query directly through the search engine.
Store all index columns:
CREATE INDEX idx2 USING SEARCH ON test(c1, c2, c3, c4)
WITH (SOURCE_SETTINGS='{
"enabled": true
}');Verify the result:
SHOW INDEX FROM test;Store a subset of columns (c2, c3, c4, excluding c1):
CREATE INDEX idx2 USING SEARCH ON test(c1, c2, c3, c4)
WITH (SOURCE_SETTINGS='{
"includes": ["c*"],
"excludes": ["c1"]
}');Verify the result:
SHOW INDEX FROM test;To monitor build progress, see View the full build progress of a search index.