All Products
Search
Document Center

PolarDB:IMCI full-text index

Last Updated:Mar 28, 2026

PolarDB for MySQL's In-Memory Column Index (IMCI) supports full-text indexes built on inverted indexes. Once enabled, you can run millisecond-level text searches using MATCH...AGAINST syntax or accelerated LIKE queries — without a full table scan.

With IMCI full-text indexes, you can:

  • Search text columns directly using SQL — no separate search engine required

  • Choose from six tokenizers to match your language and query pattern (English, Chinese, N-gram, JSON, and more)

  • Accelerate existing LIKE '%keyword%' queries with no code changes

  • Verify index usage instantly by checking for the FtsTableScan operator in EXPLAIN output

For background on how IMCI full-text search works internally, see Analysis of Full-Text Retrieval Capabilities in Columnstore Indexes.

Version requirements

PolarDB for MySQL versionMinimum minor version
8.0.18.0.1.1.52
8.0.28.0.2.2.32

Prerequisites

Before you begin, ensure that you have:

  • A PolarDB for MySQL cluster that meets the version requirements above

  • A table with the column store enabled (COMMENT 'columnar=1')

  • The imci_enable_fts_query system variable set to ON for the session where you run queries

Choose a tokenizer

Select a tokenizer based on your data type and search pattern. Each tokenizer splits text differently and suits different query scenarios.

Tokenizertype valueBest for
token0 (default)English text or formatted strings — splits on spaces and punctuation
ngram1Fuzzy matching across any language — splits into fixed-length character chunks (requires len parameter)
jieba2Semantic Chinese search — dictionary-based tokenizer
ik3Chinese search — widely used open-source IK Analyzer
json4JSON columns — extracts content using a JSONPath expression (requires expr parameter)
whole5Exact-match queries (= or IN) — indexes the entire column as a single term

Not sure which to use? Preview how each tokenizer processes your text before creating the index:

CALL dbms_imci.fts_tokenize("I am PolarDB");
-- Result: ["i", "am", "polardb"]  (default: token, type=0)

CALL dbms_imci.fts_tokenize("I am PolarDB", "type=1");
-- Result: ["i ", " a", "am", "m ", " p", "po", "ol", "ar", "rd", "db"]  (ngram)

CALL dbms_imci.fts_tokenize("I am PolarDB", "type=2");
-- Result: ["PolarDB"]  (jieba, accurate mode)

CALL dbms_imci.fts_tokenize("I am PolarDB", "type=2,mode=1");
-- Result: ["polardb"]  (jieba, full mode)

Create a full-text index

Set the table-level COMMENT to columnar=1 and the column-level COMMENT to imci_fts(type=VALUE) to define an inverted index on that column.

Create the index at table creation time:

CREATE TABLE t1 (
    id    INT PRIMARY KEY,
    title VARCHAR(32) COMMENT "imci_fts(type=2)"
) CHARSET utf8mb4 COMMENT 'columnar=1';

Add or modify an index on an existing column:

ALTER TABLE t1 MODIFY title VARCHAR(32) COMMENT "imci_fts(type=2,mode=0)";
Important

Changing the column COMMENT may trigger a full rebuild of the inverted index. For large tables, run this operation during off-peak hours.

After the ALTER completes, IMCI builds the index in the background. Run queries only after the build finishes — check progress using the monitoring commands in Monitor index build status.

Remove an index:

ALTER TABLE t1 MODIFY title VARCHAR(255) COMMENT 'imci_fts(enable=0)';

Index configuration parameters

Use these parameters in the imci_fts(...) column COMMENT to control tokenizer behavior and build settings.

ParameterDefaultDescriptionApplicable tokenizers
type0Tokenizer type. See Choose a tokenizer.All
enable11: create the index. 0: delete the index.All
lenToken length for the N-gram tokenizer. Valid range: [2, 256).type=1 only
mode0Tokenizer mode. jieba (type=2): 0 = accurate (default), 1 = full, 2 = search engine. IK (type=3): 0 = smart (default), 1 = most fine-grained. JSON (type=4): 0 = array mode.type=2, 3, 4
score01: enable term frequency (TF) and document frequency (DF) calculation to support relevance-score sorting for MATCH...AGAINST. 0: disable (default).All
stop_word01: enable stop-word filtering. 0: disable (default).All
case_sensitive01: case-sensitive matching. 0: case-insensitive (default).All
seg_size0Inverted index segment size. 0: use the imci_fts_build_segment_size system variable. Other values: set the exact size.All
pack_cnt_min0Minimum column store data blocks per index build. 0: use imci_fts_build_packcnt_min.All
pack_cnt_max0Maximum column store data blocks per index build. 0: use imci_fts_build_packcnt_max.All

Run full-text queries

Enable full-text query support for the session before running queries:

SET imci_enable_fts_query = ON;

Use MATCH...AGAINST

MATCH...AGAINST is the primary syntax for full-text search. It delivers high performance and supports Boolean queries and relevance scoring.

-- Insert sample data
INSERT INTO t1 VALUES
    (16, 'polarDB full-text index feature title'),
    (17, 'database title performance optimization');

-- Find rows where the title column contains "title"
SELECT * FROM t1 WHERE MATCH(title) AGAINST("title");

Run EXPLAIN to confirm the query uses the full-text index. Look for the FtsTableScan operator:

EXPLAIN SELECT * FROM t1 WHERE MATCH(title) AGAINST("title") AND id > 10;
+----+------------------------+------+-----------------------------------------------------------------+
| ID | Operator               | Name | Extra Info                                                      |
+----+------------------------+------+-----------------------------------------------------------------+
|  1 | Select Statement       |      | IMCI Execution Plan (max_dop = 32, max_query_mem = 41230008320) |
|  2 | └─Compute Scalar       |      |                                                                 |
|  3 |   └─FILTER             |      | Cond: (t1.id > 10)                                              |
|  4 |     └─FtsTableScan     | t1   | Term: ("title") Fallback: (t1.title LIKE "%title%")             |
+----+------------------------+------+-----------------------------------------------------------------+

The Fallback field shows that for incremental data not yet indexed, the query automatically falls back to a LIKE scan to ensure complete results.

If FtsTableScan is absent from the output, the query is not using the full-text index. The most common cause is that imci_enable_fts_query is set to OFF. See FAQ for additional troubleshooting steps.

Accelerate LIKE queries

To accelerate LIKE '%keyword%' queries in existing application code without rewriting them, enable automatic conversion to MATCH...AGAINST:

SET imci_convert_like_to_match = ON;

After setting this, LIKE queries on indexed columns are converted automatically:

EXPLAIN SELECT * FROM t1 WHERE title LIKE "%title%";
+----+------------------------+------+-----------------------------------------------------------------+
| ID | Operator               | Name | Extra Info                                                      |
+----+------------------------+------+-----------------------------------------------------------------+
|  1 | Select Statement       |      | IMCI Execution Plan (max_dop = 32, max_query_mem = 41230008320) |
|  2 | └─Compute Scalar       |      |                                                                 |
|  3 |   └─FILTER             |      | Cond: (t1.title LIKE "%title%")                                 |
|  4 |     └─FtsTableScan     | t1   | Term: ("title") Fallback: (t1.title LIKE "%title%")             |
+----+------------------------+------+-----------------------------------------------------------------+

FtsTableScan in the output confirms the LIKE query was accelerated.

Fall back from MATCH to LIKE

If a full-text index does not exist on the column, enable automatic conversion of MATCH to LIKE so queries still return correct results:

SET imci_enable_query_fts_like = ON;
EXPLAIN SELECT * FROM t1 WHERE MATCH(title) AGAINST("title");
+----+----------------------+------+-----------------------------------------------------------------+
| ID | Operator             | Name | Extra Info                                                      |
+----+----------------------+------+-----------------------------------------------------------------+
|  1 | Select Statement     |      | IMCI Execution Plan (max_dop = 32, max_query_mem = 41230008320) |
|  2 | └─Compute Scalar     |      |                                                                 |
|  3 |   └─Table Scan       | t1   | Cond: (title LIKE "%title%")                                    |
+----+----------------------+------+-----------------------------------------------------------------+

The execution plan shows a full table scan. This guarantees correct results but does not use the index.

Monitor index build status

After creating or modifying a full-text index, track the build progress using these commands:

-- List all inverted indexes
SHOW imci indexes fulltext;
SELECT * FROM information_schema.imci_fts_indexes;

-- Filter to a specific table
SHOW imci indexes fulltext FOR [db_name].[table_name];
SELECT * FROM information_schema.imci_fts_indexes
    WHERE schema_name = '[db_name]' AND table_name = '[table_name]';

-- View index metadata
SELECT * FROM information_schema.imci_fts_index_metas
    WHERE schema_name = '[db_name]' AND table_name = '[table_name]' AND column_name = '[column_name]';

-- View segment data
SELECT * FROM information_schema.imci_fts_index_segs
    WHERE schema_name = '[db_name]' AND table_name = '[table_name]' AND column_name = '[column_name]';

-- View column store data
SELECT * FROM information_schema.imci_fts_index_packs
    WHERE schema_name = '[db_name]' AND table_name = '[table_name]' AND column_name = '[column_name]';

System variables

These system variables control global index build behavior and query conversion. Set global variables in the PolarDB console — they cannot be changed from the command line. Session-scoped variables can be set per connection.

VariableScopeDefaultDescription
imci_enable_ftsGlobalONAllow inverted indexes on column store nodes. Supports string and JSON data types.
imci_enable_fts_queryGlobal/SessionOFFEnable full-text index queries. Must be set to ON before running MATCH...AGAINST.
imci_fts_build_pack_cnt_minGlobal8Minimum column store data blocks per index build. Range: 0–8192. Set to 0 to pause the build.
imci_fts_build_pack_cnt_maxGlobal128Maximum column store data blocks per index build. Range: 0–8192.
imci_fts_build_segment_sizeGlobal536870912 (512 MB)Segment size for inverted index builds, in bytes. Range: 0–4294967295.
imci_fts_lru_cache_capacityGlobalDBNodeClassMemory × 10%LRU cache capacity for inverted index dictionaries. Range: [DBNodeClassMemory × 10%, DBNodeClassMemory × 50%].
imci_enable_fts_prunerGlobal/SessionONEnable pre-filtering optimization for inverted indexes.
imci_convert_like_to_matchGlobal/SessionOFFConvert LIKE queries to MATCH...AGAINST to use the inverted index. Values: ON, OFF, ONLY_MATCH (enable and ignore the original LIKE expression).
imci_enable_query_fts_likeGlobal/SessionOFFConvert MATCH...AGAINST to LIKE when no full-text index exists.
imci_enable_match_expr_fallbackGlobal/SessionONAllow queries to fall back to a slower execution path when the inverted index is unavailable or still building.

FAQ

My query isn't using the full-text index. What's wrong?

The most common cause is that imci_enable_fts_query is still set to OFF. Run SET imci_enable_fts_query = ON; and try again.

If the index is still not used, check the EXPLAIN output. If FtsTableScan is absent, the query pattern may not match the index, or the query optimizer chose a full table scan due to lower estimated cost.

Which tokenizer should I use?

Start with the default type=0 (token) for English or structured text. For Chinese content, use type=2 (jieba) in accurate mode for semantic search, or type=3 (IK Analyzer) as an alternative. For JSON columns, use type=4 (json) with the expr parameter to specify the JSONPath to index.

To confirm your choice before creating the index, preview tokenization results with dbms_imci.fts_tokenize. See Choose a tokenizer.

What is the difference between `LIKE` and `MATCH...AGAINST`?

LIKE '%keyword%' without a full-text index causes a full table scan. MATCH...AGAINST uses the inverted index for fast lookups and also supports Boolean queries and relevance scoring. For new applications, use MATCH...AGAINST directly. For existing code that uses LIKE, enable imci_convert_like_to_match to get index acceleration without rewriting queries.

What's next