In-memory column index (IMCI) accelerates complex, time-consuming SQL queries in hybrid transactional and analytical processing (HTAP) workloads. This guide walks you through the setup process and covers advanced options for fine-tuning IMCI to meet your requirements.
Prerequisites
Before you begin, ensure that you have:
A PolarDB for MySQL cluster
Permissions to add or remove nodes and modify cluster endpoint settings
Quick start
Complete the following steps to enable HTAP query acceleration on your cluster.
Step 1: Add a read-only IMCI node
Log on to the PolarDB console, select the region where your cluster is located, and find the target cluster in the Clusters list. In the Actions column, select Add/Remove Node to add a read-only IMCI node.

If you already added read-only IMCI nodes when purchasing the cluster, skip this step. Your cluster must also meet the version requirements for IMCI. For details, see Add a read-only column store node.
Step 2: Configure request routing
Choose how to route queries between row store nodes and IMCI nodes.
Automatic request distribution among row store and IMCI nodes
PolarDB automatically routes each read request to either a read-only IMCI node or a read-only row store node based on the estimated execution cost (number of rows scanned). Use this mode when a single application sends both OLAP and OLTP requests.
On the cluster details page, go to the Database Connections section. For the Cluster Endpoint, set Transactional/Analytical Processing Splitting to On.

This setting takes effect on new connections only. Reconnect to the cluster after enabling it.
DMS connects through the primary endpoint by default. To use IMCI with DMS, change the connection to use the cluster endpoint.
The default routing threshold is 50,000 rows scanned. Adjust this threshold based on your workload.
Manual request distribution among row store and IMCI nodes
If different applications handle OLAP and OLTP requests separately, create a custom endpoint that includes only Read-only IMCI Nodes as service nodes. Point your OLAP application to this endpoint.
On the cluster details page, go to the Database Connections section and create a custom endpoint. Set the service nodes to Read-only IMCI Nodes only.

Step 3: Add IMCIs to your tables
All columns referenced in a SQL statement must be covered by an IMCI to accelerate that query.
Manually add IMCIs
IMCI supports the following DDL statements to add or remove column indexes on your tables:
| DDL syntax | Example |
|---|---|
| Create IMCIs when creating a table | -- Add IMCI for a specific column<br>CREATE TABLE <table_name>(<br> <column_name_1> INT COMMENT 'COLUMNAR=1',<br> <column_name_2> VARCHAR(100)<br>) ENGINE InnoDB;<br><br>-- Add IMCI for the entire table<br>CREATE TABLE <table_name>(<br> <column_name_1> INT,<br> <column_name_2> VARCHAR(100)<br>) ENGINE InnoDB COMMENT 'COLUMNAR=1'; |
| Dynamically add or delete IMCIs | -- Add IMCI for a specific column<br>ALTER TABLE <table_name> MODIFY COLUMN <column_name_1> INT COMMENT 'COLUMNAR=1';<br><br>-- Add IMCI for the entire table<br>ALTER TABLE <table_name> COMMENT 'COLUMNAR=1'; |
| Manage table-level IMCIs | -- Add IMCI for an entire table<br>CREATE COLUMNAR INDEX ON <db_name>.<table_name>;<br><br>-- Add IMCI for an entire database<br>CREATE COLUMNAR INDEX FOR TABLES IN <db_name>; |
To preserve existing table comments, appendCOLUMNAR=1before them:ALTER TABLE <table_name> COMMENT 'COLUMNAR=1 <original comment>';
When adding IMCIs in batches at the database or table level, the table comment is automatically updated to 'COLUMNAR=1 <original comment>'.Automatically add IMCIs
The AutoIndex feature monitors slow queries and automatically creates IMCIs to accelerate them. As workloads change, AutoIndex continuously adjusts the IMCI strategy to maintain optimal performance — without requiring manual query analysis.
On the cluster details page, enable the AutoIndex feature:

Step 4 (Optional): Check IMCI build progress
After adding IMCIs manually, verify that the build is complete before expecting query acceleration. Run the following query:
SELECT * FROM INFORMATION_SCHEMA.IMCI_ASYNC_DDL_STATS;The output looks similar to this:
+-------------+------------+---------------------+---------------------+---------------------+--------------+------------------+-----------------+-------------+-------------+-------------+------------+--------------+-----------+-------------------+-----------------+
| SCHEMA_NAME | TABLE_NAME | CREATED_AT | STARTED_AT | FINISHED_AT | STATUS | APPROXIMATE_ROWS | SCANNED_ROWS | SCAN_SECOND | SORT_ROUNDS | SORT_SECOND | BUILD_ROWS | BUILD_SECOND | AVG_SPEED | SPEED_LAST_SECOND | ESTIMATE_SECOND |
+-------------+------------+---------------------+---------------------+---------------------+--------------+------------------+-----------------+-------------+-------------+-------------+------------+--------------+-----------+-------------------+-----------------+
| tpch | lineitem | 2024-10-21 13:44:02 | 2024-10-21 13:44:02 | 2024-10-21 13:50:11 | Safe to read | 590446240 | 600037902(100%) | 369 | 0 | 0 | 0(0%) | 0 | 1625058 | 0 | 0 |
+-------------+------------+---------------------+---------------------+---------------------+--------------+------------------+-----------------+-------------+-------------+-------------+------------+--------------+-----------+-------------------+-----------------+
1 row in set, 1 warning (0.00 sec)Check the STATUS column to determine build progress:
`Safe to read`: The IMCI is fully built and queries on this table can be accelerated.
Any other status: The build is still in progress. Wait and re-run the query.
For more information, see View DDL execution speed and build progress for IMCIs.
Step 5 (Optional): Verify that queries use IMCI
Run EXPLAIN on your query to confirm it uses the IMCI execution path:
EXPLAIN SELECT COUNT(*) FROM t1 WHERE t1.a > 1;IMCI execution plan (horizontal tree format)
If the query uses IMCI, the output shows a horizontal tree structure:
+----+----------------------------+----------+--------+--------+-----------------------------------------------------------------------------+
| ID | Operator | Name | E-Rows | E-Cost | Extra Info |
+----+----------------------------+----------+--------+--------+-----------------------------------------------------------------------------+
| 1 | Select Statement | | | | IMCI Execution Plan (max_dop = 4, max_query_mem = 858993459) |
| 2 | └─Sort | | | | Sort Key: revenue DESC,o_orderdate ASC |
| 3 | └─Hash Groupby | | | | Group Key: (lineitem.L_ORDERKEY, orders.O_ORDERDATE, orders.O_SHIPPRIORITY) |
| 4 | └─Hash Join | | | | Join Cond: orders.O_ORDERKEY = lineitem.L_ORDERKEY |
| 5 | ├─Hash Join | | | | Join Cond: customer.C_CUSTKEY = orders.O_CUSTKEY |
| 6 | │ ├─Table Scan | customer | | | Cond: (C_MKTSEGMENT = "BUILDING") |
| 7 | │ └─Table Scan | orders | | | Cond: (O_ORDERDATE < 03/24/1995) |
| 8 | └─Table Scan | lineitem | | | Cond: (L_SHIPDATE > 03/24/1995) |
+----+----------------------------+----------+--------+--------+-----------------------------------------------------------------------------+
8 rows in set (0.01 sec)The key signal is the Extra Info field on row 1: IMCI Execution Plan. This confirms the query is running on IMCI.
Row store execution plan
If the query uses the row store instead, the output follows the standard MySQL tabular format without the IMCI Execution Plan header:
+----+-------------+----------+------------+------+--------------------+------------+---------+-----------------------------+--------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+--------------------+------------+---------+-----------------------------+--------+----------+----------------------------------------------+
| 1 | SIMPLE | customer | NULL | ALL | PRIMARY | NULL | NULL | NULL | 147630 | 10.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | orders | NULL | ref | PRIMARY,ORDERS_FK1 | ORDERS_FK1 | 4 | tpch100g.customer.C_CUSTKEY | 14 | 33.33 | Using where |
| 1 | SIMPLE | lineitem | NULL | ref | PRIMARY | PRIMARY | 4 | tpch100g.orders.O_ORDERKEY | 4 | 33.33 | Using where |
+----+-------------+----------+------------+------+--------------------+------------+---------+-----------------------------+--------+----------+----------------------------------------------+
3 rows in set, 1 warning (0.00 sec)For more information, see Check whether a query can use IMCIs.
Auxiliary tools
When troubleshooting IMCI coverage for a specific query, use the following built-in stored procedures to check coverage and generate the DDL needed to add missing IMCIs.
Check IMCI coverage for a query:
dbms_imci.check_columnar_index('<query_string>');See Check whether an IMCI covers all columns in an SQL statement.Generate DDL to add a missing IMCI:
dbms_imci.columnar_advise('<query_string>');anddbms_imci.columnar_advise_by_columns('<query_string>');See Get the DDL statement to create an IMCI.Batch-generate DDL for multiple queries:
dbms_imci.columnar_advise_begin();,dbms_imci.columnar_advise_show();, anddbms_imci.columnar_advise_end();See Batch-get DDL statements to create IMCIs.
FAQ
Advanced usage
After completing the quick start, use the following features to optimize IMCI for your workload.
| Feature | Description |
|---|---|
| Configure sort keys for IMCIs | IMCI organizes data into row groups. In each row group, column data blocks are built in parallel based on the primary key order of the raw row data, which results in an overall unordered state. Adding a sort key reorders the column data blocks to improve query performance for specific access patterns. |
| Analyze JSON-formatted data | IMCI integrates column-oriented storage for JSON, virtual columns, Instant DDL, and extended maximum column counts for large-scale structured and semi-structured data analysis. |
| Accelerate ETL with IMCI | The ETL feature enables IMCI on read-write (RW) nodes. SELECT statements issued on RW nodes are routed to read-only IMCI nodes for acceleration, and the results are transferred back to the RW node through the internal network for writing to the target table. |
| Enable serverless on a read-only IMCI node | Serverless automatically scales the IMCI node up during peak hours and down during off-peak hours to reduce costs. |
| Accelerate wide table queries with hybrid plans | A hybrid plan uses both column indexes and row indexes in a single query. IMCI-eligible sub-plans run on IMCI to produce intermediate results containing only primary key information, then the InnoDB primary index retrieves the remaining columns. |
| Scale out with multi-node MPP | For complex queries on massive datasets where a single IMCI node is insufficient, multi-node parallel processing distributes the query across multiple IMCI nodes. |
Related information
For a deeper understanding of how IMCI works: