In-memory column index (IMCI) provides a simple solution to accelerate complex and time-consuming SQL statements in your business. Refer to Quick start to configure the HTAP load processing capability of your cluster. Additionally, refer to Advanced usage to customize IMCI to meet specific business requirements.
Quick start
1. Add a read-only IMCI node
Log on to the PolarDB console, select the region where your cluster is located, and find the target cluster in the Clusters list. In the Actions column, select Add/Remove Node to add a read-only IMCI node.
If you have already added read-only IMCI nodes when purchasing the cluster, skip this step.
Your cluster must meet the version requirements for IMCI. For more information, see Add a read-only column store node.
2. Configure request distribution among row store and IMCI nodes
You can choose automatic or manual request distribution to use IMCI.
Automatic request distribution among row store and IMCI nodes: If your application sends requests to access both OLAP and OLTP services, you can configure automatic distribution of read requests based on the estimated execution cost (number of rows scanned by a request) to route them to either read-only IMCI nodes or read-only row store nodes.
Manual request distribution among row store and IMCI nodes: If different applications are used to send requests for OLAP and OLTP services, you can configure different cluster endpoints for the applications, and then associate row store and IMCI nodes with different cluster endpoints to distribute the requests.
Automatic request distribution among row store and IMCI nodes
Go to the cluster details page, in the Database Connections section, set Transactional/Analytical Processing Splitting of the Cluster Endpoint to On.
After you turn on this feature, it takes effect on new connections. Reconnect to the cluster after you enable the feature.
DMS uses the primary endpoint by default. If you use DMS to connect to a cluster, you must change the connection method to using a cluster endpoint.
The default estimated execution cost (number of rows scanned) for automatic request distribution is 50,000. You can adjust this parameter based on your actual business requirements.
Manual request distribution among row store and IMCI nodes
Go to the cluster details page, in the Database Connections section, create a custom endpoint. The cluster can contain only Read-only IMCI Nodes as service nodes.
3. Add IMCIs
You can use manual or automatic methods to add IMCIs to your business tables.
All columns involved in a SQL statement must be covered by the IMCI to accelerate the query.
Manually add IMCIs
IMCI provides comprehensive DDL statements to support adding or removing IMCIs for your business tables:
DDL Syntax | Examples |
| |
Execute DDL statements to dynamically create and delete an IMCI | |
|
When adding IMCIs, you can keep the original comments. For example:
ALTER TABLE <table_name> COMMENT 'COLUMNAR=1 <original comment>';.By default, when IMCIs are added in batches at the database/table level, the table comment is modified to
'COLUMNAR=1 <original comment>'.
Automatically add IMCIs
The automatic acceleration (AutoIndex) feature provided by IMCI can automatically create IMCIs based on your slow SQL statements, significantly improving the execution speed of slow SQL statements without requiring you to deeply understand each slow SQL statement for optimization. As the application workload changes, the automatic IMCI acceleration feature continuously monitors and adjusts the IMCI strategy to ensure that the PolarDB cluster maintains optimal performance.
You only need to Enable the automatic IMCI-based query acceleration feature on the PolarDB cluster details page:

(Optional) 4. Check the IMCI build progress
After manually adding IMCI, you need to view DDL execution speed and build progress for IMCIs. After the IMCI build is complete, the queries can be accelerated.
SELECT * FROM INFORMATION_SCHEMA.IMCI_ASYNC_DDL_STATS;For example, you can confirm whether the IMCI has been built by checking if the STATUS is Safe to read.
+-------------+------------+---------------------+---------------------+---------------------+--------------+------------------+-----------------+-------------+-------------+-------------+------------+--------------+-----------+-------------------+-----------------+
| SCHEMA_NAME | TABLE_NAME | CREATED_AT | STARTED_AT | FINISHED_AT | STATUS | APPROXIMATE_ROWS | SCANNED_ROWS | SCAN_SECOND | SORT_ROUNDS | SORT_SECOND | BUILD_ROWS | BUILD_SECOND | AVG_SPEED | SPEED_LAST_SECOND | ESTIMATE_SECOND |
+-------------+------------+---------------------+---------------------+---------------------+--------------+------------------+-----------------+-------------+-------------+-------------+------------+--------------+-----------+-------------------+-----------------+
| tpch | lineitem | 2024-10-21 13:44:02 | 2024-10-21 13:44:02 | 2024-10-21 13:50:11 | Safe to read | 590446240 | 600037902(100%) | 369 | 0 | 0 | 0(0%) | 0 | 1625058 | 0 | 0 |
+-------------+------------+---------------------+---------------------+---------------------+--------------+------------------+-----------------+-------------+-------------+-------------+------------+--------------+-----------+-------------------+-----------------+
1 row in set, 1 warning (0.00 sec)(Optional) 5. Check if SQL statements use IMCI
The IMCI execution plan is output in a horizontal tree format, which is significantly different from the row store execution plan output format. You can use the EXPLAIN statement to determine whether to use IMCIs to accelerate SQL queries. See the following examples:
IMCI execution plan (horizontal tree format)
+----+----------------------------+----------+--------+--------+-----------------------------------------------------------------------------+
| ID | Operator | Name | E-Rows | E-Cost | Extra Info |
+----+----------------------------+----------+--------+--------+-----------------------------------------------------------------------------+
| 1 | Select Statement | | | | IMCI Execution Plan (max_dop = 4, max_query_mem = 858993459) |
| 2 | └─Sort | | | | Sort Key: revenue DESC,o_orderdate ASC |
| 3 | └─Hash Groupby | | | | Group Key: (lineitem.L_ORDERKEY, orders.O_ORDERDATE, orders.O_SHIPPRIORITY) |
| 4 | └─Hash Join | | | | Join Cond: orders.O_ORDERKEY = lineitem.L_ORDERKEY |
| 5 | ├─Hash Join | | | | Join Cond: customer.C_CUSTKEY = orders.O_CUSTKEY |
| 6 | │ ├─Table Scan | customer | | | Cond: (C_MKTSEGMENT = "BUILDING") |
| 7 | │ └─Table Scan | orders | | | Cond: (O_ORDERDATE < 03/24/1995) |
| 8 | └─Table Scan | lineitem | | | Cond: (L_SHIPDATE > 03/24/1995) |
+----+----------------------------+----------+--------+--------+-----------------------------------------------------------------------------+
8 rows in set (0.01 sec)Row store execution plan
+----+-------------+----------+------------+------+--------------------+------------+---------+-----------------------------+--------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+--------------------+------------+---------+-----------------------------+--------+----------+----------------------------------------------+
| 1 | SIMPLE | customer | NULL | ALL | PRIMARY | NULL | NULL | NULL | 147630 | 10.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | orders | NULL | ref | PRIMARY,ORDERS_FK1 | ORDERS_FK1 | 4 | tpch100g.customer.C_CUSTKEY | 14 | 33.33 | Using where |
| 1 | SIMPLE | lineitem | NULL | ref | PRIMARY | PRIMARY | 4 | tpch100g.orders.O_ORDERKEY | 4 | 33.33 | Using where |
+----+-------------+----------+------------+------+--------------------+------------+---------+-----------------------------+--------+----------+----------------------------------------------+
3 rows in set, 1 warning (0.00 sec)Auxiliary tools
When using IMCI to query complex SQL statements, you need to check whether all columns in the SQL statement are covered by IMCIs. If you find columns that are not covered by IMCI, you can obtain the DDL statement used to create an IMCI, or batch obtain the DDL statements used to create IMCIs. After executing the obtained DDL statements, ensure that all columns in the SQL statement are covered by IMCIs for query acceleration.
The built-in stored procedures in PolarDB clusters are as follows:
The stored procedure to check whether an IMCI is created for a table in an SQL statement:
dbms_imci.check_columnar_index('<query_string>');.The stored procedures to obtain the DDL statement used to create an IMCI:
dbms_imci.columnar_advise('<query_string>');anddbms_imci.columnar_advise_by_columns('<query_string>');.The stored procedures to batch obtain the DDL statements used to create IMCIs:
dbms_imci.columnar_advise_begin();,dbms_imci.columnar_advise_show();, anddbms_imci.columnar_advise_end();.
FAQ
For more information, see IMCI FAQ.
Advanced usage
Use the following features to optimize the use of IMCI.
Feature | Description |
The basic organizational unit of IMCI data is the row group. In each row group, each column is packed into column data blocks. The column data blocks are built in parallel based on the primary key order of the raw row data, which results in an overall unordered state. To improve query performance, you can configure a sort key to sort the data of the column data blocks. | |
IMCI integrates a series of related features such as Column-oriented storage for JSON-formatted data, Virtual columns, Instant DDL, and Extended maximum number of columns in a table to handle massive structured and semi-structured data analysis scenarios. | |
The ETL (Extract Transform Load) feature allows you to use IMCI on read-write (RW) nodes. The | |
Enable the serverless feature on a read-only column store node | The Serverless feature automatically scales the cluster based on business workload. A cluster is automatically scaled up during peak hours to handle traffic spikes. During off-peak hours, the cluster is automatically scaled down to reduce costs. |
Hybrid Plan refers to a query method that uses both column indexes and row indexes in the same query statement. Hybrid Plan significantly improves the speed of wide table queries. In the execution plan, statements suitable for IMCIs are executed through IMCIs to obtain intermediate results that only contain primary key information. Finally, the required column information is queried and output by combining the primary key with the InnoDB primary index. | |
For complex queries on massive data, a single read-only IMCI node may no longer meet performance requirements. You can use multi-node parallel processing for query acceleration. |
Related information
If you are interested in how IMCIs work, you can refer to the following documents for a deeper understanding: