All Products
Search
Document Center

PolarDB:FAQ for In-Memory Column Index

Last Updated:Dec 19, 2025

This topic summarizes frequently asked questions (FAQ) about the PolarDB for MySQL In-Memory Column Index (IMCI) feature.

How to use the PolarDB for MySQL In-Memory Column Index feature?

To use the IMCI feature to accelerate queries, follow these steps:

  1. Add a column store read-only node to your PolarDB for MySQL cluster and enable the IMCI feature on it. For more information about how to add a read-only node with IMCI, see Add a column store read-only node.

  2. Add columnstore indexes to the tables you want to accelerate using the CREATE TABLE or ALTER TABLE statement to include COLUMNAR=1 in the COMMENT field of the table. After the columnstore index is ready, the optimizer automatically determines whether to use it based on execution cost. For more information about the syntax for adding a columnstore index, see DDL syntax for creating a columnstore index when creating a table.

  3. Forward SQL statements to the column store node. If the query cost exceeds a specific threshold, the optimizer automatically uses the columnstore index. For more information about automatic and manual SQL statement forwarding, see Configure a cluster endpoint to distribute requests between row store and column store.

How to check the status of a columnstore index?

After you use the ALTER TABLE statement to dynamically add a columnstore index to an existing table, the index is built asynchronously on the column store read-only node. Connect to a cluster endpoint with request distribution enabled or connect directly to the column store node. Then, query the INFORMATION_SCHEMA.IMCI_INDEXES table to obtain the build status of the columnstore index. Only columnstore indexes in the COMMITTED state can be used for queries. For a columnstore index that is being built, query INFORMATION_SCHEMA.IMCI_ASYNC_DDL_STATS to get information about the index build progress. For more information, see View the index status.

Note

When you use DMS to log on to a database, you connect to the primary endpoint of the cluster by default. To connect to a cluster endpoint or directly to a column store node, follow these instructions:

  • Connect to a cluster endpoint.

    Log on to Data Management (DMS) 5.0. On the Add Instance page, set Entry Mode to Connection String Address and enter the cluster endpoint. For more information, see Add an ApsaraDB instance.

  • Connect directly to a column store node.

    First, add a custom cluster endpoint for the target column store node. This custom endpoint must contain only the target column store node. Then, log on to Data Management (DMS) 5.0. On the Add Instance page, set Entry Mode to Connection String Address and enter the custom cluster endpoint of the column store read-only node. For more information, see Add an ApsaraDB instance.

How to confirm that a query used a columnstore index?

Use the EXPLAIN statement to view the execution plan of an SQL statement. If the execution plan contains IMCI Execution Plan, the SQL statement used a columnstore index to accelerate the query. The following is an example:

*************************** 1. row ***************************
IMCI Execution Plan (max_dop = 8, max_query_mem = 3435134976):
Project | Exprs: temp_table3.lineitem.L_ORDERKEY, temp_table3.SUM(lineitem.L_EXTENDEDPRICE * 1.00 - lineitem.L_DISCOUNT), temp_table3.orders.O_ORDERDATE, temp_table3.orders.O_SHIPPRIORITY
  Sort | Exprs: temp_table3.SUM(lineitem.L_EXTENDEDPRICE * 1.00 - lineitem.L_DISCOUNT) DESC,temp_table3.orders.O_ORDERDATE ASC
    HashGroupby | OutputTable(3): temp_table3 | Grouping: lineitem.L_ORDERKEY orders.O_ORDERDATE orders.O_SHIPPRIORITY | Output Grouping: lineitem.L_ORDERKEY, orders.O_ORDERDATE, orders.O_SHIPPRIORITY | Aggrs: SUM(lineitem.L_EXTENDEDPRICE * 1.00 - lineitem.L_DISCOUNT)
      HashJoin | HashMode: DYNAMIC | JoinMode: INNER | JoinPred: orders.O_ORDERKEY = lineitem.L_ORDERKEY
        HashJoin | HashMode: DYNAMIC | JoinMode: INNER | JoinPred: orders.O_CUSTKEY = customer.C_CUSTKEY
          CTableScan | InputTable(0): orders | Pred: (orders.O_ORDERDATE < 03/24/1995 00:00:00.000000)
          CTableScan | InputTable(1): customer | Pred: (customer.C_MKTSEGMENT = "BUILDING")
        CTableScan | InputTable(2): lineitem | Pred: (lineitem.L_SHIPDATE > 03/24/1995 00:00:00.000000)
1 row in set (0.04 sec)

The execution plan for a query that uses a columnstore index is a tree structure. Each layer represents an operator. Operators usually correspond one-to-one with operations in the SQL statement. For example, the CTableScan operator represents a scan on a table, the HashJoin operator corresponds to the JOIN part of the SQL statement, and the HashGroupby operator corresponds to the GROUP BY part. However, some operators, such as Sequence, are generated during query optimization and do not correspond to any statement in the original SQL.

Troubleshooting: Why is my query not using a columnstore index?

After you add a column store node, you must add a columnstore index to the tables that your SQL statement queries. The query cost of the SQL statement must also exceed a specific threshold. Only then will the SQL statement use the columnstore index. In addition, the SQL statement must be forwarded to the column store node to use IMCI for query acceleration. If an SQL statement cannot use a columnstore index for a query, follow these steps to troubleshoot the issue:

  1. Confirm that the SQL statement was forwarded to the column store node.

    Use the SQL Explorer feature to confirm whether the SQL statement was forwarded to the column store node.

    If you use a cluster endpoint and enable automatic request distribution among row store and column store nodes, the database proxy automatically forwards SQL statements with an estimated query cost exceeding the imci_ap_threshold value to the column store node. You can also add /*FORCE_IMCI_NODES*/ before the SELECT keyword in the SQL statement to force the statement to be forwarded to the column store node. The following is an example:

    /*FORCE_IMCI_NODES*/EXPLAIN SELECT COUNT(*) FROM t1 WHERE t1.a > 1;

    For more information, see Configure automatic request distribution between row store and column store.

    Note

    Create a new endpoint that connects directly to the column store node. This ensures that SQL statements are always forwarded to the column store node for execution.

  2. Check if the query cost is higher than the threshold.

    On the column store node, the optimizer estimates the cost of the SQL statement. If the estimated cost is higher than the set threshold cost_threshold_for_imci, the query uses the columnstore index. Otherwise, it uses the original row-store index.

    After confirming that the SQL statement was forwarded to the column store node, if the execution plan from EXPLAIN still does not show the use of a columnstore index, compare the estimated execution cost with the preset threshold. This helps determine if the estimated cost was too low to use the columnstore index. You can query the Last_query_cost variable to get the estimated execution cost of the previous SQL statement:

    EXPLAIN SELECT * FROM t1;
    SHOW STATUS LIKE 'Last_query_cost';

    If the estimated execution cost of the SQL statement is less than the preset cost_threshold_for_imci, consider adjusting cost_threshold_for_imci . For example, use a hint to adjust the preset threshold for a single SQL statement:

    /*FORCE_IMCI_NODES*/EXPLAIN SELECT /*+ SET_VAR(cost_threshold_for_imci=0) */ COUNT(*) FROM t1 WHERE t1.a > 1;
  3. Check if all required columns are fully covered by the columnstore index.

    Use the built-in stored procedure dbms_imci.check_columnar_index() to check whether a columnstore index has been created for the tables in an SQL statement. The following is an example:

    CALL dbms_imci.check_columnar_index('SELECT COUNT(*) FROM t1 WHERE t1.a > 1');

    If the SQL statement is not fully covered by a columnstore index, calling this stored procedure returns the tables and columns that are not covered. If it is fully covered, the stored procedure returns an empty result set.

  4. Check for unsupported SQL features.

    Check the list of limits to confirm whether a specific SQL feature is supported by IMCI.

If all these checks pass, the SQL statement should generally use a columnstore index for the query.

Can column store nodes use row-store indexes?

Yes. A column store read-only node can be considered a standard read-only node with the added functionality of IMCI. Therefore, a column store node can use both row-store indexes, like a standard read-only node, and columnstore indexes. The optimizer chooses which index to use based on the cost_threshold_for_imci threshold.

You can use a hint to set the query threshold for a single SQL statement and force it to use a columnstore index:

SELECT /*+ SET_VAR(cost_threshold_for_imci=0) */ COUNT(*) FROM t1 WHERE t1.a > 1;

You can also force an SQL statement not to use a columnstore index:

SELECT /*+ SET_VAR(USE_IMCI_ENGINE=OFF) */ COUNT(*) FROM t1 WHERE t1.a > 1;

How to add the right columnstore indexes for an SQL statement?

All columns used in an SQL statement must be covered by a columnstore index for that statement to use IMCI for the query. If a column involved in the SQL statement is not covered by a columnstore index, you can add one using the CREATE TABLE or ALTER TABLE statement. PolarDB for MySQL provides a series of built-in stored procedures to assist with this operation.

Use the dbms_imci.columnar_advise() stored procedure to get the required DDL statement for a specific SQL statement. Building a columnstore index according to this DDL statement ensures that the SQL statement is fully covered by the columnstore index. For more information, see Obtain the DDL statement for creating a columnstore index.

dbms_imci.columnar_advise('<query_string>');
Important

If you are using a Multi-master Cluster (Limitless) Edition, this stored procedure must be executed on a global read-only node. You can add /*force_node='<node_ID>'*/ before the SQL statement to force it to be executed on a global read-only node. For example: /*force_node='pi-bpxxxxxxxx'*/ dbms_imci.columnar_advise('<query_string>').

Use the dbms_imci.columnar_advise_begin(), dbms_imci.columnar_advise_end(), and dbms_imci.columnar_advise() interfaces to get the required DDL statements for a batch of SQL statements. For more information, see Obtain DDL statements for creating columnstore indexes in batches.

Troubleshooting high CPU or memory usage on column store nodes

  • The default configuration for IMCI allows a single SQL statement to execute concurrently and use all available CPUs. When multiple SQL statements are executed at the same time, the internal scheduler in the database schedules the SQL statements and dynamically reduces the CPU and memory limits for each SQL statement. As a result, the average CPU and memory usage of column store nodes is relatively high compared to other nodes. You can adjust the imci_max_dop parameter to control the maximum degree of parallelism for a single SQL statement, which is the maximum number of CPU cores a single SQL statement can use.

  • We recommend setting the monitoring threshold for CPU usage to 70% and the monitoring threshold for memory usage to 90%.

  • PolarDB for MySQL supports different specifications for nodes within the same cluster. You can upgrade or downgrade the configuration of a column store node independently. We recommend that a column store node has at least 8 cores and 16 GB of memory.

Do PolarDB for MySQL 5.6 and 5.7 support the In-Memory Column Index feature?

PolarDB for MySQL versions 5.6 and 5.7 do not support the IMCI feature. PolarDB for MySQL 8.0 supports the IMCI feature.

What are the limits of IMCI?/Does IMCI support full-text indexes?/Are queries that use IMCI compatible with MySQL?

Using IMCI is fully compatible with MySQL. However, some uncommon query features, such as certain spatiotemporal expressions, full-text indexes, and some forms of correlated subqueries, are not yet fully supported. SQL statements that use these features cannot use a columnstore index and will default to using a row-store index for the query. For more information about the detailed limits of IMCI, see Limits.

Can `INSERT INTO SELECT` and `CREATE TABLE AS SELECT` statements use columnstore indexes?

IMCI can only be used for queries on read-only nodes. INSERT and CREATE statements can only be executed on the primary (RW) node. To use a columnstore index for INSERT INTO SELECT or CREATE TABLE AS SELECT queries, you must use the IMCI extract, transform, and load (ETL) feature. For more information, see Use a columnstore index to accelerate ETL.

Is there a fee for the IMCI feature? Do standard read-only nodes support IMCI?

The IMCI feature itself is free of charge. However, using IMCI requires adding a separate read-only node and enabling IMCI on it. You will be charged for this new read-only node and for the additional storage that the columnstore indexes consume.

Standard read-only nodes do not support IMCI.

How much extra storage space do columnstore indexes use?

Columnstore data is organized by column, which allows for a higher compression ratio. Compared to row store, the compression ratio is typically 3 to 10 times better. This results in an additional storage increase of about 10% to 30%.

How to check the storage space used by columnstore indexes?

  • For cluster versions PolarDB for MySQL 8.0.1.1.32 and earlier, you can use the imci_columns system table in information_schema to view the storage space and column compression ratio of a table with a columnstore index. For example, to view the storage space and compression ratio of the test table with a columnstore index in the test database, run the following SQL statement:

    SELECT
      SCHEMA_NAME, TABLE_NAME,
      SUM(EXTENT_SIZE * TOTAL_EXTENT_COUNT) AS TOTAL_SIZE,
      SUM(EXTENT_SIZE * USED_EXTENT_COUNT) AS USED_SIZE,
      SUM(EXTENT_SIZE * FREE_EXTENT_COUNT) AS FREE_SIZE,
      SUM(RAW_DATA_SIZE) / SUM(FILE_SIZE) AS COMPRESS
    FROM
      information_schema.imci_columns
    WHERE SCHEMA_NAME = 'test' AND TABLE_NAME = 'test';
  • For cluster versions PolarDB for MySQL 8.0.1.1.33 and later, you can use the imci_data_files system table in information_schema to view the storage space and the imci_columns system table to view the column compression ratio. For example, to view the storage space and column compression ratio of the test table with a columnstore index in the test database, run the following SQL statements:

    • To view the storage space used by the test table with a columnstore index, run the following SQL statement:

      SELECT
          SCHEMA_NAME, TABLE_NAME,
          SUM(EXTENT_SIZE * TOTAL_EXTENT_COUNT) AS TOTAL_SIZE,
          SUM(EXTENT_SIZE * USED_EXTENT_COUNT) AS USED_SIZE,
          SUM(EXTENT_SIZE * FREE_EXTENT_COUNT) AS FREE_SIZE
      FROM
          INFORMATION_SCHEMA.IMCI_DATA_FILES
      WHERE SCHEMA_NAME = 'test' AND TABLE_NAME = 'test';
    • To view the column compression ratio, run the following SQL statement:

      SELECT
          SCHEMA_NAME, TABLE_NAME,
           SUM(RAW_DATA_SIZE) / SUM(CMP_DATA_SIZE) AS COMPRESS_RATIO
      FROM
          INFORMATION_SCHEMA.IMCI_COLUMNS
      WHERE SCHEMA_NAME = 'test' AND TABLE_NAME = 'test';

The following table describes the parameters in the preceding SQL statements.

Parameter

Description

SCHEMA_NAME

The database name.

TABLE_NAME

The table name.

EXTENT_SIZE

The size of an EXTENT. Unit: bytes.

TOTAL_EXTENT_COUNT

The total number of EXTENTs.

USED_EXTENT_COUNT

The number of used EXTENTs.

FREE_EXTENT_COUNT

The number of free EXTENTs.

RAW_DATA_SIZE

The size of the column data before compression. Unit: bytes.

FILE_SIZE

The size of the column data after compression. Unit: bytes.

Note

This parameter applies to versions of PolarDB for MySQL earlier than 8.0.1.1.33.

CMP_DATA_SIZE

The size of the column data after compression. Unit: bytes.

Note

This parameter applies to PolarDB for MySQL 8.0.1.1.33 and later.

Why does INSTANT DDL not work after adding a columnstore index?

  • In versions of PolarDB for MySQL earlier than 8.0.1.1.42 and 8.0.2.2.23, adding a column to a table with a table-level columnstore index does not use the instant column addition logic. This is because the operation involves changing the columnstore index structure and rebuilding the index data.

  • In PolarDB for MySQL 8.0.1.1.42 and later, and 8.0.2.2.23 and later, INSTANT DDL is supported on tables with table-level columnstore indexes. This feature is not compatible with the old rebuild mode. You must set the imci_enable_add_column_instant_ddl parameter to OFF. The table must also have a primary key.

How to view or delete columnstore indexes added by Autoindex?

SELECT * FROM  information_schema.imci_autoindex_executed;

The method for deleting a columnstore index automatically added by Autoindex is the same as for deleting a manually added columnstore index:

ALTER TABLE t1 comment 'columnar=0';

Why does it take longer to add or drop columns with `ALTER TABLE` after adding a columnstore index?

When you add or drop columns, the table data is usually rebuilt. If the original table has a columnstore index, the columnstore index data must also be rebuilt. The process of rebuilding the columnstore index data requires writing to the Redo log. A columnstore index often covers many columns, so the amount of Redo log data generated during the rebuild is proportional to the size of the original table data. This increases the I/O volume compared to rebuilding a table without a columnstore index, which makes the operation take longer.

Does adding a columnstore index affect write performance?

The impact of adding a columnstore index on write performance is generally within 5%. Tests using the Sysbench oltp_insert workload test dataset show a write performance decrease of about 3% after adding a columnstore index.

Which transaction isolation levels does IMCI support?

IMCI supports the READ_COMMITTED and REPEATABLE_READ transaction isolation levels.

Note
  • For the REPEATABLE_READ transaction isolation level, you must use a custom connection endpoint that contains only column store read-only nodes when using IMCI.

  • For cluster versions PolarDB for MySQL 8.0.1.1.40 and later, and 8.0.2.2.21 and later, some ecosystem tools may implicitly set an unsupported transaction isolation level in the query session. For example, the Metabase BI tool sets it to READ_UNCOMMITTED. In such scenarios, you can run SET imci_ignore_unsupported_isolation_level=ON to force the use of READ_COMMITTED. Alternatively, you can add a session variable setting to the ODBC/JDBC connection string. For example, in Metabase, you can add session Variables=imci_ignore_unsupported_isolation_level='ON' to the connection string.

Does IMCI accelerate fuzzy search scenarios?

Yes. IMCI performs very well in accelerating fuzzy search scenarios. It supports features such as LIKE PRUNER, NGRAM LIKE, and SMID LIKE. The columnstore full-text index feature also provides acceleration for fuzzy search scenarios.