Complex analytical queries that join many tables are sensitive to join order: a suboptimal order can produce large intermediate result sets, causing every downstream operator to do more work. The IMCI optimizer addresses this by combining query transformation rules with per-column statistics to evaluate multiple equivalent execution plans and select the lowest-cost join order automatically. This page explains how the optimizer works, how to enable it, and what to do when the optimizer cannot find the optimal plan.
How it works
SQL is a declarative language: a single query can be executed in many ways, all producing the same correct result. The query optimizer explores this space of equivalent query plans and picks the one with the lowest estimated cost.
For example, the following query on four tables can be executed in at least two orders:
SELECT * FROM t0, t1, t2, t3
WHERE t0.a = t1.a AND t1.a = t2.a AND t2.a = t3.a AND t3.b = t1.b;
Plan A and Plan B produce the same result. The optimizer generates Plan B by applying a query transformation rule — for example, rewriting t1 INNER JOIN t2 as t2 INNER JOIN t1. It then estimates the cost of each plan and selects the one with the lower cost.
Optimizer workflow
Parse the SQL statement to produce an initial query plan.
Apply query transformation rules to generate equivalent query plans.
Use statistics and cost models to estimate the cost of each plan, then submit the lowest-cost plan to the execution engine.
Statistics
The IMCI optimizer uses three types of statistics for cardinality estimation and cost calculation:
Histograms: show the value distribution of each column, used to estimate value ranges and the selection rate of equality predicates.
Distinct value counts: used to estimate the number of groups in
GROUP BYand to assist with selection rate estimation.Column constraints: whether a column has a unique index or foreign key relationships.
Cost calculation
The cost of each operator in a query plan depends on:
The total number of rows the operator processes (estimated from statistics).
The algorithmic complexity of the operator.
The total execution cost equals the sum of all operator costs. For hash join, the formula is:
Cost(join) = Card(inner) + Card(outer)Applying this to Plan A and Plan B:
| Plan | Calculation | Total cost |
|---|---|---|
| Plan A | 10000+1+1000+100+10000+10 | 21,111 |
| Plan B | 10000+1+100+10+1000+10 | 11,121 |
Plan B has a lower cost, so the optimizer selects it.
Supported versions
Your PolarDB cluster must be running one of the following versions:
PolarDB for MySQL 8.0.1, revision 8.0.1.1.31 or later
PolarDB for MySQL 8.0.2, revision 8.0.2.2.12 or later
Enable IMCI query optimization
Step 1: Collect statistics
Statistics are required before the optimizer can estimate costs accurately. Choose one of the following approaches:
Periodic `ANALYZE TABLE`: run
ANALYZE TABLEon tables that use IMCI query optimization to keep statistics current.Initial collection + auto-update (recommended for tables under active load): after creating an In-Memory Column Index (IMCI) on a table, run
ANALYZE TABLEon the read-only node to build initial statistics. Then setloose_imci_auto_update_statisticto ASYNC so the optimizer refreshes statistics automatically as data changes, without manual intervention. ASYNC mode is suitable for active tables because re-collection happens in the background and does not block ongoing queries.
Step 2: Enable the feature
In the PolarDB console, set loose_imci_optimizer_switch to use_imci_card_est=ON,use_imci_join_reorder=ON.
Join reorder does not take effect unlessuse_imci_card_estis also set toONand the table has statistics.
For instructions on configuring parameters, see Specify cluster and node parameters.
Step 3: Run your queries
After the feature is enabled and statistics are collected, run your analytical queries. The optimizer automatically evaluates join orders and selects the lowest-cost plan.
Parameters
Configure the following parameters in the PolarDB console to control IMCI query optimization.
| Parameter | Description | Default |
|---|---|---|
loose_imci_optimizer_switch | Controls IMCI query optimization sub-features. Set use_imci_card_est to ON to enable cardinality estimation and cost calculation. Set use_imci_join_reorder to ON to enable join reorder. Both sub-options must be enabled together for join reorder to take effect. | OFF |
loose_imci_auto_update_statistic | Controls how the IMCI optimizer refreshes stale statistics. ASYNC: re-collects statistics asynchronously in the background — suitable for tables under active load. SYNC: re-collects statistics synchronously before the query runs. OFF: statistics are not automatically refreshed. | ASYNC |
loose_imci_max_enum_join_pairs | The maximum number of equivalent execution plans the optimizer evaluates when join reorder is enabled. Increase this value to explore more plans for queries with many joins; decrease it to reduce planning overhead. Valid values: 0–4,294,967,295. | 2000 |
Limitations and workarounds
The following conditions can cause large errors in cardinality estimation, leading the optimizer to pick a suboptimal plan. When you encounter one of these conditions, use a JOIN_ORDER hint to explicitly specify the join order.
| Condition | Example | Workaround |
|---|---|---|
| Predicates that compare two columns in the same table | t1.c1 > t1.c2 | Use JOIN_ORDER hint |
| Predicates with operators that cannot use statistics | t1.c1 MOD 2 = 1, t1.c2 LIKE '%ABC%' | Use JOIN_ORDER hint |
| Predicates with expressions the optimizer cannot evaluate | t1.c1 + t1.c3 > 100 | Use JOIN_ORDER hint |
| Columns without statistics for estimating predicate selection rates | SELECT a, SUM(b) FROM t1 HAVING SUM(b) > 10 | Use JOIN_ORDER hint |
Multiple predicates combined with AND | t1.c1 > 10 AND t1.c3 < 5 | Use JOIN_ORDER hint |
| Queries with many nested layers | — | Use JOIN_ORDER hint |
| Queries joining many tables | — | Adjust loose_imci_max_enum_join_pairs to expand the search space, or use JOIN_ORDER hint |
Applying a `JOIN_ORDER` hint: add the hint directly in the SELECT statement to override the optimizer's join order. For example:
SELECT /*+ JOIN_ORDER(t0, t1, t2, t3) */ *
FROM t0, t1, t2, t3
WHERE t0.a = t1.a AND t1.a = t2.a AND t2.a = t3.a AND t3.b = t1.b;For the full list of supported join order hints, see MySQL optimizer hints for join order.
Performance results
The following example uses TPC-H Q8, a multi-table query with aggregate functions representative of complex OLAP workloads.
SELECT
o_year,
SUM(
CASE
WHEN nation = 'BRAZIL' THEN volume
ELSE 0
END
) / SUM(volume) AS mkt_share
FROM
(
SELECT
EXTRACT(year FROM o_orderdate) AS o_year,
l_extendedprice * (1 - l_discount) AS volume,
n2.n_name AS nation
FROM
lineitem, orders, part, supplier, customer, nation n1, nation n2, region
WHERE
p_partkey = l_partkey
AND s_suppkey = l_suppkey
AND l_orderkey = o_orderkey
AND o_custkey = c_custkey
AND c_nationkey = n1.n_nationkey
AND n1.n_regionkey = r_regionkey
AND r_name = 'AMERICA'
AND s_nationkey = n2.n_nationkey
AND o_orderdate BETWEEN DATE '1995-01-01' AND DATE '1996-12-31'
AND p_type = 'ECONOMY ANODIZED STEEL'
) AS all_nations
GROUP BY o_year
ORDER BY o_year;Test environment: 32-core cluster, TPC-H SF100 dataset.
Without IMCI query optimization

The query plan produces large intermediate result sets across multiple joins, increasing the rows processed by each operator. Query duration: 7,017 ms.
With IMCI query optimization

The optimizer reorders joins to reduce intermediate result set sizes to the order of millions of rows, cutting the work for downstream operators. Query duration: 1,900 ms — a 73% reduction.