All Products
Search
Document Center

PolarDB:Analyze the IMCI-based query performance

Last Updated:Mar 28, 2026

PolarDB for MySQL collects per-operator execution statistics for queries processed by the In-Memory Column Index (IMCI) engine. The statistics — actual row counts, actual costs, and wall-clock time per operator — are stored alongside the EXPLAIN plan. Use these statistics to identify which operators consume the most time and to measure the impact of optimizations.

Prerequisites

Before you begin, ensure that you have:

  • A PolarDB for MySQL cluster running version 8.0.1.1.42 or later. To check your version, see Query an engine version.

Enable profiling

Set the imci_analyze_query session parameter to ON to enable profiling. After a query runs, the profiling data for that query is available in information_schema.imci_sql_profiling.

ParameterLevelDefaultDescription
imci_analyze_querySessionOFFCaptures per-operator execution statistics for the most recent IMCI query. Set to ON to enable.
imci_sql_profiling stores data for the most recent SQL statement executed by using IMCIs. Query the table immediately after the target query completes.

How it works

When imci_analyze_query is ON, PolarDB records the following metrics for each operator in the IMCI execution plan:

ColumnDescription
IDOperator ID, matching the EXPLAIN output
OperatorOperator type (for example, Hash Join, Table Scan)
NameTable name, if applicable
A-RowsActual number of rows the operator produced
A-CostActual cost
Execution Time(s)Wall-clock time spent in this operator
Extra InfoJoin conditions, group keys, and resource usage (CPU/memory)

Interpreting the profiling output

Use the following patterns to diagnose performance issues:

SignalLikely causeAction
An operator's Execution Time(s) accounts for a large share of total query timeBottleneck operatorFocus optimization on this operator
A-Rows is much larger than E-Rows from EXPLAINRow count estimation error; may indicate an exploding join (output rows far exceed input rows)Review join conditions; check for missing filters or missing statistics
Row 1 (Select Statement) Extra Info shows real_query_mem close to or exceeding max_query_memMemory pressureReduce parallelism or process data in smaller batches

To sort operators by execution time and find the bottleneck directly:

/*ROUTE_TO_LAST_USED*/SELECT ID, Operator, Name, `A-Rows`, `Execution Time(s)`
FROM information_schema.imci_sql_profiling
ORDER BY `Execution Time(s)` DESC;

Examples

Simple query example

This example profiles a join-aggregation query on a TPC-H (Transaction Processing Performance Council Benchmark H) dataset.

SELECT l_shipmode, COUNT(*) FROM lineitem, orders WHERE l_orderkey = o_orderkey GROUP BY l_shipmode;

Step 1: Review the execution plan.

EXPLAIN SELECT l_shipmode, COUNT(*) FROM lineitem, orders WHERE l_orderkey = o_orderkey GROUP BY l_shipmode;

Sample output:

+----+------------------------+----------+--------+----------+---------------------------------------------------------------+
| ID | Operator               | Name     | E-Rows | E-Cost   | Extra Info                                                    |
+----+------------------------+----------+--------+----------+---------------------------------------------------------------+
|  1 | Select Statement       |          |        |          | IMCI Execution Plan (max_dop = 1, max_query_mem = unlimited)  |
|  2 | └─Hash Groupby         |          | 6      | 51218.50 | Group Key: lineitem.l_shipmode                                |
|  3 |   └─Compute Scalar     |          | 1869   | 50191.00 |                                                               |
|  4 |     └─Hash Join        |          | 1869   | 31000.00 | Join Cond: lineitem.l_orderkey = orders.o_orderkey            |
|  5 |       ├─Table Scan     | lineitem | 2000   | 0.00     |                                                               |
|  6 |       └─Table Scan     | orders   | 2000   | 0.00     |                                                               |
+----+------------------------+----------+--------+----------+---------------------------------------------------------------+

Step 2: Enable profiling and run the query.

SET imci_analyze_query = ON;

SELECT l_shipmode, COUNT(*) FROM lineitem, orders WHERE l_orderkey = o_orderkey GROUP BY l_shipmode;

Sample output:

+------------+----------+
| l_shipmode | COUNT(*) |
+------------+----------+
| REG AIR    |      283 |
| SHIP       |      269 |
| FOB        |      299 |
| RAIL       |      289 |
| TRUCK      |      314 |
| MAIL       |      274 |
| AIR        |      272 |
+------------+----------+
7 rows in set (0.05 sec)

Step 3: Retrieve the profiling data.

Use the /*ROUTE_TO_LAST_USED*/ hint to route this query to the same node where the IMCI query ran. Without it, PolarProxy may route to a different node and return no results.

/*ROUTE_TO_LAST_USED*/SELECT * FROM information_schema.imci_sql_profiling;

Sample output:

+----+------------------------+----------+--------+----------+-------------------+--------------------------------------------------------------------------------------------------------+
| ID | Operator               | Name     | A-Rows | A-Cost   | Execution Time(s) | Extra Info                                                                                             |
+----+------------------------+----------+--------+----------+-------------------+--------------------------------------------------------------------------------------------------------+
|  1 | Select Statement       |          |      0 | 52609.51 |                 0 | IMCI Execution Plan (max_dop = 1, real_dop = 1, max_query_mem = unlimited, real_query_mem = unlimited) |
|  2 | └─Hash Groupby         |          |      7 |  52609.5 |             0.002 | Group Key: lineitem.l_shipmode                                                                         |
|  3 |   └─Compute Scalar     |          |   2000 |    51501 |                 0 |                                                                                                        |
|  4 |     └─Hash Join        |          |   2000 |    31000 |             0.007 | Join Cond: lineitem.l_orderkey = orders.o_orderkey                                                     |
|  5 |       ├─Table Scan     | lineitem |   2000 |        0 |             0.001 |                                                                                                        |
|  6 |       └─Table Scan     | orders   |   2000 |        0 |                 0 |                                                                                                        |
+----+------------------------+----------+--------+----------+-------------------+--------------------------------------------------------------------------------------------------------+

The Extra Info column displays the join condition for the Hash Join operator and the group key for the Hash Groupby operator. The Extra Info column in the first row displays actual CPU and memory usage (real_dop, real_query_mem) alongside the estimated values (max_dop, max_query_mem), which you can compare to detect resource pressure.

Step 4: (Optional) Aggregate profiling metrics.

Run aggregation queries directly on imci_sql_profiling. For example, to get the total execution time:

/*ROUTE_TO_LAST_USED*/SELECT SUM(`Execution Time(s)`) AS TOTAL_TIME FROM information_schema.imci_sql_profiling;

Sample output:

+----------------------+
| TOTAL_TIME           |
+----------------------+
| 0.010000000000000002 |
+----------------------+
1 row in set (0.00 sec)

The total execution duration of the SQL statement is 10 ms.

Complex query example

This example shows how to use profiling data to identify a bottleneck, apply an optimization, and verify the improvement. The query runs against a TPC-H SF100 dataset (scale factor 100).

SELECT
    c_name,
    sum(l_quantity)
FROM
    customer,
    orders,
    lineitem
WHERE
    o_orderkey IN (
        SELECT
            l_orderkey
        FROM
            lineitem
        WHERE
            l_partkey > 18000000
    )
    AND c_custkey = o_custkey
    AND o_orderkey = l_orderkey
GROUP BY
    c_name
ORDER BY
    c_name
LIMIT 10;

Diagnose the bottleneck

Step 1: Review the estimated execution plan.

EXPLAIN SELECT
    c_name,
    sum(l_quantity)
FROM
    customer,
    orders,
    lineitem
WHERE
    o_orderkey IN (
        SELECT
            l_orderkey
        FROM
            lineitem
        WHERE
            l_partkey > 18000000
    )
    AND c_custkey = o_custkey
    AND o_orderkey = l_orderkey
GROUP BY
    c_name
ORDER BY
    c_name
LIMIT 10;

Sample output:

+----+----------------------------------+----------+-----------+------------+---------------------------------------------------------------+
| ID | Operator                         | Name     | E-Rows    | E-Cost     | Extra Info                                                    |
+----+----------------------------------+----------+-----------+------------+---------------------------------------------------------------+
|  1 | Select Statement                 |          |           |            | IMCI Execution Plan (max_dop = 32, max_query_mem = unlimited) |
|  2 | └─Limit                          |          | 10        | 7935739.96 | Offset=0 Limit=10                                             |
|  3 |   └─Sort                         |          | 10        | 7935739.96 | Sort Key: c_name ASC                                          |
|  4 |     └─Hash Groupby               |          | 1503700   | 7933273.99 | Group Key: customer.C_NAME                                    |
|  5 |       └─Hash Right Semi Join     |          | 54010545  | 7865930.26 | Join Cond: lineitem.L_ORDERKEY = orders.O_ORDERKEY            |
|  6 |         ├─Table Scan             | lineitem | 59785766  | 24001.52   | Cond: (L_PARTKEY > 18000000)                                  |
|  7 |         └─Hash Join              |          | 538776190 | 5488090.33 | Join Cond: orders.O_ORDERKEY = lineitem.L_ORDERKEY            |
|  8 |           ├─Hash Join            |          | 181006430 | 668535.99  | Join Cond: customer.C_CUSTKEY = orders.O_CUSTKEY              |
|  9 |           │ ├─Table Scan         | customer | 15000000  | 600.00     |                                                               |
| 10 |           │ └─Table Scan         | orders   | 150000000 | 6000.00    |                                                               |
| 11 |           └─Table Scan           | lineitem | 600037902 | 24001.52   |                                                               |
+----+----------------------------------+----------+-----------+------------+---------------------------------------------------------------+

The Hash Join at ID 7 has an estimated cost of 5,488,090 — approximately 70% of the total query cost (7,935,739). It consumes a large amount of CPU resources, which signals a large intermediate result set. Enable profiling to confirm this is the actual bottleneck.

Step 2: Enable profiling and run the query.

SET imci_analyze_query = ON;

SELECT
    c_name,
    sum(l_quantity)
FROM
    customer,
    orders,
    lineitem
WHERE
    o_orderkey IN (
        SELECT
            l_orderkey
        FROM
            lineitem
        WHERE
            l_partkey > 18000000
    )
    AND c_custkey = o_custkey
    AND o_orderkey = l_orderkey
GROUP BY
    c_name
ORDER BY
    c_name
LIMIT 10;

Sample output:

+--------------------+-----------------+
| c_name             | sum(l_quantity) |
+--------------------+-----------------+
| Customer#000000001 |          172.00 |
| Customer#000000002 |          663.00 |
| Customer#000000004 |          174.00 |
| Customer#000000005 |          488.00 |
| Customer#000000007 |         1135.00 |
| Customer#000000008 |          440.00 |
| Customer#000000010 |          625.00 |
| Customer#000000011 |          143.00 |
| Customer#000000013 |         1032.00 |
| Customer#000000014 |          564.00 |
+--------------------+-----------------+
10 rows in set (21.37 sec)

Step 3: Retrieve the profiling data.

/*ROUTE_TO_LAST_USED*/SELECT * FROM information_schema.imci_sql_profiling;

Sample output:

+----+----------------------------------+----------+-----------+------------+-------------------+----------------------------------------------------+
| ID | Operator                         | Name     | A-Rows    | A-Cost     | Execution Time(s) | Extra Info                                         |
+----+----------------------------------+----------+-----------+------------+-------------------+----------------------------------------------------+
|  1 | Select Statement                 |          |         0 | 8336856.67 |                 0 |                                                    |
|  2 | └─Limit                          |          |        10 | 8336856.67 |             0.002 | Offset=0 Limit=10                                  |
|  3 |   └─Sort                         |          |         0 | 8336856.67 |             2.275 | Sort Key: c_name ASC                               |
|  4 |     └─Hash Groupby               |          |   9813586 | 8320763.22 |           160.083 | Group Key: customer.C_NAME                         |
|  5 |       └─Hash Right Semi Join     |          | 239598134 | 7994854.23 |            98.174 | Join Cond: lineitem.L_ORDERKEY = orders.O_ORDERKEY |
|  6 |         ├─Table Scan             | lineitem |  60013756 |   24001.52 |              3.28 | Cond: (L_PARTKEY > 18000000)                       |
|  7 |         └─Hash Join              |          | 600037902 | 5156677.35 |           301.503 | Join Cond: orders.O_ORDERKEY = lineitem.L_ORDERKEY |
|  8 |           ├─Hash Join            |          | 150000000 |  629777.96 |            97.201 | Join Cond: customer.C_CUSTKEY = orders.O_CUSTKEY   |
|  9 |           │ ├─Table Scan         | customer |  15000000 |        600 |             3.321 |                                                    |
| 10 |           │ └─Table Scan         | orders   | 150000000 |       6000 |             0.241 |                                                    |
| 11 |           └─Table Scan           | lineitem | 600037902 |   24001.52 |             0.661 |                                                    |
+----+----------------------------------+----------+-----------+------------+-------------------+----------------------------------------------------+

The Hash Join at ID 7 took 301.503 s — roughly half the total query time. Its inputs are the lineitem table (600M rows, ID 11) and the result of Hash Join ID 8 (150M rows). Joining two large sets forces the engine to build a very large hash table, which dominates execution time.

The execution plan also shows that the o_orderkey IN (...) filter is evaluated as a Hash Right Semi Join (ID 5) after the large join at ID 7 completes. If the optimizer converts this IN subquery into a semi-join and pushes it down before ID 7, it can filter orders rows early and shrink the intermediate result set fed into ID 7.

Apply the optimization

Step 4: Enable cost-based semi-join pushdown.

SET imci_optimizer_switch = 'semijoin_pushdown=on';

Step 5: Verify the new execution plan.

EXPLAIN SELECT
    c_name,
    sum(l_quantity)
FROM
    customer,
    orders,
    lineitem
WHERE
    o_orderkey IN (
        SELECT
            l_orderkey
        FROM
            lineitem
        WHERE
            l_partkey > 18000000
    )
    AND c_custkey = o_custkey
    AND o_orderkey = l_orderkey
GROUP BY
    c_name
ORDER BY
    c_name
LIMIT 10;

Sample output:

+----+----------------------------------------+----------+-----------+------------+---------------------------------------------------------------+
| ID | Operator                               | Name     | E-Rows    | E-Cost     | Extra Info                                                    |
+----+----------------------------------------+----------+-----------+------------+---------------------------------------------------------------+
|  1 | Select Statement                       |          |           |            | IMCI Execution Plan (max_dop = 32, max_query_mem = unlimited) |
|  2 | └─Limit                                |          | 10        | 2800433.74 | Offset=0 Limit=10                                             |
|  3 |   └─Sort                               |          | 10        | 2800433.74 | Sort Key: c_name ASC                                          |
|  4 |     └─Hash Groupby                     |          | 14567321  | 2776544.58 | Group Key: customer.C_NAME                                    |
|  5 |       └─Hash Join                      |          | 57918330  | 2631846.75 | Join Cond: orders.O_ORDERKEY = lineitem.L_ORDERKEY            |
|  6 |         ├─Hash Join                    |          | 14567321  | 1014041.92 | Join Cond: orders.O_CUSTKEY = customer.C_CUSTKEY              |
|  7 |         │ ├─Hash Right Semi Join       |          | 12071937  | 906226.67  | Join Cond: lineitem.L_ORDERKEY = orders.O_ORDERKEY            |
|  8 |         │ │ ├─Table Scan               | lineitem | 59785766  | 24001.52   | Cond: (L_PARTKEY > 18000000)                                  |
|  9 |         │ │ └─Table Scan               | orders   | 150000000 | 6000.00    |                                                               |
| 10 |         │ └─Table Scan                 | customer | 15000000  | 600.00     |                                                               |
| 11 |         └─Table Scan                   | lineitem | 600037902 | 24001.52   |                                                               |
+----+----------------------------------------+----------+-----------+------------+---------------------------------------------------------------+

The semi-join is now pushed down to ID 7, filtering orders rows before the expensive join. The estimated total cost drops from 7,935,739 to 2,800,433.

Verify the improvement

Step 6: Re-run the query and profile it.

SELECT
    c_name,
    sum(l_quantity)
FROM
    customer,
    orders,
    lineitem
WHERE
    o_orderkey IN (
        SELECT
            l_orderkey
        FROM
            lineitem
        WHERE
            l_partkey > 18000000
    )
    AND c_custkey = o_custkey
    AND o_orderkey = l_orderkey
GROUP BY
    c_name
ORDER BY
    c_name
LIMIT 10;

Sample output:

+--------------------+-----------------+
| c_name             | sum(l_quantity) |
+--------------------+-----------------+
| Customer#000000001 |          172.00 |
| Customer#000000002 |          663.00 |
| Customer#000000004 |          174.00 |
| Customer#000000005 |          488.00 |
| Customer#000000007 |         1135.00 |
| Customer#000000008 |          440.00 |
| Customer#000000010 |          625.00 |
| Customer#000000011 |          143.00 |
| Customer#000000013 |         1032.00 |
| Customer#000000014 |          564.00 |
+--------------------+-----------------+
10 rows in set (13.74 sec)

Execution time dropped from 21.37 s to 13.74 s — approximately a 40% reduction.

Step 7: Confirm with profiling data.

/*ROUTE_TO_LAST_USED*/SELECT * FROM information_schema.imci_sql_profiling;

Sample output:

+----+----------------------------------------+----------+-----------+------------+-------------------+----------------------------------------------------+
| ID | Operator                               | Name     | A-Rows    | A-Cost     | Execution Time(s) | Extra Info                                         |
+----+----------------------------------------+----------+-----------+------------+-------------------+----------------------------------------------------+
|  1 | Select Statement                       |          |         0 | 4318488.35 |                 0 |                                                    |
|  2 | └─Limit                                |          |        10 | 4318488.35 |             0.002 | Offset=0 Limit=10                                  |
|  3 |   └─Sort                               |          |         0 | 4318488.34 |             3.076 | Sort Key: c_name ASC                               |
|  4 |     └─Hash Groupby                     |          |   9813586 | 4302394.89 |           163.149 | Group Key: customer.C_NAME                         |
|  5 |       └─Hash Join                      |          | 239598134 | 3976485.91 |           151.253 | Join Cond: orders.O_ORDERKEY = lineitem.L_ORDERKEY |
|  6 |         ├─Hash Join                    |          |  49393149 | 1321335.54 |            55.392 | Join Cond: orders.O_CUSTKEY = customer.C_CUSTKEY   |
|  7 |         │ ├─Hash Right Semi Join       |          |  49393149 |  954805.16 |            52.552 | Join Cond: lineitem.L_ORDERKEY = orders.O_ORDERKEY |
|  8 |         │ │ ├─Table Scan               | lineitem |  60013756 |   24001.52 |             2.791 | Cond: (L_PARTKEY > 18000000)                       |
|  9 |         │ │ └─Table Scan               | orders   | 150000000 |       6000 |             0.152 |                                                    |
| 10 |         │ └─Table Scan                 | customer |  15000000 |        600 |             0.028 |                                                    |
| 11 |         └─Table Scan                   | lineitem | 600037902 |   24001.52 |             0.642 |                                                    |
+----+----------------------------------------+----------+-----------+------------+-------------------+----------------------------------------------------+

The previous bottleneck (Hash Join ID 7, 301.503 s) is eliminated. The semi-join pushdown at ID 7 now filters orders to ~49M rows before the join at ID 5, reducing the join input size and overall execution time.