All Products
Search
Document Center

PolarDB:Subquery decorrelation

Last Updated:Mar 30, 2026

Correlated subqueries execute once for every row the outer query returns. If the outer query generates a large amount of data and the subquery is not associated with indexes, the execution of the subquery will be highly time-consuming. Subquery unnesting rewrites correlated subqueries into equivalent JOIN statements, so the optimizer runs each subquery once instead of once per row, and can apply further optimizations such as join reordering.

PolarDB for MySQL supports two unnesting strategies: window functions and GROUP BY clauses.

Prerequisites

  • Cluster version: PolarDB for MySQL 8.0, revision 8.0.2.2.1 or later. To check your version, see Query the engine version.

Choose a strategy

Strategy Use when
Window function The subquery uses an aggregate function and the join column is a primary key or unique key
GROUP BY The subquery uses an aggregate function and the join column has many duplicate values

Enable subquery unnesting

Both strategies are controlled by sub-parameters of loose_polar_optimizer_switch. Configure this parameter at the global or session level. For details, see Configure cluster and node parameters.

Sub-parameter Default Description
unnest_use_window_function ON Unnest subqueries using window functions
unnest_use_group_by ON Unnest subqueries using GROUP BY clauses (cost-based)
derived_merge_cost_based OFF Apply derived merge based on cost-based optimization

Unnest subqueries using window functions

How it works

The following figure shows the structure of a query that contains a subquery.

Query structure

T1, T2, and T3 each represent a collection of one or more tables and views. T2 (inside the subquery) is nested with T3, as shown by the dotted line. T1 is in the outer query and is not nested with T2.

The window function strategy applies when all of the following conditions are met:

  • The scalar subquery contains no LIMIT or DISTINCT clause, and its output is an aggregate function.

  • The table in the subquery is a subset of the tables in the outer query.

  • The subquery is connected to the outer query by an equi join. The outer query contains join conditions with the same semantics and filter conditions for common tables in the subquery.

  • The join column in the subquery is a primary key or unique key column.

  • Neither the subquery nor the outer query contains custom functions or random functions.

After unnesting, the window function computes the aggregate in groups and attaches the result to each row, eliminating the need to re-execute the subquery.

Window function

Example

The following example uses TPC-H Q2 (Minimum Cost Supplier Query), which finds the supplier with the lowest price among all suppliers in a region who sell components of a specific type and size.

In MySQL Community Edition, the outer query first retrieves all matching suppliers. The subquery then runs on each row to find the minimum supply cost. For large datasets, this means the subquery executes once per supplier row.

Original query:

SELECT s_acctbal, s_name, n_name, p_partkey, p_mfgr,
 s_address, s_phone, s_comment
FROM part, supplier, partsupp, nation, region
WHERE p_partkey = ps_partkey
   AND s_suppkey = ps_suppkey
   AND p_size = 30
   AND p_type LIKE '%STEEL'
   AND s_nationkey = n_nationkey
   AND n_regionkey = r_regionkey
   AND r_name = 'ASIA'
   AND ps_supplycost = (
       SELECT MIN(ps_supplycost)
       FROM partsupp, supplier, nation, region
       WHERE p_partkey = ps_partkey
           AND s_suppkey = ps_suppkey
           AND s_nationkey = n_nationkey
           AND n_regionkey = r_regionkey
           AND r_name = 'ASIA'
   )
ORDER BY s_acctbal DESC, n_name, s_name, p_partkey
LIMIT 100;

PolarDB rewrites this using MIN() OVER(PARTITION BY ps_partkey) to compute the minimum cost per part in a single pass, then filters rows where the supply cost matches the group minimum.

Rewritten query:

SELECT s_acctbal, s_name, n_name, p_partkey, p_mfgr,
  s_address, s_phone, s_comment
FROM (
    SELECT MIN(ps_supplycost) OVER(PARTITION BY ps_partkey) as win_min,
      ps_partkey, ps_supplycost, s_acctbal, n_name, s_name, s_address,
      s_phone, s_comment
    FROM part, partsupp, supplier, nation, region
    WHERE p_partkey = ps_partkey
      AND s_suppkey = ps_suppkey
      AND s_nationkey = n_nationkey
      AND n_regionkey = r_regionkey
      AND p_size = 30
      AND p_type LIKE '%STEEL'
      AND r_name = 'ASIA') as derived
WHERE ps_supplycost = derived.win_min
ORDER BY s_acctbal DESC, n_name, s_name, p_partkey
LIMIT 100;

Performance improvement

At TPC-H scale factor 10:

  • Q2: 1.54x improvement

  • Q17: 4.91x improvement

Performance improvement

Unnest subqueries using GROUP BY clauses

How it works

The following figure shows the structure of a query that contains a subquery.

Query transformation

The GROUP BY strategy applies when all of the following conditions are met:

  • The scalar subquery contains no GROUP BY or LIMIT clause, and its output is an aggregate function.

  • The scalar subquery appears in a JOIN, WHERE, or SELECT condition.

  • The subquery is connected to the outer query by an equi join, with conditions joined by AND.

  • The scalar subquery contains no custom functions or random functions.

After unnesting, the subquery becomes a derived table that pre-aggregates results grouped by the join key. The outer query then joins this derived table once, replacing repeated subquery execution.

Group by aggregation

Example

The following example retrieves orders where the quantity exceeds 10% of the total purchase amount for the same product.

Original query:

SELECT *
FROM sale_lineitem sl
WHERE sl.sl_quantity >
    (SELECT 0.1 * SUM(pl.pl_quantity)
     FROM purchase_lineitem pl
     WHERE pl.pl_objectkey = sl.sl_objectkey);

Without unnesting, the database iterates every row in sale_lineitem, reads sl_objectkey, and re-executes the subquery for that value. The subquery runs as many times as there are rows in sale_lineitem. Because sl_objectkey typically contains many duplicate values, the same aggregation on purchase_lineitem repeats many times — even when an index exists on pl_objectkey.

PolarDB rewrites this as a LEFT JOIN against a pre-aggregated derived table. The purchase_lineitem table is scanned only once.

Rewritten query:

SELECT *
FROM sale_lineitem sl
LEFT JOIN
  (SELECT (0.1 * sum(pl.pl_quantity)) AS Name_exp_1,
          pl.pl_objectkey AS Name_exp_2
   FROM purchase_lineitem pl
   GROUP BY pl.pl_objectkey) derived ON derived.Name_exp_2 = sl.sl_objectkey
WHERE sl.sl_quantity > derived.name_exp_1;

The optimizer can further improve execution by reordering the join based on cost estimates.

Override unnesting with hints

Use the UNNEST and NO_UNNEST hints to control unnesting on a per-query basis, regardless of the loose_polar_optimizer_switch setting.

Hint syntax:

UNNEST([@query_block_name] [strategy [, strategy] ...])
NO_UNNEST([@query_block_name] [strategy [, strategy] ...])

strategy can be WINDOW_FUNCTION or GROUP_BY.

Examples:

-- Force window function unnesting
SELECT ... FROM ... WHERE ... = (SELECT /*+UNNEST(WINDOW_FUNCTION)*/ agg FROM ...)
SELECT /*+UNNEST(@`select#2` WINDOW_FUNCTION)*/ ... FROM ... WHERE ... = (SELECT agg FROM ...)

-- Disable window function unnesting
SELECT ... FROM ... WHERE ... = (SELECT /*+NO_UNNEST(WINDOW_FUNCTION)*/ agg FROM ...)
SELECT /*+NO_UNNEST(@`select#2` WINDOW_FUNCTION)*/ ... FROM ... WHERE ... = (SELECT agg FROM ...)

-- Force GROUP BY unnesting
SELECT ... FROM ... WHERE ... = (SELECT /*+UNNEST(GROUP_BY)*/ agg FROM ...)
SELECT /*+UNNEST(@`select#2` GROUP_BY)*/ ... FROM ... WHERE ... = (SELECT agg FROM ...)

-- Disable GROUP BY unnesting
SELECT ... FROM ... WHERE ... = (SELECT /*+NO_UNNEST(GROUP_BY)*/ agg FROM ...)
SELECT /*+NO_UNNEST(@`select#2` GROUP_BY)*/ ... FROM ... WHERE ... = (SELECT agg FROM ...)