All Products
Search
Document Center

PolarDB:Push down join conditions to materialized derived tables

Last Updated:Mar 28, 2026

When a query joins a derived table, the optimizer normally materializes the entire derived table first and then applies JOIN ON filters afterward. PolarDB can instead push eligible join conditions into the derived table before materialization, so only matching rows are included. This reduces the data volume that subsequent join operations process, improving query performance.

This feature differs from join condition pushdown (JPP). This feature derives single-table conditions by leveraging WHERE predicate equivalences and pushes them into derived tables—a heuristic rule that typically improves performance. JPP targets multi-table conditions in ON predicates and transforms them into subqueries based on cost estimation, with no guaranteed performance improvement.

How it works

Consider a query where t1.a = 1 appears in the WHERE clause and dt.x > t1.a appears in the JOIN ON clause:

SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2) dt ON dt.x > t1.a WHERE t1.a = 1;

The optimizer knows t1.a = 1 at query time. It substitutes this value into the ON predicate, deriving dt.x > 1—a condition that depends only on the derived table dt. This derived condition is then pushed into dt before materialization, so the query behaves as if it were written as:

SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2 WHERE x > 1) dt ON dt.x > t1.a WHERE t1.a = 1;

Without this feature, all rows from t2 are materialized first, and the filter dt.x > 1 is applied afterward.

Prerequisites

Before you begin, ensure that your cluster uses one of the following engine versions. To check your version, see Query version number.

  • MySQL 8.0.1 with revision version 8.0.1.1.44 or later

  • MySQL 8.0.2 with revision version 8.0.2.2.25 or later

Limitations

Pushdown is not supported in the following cases:

  • The derived table has a LIMIT clause.

    SELECT * FROM t1 LEFT JOIN (SELECT c, MAX(d) FROM t2 GROUP BY c LIMIT 2) dt
    ON t1.a < dt.a AND t1.a = 1;
  • The condition references a subquery or a non-deterministic function (such as RAND()).

    SELECT * FROM t1 LEFT JOIN (SELECT c, MAX(d) FROM t2 GROUP BY c) dt
    ON t1.a < dt.a AND t1.a = RAND();
  • The condition references a stored procedure or function.

    SELECT * FROM t1 LEFT JOIN (SELECT c, MAX(d) FROM t2 GROUP BY c) dt
    ON t1.a < dt.a AND t1.a = f1();

Configure the feature

Set the loose_join_cond_push_into_derived_mode parameter based on your workload. For instructions, see Set cluster and node parameters.

ParameterLevelValid valuesDescription
loose_join_cond_push_into_derived_modeGlobalREPLICA_ON (default), ON, OFFREPLICA_ON: enables pushdown on read-only nodes only. ON: enables the condition pushdown feature. OFF: disables pushdown.

Verify the effect

Use EXPLAIN FORMAT=TREE to confirm that conditions are pushed into the derived table before materialization.

Pushdown applies when all columns in the condition expression (or their equivalent columns) originate from the materialized derived table.

Set up sample tables:

CREATE TABLE t1 (a INT, b INT, c INT, d INT);
CREATE TABLE t2 (e INT, f INT, g INT);

With pushdown enabled (loose_join_cond_push_into_derived_mode = ON):

EXPLAIN FORMAT=TREE SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2) dt ON dt.x > t1.a WHERE t1.a = 1;
-> Left hash join (no condition)
    -> Filter: (t1.a = 1)  (cost=0.55 rows=1)
        -> Table scan on t1  (cost=0.55 rows=3)
    -> Hash
        -> Table scan on dt
            -> Materialize
                -> Filter: (t2.x > 1)  (cost=0.45 rows=1)
                    -> Table scan on t2  (cost=0.45 rows=2)

The condition t2.x > 1 appears inside Materialize, confirming that filtering happens before the derived table is materialized.

With pushdown disabled (loose_join_cond_push_into_derived_mode = OFF):

EXPLAIN FORMAT=TREE SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2) dt ON dt.x > t1.a WHERE t1.a=1;
-> Left hash join (no condition)
    -> Filter: (t1.a = 1)  (cost=0.55 rows=1)
        -> Table scan on t1  (cost=0.55 rows=3)
    -> Hash
        -> Filter: (dt.x > 1)
            -> Table scan on dt
                -> Materialize
                    -> Table scan on t2  (cost=0.45 rows=2)

The Filter: (dt.x > 1) node appears outside Materialize, meaning all rows from t2 are materialized first and then filtered.