When a query joins a derived table, the optimizer normally materializes the entire derived table first and then applies JOIN ON filters afterward. PolarDB can instead push eligible join conditions into the derived table before materialization, so only matching rows are included. This reduces the data volume that subsequent join operations process, improving query performance.
This feature differs from join condition pushdown (JPP). This feature derives single-table conditions by leveragingWHEREpredicate equivalences and pushes them into derived tables—a heuristic rule that typically improves performance. JPP targets multi-table conditions inONpredicates and transforms them into subqueries based on cost estimation, with no guaranteed performance improvement.
How it works
Consider a query where t1.a = 1 appears in the WHERE clause and dt.x > t1.a appears in the JOIN ON clause:
SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2) dt ON dt.x > t1.a WHERE t1.a = 1;The optimizer knows t1.a = 1 at query time. It substitutes this value into the ON predicate, deriving dt.x > 1—a condition that depends only on the derived table dt. This derived condition is then pushed into dt before materialization, so the query behaves as if it were written as:
SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2 WHERE x > 1) dt ON dt.x > t1.a WHERE t1.a = 1;Without this feature, all rows from t2 are materialized first, and the filter dt.x > 1 is applied afterward.
Prerequisites
Before you begin, ensure that your cluster uses one of the following engine versions. To check your version, see Query version number.
MySQL 8.0.1 with revision version 8.0.1.1.44 or later
MySQL 8.0.2 with revision version 8.0.2.2.25 or later
Limitations
Pushdown is not supported in the following cases:
The derived table has a
LIMITclause.SELECT * FROM t1 LEFT JOIN (SELECT c, MAX(d) FROM t2 GROUP BY c LIMIT 2) dt ON t1.a < dt.a AND t1.a = 1;The condition references a subquery or a non-deterministic function (such as
RAND()).SELECT * FROM t1 LEFT JOIN (SELECT c, MAX(d) FROM t2 GROUP BY c) dt ON t1.a < dt.a AND t1.a = RAND();The condition references a stored procedure or function.
SELECT * FROM t1 LEFT JOIN (SELECT c, MAX(d) FROM t2 GROUP BY c) dt ON t1.a < dt.a AND t1.a = f1();
Configure the feature
Set the loose_join_cond_push_into_derived_mode parameter based on your workload. For instructions, see Set cluster and node parameters.
| Parameter | Level | Valid values | Description |
|---|---|---|---|
loose_join_cond_push_into_derived_mode | Global | REPLICA_ON (default), ON, OFF | REPLICA_ON: enables pushdown on read-only nodes only. ON: enables the condition pushdown feature. OFF: disables pushdown. |
Verify the effect
Use EXPLAIN FORMAT=TREE to confirm that conditions are pushed into the derived table before materialization.
Pushdown applies when all columns in the condition expression (or their equivalent columns) originate from the materialized derived table.
Set up sample tables:
CREATE TABLE t1 (a INT, b INT, c INT, d INT);
CREATE TABLE t2 (e INT, f INT, g INT);With pushdown enabled (loose_join_cond_push_into_derived_mode = ON):
EXPLAIN FORMAT=TREE SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2) dt ON dt.x > t1.a WHERE t1.a = 1;-> Left hash join (no condition)
-> Filter: (t1.a = 1) (cost=0.55 rows=1)
-> Table scan on t1 (cost=0.55 rows=3)
-> Hash
-> Table scan on dt
-> Materialize
-> Filter: (t2.x > 1) (cost=0.45 rows=1)
-> Table scan on t2 (cost=0.45 rows=2)The condition t2.x > 1 appears inside Materialize, confirming that filtering happens before the derived table is materialized.
With pushdown disabled (loose_join_cond_push_into_derived_mode = OFF):
EXPLAIN FORMAT=TREE SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2) dt ON dt.x > t1.a WHERE t1.a=1;-> Left hash join (no condition)
-> Filter: (t1.a = 1) (cost=0.55 rows=1)
-> Table scan on t1 (cost=0.55 rows=3)
-> Hash
-> Filter: (dt.x > 1)
-> Table scan on dt
-> Materialize
-> Table scan on t2 (cost=0.45 rows=2)The Filter: (dt.x > 1) node appears outside Materialize, meaning all rows from t2 are materialized first and then filtered.