PolarDB can push down join conditions to materialized derived tables. This feature enables the transfer of condition expressions that adhere to pushdown rules from the JOIN statement's ON clause directly to the materialized derived tables. By advancing the data filtering process, it ensures that predicate conditions are applied to the data source as early as possible, thereby reducing the data volume for subsequent processing and enhancing query performance.
The condition pushdown feature to materialized derived tables discussed in this topic differs from the standard join condition pushdown (JPP) feature, with distinct applicable scenarios. The pushdown technique here relies on the equivalence of WHERE predicates, allowing the derivation of single-table conditions dependent solely on materialized derived tables and their subsequent pushdown. This heuristic rule typically boosts SQL execution performance. In contrast, JPP focuses on ON predicates involving multi-table conditions, which are transformed into related subqueries based on cost estimation during pushdown, without a guaranteed performance improvement. For more information, see join condition pushdown.
Prerequisites
This feature is supported in clusters that use the following database engines. For more information about how to query the database engine version, see Query Version Number.
MySQL 8.0.1 with revision version 8.0.1.1.44 or later.
MySQL 8.0.2 with revision version 8.0.2.2.25 or later.
Limits
Pushdown is not allowed if the materialized derived table has a
LIMITclause. For example:SELECT * FROM t1 LEFT JOIN (SELECT c, MAX(d) FROM t2 GROUP BY c LIMIT 2) dt ON t1.a < dt.a AND t1.a = 1;Pushdown is unsupported if the columns of the outer WHERE condition expression or the columns mapped to the corresponding columns of the materialized derived table meet any of the following conditions:
The column references a subquery or is non-deterministic. Example:
SELECT * FROM t1 LEFT JOIN (SELECT c, MAX(d) FROM t2 GROUP BY c) dt ON t1.a < dt.a AND t1.a = RAND();The column is part of a stored procedure or function. Example:
CREATE FUNCTION f1() RETURNS INT BEGIN ... END; SELECT * FROM t1 LEFT JOIN (SELECT c, MAX(d) FROM t2 GROUP BY c) dt ON t1.a < dt.a AND t1.a = f1();
Usage
Preparations
Before utilizing the condition pushdown feature, configure the loose_join_cond_push_into_derived_mode parameter according to your business needs. For detailed instructions, see Set Cluster and Node Parameters.
The following table describes the parameter.
Parameter name | Level | Description |
loose_join_cond_push_into_derived_mode | Global | The control switch for the condition pushdown feature from join conditions to Derived Tables. Valid values:
|
Examples
If all columns in the condition expression (or their equivalent columns) originate from the materialized derived table, the condition can be pushed down to it.
Execute the following code to create sample tables:
CREATE TABLE t1 (a INT, b INT, c INT, d INT);
CREATE TABLE t2 (e INT, f INT, g INT);Enable the feature by setting
loose_join_cond_push_into_derived_modetoONand execute the following code:EXPLAIN FORMAT=TREE SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2) dt ON dt.x > t1.a WHERE t1.a = 1;Sample result:
-> Left hash join (no condition) -> Filter: (t1.a = 1) (cost=0.55 rows=1) -> Table scan on t1 (cost=0.55 rows=3) -> Hash -> Table scan on dt -> Materialize -> Filter: (t2.x > 1) (cost=0.45 rows=1) -> Table scan on t2 (cost=0.45 rows=2)NoteIn the query plan, the equivalence
t1.a = 1from theWHEREclause is successfully passed to theJOINpredicate, resulting indt.x > 1being effectively pushed down to the derived tabledt. Consequently, the data volume indtis reduced, leading to improved query performance.Set
loose_join_cond_push_into_derived_modetoOFFand execute the following code:EXPLAIN FORMAT=TREE SELECT * FROM t1 LEFT JOIN (SELECT * FROM t2) dt ON dt.x > t1.a WHERE t1.a=1;Sample result:
-> Left hash join (no condition) -> Filter: (t1.a = 1) (cost=0.55 rows=1) -> Table scan on t1 (cost=0.55 rows=3) -> Hash -> Filter: (dt.x > 1) -> Table scan on dt -> Materialize -> Table scan on t2 (cost=0.45 rows=2)NoteWith
loose_join_cond_push_into_derived_modeset toOFF, the conditiondt.x > 1is not pushed down to the Derived Tables, and the filtering is performed only after the derived table is materialized.