PolarDB for MySQL supports the join condition pushdown feature. After the outer join conditions of a derived table are pushed down, execution plans of the derived table can use indexes more efficiently, which can greatly improve the performance of complex queries.
Prerequisites
The cluster is of PolarDB for MySQL 8.0 and the revision version is 8.0.2.2.10 or later.
Background information
Derived tables (or inline views) are widely used in complex analytical queries. It can simplify SQL statement construction and describe query semantics in a straightforward manner. In the native MySQL, if a derived table cannot be expanded in outer queries (including operations such as GROUP BY and aggregate functions), it must be executed in materialized mode. If an SQL statement involves enormous data (for example, a large amount of table data needs to be scanned), the execution efficiency is very low. After the outer join conditions of the derived table are pushed down, execution plans of the derived table can use indexes more efficiently, which can greatly improve the performance of complex queries.
Scenarios
In complex queries, derived tables and outer tables that are joined in a nested loop manner. Because joined columns are at the inner layer of the derived table, indexes can be used to accelerate the materialization of the inner table. Accurate statistics are required to ensure that a large amount of data can be filtered out at the inner layer after join conditions are pushed down.
When join conditions are pushed down in inner queries and if indexes can be effectively used and a large amount of data is filtered out, a more efficient execution plan is generated. The join condition pushdown feature depends on the cost calculation capability of the optimizer to intelligently determine whether to push down outer join conditions.
Usage
You can use the loose_join_predicate_pushdown_opt_mode parameter to enable the join condition pushdown feature. For more information, see Specify cluster and node parameters.
Parameter | Level | Description |
loose_join_predicate_pushdown_opt_mode | Global | Specifies whether to enable the join condition pushdown feature. Default value: REPLICA_ON. Valid values:
|
Examples
Original query
In the original query, because the os
derived table needs to be fully materialized and no filter conditions can provide a high selection rate, it takes a long time to materialize the os
derived table. This query takes about 65 seconds.
SELECT *
FROM (
SELECT *
FROM sample_table.tb_order
WHERE create_date >= DATE_SUB(CAST('2022-12-05 15:12:05' AS datetime), INTERVAL 5 MINUTE)
AND product_type IN (2, 4)
) o
LEFT JOIN (
SELECT *
FROM sample_table.tb_order_detailed
WHERE update_time >= DATE_SUB('2022-12-05 15:12:05', INTERVAL 50 MINUTE)
) od
ON o.order_id = od.order_id
LEFT JOIN (
SELECT t.*, row_number() OVER (PARTITION BY detail_id ORDER BY update_date DESC) AS rn
FROM sample_table.tb_order_sku t
WHERE update_date >= DATE_SUB('2022-12-05 15:12:05', INTERVAL 50 MINUTE)
AND coalesce(product_type, '0') <> '5'
) os
ON od.id = os.detail_id;
Optimized query
After the join condition pushdown feature is enabled, the join condition od.id = os.detail_id
between the o
and os
tables is pushed down to the inner layer of the os
table. In this case, the os
table can more efficiently use the detail_id
index to filter out a large amount of data to improve execution efficiency. This query takes about 0.5 seconds.
SELECT *
FROM (
SELECT *
FROM db_order.tb_order
WHERE create_date >= DATE_SUB(CAST('2022-12-05 15:12:05' AS datetime), INTERVAL 5 MINUTE)
AND product_type IN (2, 4)
) o
LEFT JOIN (
SELECT *
FROM db_order.tb_order_detailed
WHERE update_time >= DATE_SUB('2022-12-05 15:12:05', INTERVAL 50 MINUTE)
) od
ON o.order_id = od.order_id
LEFT JOIN LATERAL((
SELECT t.*, row_number() OVER (PARTITION BY detail_id ORDER BY update_date DESC) AS rn
FROM db_order.tb_order_sku t
WHERE update_date >= DATE_SUB('2022-12-05 15:12:05', INTERVAL 50 MINUTE)
AND coalesce(product_type, '0') <> '5'
AND od.id = detail_id
)) os;