全部產品
Search
文件中心

PolarDB:條件下推(WHERE子句下推至IN子查詢)

更新時間:Jun 14, 2025

PolarDB支援條件下推(WHERE子句下推至IN子查詢)功能,可以最佳化複雜查詢,從而顯著提升了查詢效能(嵌套子查詢效果最好)。

版本限制

適用的資料庫引擎版本如下,如何查看核心版本,請參見查詢版本號碼

  • MySQL 8.0.1,且小版本需為8.0.1.1.42及以上版本。

  • MySQL 8.0.2,且小版本需為8.0.2.19及以上版本。

使用情境

在SQL查詢語句中,與GROUP BY結合的子查詢通常會被物化,例如子查詢是IN子查詢時:(FIELD[1],...) IN (SELECT col[1], ... FROM ... GROUP BY ...),並且該子查詢是主查詢WHERE條件中AND子句的一部分,例如:COND(FIELD[1])... AND (FIELD[1],...) IN (SELECT col[1], ... FROM ... GROUP BY ...) ,結果集的每一行都滿足FIELD[1] = col[1]都成立,所以條件COND(FIELD[1])可以被下推到子查詢中。

使用限制

  • 如果子查詢(IN SUBQUERY)有LIMIT限制,則不支援該條件下推,例如:

    SELECT * FROM t WHERE a > 1 AND (a, b) IN (SELECT c, MAX(d) FROM t2 GROUP BY c LIMIT 2);
  • 外層WHERE條件運算式的列和映射到物化表對應的列,存在以下情況則不支援該條件下推:

    • 該列引用了子查詢或者是非確定性(相同的輸入條件下,可能會產生不同的結果),例如:

      SELECT * FROM t WHERE a > 5*RAND() AND (a, b) IN (SELECT c, MAX(d) FROM t2 GROUP BY c);
    • 該列是預存程序或者儲存函數(一般儲存函數較為複雜,重複執行有可能產生負收益,因此不考慮這種情況的下推),例如:

      CREATE FUNCTION f1() RETURNS INT
      BEGIN
      ...
      END;
      
      SELECT * FROM t WHERE a > f1() AND (a, b) IN (SELECT c, MAX(d) FROM t2 GROUP BY c);

使用方法

準備工作

在使用該條件下推之前,需根據您的業務需求調整loose_subquery_cond_pushdown_mode參數。具體操作,請參見設定叢集參數和節點參數

參數說明如下:

參數名稱

層級

說明

loose_subquery_cond_pushdown_mode

Global

WHERE子句下推至IN子查詢的條件下推功能的控制開關。取值範圍如下:

  • REPLICA_ON:僅在唯讀節點開啟該條件下推功能。

  • ON:開啟該條件下推功能。

  • OFF(預設值):關閉該條件下推功能。

也可以通過OPTIMIZE HINT(SUBQUERY_CONDITION_PUSHDOWNNO_SUBQUERY_CONDITION_PUSHDOWN)直接控制是否下推到某個子查詢(無需配置loose_subquery_cond_pushdown_mode)。

樣本如下:

CREATE TABLE t1 (a INT, b INT, c INT, d INT);
CREATE TABLE t2 (e INT, f INT, g INT);
--該查詢語句分別表示當前query block 禁止將條件下推到子查詢@suq1中--
SELECT  /*+ NO_SUBQUERY_CONDITION_PUSHDOWN(@subq1) */ * FROM t1
WHERE t1.c<25 AND
(t1.a,t1.c) IN (SELECT /*+ QB_NAME(subq1) */ t2.e,MAX(t2.g) FROM t2 WHERE t2.e<5 GROUP BY t2.e);

--該查詢語句分別表示當前query block 會將符合規則的條件下推到子查詢@suq1中--
SELECT  /*+ SUBQUERY_CONDITION_PUSHDOWN(@subq1) */ * FROM t1
WHERE t1.c<25 AND
(t1.a,t1.c) IN (SELECT /*+ QB_NAME(subq1) */ t2.e,MAX(t2.g) FROM t2 WHERE t2.e<5 GROUP BY t2.e);

樣本1

開啟從WHERE子句下推至IN子查詢的條件下推功能前後,都執行如下代碼,查看執行計畫:

EXPLAIN FORMAT=TREE SELECT * FROM t1
WHERE t1.c<25 AND
(t1.a,t1.c) IN (SELECT t2.e,MAX(t2.g) FROM t2 WHERE t2.e<5 GROUP BY t2.e);

返回結果如下:

-- 開啟loose_subquery_cond_pushdown_mode前
-> Filter: ((t1.c < 25) and <in_optimizer>((t1.a,t1.c),(t1.a,t1.c) in (select #2)))  (cost=*** rows=***)
    -> Table scan on t1  (cost=*** rows=***)
    -> Select #2 (subquery in condition; run only once)
        -> Filter: ((t1.a = `<materialized_subquery>`.e) and (t1.c = `<materialized_subquery>`.`MAX(t2.g)`))
            -> Limit: 1 row(s)
                -> Index lookup on <materialized_subquery> using <auto_distinct_key> (e=t1.a, MAX(t2.g)=t1.c)
                    -> Materialize with deduplication
                        -> Table scan on <temporary>
                            -> Aggregate using temporary table
                                -> Filter: (t2.e < 5)  (cost=*** rows=***)
                                    -> Table scan on t2  (cost=*** rows=***)
                                
-- 開啟loose_subquery_cond_pushdown_mode後
-> Filter: ((t1.c < 25) and <in_optimizer>((t1.a,t1.c),(t1.a,t1.c) in (select #2)))  (cost=*** rows=***)
    -> Table scan on t1  (cost=*** rows=***)
    -> Select #2 (subquery in condition; run only once)
        -> Filter: ((t1.a = `<materialized_subquery>`.e) and (t1.c = `<materialized_subquery>`.`MAX(t2.g)`))
            -> Limit: 1 row(s)
                -> Index lookup on <materialized_subquery> using <auto_distinct_key> (e=t1.a, MAX(t2.g)=t1.c)
                    -> Materialize with deduplication
                        -> Filter: (max(t2.g) < 25)
                            -> Table scan on <temporary>
                                -> Aggregate using temporary table
                                    -> Filter: (t2.e < 5)  (cost=*** rows=***)
                                        -> Table scan on t2  (cost=*** rows=***)                        
說明

分析SQL語義可以得出,查詢結果表t1的結果集必然滿足t1.c與子查詢中的投影列MAX(t2.g)是等值關係,因此根據主查詢WHERE條件中的t1.c < 25,子查詢也必然滿足MAX(t2.g)<25,通過開啟WHERE條件下推子查詢的功能,最佳化器可以將條件t1.c<25下推至子查詢中,又因為GROUP BY的列不包含g列,因此只能下推到子查詢的HAVING子句上。

樣本2

開啟從WHERE子句下推至IN子查詢的條件下推功能前後,都執行如下代碼,查看執行計畫:

EXPLAIN FORMAT=TREE SELECT * FROM t1
WHERE ((t1.a<2 OR t1.a=5) AND t1.b>3) AND
(t1.a,t1.b,t1.c) IN
(
SELECT t2.e,t2.f,MAX(t2.g)
FROM t2
WHERE t2.e<5
GROUP BY t2.e,t2.f
);

返回結果如下:

-- 開啟loose_subquery_cond_pushdown_mode前
-> Filter: ((t1.b > 3) and ((t1.a < 2) or (t1.a = 5)) and <in_optimizer>((t1.a,t1.b,t1.c),(t1.a,t1.b,t1.c) in (select #2)))  (cost=*** rows=***)
    -> Table scan on t1  (cost=*** rows=***)
    -> Select #2 (subquery in condition; run only once)
        -> Filter: ((t1.a = `<materialized_subquery>`.e) and (t1.b = `<materialized_subquery>`.f) and (t1.c = `<materialized_subquery>`.`MAX(t2.g)`))
            -> Limit: 1 row(s)
                -> Index lookup on <materialized_subquery> using <auto_distinct_key> (e=t1.a, f=t1.b, MAX(t2.g)=t1.c)
                    -> Materialize with deduplication
                        -> Table scan on <temporary>
                            -> Aggregate using temporary table
                                -> Filter: (t2.e < 5)  (cost=*** rows=***)
                                    -> Table scan on t2  (cost=*** rows=***)

-- 開啟loose_subquery_cond_pushdown_mode後
-> Filter: ((t1.b > 3) and ((t1.a < 2) or (t1.a = 5)) and <in_optimizer>((t1.a,t1.b,t1.c),(t1.a,t1.b,t1.c) in (select #2)))  (cost=*** rows=***)
    -> Table scan on t1  (cost=*** rows=***)
    -> Select #2 (subquery in condition; run only once)
        -> Filter: ((t1.a = `<materialized_subquery>`.e) and (t1.b = `<materialized_subquery>`.f) and (t1.c = `<materialized_subquery>`.`MAX(t2.g)`))
            -> Limit: 1 row(s)
                -> Index lookup on <materialized_subquery> using <auto_distinct_key> (e=t1.a, f=t1.b, MAX(t2.g)=t1.c)
                    -> Materialize with deduplication
                        -> Table scan on <temporary>
                            -> Aggregate using temporary table
                                -> Filter: ((t2.e < 5) and (t2.f > 3) and ((t2.e < 2) or (t2.e = 5)))  (cost=*** rows=***)
                                    -> Table scan on t2  (cost=*** rows=***)
                                                                                                               
說明

SQL查詢的主查詢中WHERE條件稍微複雜一些,但通過WHERE條件下推子查詢的原理分析,可以發現主查詢表t1中的列t1.a,t1.b,t1.c分別等價於子查詢中的投影列t2.e,t2.f,MAX(t2.g),因此,對於AND串連的主查詢的WHERE條件((t1.a<2 OR t1.a=5) AND t1.b>3) ,可以推匯出子查詢中對應列應滿足條件((t2.e<2 OR t2.e=5) AND t2.f.b>3)。開啟WHERE條件下推子查詢的功能之後,最佳化器可以將WHERE條件((t1.a<2 OR t1.a=5) AND t1.b>3下推至子查詢中,分析子查詢GROUP BY的列包括t2.e,t2.f,因此下推至子查詢的WHERE子句上。