如何使用EXPLAIN和EXPLAINANALYZE命令來分析查詢執行計畫 - AnalyticDB

本文介紹如何使用EXPLAIN和EXPLAIN ANALYZE命令來分析查詢執行計畫。

前提條件

AnalyticDB MySQL版叢集需為3.1.3或以上版本。

說明

如何查看叢集版本，請參見查看版本。如需升級版本，請提交工單聯絡支援人員。

EXPLAIN

您可以通過EXPLAIN命令來評估查詢語句的執行方式，評估結果僅供參考，並不等於實際的執行結果。

文法
```
EXPLAIN (format text) <SELECT statement>;
```
說明
如果查詢SQL不複雜，您可以在EXPLAIN命令中加上(format text)，來提高返回結果中計劃樹階層的易讀性。

樣本

EXPLAIN (format text)
      SELECT count(*)
      FROM
      nation, region, customer
      WHERE
      c_nationkey = n_nationkey
      AND n_regionkey = r_regionkey
      AND r_name = 'ASIA';

返回結果如下：

Output[count(*)]
│   Outputs: [count:bigint]
│   Estimates: {rows: 1 (8B)}
│   count(*) := count
└─ Aggregate(FINAL)
   │   Outputs: [count:bigint]
   │   Estimates: {rows: 1 (8B)}
   │   count := count(`count_1`)
   └─ LocalExchange[SINGLE] ()
      │   Outputs: [count_0_1:bigint]
      │   Estimates: {rows: 1 (8B)}
      └─ RemoteExchange[GATHER]
         │   Outputs: [count_0_2:bigint]
         │   Estimates: {rows: 1 (8B)}
         └─ Aggregate(PARTIAL)
            │   Outputs: [count_0_4:bigint]
            │   Estimates: {rows: 1 (8B)}
            │   count_4 := count(*)
            └─ InnerJoin[(`c_nationkey` = `n_nationkey`)][$hashvalue, $hashvalue_0_6]
               │   Outputs: []
               │   Estimates: {rows: 302035 (4.61MB)}
               │   Distribution: REPLICATED
               ├─ Project[]
               │  │   Outputs: [c_nationkey:integer, $hashvalue:bigint]
               │  │   Estimates: {rows: 1500000 (5.72MB)}
               │  │   $hashvalue := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`c_nationkey`), 0))
               │  └─ RuntimeFilter
               │     │   Outputs: [c_nationkey:integer]
               │     │   Estimates: {rows: 1500000 (5.72MB)}
               │     ├─ TableScan[adb:AdbTableHandle{schema=tpch, tableName=customer, partitionColumnHandles=[c_custkey]}]
               │     │      Outputs: [c_nationkey:integer]
               │     │      Estimates: {rows: 1500000 (5.72MB)}
               │     │      c_nationkey := AdbColumnHandle{columnName=c_nationkey, type=4, isIndexed=true}
               │     └─ RuntimeCollect
               │        │   Outputs: [n_nationkey:integer]
               │        │   Estimates: {rows: 5 (60B)}
               │        └─ LocalExchange[ROUND_ROBIN] ()
               │           │   Outputs: [n_nationkey:integer]
               │           │   Estimates: {rows: 5 (60B)}
               │           └─ RuntimeScan
               │                  Outputs: [n_nationkey:integer]
               │                  Estimates: {rows: 5 (60B)}
               └─ LocalExchange[HASH][$hashvalue_0_6] ("n_nationkey")
                  │   Outputs: [n_nationkey:integer, $hashvalue_0_6:bigint]
                  │   Estimates: {rows: 5 (60B)}
                  └─ Project[]
                     │   Outputs: [n_nationkey:integer, $hashvalue_0_10:bigint]
                     │   Estimates: {rows: 5 (60B)}
                     │   $hashvalue_10 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`n_nationkey`), 0))
                     └─ RemoteExchange[REPLICATE]
                        │   Outputs: [n_nationkey:integer]
                        │   Estimates: {rows: 5 (60B)}
                        └─ InnerJoin[(`n_regionkey` = `r_regionkey`)][$hashvalue_0_7, $hashvalue_0_8]
                           │   Outputs: [n_nationkey:integer]
                           │   Estimates: {rows: 5 (60B)}
                           │   Distribution: REPLICATED
                           ├─ Project[]
                           │  │   Outputs: [n_nationkey:integer, n_regionkey:integer, $hashvalue_0_7:bigint]
                           │  │   Estimates: {rows: 25 (200B)}
                           │  │   $hashvalue_7 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`n_regionkey`), 0))
                           │  └─ RuntimeFilter
                           │     │   Outputs: [n_nationkey:integer, n_regionkey:integer]
                           │     │   Estimates: {rows: 25 (200B)}
                           │     ├─ TableScan[adb:AdbTableHandle{schema=tpch, tableName=nation, partitionColumnHandles=[]}]
                           │     │      Outputs: [n_nationkey:integer, n_regionkey:integer]
                           │     │      Estimates: {rows: 25 (200B)}
                           │     │      n_nationkey := AdbColumnHandle{columnName=n_nationkey, type=4, isIndexed=true}
                           │     │      n_regionkey := AdbColumnHandle{columnName=n_regionkey, type=4, isIndexed=true}
                           │     └─ RuntimeCollect
                           │        │   Outputs: [r_regionkey:integer]
                           │        │   Estimates: {rows: 1 (4B)}
                           │        └─ LocalExchange[ROUND_ROBIN] ()
                           │           │   Outputs: [r_regionkey:integer]
                           │           │   Estimates: {rows: 1 (4B)}
                           │           └─ RuntimeScan
                           │                  Outputs: [r_regionkey:integer]
                           │                  Estimates: {rows: 1 (4B)}
                           └─ LocalExchange[HASH][$hashvalue_0_8] ("r_regionkey")
                              │   Outputs: [r_regionkey:integer, $hashvalue_0_8:bigint]
                              │   Estimates: {rows: 1 (4B)}
                              └─ ScanProject[table = adb:AdbTableHandle{schema=tpch, tableName=region, partitionColumnHandles=[]}]
                                     Outputs: [r_regionkey:integer, $hashvalue_0_9:bigint]
                                     Estimates: {rows: 1 (4B)}/{rows: 1 (B)}
                                     $hashvalue_9 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`r_regionkey`), 0))
                                     r_regionkey := AdbColumnHandle{columnName=r_regionkey, type=4, isIndexed=true}

返回結果中的主要參數說明見下表。

參數	說明
`Outputs: [symbol:type]`	每個運算元的輸出資料行及資料類型。
`Estimates: {rows: %s (%sB)}`	每個運算元的估算行數及資料量。估算結果可用來決定最佳化器的Join Order和Data Shuffle。

EXPLAIN ANALYZE

您可以通過EXPLAIN ANALYZE命令查看查詢的分布式執行計畫以及實際執行代價，包括執行耗時、記憶體使用量量，輸入輸出資料量等。

文法
```
EXPLAIN ANALYZE <SELECT statement>;
```

樣本

EXPLAIN ANALYZE
SELECT count(*)
      FROM
      nation, region, customer
      WHERE
      c_nationkey = n_nationkey
      AND n_regionkey = r_regionkey
      AND r_name = 'ASIA';

返回結果如下：

Fragment 1 [SINGLE]
    Output: 1 row (9B), PeakMemory: 178KB, WallTime: 1.00ns, Input: 32 rows (288B); per task: avg.: 32.00 std.dev.: 0.00
    Output layout: [count]
    Output partitioning: SINGLE []
    Aggregate(FINAL)
    │   Outputs: [count:bigint]
    │   Estimates: {rows: 1 (8B)}
    │   Output: 2 rows (18B), PeakMemory: 24B (0.00%), WallTime: 70.39us (0.03%)
    │   count := count(`count_1`)
    └─ LocalExchange[SINGLE] ()
       │   Outputs: [count1:bigint]
       │   Estimates: {rows: ? (?)}
       │   Output: 64 rows (576B), PeakMemory: 8KB (0.07%), WallTime: 238.69us (0.10%)
       └─ RemoteSource[2]
              Outputs: [count2:bigint]
              Estimates:
              Output: 32 rows (288B), PeakMemory: 32KB (0.27%), WallTime: 182.82us (0.08%)
              Input avg.: 4.00 rows, Input std.dev.: 264.58%

Fragment 2 [adb:AdbPartitioningHandle{schema=tpch, tableName=customer, dimTable=false, shards=32, tableEngineType=Cstore, partitionColumns=c_custkey, prunedBuckets= empty}]
    Output: 32 rows (288B), PeakMemory: 6MB, WallTime: 164.00ns, Input: 1500015 rows (20.03MB); per task: avg.: 500005.00 std.dev.: 21941.36
    Output layout: [count4]
    Output partitioning: SINGLE []
    Aggregate(PARTIAL)
    │   Outputs: [count4:bigint]
    │   Estimates: {rows: 1 (8B)}
    │   Output: 64 rows (576B), PeakMemory: 336B (0.00%), WallTime: 1.01ms (0.42%)
    │   count_4 := count(*)
    └─ INNER Join[(`c_nationkey` = `n_nationkey`)][$hashvalue, $hashvalue6]
       │   Outputs: []
       │   Estimates: {rows: 302035 (4.61MB)}
       │   Output: 300285 rows (210B), PeakMemory: 641KB (5.29%), WallTime: 99.08ms (41.45%)
       │   Left (probe) Input avg.: 46875.00 rows, Input std.dev.: 311.24%
       │   Right (build) Input avg.: 0.63 rows, Input std.dev.: 264.58%
       │   Distribution: REPLICATED
       ├─ ScanProject[table = adb:AdbTableHandle{schema=tpch, tableName=customer, partitionColumnHandles=[c_custkey]}]
       │      Outputs: [c_nationkey:integer, $hashvalue:bigint]
       │      Estimates: {rows: 1500000 (5.72MB)}/{rows: 1500000 (5.72MB)}
       │      Output: 1500000 rows (20.03MB), PeakMemory: 5MB (44.38%), WallTime: 68.29ms (28.57%)
       │      $hashvalue := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`c_nationkey`), 0))
       │      c_nationkey := AdbColumnHandle{columnName=c_nationkey, type=4, isIndexed=true}
       │      Input: 1500000 rows (7.15MB), Filtered: 0.00%
       └─ LocalExchange[HASH][$hashvalue6] ("n_nationkey")
          │   Outputs: [n_nationkey:integer, $hashvalue6:bigint]
          │   Estimates: {rows: 5 (60B)}
          │   Output: 30 rows (420B), PeakMemory: 394KB (3.26%), WallTime: 455.03us (0.19%)
          └─ Project[]
             │   Outputs: [n_nationkey:integer, $hashvalue10:bigint]
             │   Estimates: {rows: 5 (60B)}
             │   Output: 15 rows (210B), PeakMemory: 24KB (0.20%), WallTime: 83.61us (0.03%)
             │   Input avg.: 0.63 rows, Input std.dev.: 264.58%
             │   $hashvalue_10 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`n_nationkey`), 0))
             └─ RemoteSource[3]
                    Outputs: [n_nationkey:integer]
                    Estimates:
                    Output: 15 rows (75B), PeakMemory: 24KB (0.20%), WallTime: 45.97us (0.02%)
                    Input avg.: 0.63 rows, Input std.dev.: 264.58%

Fragment 3 [adb:AdbPartitioningHandle{schema=tpch, tableName=nation, dimTable=true, shards=32, tableEngineType=Cstore, partitionColumns=, prunedBuckets= empty}]
    Output: 5 rows (25B), PeakMemory: 185KB, WallTime: 1.00ns, Input: 26 rows (489B); per task: avg.: 26.00 std.dev.: 0.00
    Output layout: [n_nationkey]
    Output partitioning: BROADCAST []
    INNER Join[(`n_regionkey` = `r_regionkey`)][$hashvalue7, $hashvalue8]
    │   Outputs: [n_nationkey:integer]
    │   Estimates: {rows: 5 (60B)}
    │   Output: 11 rows (64B), PeakMemory: 152KB (1.26%), WallTime: 255.86us (0.11%)
    │   Left (probe) Input avg.: 25.00 rows, Input std.dev.: 0.00%
    │   Right (build) Input avg.: 0.13 rows, Input std.dev.: 264.58%
    │   Distribution: REPLICATED
    ├─ ScanProject[table = adb:AdbTableHandle{schema=tpch, tableName=nation, partitionColumnHandles=[]}]
    │      Outputs: [n_nationkey:integer, n_regionkey:integer, $hashvalue7:bigint]
    │      Estimates: {rows: 25 (200B)}/{rows: 25 (200B)}
    │      Output: 25 rows (475B), PeakMemory: 16KB (0.13%), WallTime: 178.81us (0.07%)
    │      $hashvalue_7 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`n_regionkey`), 0))
    │      n_nationkey := AdbColumnHandle{columnName=n_nationkey, type=4, isIndexed=true}
    │      n_regionkey := AdbColumnHandle{columnName=n_regionkey, type=4, isIndexed=true}
    │      Input: 25 rows (250B), Filtered: 0.00%
    └─ LocalExchange[HASH][$hashvalue8] ("r_regionkey")
       │   Outputs: [r_regionkey:integer, $hashvalue8:bigint]
       │   Estimates: {rows: 1 (4B)}
       │   Output: 2 rows (28B), PeakMemory: 34KB (0.29%), WallTime: 57.41us (0.02%)
       └─ ScanProject[table = adb:AdbTableHandle{schema=tpch, tableName=region, partitionColumnHandles=[]}]
              Outputs: [r_regionkey:integer, $hashvalue9:bigint]
              Estimates: {rows: 1 (4B)}/{rows: 1 (4B)}
              Output: 1 row (14B), PeakMemory: 8KB (0.07%), WallTime: 308.99us (0.13%)
              $hashvalue_9 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`r_regionkey`), 0))
              r_regionkey := AdbColumnHandle{columnName=r_regionkey, type=4, isIndexed=true}
              Input: 1 row (5B), Filtered: 0.00%

返回結果中的主要參數說明見下表。

參數	說明
`Outputs: [symbol:type]`	每個運算元的輸出資料行及資料類型。
`Estimates: {rows: %s (%sB)}`	每個運算元的估算行數及資料量。估算結果可用來決定最佳化器的Join Order和Data Shuffle。
`PeakMemory: %s`	記憶體使用量總和，用於分析記憶體使用量的瓶頸點。
`WallTime: %s`	運算元執行時間的累加總和，用於分析計算瓶頸點。說明由於存在並行計算，所以該時間並不是真實的執行時間。
`Input: %s rows (%sB)`	輸入行數及資料量。
`per task: avg.: %s std.dev.: %s`	平均行數和其標準差，用於分析Stage內部的資料扭曲。
`Output: %s row (%sB)`	輸出行數及資料量。

使用情境

您可以通過EXPLAIN ANALYZE分析一些常見的計劃問題。

過濾器未下推

在如下兩個查詢中，相較於SQL 1，SQL 2中由於存在不能下推的函數 length(string_test) ，需要掃描全量資料進行計算：

SQL 1

SELECT count(*) FROM test WHERE string_test = 'a';

SQL 2

SELECT count(*) FROM test WHERE length(string_test) = 1;

使用EXPLAIN ANALYZE分別分析上述兩個查詢的執行計畫，對比計劃中的Fragment 2，可以看出：

SQL 1使用的運算元是TableScan，且Input avg.為0.00 rows，說明過濾器下推成功，掃描資料量為0行。

Fragment 2 [adb:AdbPartitioningHandle{schema=test4dmp, tableName=test, dimTable=false, shards=4, tableEngineType=Cstore, partitionColumns=id, prunedBuckets= empty}]
    Output: 4 rows (36B), PeakMemory: 0B, WallTime: 6.00ns, Input: 0 rows (0B); per task: avg.: 0.00 std.dev.: 0.00
    Output layout: [count_0_1]
    Output partitioning: SINGLE []
    Aggregate(PARTIAL)
    │   Outputs: [count_0_1:bigint]
    │   Estimates: {rows: 1 (8B)}
    │   Output: 8 rows (72B), PeakMemory: 0B (0.00%), WallTime: 212.92us (3.99%)
    │   count_0_1 := count(*)
    └─ TableScan[adb:AdbTableHandle{schema=test4dmp, tableName=test, partitionColumnHandles=[id]}]
           Outputs: []
           Estimates: {rows: 4 (0B)}
           Output: 0 rows (0B), PeakMemory: 0B (0.00%), WallTime: 4.76ms (89.12%)
           Input avg.: 0.00 rows, Input std.dev.: ?%

SQL 2使用的運算元是ScanFilterProject，且Input為9999 rows，同時，filterPredicate屬性不為空白（即filterPredicate = (`test4dmp`.`length`(`string_test`) = BIGINT '1')）表明沒有下推的過濾器，掃描資料量為9999行。

Fragment 2 [adb:AdbPartitioningHandle{schema=test4dmp, tableName=test, dimTable=false, shards=4, tableEngineType=Cstore, partitionColumns=id, prunedBuckets= empty}]
    Output: 4 rows (36B), PeakMemory: 0B, WallTime: 102.00ns, Input: 0 rows (0B); per task: avg.: 0.00 std.dev.: 0.00
    Output layout: [count_0_1]
    Output partitioning: SINGLE []
    Aggregate(PARTIAL)
    │   Outputs: [count_0_1:bigint]
    │   Estimates: {rows: 1 (8B)}
    │   Output: 8 rows (72B), PeakMemory: 0B (0.00%), WallTime: 252.23us (0.12%)
    │   count_0_1 := count(*)
    └─ ScanFilterProject[table = adb:AdbTableHandle{schema=test4dmp, tableName=test, partitionColumnHandles=[id]}, filterPredicate = (`test4dmp`.`length`(`string_test`) = BIGINT '1')]
           Outputs: []
           Estimates: {rows: 9999 (312.47kB)}/{rows: 9999 (312.47kB)}/{rows: ? (?)}
           Output: 0 rows (0B), PeakMemory: 0B (0.00%), WallTime: 101.31ms (49.84%)
           string_test := AdbColumnHandle{columnName=string_test, type=13, isIndexed=true}
           Input: 9999 rows (110.32kB), Filtered: 100.00%

Bad SQL記憶體使用量率
您可以直接查看每個Fragment的PeakMemory來定位資源消耗的問題。排除計劃中Disaster Broadcast的情況，高PeakMemory通常是因為串連資料膨脹、串連的左表資料量過大、TableScan運算元掃描資料量過大等，需要從業務角度加條件限制資料量。另外，您也可以通過查看每個運算元的PeakMemory百分比定位資源消耗最大的運算元，再進一步分析。