本文介紹如何使用EXPLAIN和EXPLAIN ANALYZE命令來分析查詢執行計畫。
前提條件
AnalyticDB MySQL版叢集需為3.1.3或以上版本。
EXPLAIN
您可以通過EXPLAIN命令來評估查詢語句的執行方式,評估結果僅供參考,並不等於實際的執行結果。
文法
EXPLAIN (format text) <SELECT statement>;說明如果查詢SQL不複雜,您可以在
EXPLAIN命令中加上(format text),來提高返回結果中計劃樹階層的易讀性。樣本
EXPLAIN (format text) SELECT count(*) FROM nation, region, customer WHERE c_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA';返回結果如下:
Output[count(*)] │ Outputs: [count:bigint] │ Estimates: {rows: 1 (8B)} │ count(*) := count └─ Aggregate(FINAL) │ Outputs: [count:bigint] │ Estimates: {rows: 1 (8B)} │ count := count(`count_1`) └─ LocalExchange[SINGLE] () │ Outputs: [count_0_1:bigint] │ Estimates: {rows: 1 (8B)} └─ RemoteExchange[GATHER] │ Outputs: [count_0_2:bigint] │ Estimates: {rows: 1 (8B)} └─ Aggregate(PARTIAL) │ Outputs: [count_0_4:bigint] │ Estimates: {rows: 1 (8B)} │ count_4 := count(*) └─ InnerJoin[(`c_nationkey` = `n_nationkey`)][$hashvalue, $hashvalue_0_6] │ Outputs: [] │ Estimates: {rows: 302035 (4.61MB)} │ Distribution: REPLICATED ├─ Project[] │ │ Outputs: [c_nationkey:integer, $hashvalue:bigint] │ │ Estimates: {rows: 1500000 (5.72MB)} │ │ $hashvalue := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`c_nationkey`), 0)) │ └─ RuntimeFilter │ │ Outputs: [c_nationkey:integer] │ │ Estimates: {rows: 1500000 (5.72MB)} │ ├─ TableScan[adb:AdbTableHandle{schema=tpch, tableName=customer, partitionColumnHandles=[c_custkey]}] │ │ Outputs: [c_nationkey:integer] │ │ Estimates: {rows: 1500000 (5.72MB)} │ │ c_nationkey := AdbColumnHandle{columnName=c_nationkey, type=4, isIndexed=true} │ └─ RuntimeCollect │ │ Outputs: [n_nationkey:integer] │ │ Estimates: {rows: 5 (60B)} │ └─ LocalExchange[ROUND_ROBIN] () │ │ Outputs: [n_nationkey:integer] │ │ Estimates: {rows: 5 (60B)} │ └─ RuntimeScan │ Outputs: [n_nationkey:integer] │ Estimates: {rows: 5 (60B)} └─ LocalExchange[HASH][$hashvalue_0_6] ("n_nationkey") │ Outputs: [n_nationkey:integer, $hashvalue_0_6:bigint] │ Estimates: {rows: 5 (60B)} └─ Project[] │ Outputs: [n_nationkey:integer, $hashvalue_0_10:bigint] │ Estimates: {rows: 5 (60B)} │ $hashvalue_10 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`n_nationkey`), 0)) └─ RemoteExchange[REPLICATE] │ Outputs: [n_nationkey:integer] │ Estimates: {rows: 5 (60B)} └─ InnerJoin[(`n_regionkey` = `r_regionkey`)][$hashvalue_0_7, $hashvalue_0_8] │ Outputs: [n_nationkey:integer] │ Estimates: {rows: 5 (60B)} │ Distribution: REPLICATED ├─ Project[] │ │ Outputs: [n_nationkey:integer, n_regionkey:integer, $hashvalue_0_7:bigint] │ │ Estimates: {rows: 25 (200B)} │ │ $hashvalue_7 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`n_regionkey`), 0)) │ └─ RuntimeFilter │ │ Outputs: [n_nationkey:integer, n_regionkey:integer] │ │ Estimates: {rows: 25 (200B)} │ ├─ TableScan[adb:AdbTableHandle{schema=tpch, tableName=nation, partitionColumnHandles=[]}] │ │ Outputs: [n_nationkey:integer, n_regionkey:integer] │ │ Estimates: {rows: 25 (200B)} │ │ n_nationkey := AdbColumnHandle{columnName=n_nationkey, type=4, isIndexed=true} │ │ n_regionkey := AdbColumnHandle{columnName=n_regionkey, type=4, isIndexed=true} │ └─ RuntimeCollect │ │ Outputs: [r_regionkey:integer] │ │ Estimates: {rows: 1 (4B)} │ └─ LocalExchange[ROUND_ROBIN] () │ │ Outputs: [r_regionkey:integer] │ │ Estimates: {rows: 1 (4B)} │ └─ RuntimeScan │ Outputs: [r_regionkey:integer] │ Estimates: {rows: 1 (4B)} └─ LocalExchange[HASH][$hashvalue_0_8] ("r_regionkey") │ Outputs: [r_regionkey:integer, $hashvalue_0_8:bigint] │ Estimates: {rows: 1 (4B)} └─ ScanProject[table = adb:AdbTableHandle{schema=tpch, tableName=region, partitionColumnHandles=[]}] Outputs: [r_regionkey:integer, $hashvalue_0_9:bigint] Estimates: {rows: 1 (4B)}/{rows: 1 (B)} $hashvalue_9 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`r_regionkey`), 0)) r_regionkey := AdbColumnHandle{columnName=r_regionkey, type=4, isIndexed=true}返回結果中的主要參數說明見下表。
參數
說明
Outputs: [symbol:type]每個運算元的輸出資料行及資料類型。
Estimates: {rows: %s (%sB)}每個運算元的估算行數及資料量。估算結果可用來決定最佳化器的Join Order和Data Shuffle。
EXPLAIN ANALYZE
您可以通過EXPLAIN ANALYZE命令查看查詢的分布式執行計畫以及實際執行代價,包括執行耗時、記憶體使用量量,輸入輸出資料量等。
文法
EXPLAIN ANALYZE <SELECT statement>;樣本
EXPLAIN ANALYZE SELECT count(*) FROM nation, region, customer WHERE c_nationkey = n_nationkey AND n_regionkey = r_regionkey AND r_name = 'ASIA';返回結果如下:
Fragment 1 [SINGLE] Output: 1 row (9B), PeakMemory: 178KB, WallTime: 1.00ns, Input: 32 rows (288B); per task: avg.: 32.00 std.dev.: 0.00 Output layout: [count] Output partitioning: SINGLE [] Aggregate(FINAL) │ Outputs: [count:bigint] │ Estimates: {rows: 1 (8B)} │ Output: 2 rows (18B), PeakMemory: 24B (0.00%), WallTime: 70.39us (0.03%) │ count := count(`count_1`) └─ LocalExchange[SINGLE] () │ Outputs: [count1:bigint] │ Estimates: {rows: ? (?)} │ Output: 64 rows (576B), PeakMemory: 8KB (0.07%), WallTime: 238.69us (0.10%) └─ RemoteSource[2] Outputs: [count2:bigint] Estimates: Output: 32 rows (288B), PeakMemory: 32KB (0.27%), WallTime: 182.82us (0.08%) Input avg.: 4.00 rows, Input std.dev.: 264.58% Fragment 2 [adb:AdbPartitioningHandle{schema=tpch, tableName=customer, dimTable=false, shards=32, tableEngineType=Cstore, partitionColumns=c_custkey, prunedBuckets= empty}] Output: 32 rows (288B), PeakMemory: 6MB, WallTime: 164.00ns, Input: 1500015 rows (20.03MB); per task: avg.: 500005.00 std.dev.: 21941.36 Output layout: [count4] Output partitioning: SINGLE [] Aggregate(PARTIAL) │ Outputs: [count4:bigint] │ Estimates: {rows: 1 (8B)} │ Output: 64 rows (576B), PeakMemory: 336B (0.00%), WallTime: 1.01ms (0.42%) │ count_4 := count(*) └─ INNER Join[(`c_nationkey` = `n_nationkey`)][$hashvalue, $hashvalue6] │ Outputs: [] │ Estimates: {rows: 302035 (4.61MB)} │ Output: 300285 rows (210B), PeakMemory: 641KB (5.29%), WallTime: 99.08ms (41.45%) │ Left (probe) Input avg.: 46875.00 rows, Input std.dev.: 311.24% │ Right (build) Input avg.: 0.63 rows, Input std.dev.: 264.58% │ Distribution: REPLICATED ├─ ScanProject[table = adb:AdbTableHandle{schema=tpch, tableName=customer, partitionColumnHandles=[c_custkey]}] │ Outputs: [c_nationkey:integer, $hashvalue:bigint] │ Estimates: {rows: 1500000 (5.72MB)}/{rows: 1500000 (5.72MB)} │ Output: 1500000 rows (20.03MB), PeakMemory: 5MB (44.38%), WallTime: 68.29ms (28.57%) │ $hashvalue := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`c_nationkey`), 0)) │ c_nationkey := AdbColumnHandle{columnName=c_nationkey, type=4, isIndexed=true} │ Input: 1500000 rows (7.15MB), Filtered: 0.00% └─ LocalExchange[HASH][$hashvalue6] ("n_nationkey") │ Outputs: [n_nationkey:integer, $hashvalue6:bigint] │ Estimates: {rows: 5 (60B)} │ Output: 30 rows (420B), PeakMemory: 394KB (3.26%), WallTime: 455.03us (0.19%) └─ Project[] │ Outputs: [n_nationkey:integer, $hashvalue10:bigint] │ Estimates: {rows: 5 (60B)} │ Output: 15 rows (210B), PeakMemory: 24KB (0.20%), WallTime: 83.61us (0.03%) │ Input avg.: 0.63 rows, Input std.dev.: 264.58% │ $hashvalue_10 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`n_nationkey`), 0)) └─ RemoteSource[3] Outputs: [n_nationkey:integer] Estimates: Output: 15 rows (75B), PeakMemory: 24KB (0.20%), WallTime: 45.97us (0.02%) Input avg.: 0.63 rows, Input std.dev.: 264.58% Fragment 3 [adb:AdbPartitioningHandle{schema=tpch, tableName=nation, dimTable=true, shards=32, tableEngineType=Cstore, partitionColumns=, prunedBuckets= empty}] Output: 5 rows (25B), PeakMemory: 185KB, WallTime: 1.00ns, Input: 26 rows (489B); per task: avg.: 26.00 std.dev.: 0.00 Output layout: [n_nationkey] Output partitioning: BROADCAST [] INNER Join[(`n_regionkey` = `r_regionkey`)][$hashvalue7, $hashvalue8] │ Outputs: [n_nationkey:integer] │ Estimates: {rows: 5 (60B)} │ Output: 11 rows (64B), PeakMemory: 152KB (1.26%), WallTime: 255.86us (0.11%) │ Left (probe) Input avg.: 25.00 rows, Input std.dev.: 0.00% │ Right (build) Input avg.: 0.13 rows, Input std.dev.: 264.58% │ Distribution: REPLICATED ├─ ScanProject[table = adb:AdbTableHandle{schema=tpch, tableName=nation, partitionColumnHandles=[]}] │ Outputs: [n_nationkey:integer, n_regionkey:integer, $hashvalue7:bigint] │ Estimates: {rows: 25 (200B)}/{rows: 25 (200B)} │ Output: 25 rows (475B), PeakMemory: 16KB (0.13%), WallTime: 178.81us (0.07%) │ $hashvalue_7 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`n_regionkey`), 0)) │ n_nationkey := AdbColumnHandle{columnName=n_nationkey, type=4, isIndexed=true} │ n_regionkey := AdbColumnHandle{columnName=n_regionkey, type=4, isIndexed=true} │ Input: 25 rows (250B), Filtered: 0.00% └─ LocalExchange[HASH][$hashvalue8] ("r_regionkey") │ Outputs: [r_regionkey:integer, $hashvalue8:bigint] │ Estimates: {rows: 1 (4B)} │ Output: 2 rows (28B), PeakMemory: 34KB (0.29%), WallTime: 57.41us (0.02%) └─ ScanProject[table = adb:AdbTableHandle{schema=tpch, tableName=region, partitionColumnHandles=[]}] Outputs: [r_regionkey:integer, $hashvalue9:bigint] Estimates: {rows: 1 (4B)}/{rows: 1 (4B)} Output: 1 row (14B), PeakMemory: 8KB (0.07%), WallTime: 308.99us (0.13%) $hashvalue_9 := `combine_hash`(BIGINT '0', COALESCE(`$operator$hash_code`(`r_regionkey`), 0)) r_regionkey := AdbColumnHandle{columnName=r_regionkey, type=4, isIndexed=true} Input: 1 row (5B), Filtered: 0.00%返回結果中的主要參數說明見下表。
參數
說明
Outputs: [symbol:type]每個運算元的輸出資料行及資料類型。
Estimates: {rows: %s (%sB)}每個運算元的估算行數及資料量。估算結果可用來決定最佳化器的Join Order和Data Shuffle。
PeakMemory: %s記憶體使用量總和,用於分析記憶體使用量的瓶頸點。
WallTime: %s運算元執行時間的累加總和,用於分析計算瓶頸點。
說明由於存在並行計算,所以該時間並不是真實的執行時間。
Input: %s rows (%sB)輸入行數及資料量。
per task: avg.: %s std.dev.: %s平均行數和其標準差,用於分析Stage內部的資料扭曲。
Output: %s row (%sB)輸出行數及資料量。
使用情境
您可以通過EXPLAIN ANALYZE分析一些常見的計劃問題。
過濾器未下推
在如下兩個查詢中,相較於SQL 1,SQL 2中由於存在不能下推的函數
length(string_test),需要掃描全量資料進行計算:SQL 1
SELECT count(*) FROM test WHERE string_test = 'a';SQL 2
SELECT count(*) FROM test WHERE length(string_test) = 1;
使用
EXPLAIN ANALYZE分別分析上述兩個查詢的執行計畫,對比計劃中的Fragment 2,可以看出:SQL 1使用的運算元是
TableScan,且Input avg.為0.00 rows,說明過濾器下推成功,掃描資料量為0行。Fragment 2 [adb:AdbPartitioningHandle{schema=test4dmp, tableName=test, dimTable=false, shards=4, tableEngineType=Cstore, partitionColumns=id, prunedBuckets= empty}] Output: 4 rows (36B), PeakMemory: 0B, WallTime: 6.00ns, Input: 0 rows (0B); per task: avg.: 0.00 std.dev.: 0.00 Output layout: [count_0_1] Output partitioning: SINGLE [] Aggregate(PARTIAL) │ Outputs: [count_0_1:bigint] │ Estimates: {rows: 1 (8B)} │ Output: 8 rows (72B), PeakMemory: 0B (0.00%), WallTime: 212.92us (3.99%) │ count_0_1 := count(*) └─ TableScan[adb:AdbTableHandle{schema=test4dmp, tableName=test, partitionColumnHandles=[id]}] Outputs: [] Estimates: {rows: 4 (0B)} Output: 0 rows (0B), PeakMemory: 0B (0.00%), WallTime: 4.76ms (89.12%) Input avg.: 0.00 rows, Input std.dev.: ?%SQL 2使用的運算元是
ScanFilterProject,且Input為9999 rows,同時,filterPredicate屬性不為空白(即filterPredicate = (`test4dmp`.`length`(`string_test`) = BIGINT '1'))表明沒有下推的過濾器,掃描資料量為9999行。Fragment 2 [adb:AdbPartitioningHandle{schema=test4dmp, tableName=test, dimTable=false, shards=4, tableEngineType=Cstore, partitionColumns=id, prunedBuckets= empty}] Output: 4 rows (36B), PeakMemory: 0B, WallTime: 102.00ns, Input: 0 rows (0B); per task: avg.: 0.00 std.dev.: 0.00 Output layout: [count_0_1] Output partitioning: SINGLE [] Aggregate(PARTIAL) │ Outputs: [count_0_1:bigint] │ Estimates: {rows: 1 (8B)} │ Output: 8 rows (72B), PeakMemory: 0B (0.00%), WallTime: 252.23us (0.12%) │ count_0_1 := count(*) └─ ScanFilterProject[table = adb:AdbTableHandle{schema=test4dmp, tableName=test, partitionColumnHandles=[id]}, filterPredicate = (`test4dmp`.`length`(`string_test`) = BIGINT '1')] Outputs: [] Estimates: {rows: 9999 (312.47kB)}/{rows: 9999 (312.47kB)}/{rows: ? (?)} Output: 0 rows (0B), PeakMemory: 0B (0.00%), WallTime: 101.31ms (49.84%) string_test := AdbColumnHandle{columnName=string_test, type=13, isIndexed=true} Input: 9999 rows (110.32kB), Filtered: 100.00%
Bad SQL記憶體使用量率
您可以直接查看每個Fragment的
PeakMemory來定位資源消耗的問題。排除計劃中Disaster Broadcast的情況,高PeakMemory通常是因為串連資料膨脹、串連的左表資料量過大、TableScan運算元掃描資料量過大等,需要從業務角度加條件限制資料量。另外,您也可以通過查看每個運算元的PeakMemory百分比定位資源消耗最大的運算元,再進一步分析。