PolarDB MySQL版新增列存索引查詢效能分析功能。該功能通過採集執行SQL語句時的耗時等資訊,並結合EXPLAIN獲得的查詢計劃一併返回給使用者。能夠協助使用者瞭解執行SQL語句時的耗時細節,並且能夠輔助分析慢SQL。
版本要求
叢集版本需為PolarDB MySQL版8.0.1.1.42及以上,您可以通過查詢版本號碼來確認叢集版本。
使用說明
您需要將參數imci_analyze_query的值設定為ON,來開啟列存索引查詢效能分析功能。
參數名稱 | 層級 | 說明 |
imci_analyze_query | Session | 列存索引查詢效能分析功能控制開關。取值範圍如下:
|
樣本
簡單查詢樣本
以在TPC-H Schema上執行的一個簡單查詢語句為例,SQL語句如下:
SELECT l_shipmode, COUNT(*) FROM lineitem, orders WHERE l_orderkey = o_orderkey GROUP BY l_shipmode;首先,您可以通過EXPLAIN得到該查詢最佳化工具估計的各個運算元的執行代價以及返回行數資訊。
EXPLAIN SELECT l_shipmode, COUNT(*) FROM lineitem, orders WHERE l_orderkey = o_orderkey GROUP BY l_shipmode;查詢結果如下:
+----+------------------------+----------+--------+----------+---------------------------------------------------------------+ | ID | Operator | Name | E-Rows | E-Cost | Extra Info | +----+------------------------+----------+--------+----------+---------------------------------------------------------------+ | 1 | Select Statement | | | | IMCI Execution Plan (max_dop = 1, max_query_mem = unlimited) | | 2 | └─Hash Groupby | | 6 | 51218.50 | Group Key: lineitem.l_shipmode | | 3 | └─Compute Scalar | | 1869 | 50191.00 | | | 4 | └─Hash Join | | 1869 | 31000.00 | Join Cond: lineitem.l_orderkey = orders.o_orderkey | | 5 | ├─Table Scan | lineitem | 2000 | 0.00 | | | 6 | └─Table Scan | orders | 2000 | 0.00 | | +----+------------------------+----------+--------+----------+---------------------------------------------------------------+開啟效能分析功能。
SET imci_analyze_query = ON;執行SQL語句。
SELECT l_shipmode, COUNT(*) FROM lineitem, orders WHERE l_orderkey = o_orderkey GROUP BY l_shipmode;執行結果如下:
+------------+----------+ | l_shipmode | COUNT(*) | +------------+----------+ | REG AIR | 283 | | SHIP | 269 | | FOB | 299 | | RAIL | 289 | | TRUCK | 314 | | MAIL | 274 | | AIR | 272 | +------------+----------+ 7 rows in set (0.05 sec)效能分析結果會存放在
information_schema.imci_sql_profiling表中。在將參數imci_analyze_query的值設定為ON時,該表將存放最近一條使用列存索引執行的查詢語句的效能分析資訊,通過查詢這張表,就可以獲得執行的查詢語句的效能資訊。查詢SQL語句的效能資訊。
/*ROUTE_TO_LAST_USED*/SELECT * FROM information_schema.imci_sql_profiling;說明/*ROUTE_TO_LAST_USED*/用於指示智能代理將查詢路由到上一條語句執行的節點上。否則可能會因為路由不到查詢執行的節點,導致查詢不到結果或結果不正確。查詢結果如下:
+----+------------------------+----------+--------+----------+-------------------+--------------------------------------------------------------------------------------------------------+ | ID | Operator | Name | A-Rows | A-Cost | Execution Time(s) | Extra Info | +----+------------------------+----------+--------+----------+-------------------+--------------------------------------------------------------------------------------------------------+ | 1 | Select Statement | | 0 | 52609.51 | 0 | IMCI Execution Plan (max_dop = 1, real_dop = 1, max_query_mem = unlimited, real_query_mem = unlimited) | | 2 | └─Hash Groupby | | 7 | 52609.5 | 0.002 | Group Key: lineitem.l_shipmode | | 3 | └─Compute Scalar | | 2000 | 51501 | 0 | | | 4 | └─Hash Join | | 2000 | 31000 | 0.007 | Join Cond: lineitem.l_orderkey = orders.o_orderkey | | 5 | ├─Table Scan | lineitem | 2000 | 0 | 0.001 | | | 6 | └─Table Scan | orders | 2000 | 0 | 0 | | +----+------------------------+----------+--------+----------+-------------------+--------------------------------------------------------------------------------------------------------+可以看到,
Hash Join與Hash Groupby對應的Extra Info資訊中標明了對應的串連條件以及分組列,而第1行的Extra Info則展示了執行查詢操作時使用的CPU和記憶體資訊。(可選)您可以在
information_schema.imci_sql_profiling表上執行彙總等操作。如查看該查詢語句的執行時間等。/*ROUTE_TO_LAST_USED*/SELECT SUM(`Execution Time(s)`) AS TOTAL_TIME FROM information_schema.imci_sql_profiling;執行結果如下:
+----------------------+ | TOTAL_TIME | +----------------------+ | 0.010000000000000002 | +----------------------+ 1 row in set (0.00 sec)可以看到,該查詢語句的執行耗時為10ms。
複雜查詢樣本
以在TPC-H SF100資料集上執行一個複雜查詢為例,通過EXPLAIN和查詢分析功能分析查詢耗時分布以及效能。SQL語句如下:
SELECT
c_name,
sum(l_quantity)
FROM
customer,
orders,
lineitem
WHERE
o_orderkey IN (
SELECT
l_orderkey
FROM
lineitem
WHERE
l_partkey > 18000000
)
AND c_custkey = o_custkey
AND o_orderkey = l_orderkey
GROUP BY
c_name
ORDER BY
c_name
LIMIT 10;通過EXPLAIN查看該查詢語句的執行計畫。
EXPLAIN SELECT c_name, sum(l_quantity) FROM customer, orders, lineitem WHERE o_orderkey IN ( SELECT l_orderkey FROM lineitem WHERE l_partkey > 18000000 ) AND c_custkey = o_custkey AND o_orderkey = l_orderkey GROUP BY c_name ORDER BY c_name LIMIT 10;執行結果如下:
+----+----------------------------------+----------+-----------+------------+---------------------------------------------------------------+ | ID | Operator | Name | E-Rows | E-Cost | Extra Info | +----+----------------------------------+----------+-----------+------------+---------------------------------------------------------------+ | 1 | Select Statement | | | | IMCI Execution Plan (max_dop = 32, max_query_mem = unlimited) | | 2 | └─Limit | | 10 | 7935739.96 | Offset=0 Limit=10 | | 3 | └─Sort | | 10 | 7935739.96 | Sort Key: c_name ASC | | 4 | └─Hash Groupby | | 1503700 | 7933273.99 | Group Key: customer.C_NAME | | 5 | └─Hash Right Semi Join | | 54010545 | 7865930.26 | Join Cond: lineitem.L_ORDERKEY = orders.O_ORDERKEY | | 6 | ├─Table Scan | lineitem | 59785766 | 24001.52 | Cond: (L_PARTKEY > 18000000) | | 7 | └─Hash Join | | 538776190 | 5488090.33 | Join Cond: orders.O_ORDERKEY = lineitem.L_ORDERKEY | | 8 | ├─Hash Join | | 181006430 | 668535.99 | Join Cond: customer.C_CUSTKEY = orders.O_CUSTKEY | | 9 | │ ├─Table Scan | customer | 15000000 | 600.00 | | | 10 | │ └─Table Scan | orders | 150000000 | 6000.00 | | | 11 | └─Table Scan | lineitem | 600037902 | 24001.52 | | +----+----------------------------------+----------+-----------+------------+---------------------------------------------------------------+該SQL語句的執行代價為7935739。其中,ID為7的Hash Join會消耗大量CPU(其代價為5488090,佔總代價約70%)。若要驗證這條查詢語句的效能問題,您可以開啟效能分析功能,開啟後再執行該SQL語句,並通過查看SQL語句的查詢效能資訊來分析具體原因。
開啟效能分析功能。
SET imci_analyze_query = ON;執行SQL語句。
SELECT c_name, sum(l_quantity) FROM customer, orders, lineitem WHERE o_orderkey IN ( SELECT l_orderkey FROM lineitem WHERE l_partkey > 18000000 ) AND c_custkey = o_custkey AND o_orderkey = l_orderkey GROUP BY c_name ORDER BY c_name LIMIT 10;執行結果如下:
+--------------------+-----------------+ | c_name | sum(l_quantity) | +--------------------+-----------------+ | Customer#000000001 | 172.00 | | Customer#000000002 | 663.00 | | Customer#000000004 | 174.00 | | Customer#000000005 | 488.00 | | Customer#000000007 | 1135.00 | | Customer#000000008 | 440.00 | | Customer#000000010 | 625.00 | | Customer#000000011 | 143.00 | | Customer#000000013 | 1032.00 | | Customer#000000014 | 564.00 | +--------------------+-----------------+ 10 rows in set (21.37 sec)查詢SQL語句的效能資訊。
/*ROUTE_TO_LAST_USED*/SELECT * FROM information_schema.imci_sql_profiling;說明/*ROUTE_TO_LAST_USED*/用於指示智能代理將查詢路由到上一條語句執行的節點上。否則可能會因為路由到錯誤節點而查詢不到結果。查詢結果如下:
+----+----------------------------------+----------+-----------+------------+-------------------+----------------------------------------------------+ | ID | Operator | Name | A-Rows | A-Cost | Execution Time(s) | Extra Info | +----+----------------------------------+----------+-----------+------------+-------------------+----------------------------------------------------+ | 1 | Select Statement | | 0 | 8336856.67 | 0 | | | 2 | └─Limit | | 10 | 8336856.67 | 0.002 | Offset=0 Limit=10 | | 3 | └─Sort | | 0 | 8336856.67 | 2.275 | Sort Key: c_name ASC | | 4 | └─Hash Groupby | | 9813586 | 8320763.22 | 160.083 | Group Key: customer.C_NAME | | 5 | └─Hash Right Semi Join | | 239598134 | 7994854.23 | 98.174 | Join Cond: lineitem.L_ORDERKEY = orders.O_ORDERKEY | | 6 | ├─Table Scan | lineitem | 60013756 | 24001.52 | 3.28 | Cond: (L_PARTKEY > 18000000) | | 7 | └─Hash Join | | 600037902 | 5156677.35 | 301.503 | Join Cond: orders.O_ORDERKEY = lineitem.L_ORDERKEY | | 8 | ├─Hash Join | | 150000000 | 629777.96 | 97.201 | Join Cond: customer.C_CUSTKEY = orders.O_CUSTKEY | | 9 | │ ├─Table Scan | customer | 15000000 | 600 | 3.321 | | | 10 | │ └─Table Scan | orders | 150000000 | 6000 | 0.241 | | | 11 | └─Table Scan | lineitem | 600037902 | 24001.52 | 0.661 | | +----+----------------------------------+----------+-----------+------------+-------------------+----------------------------------------------------+根據查詢的效能資訊可以看出:Hash Join(7)耗費了約一半的時間。這是因為
lineitem表與Hash Join(8)表的結果都很大,從而導致join的資料量很大,因此耗時很長。在Hash Join中,構建雜湊表的開銷通常遠遠大於雜湊尋找,因此只需要減小Join(8)的結果大小,就能有效減少查詢的耗時。查看SQL語句和執行計畫,若最佳化器將
o_orderkey in (...)轉換為半串連子查詢,且能夠將該條件下推,則能提高該SQL語句的查詢效能。開啟基於代價的半串連下推功能。
SET imci_optimizer_switch = 'semijoin_pushdown=on';通過EXPLAIN查看該查詢語句的執行計畫。
EXPLAIN SELECT c_name, sum(l_quantity) FROM customer, orders, lineitem WHERE o_orderkey IN ( SELECT l_orderkey FROM lineitem WHERE l_partkey > 18000000 ) AND c_custkey = o_custkey AND o_orderkey = l_orderkey GROUP BY c_name ORDER BY c_name LIMIT 10;執行結果如下:
+----+----------------------------------------+----------+-----------+------------+---------------------------------------------------------------+ | ID | Operator | Name | E-Rows | E-Cost | Extra Info | +----+----------------------------------------+----------+-----------+------------+---------------------------------------------------------------+ | 1 | Select Statement | | | | IMCI Execution Plan (max_dop = 32, max_query_mem = unlimited) | | 2 | └─Limit | | 10 | 2800433.74 | Offset=0 Limit=10 | | 3 | └─Sort | | 10 | 2800433.74 | Sort Key: c_name ASC | | 4 | └─Hash Groupby | | 14567321 | 2776544.58 | Group Key: customer.C_NAME | | 5 | └─Hash Join | | 57918330 | 2631846.75 | Join Cond: orders.O_ORDERKEY = lineitem.L_ORDERKEY | | 6 | ├─Hash Join | | 14567321 | 1014041.92 | Join Cond: orders.O_CUSTKEY = customer.C_CUSTKEY | | 7 | │ ├─Hash Right Semi Join | | 12071937 | 906226.67 | Join Cond: lineitem.L_ORDERKEY = orders.O_ORDERKEY | | 8 | │ │ ├─Table Scan | lineitem | 59785766 | 24001.52 | Cond: (L_PARTKEY > 18000000) | | 9 | │ │ └─Table Scan | orders | 150000000 | 6000.00 | | | 10 | │ └─Table Scan | customer | 15000000 | 600.00 | | | 11 | └─Table Scan | lineitem | 600037902 | 24001.52 | | +----+----------------------------------------+----------+-----------+------------+---------------------------------------------------------------+可以看出,開啟半串連下推功能後,查詢總代價從7935739下降至2800433。
重新執行該SQL語句。
SELECT c_name, sum(l_quantity) FROM customer, orders, lineitem WHERE o_orderkey IN ( SELECT l_orderkey FROM lineitem WHERE l_partkey > 18000000 ) AND c_custkey = o_custkey AND o_orderkey = l_orderkey GROUP BY c_name ORDER BY c_name LIMIT 10;執行結果如下:
+--------------------+-----------------+ | c_name | sum(l_quantity) | +--------------------+-----------------+ | Customer#000000001 | 172.00 | | Customer#000000002 | 663.00 | | Customer#000000004 | 174.00 | | Customer#000000005 | 488.00 | | Customer#000000007 | 1135.00 | | Customer#000000008 | 440.00 | | Customer#000000010 | 625.00 | | Customer#000000011 | 143.00 | | Customer#000000013 | 1032.00 | | Customer#000000014 | 564.00 | +--------------------+-----------------+ 10 rows in set (13.74 sec)可以看出,執行耗時降低了約40%。
查詢SQL語句的效能資訊。
/*ROUTE_TO_LAST_USED*/SELECT * FROM information_schema.imci_sql_profiling;執行結果如下:
+----+----------------------------------------+----------+-----------+------------+-------------------+----------------------------------------------------+ | ID | Operator | Name | A-Rows | A-Cost | Execution Time(s) | Extra Info | +----+----------------------------------------+----------+-----------+------------+-------------------+----------------------------------------------------+ | 1 | Select Statement | | 0 | 4318488.35 | 0 | | | 2 | └─Limit | | 10 | 4318488.35 | 0.002 | Offset=0 Limit=10 | | 3 | └─Sort | | 0 | 4318488.34 | 3.076 | Sort Key: c_name ASC | | 4 | └─Hash Groupby | | 9813586 | 4302394.89 | 163.149 | Group Key: customer.C_NAME | | 5 | └─Hash Join | | 239598134 | 3976485.91 | 151.253 | Join Cond: orders.O_ORDERKEY = lineitem.L_ORDERKEY | | 6 | ├─Hash Join | | 49393149 | 1321335.54 | 55.392 | Join Cond: orders.O_CUSTKEY = customer.C_CUSTKEY | | 7 | │ ├─Hash Right Semi Join | | 49393149 | 954805.16 | 52.552 | Join Cond: lineitem.L_ORDERKEY = orders.O_ORDERKEY | | 8 | │ │ ├─Table Scan | lineitem | 60013756 | 24001.52 | 2.791 | Cond: (L_PARTKEY > 18000000) | | 9 | │ │ └─Table Scan | orders | 150000000 | 6000 | 0.152 | | | 10 | │ └─Table Scan | customer | 15000000 | 600 | 0.028 | | | 11 | └─Table Scan | lineitem | 600037902 | 24001.52 | 0.642 | | +----+----------------------------------------+----------+-----------+------------+-------------------+----------------------------------------------------+從效能分析結果可以看出,半串連被下推,之前的大表Join被消除,查詢耗時顯著減少。