This topic describes how to manage execution plans and retain the execution plans of repeated or complex queries for a long time.

Background information

The optimizer generates an execution plan for each Structured Query Language (SQL) statement. However, in many cases, the SQL statements of requests from applications are repeated other than only parameters. The parameterized SQL statements are exactly the same. In this case, you can create a cache based on the parameterized SQL statements to cache various information (such as execution plans) other than parameters. This is called Plan Cache.

In addition, Plan Cache allows the execution plans of complex queries, such as queries that involve in joining multiple tables, to remain relatively stable. Plan Cache also prevents the execution plans from changing due to reasons, such as version upgrades. In Plan Management, a group of execution plans are recorded for each SQL statement, and are persistently retained even when the version is upgraded.

Overview of the workflow

When PolarDB-X 1.0 receives a query SQL statement, this statement goes through the following process:

  1. Parameterize the query SQL statement and replace all the parameters with the placeholders ?.
  2. Use the parameterized SQL statement as a key to check whether an execution plan is cached in Plan Cache. If no execution plan is cached, revoke the optimizer for optimization.
  3. If the SQL statement performs a simple query, directly execute the SQL statement and skip the steps that are relevant to Plan Management.
  4. If the SQL statement performs a complex query, use the execution plan that is persistently retained in the Baseline. If multiple execution plans are available, select the one that has the minimum cost.
Process of execution plans

Plan Cache

By default, the Plan Cache feature is enabled in PolarDB-X 1.0. HitCache in the EXPLAIN results specifies whether the current SQL statement hits Plan Cache. After Plan Cache is enabled, PolarDB-X 1.0 parameterizes the SQL statement by replacing the constants in the SQL statement with placeholders ?, and creating a corresponding parameter list. In the execution plan, you can also see that the SQL statement of the LogicalView operator contains the placeholders ?.

Plan Management

Complex SQL statements also go through the Plan Management process after they go through the Plan Cache process.

Both Plan Cache and Plan Management use the parameterized SQL statements as keys to execute plans. The execution plans of all SQL statements are cached in Plan Cache. However, in Plan Management, only complex query SQL statements are processed.

SQL templates do not always have one-to-one correspondence with optimal execution plans due to the impact of specific parameters.

In Plan Management, each SQL statement corresponds to one Baseline, and each Baseline contains one or more execution plans. In practice, the execution plan that has the minimum cost is selected for execution based on the used parameters.

Select a plan

When an execution plan in Plan Cache enters Plan Management, SQL Plan Management (SPM) can implement a process to check whether the execution plan is known. If the execution plan is known, SPM checks whether the cost of the execution plan is minimum. If the execution plan is unknown, SPM checks whether the execution plan needs to be executed to determine the optimization degree of the execution plan.

Operations and maintenance (O&M) instructions

PolarDB-X 1.0 provides a variety of instruction sets to manage execution plans. The following syntax is used:

BASELINE (LOAD|PERSIST|CLEAR|VALIDATE|LIST|DELETE) [Signed Integer,Signed Integer....]
BASELINE (ADD|FIX) SQL (HINT Select Statemtnt)
  • BASELINE (ADD|FIX) SQL <HINT> <Select Statement>: fixes the records of the SQL execution plans that are repaired by HINT.
  • BASELINE LOAD: updates the specified Baseline information in system tables to the memory and makes the Baseline information take effect.
  • BASELINE LOAD_PLAN: updates the specified execution plan information in system tables to the memory and makes the information take effect.
  • BASELINE LIST: lists all the current Baseline information. BASELINE PERSIST: flushes the specified Baseline to the disk.
  • BASELINE PERSIST_PLAN: flushes the specified execution plan to the disk.
  • BASELINE CLEAR: clears a Baseline from the memory.
  • BASELINE CLEAR_PLAN: clears an execution plan from the memory.
  • BASELINE DELETE: deletes a Baseline from the disk.
  • BASELINE DELETE_PLAN: deletes an execution plan from the disk.

Practices of tuning execution plans

After data changes or the optimizer engine of PolarDB-X 1.0 is upgraded, a better execution plan may become available for the same SQL statement. During automatic evolution, SPM adds the better execution plan that is automatically discovered by Cost-Based Optimizer (CBO) to the Baseline of the SQL statement. In addition, you can also proactively optimize execution plans by using SPM instructions.

The following SQL statement is used as an example:

SELECT *
FROM lineitem JOIN part ON l_partkey=p_partkey
WHERE p_name LIKE '%green%';

In normal cases, EXPLAIN finds that the execution plan generated by the SQL statement uses a hash join and the SQL statement has only this execution plan in Baseline List:

mysql> explain select * from lineitem join part on l_partkey=p_partkey where p_name like '%geen%';
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| LOGICAL PLAN                                                                                                                                                                                                                                                                                                                                                   |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Gather(parallel=true)                                                                                                                                                                                                                                                                                                                                          |
|   ParallelHashJoin(condition="l_partkey = p_partkey", type="inner")                                                                                                                                                                                                                                                                                            |
|     LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true) |
|     LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)", parallel=true)                                                                                                                    |
| HitCache:true                                                                                                                                                                                                                                                                                                                                                  |
|                                                                                                                                                                                                                                                                                                                                                                |
|                                                                                                                                                                                                                                                                                                                                                                |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
7 rows in set (0.06 sec)
mysql> baseline list;
+-------------+--------------------------------------------------------------------------------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------+
| BASELINE_ID | PARAMETERIZED_SQL                                                              | PLAN_ID    | EXTERNALIZED_PLAN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | FIXED | ACCEPTED |
+-------------+--------------------------------------------------------------------------------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------+
|  -399023558 | SELECT *
FROM lineitem
    JOIN part ON l_partkey = p_partkey
WHERE p_name LIKE ? | -935671684 |
Gather(parallel=true)
  ParallelHashJoin(condition="l_partkey = p_partkey", type="inner")
    LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true)
    LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)", parallel=true)
 |     0 |        1 |
+-------------+--------------------------------------------------------------------------------+------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------+
1 row in set (0.02 sec)

If the SQL statement has better performance after the Batched Key Access (BKA) join (lookup join) is used under some conditions, you must first find a way to use HINT to instruct PolarDB-X 1.0 to generate the expected execution plan. The following HINT format of the BKA join is used:

/*+TDDL:BKA_JOIN(lineitem, part)*/

Execute EXPLAIN [HINT] [SQL] to observe whether the output execution plan is as expected.

mysql> explain /*+TDDL:bka_join(lineitem, part)*/ select * from lineitem join part on l_partkey=p_partkey where p_name like '%geen%';
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| LOGICAL PLAN                                                                                                                                                                                                                                                                                                                                                   |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Gather(parallel=true)                                                                                                                                                                                                                                                                                                                                          |
|   ParallelBKAJoin(condition="l_partkey = p_partkey", type="inner")                                                                                                                                                                                                                                                                                             |
|     LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true) |
|     Gather(concurrent=true)                                                                                                                                                                                                                                                                                                                                    |
|       LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)")                                                                                                                                 |
| HitCache:false                                                                                                                                                                                                                                                                                                                                                 |
|                                                                                                                                                                                                                                                                                                                                                                |
|                                                                                                                                                                                                                                                                                                                                                                |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.14 sec)

Pay attention to that the join algorithm has been changed to the BKA join due to the intervention of the Hint. However, this does not change the Baseline. If you need to use the preceding plan every time you execute this SQL statement, you must add this plan to the Baseline.

You can use the Baseline Add instruction in Plan Management to add an execution plan for the SQL statement. In this case, two sets of execution plans exist in the Baseline of the SQL statement at the same time. The CBO selects an execution plan for execution based on the cost.

mysql> baseline add sql /*+TDDL:bka_join(lineitem, part)*/ select * from lineitem join part on l_partkey=p_partkey where p_name like '%geen%';
+-------------+--------+
| BASELINE_ID | STATUS |
+-------------+--------+
|  -399023558 | OK     |
+-------------+--------+
1 row in set (0.09 sec)
mysql> baseline list;
+-------------+--------------------------------------------------------------------------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------+
| BASELINE_ID | PARAMETERIZED_SQL                                                              | PLAN_ID     | EXTERNALIZED_PLAN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | FIXED | ACCEPTED |
+-------------+--------------------------------------------------------------------------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------+
|  -399023558 | SELECT *
FROM lineitem
    JOIN part ON l_partkey = p_partkey
WHERE p_name LIKE ? | -1024543942 |
Gather(parallel=true)
  ParallelBKAJoin(condition="l_partkey = p_partkey", type="inner")
    LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true)
    Gather(concurrent=true)
      LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)")
 |     0 |        1 |
|  -399023558 | SELECT *
FROM lineitem
    JOIN part ON l_partkey = p_partkey
WHERE p_name LIKE ? |  -935671684 |
Gather(parallel=true)
  ParallelHashJoin(condition="l_partkey = p_partkey", type="inner")
    LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true)
    LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)", parallel=true)
               |     0 |        1 |
+-------------+--------------------------------------------------------------------------------+-------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+----------+
2 rows in set (0.03 sec)

The preceding result of the Baseline List instruction shows that the execution plan based on BKA_JOIN has been added to the Baseline of the SQL statement. If you execute the EXPLAIN statement for this SQL statement, you can find that as the p_name LIKE ? condition changes in the SQL statement, PolarDB-X 1.0 selects different execution plans. If you want PolarDB-X 1.0 to use the preceding execution plan instead of choosing one of the two execution plans, you can use the Baseline Fix instruction to force PolarDB-X 1.0 to follow the specified execution plan.

mysql> baseline fix sql /*+TDDL:bka_join(lineitem, part)*/ select * from lineitem join part on l_partkey=p_partkey where p_name like '%geen%';
+-------------+--------+
| BASELINE_ID | STATUS |
+-------------+--------+
|  -399023558 | OK     |
+-------------+--------+
1 row in set (0.07 sec)
mysql> baseline list\G
*************************** 1. row ***************************
      BASELINE_ID: -399023558
PARAMETERIZED_SQL: SELECT *
FROM lineitem
    JOIN part ON l_partkey = p_partkey
WHERE p_name LIKE ?
          PLAN_ID: -1024543942
EXTERNALIZED_PLAN:
Gather(parallel=true)
  ParallelBKAJoin(condition="l_partkey = p_partkey", type="inner")
    LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true)
    Gather(concurrent=true)
      LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)")
            FIXED: 1
         ACCEPTED: 1
*************************** 2. row ***************************
      BASELINE_ID: -399023558
PARAMETERIZED_SQL: SELECT *
FROM lineitem
    JOIN part ON l_partkey = p_partkey
WHERE p_name LIKE ?
          PLAN_ID: -935671684
EXTERNALIZED_PLAN:
Gather(parallel=true)
  ParallelHashJoin(condition="l_partkey = p_partkey", type="inner")
    LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true)
    LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)", parallel=true)
            FIXED: 0
         ACCEPTED: 1
2 rows in set (0.01 sec)

After the Baseline Fix instruction is executed, you can see that the Fix status bit in the execution plan that contains the BKA join has been set to 1. In this case, this execution plan is definitely to be used even if no HINT is added and you execute the EXPLAIN statement for this SQL statement under a condition.

mysql> explain select * from lineitem join part on l_partkey=p_partkey where p_name like '%green%';
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| LOGICAL PLAN                                                                                                                                                                                                                                                                                                                                                   |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Gather(parallel=true)                                                                                                                                                                                                                                                                                                                                          |
|   ParallelBKAJoin(condition="l_partkey = p_partkey", type="inner")                                                                                                                                                                                                                                                                                             |
|     LogicalView(tables="[00-03].lineitem", shardCount=4, sql="SELECT `l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`, `l_quantity`, `l_extendedprice`, `l_discount`, `l_tax`, `l_returnflag`, `l_linestatus`, `l_shipdate`, `l_commitdate`, `l_receiptdate`, `l_shipinstruct`, `l_shipmode`, `l_comment` FROM `lineitem` AS `lineitem`", parallel=true) |
|     Gather(concurrent=true)                                                                                                                                                                                                                                                                                                                                    |
|       LogicalView(tables="[00-03].part", shardCount=4, sql="SELECT `p_partkey`, `p_name`, `p_mfgr`, `p_brand`, `p_type`, `p_size`, `p_container`, `p_retailprice`, `p_comment` FROM `part` AS `part` WHERE (`p_name` LIKE ?)")                                                                                                                                 |
| HitCache:true                                                                                                                                                                                                                                                                                                                                                  |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
8 rows in set (0.01 sec)