Basic concepts of SQL tuning - PolarDB - Alibaba Cloud Documentation Center

When you use PolarDB-X 1.0, slow SQL queries whose performance is not as expected may occur. SQL tuning is performed to analyze information such as SQL execution plans and the runtime of each phase to identify the causes of slow SQL queries, and then solve the problems.

PolarDB-X 1.0 architecture

PolarDB-X 1.0 is a database service that supports an architecture in which computing is decoupled from storage. When a logical SQL query is sent to a node in PolarDB-X 1.0, PolarDB-X 1.0 classifies the subqueries in the query into two types based on whether the subqueries can be pushed down. The subqueries that can be pushed down are known as physical SQL queries.

PolarDB-X 1.0 executes SQL queries based on the following principles:

SQL statements sent by users are pushed down to and executed by the ApsaraDB RDS for MySQL instances as many as possible.
Operators that cannot be pushed down are executed by using the optimal method.

Pushdown and execution plans

The EXPLAIN command prints the execution plans of SQL statements. You need only to add EXPLAIN in front of an SQL statement. The following examples show how SQL statements are pushed down and executed in PolarDB-X 1.0:

Example 1:

> explain select c_custkey, c_name, c_address from customer where c_custkey = 42;

LogicalView(tables="customer_2", sql="SELECT `c_custkey`, `c_name`, `c_address` FROM `customer` AS `customer` WHERE (`c_custkey` = ?)")

If a primary key is specified in an SQL statement, PolarDB-X 1.0 directly pushes down the SQL statement to the shard that corresponds to the primary key. Therefore, the execution plan contains only the LogicalView operator. The physical SQL statement that is pushed down is similar to the logical SQL statement.

The LogicalView operator indicates the query that is pushed down to and executed by the ApsaraDB RDS for MySQL instances. For more information, see Execution plans and basic operators.

Example 2:

> explain select c_nationkey, count(*) from customer group by c_nationkey;

HashAgg(group="c_nationkey", count(*)="SUM(count(*))")
  Gather(concurrent=true)
    LogicalView(tables="customer_[0-7]", shardCount=8, sql="SELECT `c_nationkey`, COUNT(*) AS `count(*)` FROM `customer` AS `customer` GROUP BY `c_nationkey`")

The preceding query collects statistics on the number of customers in each country or region. The query is executed in the following two phases in sharded databases:

Run COUNT(*) on each table shard to collect statistics. This step can be pushed down to and executed by ApsaraDB RDS for MySQL instances.
Run SUM() to summarize the results of each COUNT(*) query to obtain the final results. This step must be executed by the PolarDB-X 1.0 node.

The execution plan of the statement also shows that the query executed in the preceding two steps. Take note of the following information:

The LogicalView operator indicates that statistics are collected by COUNT(*) after data is grouped by nation for the SQL subqueries that are pushed down to each table shard.
The Gather operator is used to gather the results of each table shard.
The HashAgg operator is a method to implement aggregation. This operator runs SUM() for the results of COUNT(*) by using c_nationkey as the grouping key.

For more examples, see Execution plans and basic operators.