When you use PolarDB-X 1.0, slow SQL queries whose performance is not as expected may occur. SQL tuning is performed to analyze information such as SQL execution plans and the runtime of each phase to identify the causes of slow SQL queries, and then solve the problems.
PolarDB-X 1.0 architecture
PolarDB-X 1.0 is a database service that supports an architecture in which computing is decoupled from storage. When a logical SQL query is sent to a node in PolarDB-X 1.0, PolarDB-X 1.0 classifies the subqueries in the query into two types based on whether the subqueries can be pushed down. The subqueries that can be pushed down are known as physical SQL queries.
PolarDB-X 1.0 executes SQL queries based on the following principles:
- SQL statements sent by users are pushed down to and executed by the ApsaraDB RDS for MySQL instances as many as possible.
- Operators that cannot be pushed down are executed by using the optimal method.
Pushdown and execution plans
The EXPLAIN command prints the execution plans of SQL statements. You need only to
add EXPLAIN
in front of an SQL statement. The following examples show how SQL statements are
pushed down and executed in PolarDB-X 1.0:
Example 1:
> explain select c_custkey, c_name, c_address from customer where c_custkey = 42;
LogicalView(tables="customer_2", sql="SELECT `c_custkey`, `c_name`, `c_address` FROM `customer` AS `customer` WHERE (`c_custkey` = ?)")
If a primary key is specified in an SQL statement, PolarDB-X 1.0 directly pushes down the SQL statement to the shard that corresponds to the primary
key. Therefore, the execution plan contains only the LogicalView
operator. The physical SQL statement that is pushed down is similar to the logical
SQL statement.
The LogicalView
operator indicates the query that is pushed down to and executed by the ApsaraDB
RDS for MySQL instances. For more information, see Execution plans and basic operators.
Example 2:
> explain select c_nationkey, count(*) from customer group by c_nationkey;
HashAgg(group="c_nationkey", count(*)="SUM(count(*))")
Gather(concurrent=true)
LogicalView(tables="customer_[0-7]", shardCount=8, sql="SELECT `c_nationkey`, COUNT(*) AS `count(*)` FROM `customer` AS `customer` GROUP BY `c_nationkey`")
The preceding query collects statistics on the number of customers in each country or region. The query is executed in the following two phases in sharded databases:
- Run
COUNT(*)
on each table shard to collect statistics. This step can be pushed down to and executed by ApsaraDB RDS for MySQL instances. - Run
SUM()
to summarize the results of eachCOUNT(*)
query to obtain the final results. This step must be executed by the PolarDB-X 1.0 node.
The execution plan of the statement also shows that the query executed in the preceding two steps. Take note of the following information:
- The
LogicalView
operator indicates that statistics are collected byCOUNT(*)
after data is grouped by nation for the SQL subqueries that are pushed down to each table shard. - The
Gather
operator is used to gather the results of each table shard. - The
HashAgg
operator is a method to implement aggregation. This operator runsSUM()
for the results ofCOUNT(*)
by usingc_nationkey
as the grouping key.
For more examples, see Execution plans and basic operators.