When you use PolarDB-X 1.0, slow SQL queries whose performance is not as expected may occur. SQL tuning is performed to analyze information such as SQL execution plans and the runtime of each phase to identify the causes of slow SQL queries, and then solve the problems.
PolarDB-X 1.0 architecture
PolarDB-X 1.0 is a database service that supports an architecture in which computing is decoupled from storage. When a logical SQL query is sent to a node in PolarDB-X 1.0, PolarDB-X 1.0 classifies the subqueries in the query into two types based on whether the subqueries can be pushed down. The subqueries that can be pushed down are known as physical SQL queries.
PolarDB-X 1.0 executes SQL queries based on the following principles:
- SQL statements sent by users are pushed down to and executed by the ApsaraDB RDS for MySQL instances as many as possible.
- Operators that cannot be pushed down are executed by using the optimal method.
Pushdown and execution plans
The EXPLAIN command prints the execution plans of SQL statements. You need only to
EXPLAIN in front of an SQL statement. The following examples show how SQL statements are
pushed down and executed in PolarDB-X 1.0:
> explain select c_custkey, c_name, c_address from customer where c_custkey = 42; LogicalView(tables="customer_2", sql="SELECT `c_custkey`, `c_name`, `c_address` FROM `customer` AS `customer` WHERE (`c_custkey` = ?)")
If a primary key is specified in an SQL statement, PolarDB-X 1.0 directly pushes down the SQL statement to the shard that corresponds to the primary
key. Therefore, the execution plan contains only the
LogicalView operator. The physical SQL statement that is pushed down is similar to the logical
LogicalView operator indicates the query that is pushed down to and executed by the ApsaraDB
RDS for MySQL instances. For more information, see Execution plans and basic operators.
> explain select c_nationkey, count(*) from customer group by c_nationkey; HashAgg(group="c_nationkey", count(*)="SUM(count(*))") Gather(concurrent=true) LogicalView(tables="customer_[0-7]", shardCount=8, sql="SELECT `c_nationkey`, COUNT(*) AS `count(*)` FROM `customer` AS `customer` GROUP BY `c_nationkey`")
The preceding query collects statistics on the number of customers in each country or region. The query is executed in the following two phases in sharded databases:
COUNT(*)on each table shard to collect statistics. This step can be pushed down to and executed by ApsaraDB RDS for MySQL instances.
SUM()to summarize the results of each
COUNT(*)query to obtain the final results. This step must be executed by the PolarDB-X 1.0 node.
The execution plan of the statement also shows that the query executed in the preceding two steps. Take note of the following information:
LogicalViewoperator indicates that statistics are collected by
COUNT(*)after data is grouped by nation for the SQL subqueries that are pushed down to each table shard.
Gatheroperator is used to gather the results of each table shard.
HashAggoperator is a method to implement aggregation. This operator runs
SUM()for the results of
c_nationkeyas the grouping key.
For more examples, see Execution plans and basic operators.