All Products
Search
Document Center

AnalyticDB for MySQL:Use stage and task details to analyze queries

Last Updated:Mar 01, 2024

After the frontend access node of an AnalyticDB for MySQL cluster receives a query request, the cluster divides the query into multiple stages, and reads and computes data in a distributed manner on worker nodes and executor nodes. Some stages can be executed in parallel, but the other stages that have dependencies can be executed only in series. As a result, complex SQL statements may cause slow query issues. You can use stage and task details to analyze the issues by calling API operations or in the AnalyticDB for MySQL console. This topic describes how to use stage and task details to analyze queries.

Procedure

  1. Log on to the AnalyticDB for MySQL console.
  2. In the upper-left corner of the page, select a region.
  3. In the left-side navigation pane, click Clusters.
  4. On the Data Warehouse Edition (V3.0) tab, find the cluster that you want to manage and click the Cluster ID.
  5. In the left-side navigation pane, click Diagnostics and Optimization.

  6. On the SQL Queries tab, find the query that you want to diagnose and click Diagnose in the Actions column.

  7. Click the Stages & Tasks tab to view the details of a stage. For information about the stage query results, see the "Stage parameters" section of this topic.

  8. Click the stage ID to view the details of all tasks in the stage. For information about task query results, see the "Task parameters" section of this topic.

    Important

    You can view task details only for queries that consume more than 1 second.

Parameters

Stage parameters

Parameter

Description

Stage ID

The unique identifier of the stage, which corresponds to the stage ID in the execution plan tree.

Status

The execution status of the stage. Valid values:

  • Completed: All tasks in the stage are executed.

  • Failed: One or more tasks in the stage failed.

  • Running: Tasks in the stage are running.

  • Canceled: One or more tasks in the stage are canceled.

Rows of Input Data

The number of data rows that are input to the stage.

Amount of Input Data

The amount of data that is input to the stage.

Rows of Output Data

The number of data rows that are output from the stage.

Amount of Output Data

The amount of data that is output from the stage.

Peak Memory

The peak memory usage of the stage.

Cumulative Duration

The total amount of time that is consumed to execute all operators in the stage.

You can use this parameter to identify the stages that take a long time to execute and consume a large amount of CPU resources. When you compare the cumulative duration with the query duration, you must consider the concurrency of stages.

Task parameters

Parameter

Description

Task ID

The unique identifier of the task.

Example: 2.24. In this example, 2 indicates the ID of the stage to which the task belongs, and 24 indicates the ID of the task in the stage.

Status

The execution status of the task. Valid values:

  • Completed

  • Failed

  • Terminated

  • Canceled

Amount of Input Data

The number of data rows that are input to the task and the amount of input data.

You can sort the amount of input data for all tasks to check whether data skew occurs on the input data of the stage. Data skew may be caused by inappropriate field settings of GROUP or JOIN clauses. To resolve this issue, trace back to the upstream stage of the stage to which the current task belongs.

Note

If the specified distribution fields are inappropriate, data may be unevenly distributed among worker nodes. This is called a data skew.

Amount of Output Data

The number of data rows that are output from the task and the amount of output data.

You can check whether combined fields exist in the GROUP or JOIN clause of the SQL statement based on the attributes of the Aggregation or Join node in the operator plan tree of the current stage. For example, a.id=b.id contains only the id field, and a.id = b.id and a.age=b.age contains the id and age fields.

Peak Memory

The peak memory usage of the task.

The peak memory is proportional to the amount of input data. You can use this parameter to check whether query failures are caused by unbalanced distribution of input data.

Duration of Reading Table Data

The cumulative amount of time that is consumed by all TableScan operators of the stage to read table data.

This parameter is a cumulative value that involves multiple nodes and threads and cannot be compared with the query duration. If you compare this parameter with the cumulative duration, you can determine how many computing resources of a stage are spent on data scans.

Amount of Read Table Data

The number of data rows and the amount of data that are read by all TableScan operators of the stage.

You can sort the amount of read table data for all tasks to check whether data skew occurs on the source table data. If data skew occurs, you can check whether the data skew is caused by distribution fields in the AnalyticDB for MySQL console. For more information, see Data modeling diagnostics.

Created At

The time when the task is created.

Queuing Duration

The amount of queue time of the task before execution.

Ended At

The time when the task ends.

Interval Between Start and End Time

The interval between the creation time and end time of the task. For example, if a task is created at 2022-12-12 12:00:00 and ends at 2022-12-12 12:00:04, the start and end time interval is 4 seconds.

If you compare this parameter with the query duration, you can identify the main cause of slow execution. For example, if the query duration is 6 seconds and the start and end time interval is 4 seconds, the current stage is the main reason why the query took a long time to execute. For more information, see the "Examples for calculating task duration and concurrency" section of this topic.

Cumulative Duration

The total amount of time that is consumed by all threads of all the tasks in the stage. For more information, see the "Examples for calculating task duration and concurrency" section of this topic.

Computing Time Ratio

The ratio of data computing duration to the subtask lifecycle.

This parameter can be calculated by using the following formula: Computing time ratio = (Cumulative duration/Subtask concurrency)/Interval between start and end time. In this formula, (Cumulative duration/Subtask concurrency) indicates the average amount of time that is consumed by each thread to compute data. The start and end time interval includes the actual data computing time, subtask queuing duration, and network latency. For more information, see the "Examples for calculating task duration and concurrency" section of this topic.

Note

A longer start and end time interval results in a smaller computing time ratio. In this case, you need to identify the operators that consume a long period of time. A shorter start and end time interval results in a larger computing time ratio. In this case, you need to focus on issues like queuing and network latency.

Subtask Concurrency

The number of threads that concurrently execute the task on a node. For more information, see the "Examples for calculating task duration and concurrency" section of this topic.

Execution Node

The IP address of the node on which the task is executed. If long tail issues occur on the same node for multiple queries, you must check this node.

Note

A long tail issue in AnalyticDB for MySQL refers to the situation where some tasks take a much longer time to execute than the other tasks.

Examples for calculating task duration and concurrency

In this example, the start and end time interval, cumulative duration, computing time ratio, and subtask concurrency of Task 2.1 are calculated.

Task 2.1 belongs to Stage 2. Assume that Stage 2 consists of four operators: StageOutput, Join, TableScan, and RemoteSource. The following figure shows the operator tree diagram of Stage 2.image

These operators are concurrently executed on multiple nodes in the direction of the arrows. On the node whose IP address is 192.168.12.23, Task 2.1 is executed with four concurrent threads. The four threads respectively consume 5, 5, 6, and 6 seconds to compute data, as shown in the following figure.

image

  • Cumulative duration of the task: 5s + 5s + 6s + 6s = 22s.

  • Start and end time interval: 10s.

  • Computing time ratio: (22s/4)/10s = 0.55.

Related operations

Operation

Description

DescribeDiagnosisSQLInfo

Queries the execution information about an SQL statement.

DescribeDiagnosisTasks

Queries the execution information about distributed tasks in a stage of a query.