This topic describes how to perform diagnostics on partition field reasonability, distribution field skew, and replicated table reasonability in the console.

Diagnostics on partition field reasonability

AnalyticDB for MySQL allows you to specify the distribution key when you create a table to improve the data query performance. Unreasonable partition fields are possibly selected because data characteristics may be unknown or change as business evolves. In addition, partition reasonability is also related to the scale of clusters. Partition field reasonability diagnostics can help you monitor the changes of partitions and remind you to optimize the table schema when the partitions are unreasonable. This maintains or improves the query performance of clusters.

  • Criteria for unreasonable partition fields
    If more than 10% partition records in a table are in an unreasonable range, the associated partition fields are considered unreasonable. The following table describes the criteria.
    Number of nodes Number of partition records
    1 16 million to 160 million
    2 64 million to 640 million
    4 to 6 128 million to 1,280 million
    8 to 14 192 million to 1,920 million
    16 to 30 256 million to 2,560 million
    32 to 62 512 million to 5,120 million
    More than 64 1.024 billion to 10.24 billion

    If only a small number of partition records are collected, we recommend that you increase the partition granularity. For example, you can change the granularity from day to month. If a large number of partition records are collected, we recommend that you decrease the partition granularity. For example, you can change the granularity from month to day. For more information, see Schema design.

  • Procedure
    1. Log on to the AnalyticDB for MySQL console.
    2. In the upper-left corner of the page, select the region where the cluster is deployed. In the left-side navigation pane, click Clusters.
    3. On the Data Warehouse Edition (V3.0) tab, find the cluster and click its Cluster ID.
    4. In the left-side navigation pane, click Diagnostics and Optimization.
    5. Click the Data Modeling Diagnostics tab and view the information about partition field reasonability diagnostics in the Partition Field Reasonability Diagnosis section. 2021040801

      This section shows the tables that contain unreasonable partitions in the Table Name column and the specific partitions that are unreasonable in the Inappropriate Partitions column. Then, you can optimize the tables or unreasonable partitions. For more information, see Schema design.

Diagnostics on distribution field skew

AnalyticDB for MySQL allows you to specify DISTRIBUTED BY HASH in the CREATE TABLE statement to evenly distribute data among storage nodes. However, unreasonable distribution fields are possibly selected because data characteristics may be unknown or change as business evolves. This causes data skew issues such as uneven resources across nodes and long tails on subtasks, and ultimately reduces query performance. Distribution field reasonability diagnostics can help you detect unreasonable distribution fields and reduce data skew.

  • Criteria for unreasonable distribution fields
    Data skew:
    1. You can remove the maximum shard size and then calculate the average shard size.
    2. If the size of a shard is larger than the average shard size multiplied by the threshold or smaller than the average shard size divided by the threshold, the shard is considered skewed. The threshold can be modified and its default value is 3.
  • Procedure
    1. Log on to the AnalyticDB for MySQL console.
    2. In the upper-left corner of the page, select the region where the cluster is deployed. In the left-side navigation pane, click Clusters.
    3. On the Data Warehouse Edition (V3.0) tab, find the cluster and click its Cluster ID.
    4. In the left-side navigation pane, click Diagnostics and Optimization.
    5. Click the Data Modeling Diagnostics tab and view the information about distribution field skew diagnostics in the Distribution Field Skew Diagnosis section.

      This section shows the database tables that are skewed. Then, you can select distribution fields again. For more information, see Schema design.

Diagnostics on replicated table reasonability

AnalyticDB for MySQL allows you to specify DISTRIBUTED BY BROADCAST in the CREATE TABLE statement. Replicated tables can save a complete copy of table data on each storage node of the cluster. When you join other tables with replicated tables, you do not need to redistribute replicated tables. This improves query performance. However, write amplification occurs when data is written to replicated tables. It affects the overall write performance of AnalyticDB for MySQL. Replicated table reasonability diagnostics can help you detect unreasonable replicated tables.

  • Criteria for unreasonable replicated tables

    If a replicated table contains more than 20,000 records, the table is considered unreasonable.

  • Procedure
    1. Log on to the AnalyticDB for MySQL console.
    2. In the upper-left corner of the page, select the region where the cluster is deployed. In the left-side navigation pane, click Clusters.
    3. On the Data Warehouse Edition (V3.0) tab, find the cluster and click its Cluster ID.
    4. In the left-side navigation pane, click Diagnostics and Optimization.
    5. Click the Data Modeling Diagnostics tab and view the information about replicated table reasonability diagnostics in the Dimension Table Reasonability Diagnostics section.

      You can view the diagnostic result in this section or execute the COUNT function on a replicated table to check its records. Then, you can compare the result with the recommended value.