This topic describes how to diagnose partition field reasonability, distribution field skew, and replicated table reasonability in the console.

Diagnose partition field reasonability

AnalyticDB for MySQL allows you to specify the distribution key when you create a table to improve the data query performance. Unreasonable partition fields may be selected because data characteristics are unknown or change with business characteristics. In addition, partition reasonability is also related to the scale of clusters. Partition field reasonability diagnostics can help you monitor the changes of partitions and remind you of optimizing the table schema when the partitions are unreasonable. This maintains or improves the query performance of clusters.

  • Criteria for unreasonable partition fields
    If more than 10% partition records in a table are in an unreasonable range, partition fields are considered unreasonable. The following table describes the criteria.
    Number of nodes Number of level-2 partition records
    1 16 million to 160 million
    2 64 million to 640 million
    4 to 6 128 million to 1.28 billion
    8 to 14 192 million to 1.92 billion
    16 to 30 256 million to 2.56 billion
    32 to 62 512 million to 5.12 billion
    More than 64 1.024 billion to 10.24 billion

    If only a small number of partition records are collected, we recommend that you increase the partition granularity. For example, change the granularity from day to month. If a large number of partition records are collected, we recommend that you decrease the partition granularity. For example, change the granularity from month to day. For more information, see Schema design.

  • Procedure
    1. Log on to the AnalyticDB for MySQL console.
    2. In the upper-left corner of the page, select the region where the cluster you want to manage is deployed. In the left-side navigation pane, click Clusters.
    3. On the V3.0 Clusters tab, find the cluster and click the Cluster ID.
    4. In the left-side navigation pane, click Diagnostics and Optimization.
    5. Click the Data Modeling Optimization tab and view the partition field reasonability diagnostics information in the Partition Field Reasonability Diagnosis section. 2021040801

      You can use the information to determine which tables contain unreasonable partitions and which partitions in a table are unreasonable based on the information displayed in the Inappropriate Partitions column. Then, you can optimize the tables or unreasonable partitions. For more information, see Schema design.

Diagnose distribution field skew

AnalyticDB for MySQL allows you to specify DISTRIBUTED BY HASH in the CREATE TABLE statement to evenly distribute data among storage nodes. However, unreasonable distribution fields may be selected because data characteristics are unknown or change with business characteristics. This causes data skew issues. For example, uneven resources are used across nodes, and long tails occur on subtasks. This ultimately worsens query performance. Distribution field reasonability diagnostics can help detect unreasonable distribution fields and reduce data skew.

  • Criteria for unreasonable distribution fields
    Calculation formula for data skew:
    1. Remove the maximum shard size and calculate the average shard size.
    2. Skewed shards refer to those that have a size greater than the product of the average shard size and the threshold or smaller than the proportion of the shard partition to the threshold. The threshold can be modified and its default value is 3.
  • Procedure
    1. Log on to the AnalyticDB for MySQL console.
    2. In the upper-left corner of the page, select the region where the cluster you want to manage is deployed. In the left-side navigation pane, click Clusters.
    3. On the V3.0 Clusters tab, find the cluster and click the Cluster ID.
    4. In the left-side navigation pane, click Diagnostics and Optimization.
    5. Click the Data Modeling Optimization tab and view the distribution field skew diagnostics information in the Distribution Field Skew Diagnosis section.

      You can use the information to determine which tables are skewed in a database. Then, you can select distribution fields again. For more information, see Schema design.

Diagnose replicated table reasonability

AnalyticDB for MySQL allows you to specify DISTRIBUTED BY BROADCAST in the CREATE TABLE statement. Replicated tables can save a complete copy of table data on each storage node of the instance. When you join other tables with replicated tables, you do not need to redistribute replicated tables. This improves query performance. However, write amplification occurs when data is written to replicated tables. It affects the overall write performance of AnalyticDB for MySQL. Replicated table reasonability diagnostics can help you detect unreasonable replicated tables.

  • Criteria for unreasonable replicated tables

    If a replicated table contains more than 20,000 records, the table is considered unreasonable.

  • Procedure
    1. Log on to the AnalyticDB for MySQL console.
    2. In the upper-left corner of the page, select the region where the cluster you want to manage is deployed. In the left-side navigation pane, click Clusters.
    3. On the V3.0 Clusters tab, find the cluster and click the Cluster ID.
    4. In the left-side navigation pane, click Diagnostics and Optimization.
    5. Click the Data Modeling Optimization tab and view the replicated table reasonability diagnostics information in the Replicated Table Reasonability Diagnosis section.

      You can view the diagnosis results or execute the COUNT function on a replicated table to check its records. Then, you can compare the result with the recommended value.