This topic describes how to perform partition field reasonability diagnosis, distribution field skew diagnosis, and replicated table reasonability diagnosis in the console.

Partition field reasonability diagnosis

AnalyticDB for MySQL allows you to specify the distribution key when you create a table to improve the performance of data query. Unreasonable partition field may be selected because data characteristics are unknown or change with business characteristics. In addition, partition reasonability is also related to the scale of clusters. Partition field reasonability diagnosis can help monitor the changes of partitions and remind you of optimizing the table structure when the partitions are unreasonable. This serves the purpose of maintaining or improving the query performance of clusters.

Criteria for unreasonable partition fields

If more than 10% partition records in a table are not in a reasonable range, partition fields can be considered unreasonable. The following table describes the criteria.
Number of nodes Number of second-level partition records
1 16 million to 160 million
2 64 million to 640 million
4 to 6 128 million to 1.28 billion
8 to 14 192 million to 1.92 billion
16 to 30 256 million to 2.56 billion
32 to 62 512 million to 5.12 billion
More than 64 1.024 billion to 10.24 billion

If only a small number of partition records are collected, we recommend that you increase the partition granularity. For example, change the granularity from day to month. If a huge number of partition records are collected, we recommend that you decrease the partition granularity. For example, change the granularity from month to day. For more information, see Schema design.

Procedure

  1. Log on to the AnalyticDB for MySQL console by using your Alibaba Cloud account.
  2. In the upper-left corner of the page, select the region where the cluster is deployed. In the left-side navigation pane, click Clusters.
  3. On the 3.0 Clusters tab, find the cluster and click the cluster ID.
  4. In the left-side navigation pane, click Diagnostics and Optimization. On the page that appears, click the Data Modeling Optimization tab. View the partition field reasonability diagnosis data.

    The data helps you determine which tables contain unreasonable partitions and which partitions in a table are unreasonable based on the criteria for unreasonable partition fields. Then, you can optimize the unreasonable partitions. For more information, see Schema design.

Distribution field skew diagnosis

In AnalyticDB for MySQL, you can specify DISTRIBUTED BY HASH in the CREATE TABLE statement to evenly distribute data among storage nodes. However, unreasonable distribution fields may be selected because data characteristics are unknown or change with business characteristics. This eventually causes data skew problems: unbalanced resource usage when each node processes data and long tails of subtasks. These ultimately worsen query performance. Distribution field reasonability diagnosis can help detect unreasonable distribution fields and reduce data skew.

Criteria for unreasonable distribution fields

Calculation formula for data skew:
  1. Remove the maximum shard size and calculate the average shard size.
  2. Skewed shards refer to those that have a size greater than the product of the average shard size and threshold or smaller than the ratio of shard partition to threshold. Threshold can be modified and its default value is 3.

Procedure

  1. Log on to the AnalyticDB for MySQL console by using your Alibaba Cloud account.
  2. In the upper-left corner of the page, select the region where the cluster is deployed. In the left-side navigation pane, click Clusters.
  3. On the 3.0 Clusters tab, find the cluster and click the cluster ID.
  4. In the left-side navigation pane, click Diagnostics and Optimization. On the page that appears, click the Data Modeling Optimization tab. View the distribution field skew diagnosis data.

    The data helps you determine which tables are skewed in a database and then select distribution fields again. For more information, see Schema design.

Replicated table reasonability diagnosis

In AnalyticDB for MySQL, you can specify DISTRIBUTED BY HASH in the CREATE TABLE statement. Replicated tables can save a complete copy of table data on each storage node of the instance. When you join other tables and replicated tables, you do not need to redistribute replicated tables. This improves query performance. However, write amplification occurs when data is written to replicated tables. It affects the overall write performance of AnalyticDB for MySQL. Replicated table reasonability diagnosis can help detect unreasonable replicated tables.

Criteria for unreasonable replicated tables

If a replicated table contains more than 20,000 records, it is considered unreasonable.

Procedure

  1. Log on to the AnalyticDB for MySQL console by using your Alibaba Cloud account.
  2. In the upper-left corner of the page, select the region where the cluster is deployed. In the left-side navigation pane, click Clusters.
  3. On the 3.0 Clusters tab, find the cluster and click the cluster ID.
  4. In the left-side navigation pane, click Diagnostics and Optimization. On the page that appears, click the Data Modeling Optimization tab. View the replicated table reasonability diagnosis data.

    You can view the replicated table reasonability diagnosis data or execute the COUNT function on a replicated table to check its records. Compared the result with the recommended value.