Diagnose storage issues and get optimization suggestions - AnalyticDB for MySQL

Storage diagnostics helps you identify data skew, suboptimal partition keys, excessive indexes, and other table configuration issues in your AnalyticDB for MySQL cluster. On the Storage Diagnostics page, you can evaluate the suitability of partition keys, distribution keys, and replicated tables. You can also identify opportunities for hot and cold data separation and receive index optimization suggestions. By applying these suggestions, you can optimize your database schema, reduce cluster costs, and improve efficiency.

Important notes

Hot and cold table optimization and index diagnostics are supported only on clusters running Milvus version 3.1.4 or later.
The optimization suggestions for hot and cold table optimization and index diagnostics are based on a historical analysis of data and query patterns. These suggestions remain effective as long as the data and query patterns are stable. If these patterns change significantly, the effectiveness of the suggestions diminishes. Before you use this feature, you should evaluate whether to apply the suggestions based on your current business workload.

Table diagnostics

Data skew diagnosis

When you create a table, use DISTRIBUTED BY HASH to specify a distribution key. After you define the distribution key, AnalyticDB for MySQL computes a hash of the distribution key values and distributes rows across different shards. Uneven data distribution across storage nodes causes disk space skew, which can fill disks prematurely and block data writes.

Diagnosis criteria

AnalyticDB for MySQL diagnoses data skew for tables that have more than 10,000 rows. The calculation method is as follows:

Exclude the largest shard and compute the average shard size.
If any shard is larger than average shard size × threshold or smaller than average shard size / threshold, the table is considered skewed. The default threshold is 3, and the valid range is [0, 10000000000]. You can adjust the threshold by running the following command: SET ADB_CONFIG RC_DATA_SKEW_THRESHOLD=Value;.

Procedure

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
In the navigation pane on the left, choose Storage Analysis > Storage Diagnostics.
Click the Table Diagnostics tab to view the details in the Table Skew Diagnostics section.
Storage Node Disk Usage
Use the chart to check disk usage across storage nodes and identify disk space skew. If skew exists, optimize the skewed tables listed under Top 10 Skewed Tables. Even if disk space appears balanced but skewed tables appear in Top 10 Skewed Tables, optimize them to avoid degrading cluster query performance.
Top 10 Skewed Tables
This section lists tables with data skew, sorted by total data volume in descending order. To view row counts per shard and assess skew severity, click View Skew Details in the Actions column.

Optimization methods

You can resolve this issue using one of the following methods:

Scale out storage capacity.
- Enterprise Edition clusters require scaling reserved resources. For more information, see Scale an Enterprise Edition cluster.
- Basic Edition clusters require scaling compute resources. For more information, see Scale a Basic Edition cluster.
- Data Lakehouse Edition clusters require scaling reserved storage resources. For more information, see Scale a Data Lakehouse Edition cluster.
- Data Warehouse Edition (elastic mode) clusters require scaling elastic I/O units. For more information, see Scale a Data Warehouse Edition (elastic mode) cluster.
- Data Warehouse Edition (reserved mode) clusters require scaling node groups. For more information, see Scale a Data Warehouse Edition (reserved mode) cluster.
Delete idle indexes or partitions to reduce storage usage. For more information, see Index diagnostics.
Recreate the table and migrate the data. For more information, see CREATE TABLE.

Hot and cold table optimization

AnalyticDB for MySQL analyzes table access frequency to identify infrequently accessed tables and provide optimization suggestions. You can apply these suggestions to adjust the hot and cold data separation policy for your tables. For more information about hot and cold data separation, see Hot and cold data separation.

Diagnosis criteria

AnalyticDB for MySQL provides optimization suggestions for tables that have not been accessed in the last 15 days and have an access rate of less than 1%.

Procedure

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
In the navigation pane on the left, choose Storage Analysis > Storage Diagnostics.
On the Table Diagnostics tab, click Hot and Cold Table Optimization.
On the Available Optimization Suggestions tab, click Disable to enable the cold and hot table optimization feature. If this feature is already enabled for the current cluster, you can skip this step.

Click the Available Optimization Suggestions and Applied Optimization Suggestions tabs to view available and applied suggestions.

Parameter	Description
Suggestion ID	The ID of the optimization suggestion.
SQL	Table and definition changes required by the suggestion.
Optimization Type	Hot and cold table optimization.
Optimization Suggestion	Detailed recommendation for the optimization type.
Expected Optimization Benefits	Expected benefit after applying the suggestion. Note Expected benefits are estimates based on historical statistics, not real-time accurate values. Use them for reference only.
Actions	You can apply the current suggestion using Apply. Note After clicking Apply, AnalyticDB for MySQL changes the table's storage policy to COLD. To change it to MIXED or HOT, manually run an ALTER statement. For details, see Storage policies. Click Apply to adopt the optimization suggestion. After you click Apply, the corresponding cluster executes the SQL change, and the suggestion appears on the Applied Optimization Suggestions tab. Apply has the same effect as running the SQL in a client. This action cannot be undone. Use it carefully. After the SQL is issued, the table must complete a Build operation for the change to take effect. The database system triggers Build automatically based on internal rules. Until triggered, the suggestion status remains "running". After triggering, it changes to "completed".

Replicated table diagnostics

When you create a table in AnalyticDB for MySQL, you can specify broadcast distribution using DISTRIBUTED BY BROADCAST. The table is then created as a replicated table, which stores an identical copy of its data on each shard. If your query workload involves high-concurrency joins between large and small tables, such as joining a large fact table with a small dimension table, you can create the smaller table as a replicated table. This can reduce data transmission over the cluster's internal network and improve query concurrency. However, replicated tables have poor write performance and consume a large amount of storage space, which can negatively impact the overall write performance of the AnalyticDB for MySQL cluster.

Diagnosis criteria

A replicated table is considered inefficient if it contains more than 20,000 rows.

Procedure

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
In the navigation pane on the left, choose Storage Analysis > Storage Diagnostics.
On the Table Diagnostics tab, click Replicated Table Diagnostics.

Optimization methods

Create a standard table and migrate the data. For more information, see CREATE TABLE.

Partition diagnostics

Partitioned table diagnostics

If you do not correctly set the partition field when you create a partitioned table, the following issues may occur:

Partitions that are too large, such as yearly partitions where each year's data resides in a single partition, result in a small number of partitions with massive data volumes. If a Build task runs on such a partition, it consumes excessive resources, such as storage node CPU and disk I/O, which affects cluster stability.
Partitions that are too small, such as hourly partitions where each hour's data resides in a single partition, result in many partitions that contain minimal data. The cluster must cache extensive partition metadata, which consumes a significant amount of memory. Queries also need to scan many partitions, which degrades query performance.

What is a reasonable partition size?

Partition size is measured by the number of rows¹ and scales proportionally with the number of shards². If a cluster has N shards, a reasonable partition size is a row count between [1 million × N] and [5 million × N].

For example, if a cluster has 64 shards, a reasonable partition row count ranges from 64 million to 320 million.

Note

¹To query partition row counts, run the following command: SELECT partition_id, row_count FROM information_schema.kepler_partitions WHERE schema_name = '$DB' AND table_name ='$TABLE' AND partition_id > 0;
²To query the shard count, run the following command: SELECT COUNT(1) FROM information_schema.kepler_meta_shards;

Diagnose whether the partition field is reasonable

Diagnosis criteria

A partition field is considered unreasonable if 10% or more of the partitions in a table have an unreasonable size.

For example, if a table has 100 partitions and 10 or more of them have an unreasonable size, the partition field is flagged as unreasonable.

Procedure

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
In the navigation pane on the left, choose Storage Analysis > Storage Diagnostics.
Click the Partition Diagnostics tab to view the unreasonable partitioned tables and their specific partition names in the Partitioned Table Diagnostics section.

How to adjust partition size into the reasonable range

If the partition diagnostics tool identifies unreasonable partitions, you can use the following methods to adjust them.

If a partition's row count is below the lower bound of the reasonable range, the partition is too small. In this case, you can increase the partition granularity. For example, if a cluster has 64 shards, the reasonable range is from 64 million to 320 million rows. If a partition has fewer than 64 million rows, you can change the partitioning from daily to monthly.
If the number of rows in a partition exceeds the recommended upper limit, the partition is considered too large, and you should reduce its granularity. For example, if you have 64 shards, the recommended number of rows per partition is between 64 million and 320 million. If a partition contains more than 320 million rows, we recommend that you change the partitioning from monthly to daily.
For more information about how to change the partition granularity, see Change the partition function format.
If the total row count of a table is below the lower bound of the reasonable range and is not expected to grow into the range, you can create a non-partitioned table and migrate the data from the partitioned table.

Non-partitioned table diagnostics

If you omit the PARTITION BY clause when you create a table, the table is created as a non-partitioned table. DML operations, such as INSERT, UPDATE, and DELETE, on non-partitioned tables often trigger full-table Build tasks. If the table contains an excessive amount of data, these Build tasks consume a large amount of temporary disk space, which increases disk usage on the node and can cause disks to lock. Build tasks on large tables also consume substantial disk I/O and CPU resources, which degrades overall cluster performance.

Diagnosis criteria

A non-partitioned table is considered inefficient if it contains more than 1 billion rows.

Procedure

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
In the navigation pane on the left, choose Storage Analysis > Storage Diagnostics.
Click the Partition Diagnostics tab to view Non-partitioned Table Diagnostics information.

Optimization methods

Create a partitioned table and migrate the data from the non-partitioned table. For more information, see CREATE TABLE.

Index diagnostics

AnalyticDB for MySQL analyzes index usage and automatically provides optimization suggestions for indexes that have not been used for a long time. You can apply these suggestions to delete idle indexes and reduce storage costs.

Idle index diagnostics

Diagnosis criteria

An index is considered idle if it has not been used in the last 15 days and its usage rate is less than 1%.

Procedure

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
In the navigation pane on the left, choose Storage Analysis > Storage Diagnostics.
Click the Index Diagnostics tab to view details for Indexes to Remove.
On the Available Optimization Suggestions tab, click Enable to activate index diagnostics. You can skip this step if this feature is already enabled.

Click the Available Optimization Suggestions and Applied Optimization Suggestions tabs to view available and applied suggestions.

Parameter	Description
Suggestion ID	The ID of the optimization suggestion.
SQL	Table and definition changes required by the suggestion.
Optimization Type	Index optimization.
Optimization Suggestion	Detailed recommendation for the optimization type.
Expected Optimization Benefits	Expected benefit after applying the suggestion. Note Expected benefits are estimates based on historical statistics, not real-time accurate values. Use them for reference only.
Actions	You can apply the current suggestion using Apply. Note After deleting an index, queries filtering on that column will take longer. Apply indicates that you agree to adopt this optimization suggestion. After you click Apply, the corresponding cluster executes SQL changes, and this suggestion appears in the Applied Optimization Suggestions tab. Apply has the same effect as running the SQL in a client. This action cannot be undone. Use it carefully. After the SQL is issued, the table must complete a Build operation for the change to take effect. The database system triggers Build automatically based on internal rules. Until triggered, the suggestion status remains "running". After triggering, it changes to "completed".

New index diagnostics

Diagnosis criteria

This diagnostic identifies fields that were used as filters in queries over the last 15 days but are not defined as indexed fields.

Procedure

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
In the navigation pane on the left, choose Storage Analysis > Storage Diagnostics.
Click the Index Diagnostics tab to view details for Indexes to Create.
On the Available Optimization Suggestions tab, click Enable to activate index diagnostics. You can skip this step if this feature is already enabled.

Click the Available Optimization Suggestions and Applied Optimization Suggestions tabs to view available and applied suggestions.

Parameter	Description
Suggestion ID	The ID of the optimization suggestion.
Index Fields	The field to add as an index.
SQL	The SQL statement to create the index.
Optimization Type	Create index suggestion.
Optimization Suggestion	Describes how often the field was used recently, indicating its potential as an indexed field. You can apply the current suggestion using Batch Create. Note After Indexes to Create, the table must complete a Build operation for the change to take effect. The database system triggers Build automatically based on internal rules. Until triggered, the suggestion status remains "running". After triggering, it changes to "completed".

Primary key index diagnostics

Diagnosis criteria

A table is considered to have an excessive number of primary keys if it has more than three primary key fields and these fields constitute at least half of all the fields in the table.

Procedure

Log on to the AnalyticDB for MySQL console. In the upper-left corner of the console, select a region. In the left-side navigation pane, click Clusters. Find the cluster that you want to manage and click the cluster ID.
In the navigation pane on the left, choose Storage Analysis > Storage Diagnostics.
Click the Index Diagnostics tab to view details for Primary Key Diagnostics.
On the Available Optimization Suggestions tab, click Enable to activate index diagnostics. You can skip this step if this feature is already enabled.
View tables that have an excessive number of primary keys. The table displays the following columns: Database, Table Name, Table Fields, and Primary Key Fields.

FAQ

Why does the optimization task status remain "running" after clicking Apply now for a hot and cold table optimization suggestion?

Cause: After you click Apply, AnalyticDB for MySQL changes the table's storage policy to COLD. This change requires a Build operation to take effect. The hot and cold optimization feature does not immediately trigger a Build task. The system automatically triggers this task at a later time.

Solution: You can wait for the system to trigger the Build task automatically, or you can copy the SQL statement that triggers the Build task from the console and execute it manually. After the Build task runs, you can check its status by running the following command: SELECT table_name, schema_name, status FROM INFORMATION_SCHEMA.KEPLER_META_BUILD_TASK ORDER BY create_time DESC LIMIT 10;.

Related APIs

API	Description
DescribeExcessivePrimaryKeys	View tables with excessive primary keys in Enterprise Edition, Basic Edition, and Data Lakehouse Edition clusters.
DescribeTablePartitionDiagnose	View partition diagnostics for Data Warehouse Edition clusters.