AnalyticDB for MySQL provides the schema optimization feature to offer optimization suggestions on how to reduce costs and improve efficiency for the use of clusters. These suggestions are provided based on algorithm analysis of continuously collected query performance metrics and relevant information such as tables and indexes. This feature greatly simplifies manual optimization.

Background information

As the number of data analysis services grows, the quantity and complexity of database queries to be processed also increase. Schema design and optimization can significantly reduce database costs and improve query performance. To perform schema design and optimization, you must take note of the following information:
  • Database engine architecture

    The database engine architecture of computing and storage resources can help you perform data modeling and design suitable table schemas based on data distribution and business scenarios.

  • Differences in SQL statement characteristics

    SQL statements suitable for ad hoc queries are quite different, such as the number of tables to be joined, the number of fields in the GROUP BY clause, and the join and filter conditions.

  • Differences in data characteristics

    Data distribution and query characteristics vary with business characteristics. In this case, if the initial data modeling methods and SQL statements remain unchanged, SQL engines cannot deliver the optimal performance. The performance of SQL statements may degrade based on the change of data characteristics or business models.

In such situations, AnalyticDB for MySQL provides the efficient and intelligent schema optimization feature to offer optimization suggestions on how to reduce costs and improve efficiency for the use of clusters.

Precautions

  • Schema optimization can be used for versions later than AnalyticDB for MySQL V3.1.4.
  • Schema optimization suggestions are obtained from analysis of historical user data and query characteristics. If user data and query characteristics are stable, the relevant suggestions can remain valid. If user data and query characteristics are changed significantly, the suggestions may also significantly become less valuable as a reference. We recommend that you determine whether to adopt the suggestions based on your business characteristics.

Description

Schema optimization provides suggestions of three optimization types, which are hot and cold data optimization, index optimization, and distribution key optimization.

Hot and cold data optimization

Suggestions on hot and cold data optimization are to migrate tables that are not in use for a long time to cold storage disks, which reduces table storage costs. Suggestions of this optimization type are applicable to tables. For more information about how to migrate tables to cold storage disks, see Separation of hot and cold data storage.
Note After tables are migrated to cold storage disks, it takes longer to query the tables.

Index optimization

Suggestions on index optimization are to delete data indexes that are not in use for a long time, which reduces index storage costs. For information about how to filter data by index, see Filter conditions without pushdown.
Note After a data index is deleted, it takes longer to filter a table by using the data referenced in the index.

Distribution key optimization

Suggestions on distribution key optimization are provided based on operator-level query analysis, intelligent optimization algorithms, and distribution key selection principles and precautions. In addition, estimated benefits and operation guides are given for your distribution key design and optimization. Suggestions of this optimization type are applicable to tables.

Distribution key optimization provides table creation statements based on suggested distribution keys. The data in suggested distribution keys must be evenly distributed. However, the data distribution statistics of suggested distribution keys are only estimated values. To prevent uneven data distribution caused by large amounts of duplicate values, you must query the data distribution of relevant columns. For more information about how to select a distribution key, see Select the distribution key.

AnalyticDB for MySQL does not allow distribution key changes and automatic data redistribution. If you want to use a different distribution key for your table, you must create another table and import data to the new table. For more information, see Change a partition or distribution key. If incremental data exists in your table, make sure that incremental data is also imported to the new table.

Procedure

  1. Log on to the AnalyticDB for MySQL console.
  2. In the upper-left corner of the page, select the region where the cluster is deployed.
  3. In the left-side navigation pane, click Clusters.
  4. On the V3.0 Clusters tab, find the cluster and click its ID in the Cluster ID/Cluster Description column.
  5. In the left-side navigation pane, click Diagnostics and Optimization.
  6. Click the Schema Optimization tab.
  7. On the Available Optimization Suggestions tab, turn on Enable to enable schema optimization. Skip this step if this feature is already enabled for the cluster.
  8. View available and applied optimization suggestions on the Available Optimization Suggestions and Applied Optimization Suggestions tabs.
    • On the Available Optimization Suggestions tab, view parameter information in the suggestion list and perform operations. 1
      Parameter Description
      Suggestion ID The ID of the optimization suggestion.
      SQL The SQL statement that specifies the detailed information of the table to modify.
      Optimization Type The type of optimization. Valid values: Hot and Cold Data Optimization, Index Optimization, and Distribution Key Optimization.
      Optimization Suggestion The specific optimization suggestion given for the optimization type.
      Expected Optimization Benefits The expected benefits obtained after the optimization suggestion is applied.
      Note The expected optimization benefits are estimated values measured based on historical data and for reference only.
      Actions
      • If the optimization type is Hot and Cold Data Optimization or Index Optimization, you can click Apply to apply the optimization suggestion.
        Note
        • If you agree to adopt the optimization suggestion, click Apply. Then, an ALTER statement is executed on the cluster and the suggestion is displayed on the Applied Optimization Suggestions tab.
        • The Apply operation has the same effect as executing the ALTER statement on the client. This operation cannot be revoked. Proceed with caution.
      • If the optimization type is Distribution Key Optimization, you can click Operation Guide and follow the guide to change the distribution key.
    • On the Applied Optimization Suggestions tab, select a time range and click Search to search for suggestions applied within the specified time range.
      Note The application of an optimization suggestion with the execution of its SQL statement can be complete only after BUILD operations automatically triggered on tables are complete. Before BUILD operations are triggered, the suggestion is in the Running state. After BUILD operations are triggered, the state of the suggestion changes to Completed.