This topic provides answers to frequently asked questions about statistics in AnalyticDB for MySQL.

ANALYZE statements misdiagnosed as slow queries

Q: Why are ANALYZE statements misdiagnosed as slow queries?

A: The ANALYZE statements that are automatically initiated during the maintenance window are executed with I/O throttling and low CPU priority. They may be diagnosed as slow queries because they are executed for extended periods of time. However, they do not affect your services. If CPU load is not high or if CPU overload is not closely associated with the maintenance window, you can ignore this issue. If the CPU is continuously overloaded, refer to the "Overloaded CPU" section of this topic to solve this issue.

Overloaded CPU

  • Q: Why CPU is overloaded when I use the statistics feature?

    A: A CPU may become overloaded for the following reasons:
    • During the default maintenance window from 04:00 to 05:00, the system performs a full scan on each table to collect column statistics. As a result, CPU is overloaded during this period.
    • In most cases, statistics are collected incrementally. However, when a cluster is updated to V3.1.6, it does not contain statistics. As a result, full statistics are collected the first days you use such a cluster. This may cause CPU overloads. CPU load will decrease after full data scan is completed.
    When CPU is overloaded, you must check whether the query response time is affected. If the average query response time does not change significantly, the query response time is not affected. The value of the CPU utilization metric may be high during statistics collection, but when queries are executed, resources will be preferentially allocated for the execution. This is because ANALYZE statements are executed with I/O throttling and low CPU priority.
  • Q: What do I do if the query response time is affected by CPU overload?

    A: Solve this issue by using the following solutions:
    • Change the maintenance window to off-peak hours.
      set adb_config O_CBO_MAINTENANCE_WINDOW_DURATION = [04:00-05:00];
    • If you cannot determine the appropriate off-peak hours, we recommend that you decrease the I/O limit for system queries, but not lower than 16 MB. The default value is 50 MB.
      set adb_config CSTORE_IO_LIMIT_SYSTEM_QUERY_BPS = 52428800;
    • Assign statistics collection to a low-priority resource group to isolate loads. For more information, see Automatic statistics collection.
      set adb_config O_CBO_AUTONOMOUS_STATS_ACCOUNT = [user_name];
    • Increase the expiration ratio for columns to reduce the data to be collected. The default value is 0.1. The value ranges between 0 and 1. We recommend that you do not set the expiration ratio to higher than 0.5.
      set adb_config O_CBO_STATS_EXPIRED_RATIO = 0.1;
    If none of the preceding solutions can solve this issue, you can disable automatic statistics collection by executing the set adb_config O_CBO_AUTONOMOUS_STATS_ENABLED=false; statement. However, this may cause performance compromise. If you need statistics in the future, you must manually collect statistics. For more information, see Manually collect statistics.

Statistics update issues

  • Q: The execution result of SELECT * FROM INFORMATION_SCHEMA.COLUMN_STATISTICS shows that statistics have not been updated for several days. Why?

    A: This issue is caused by the following reasons:
    • The statistics have not expired.

      Statistics expire when the amount of data that is updated, inserted, or replaced reaches the expiration ratio. The default expiration ratio is 0.1 (10%). If only a small amount of your data is changed, you can continue to use your cluster as normal while observing this issue for another week.

    • A large amount of data is contained in many columns and tables.

      By default, it takes only 1 hour per day to collect statistics that are not included in incremental updates. If a large number of columns and tables are involved (for example, more than 1,000 columns), the system may not be able to complete an update within one day. It may take a week to complete an update. In this case, the absence of statistics updates within several days is normal. You can continue to use the cluster as normal while observing this issue.

  • Q: Are statistics automatically updated after data is imported to a new table?

    A: If data is batch imported by using INSERT OVERWRITE INTO, basic statistics are automatically collected after the data import is complete. If data is imported in real time by using INSERT INTO or REPLACE INTO, statistics are collected during the next maintenance window or an incremental collection task is triggered during the incremental collection period after each BUILD task is complete. We recommend that you manually collect basic statistics after data is imported. For more information, see Manually collect statistics.