This topic provides answers to frequently asked questions about optimization in AnalyticDB for MySQL.
- Q: How do I ensure high write performance when I create tables in AnalyticDB for MySQL?
A: When you create tables, take note of the following items:
- Appropriate distribution fields. AnalyticDB for MySQL uses a distributed architecture. Appropriate distribution fields must be used to evenly distribute data across all nodes and ensure high resource utilization. Inappropriate distribution fields may cause data to be unevenly distributed and concentrate in hot spots, which degrades write performance.
- Appropriate partition fields. AnalyticDB for MySQL stores files and creates and queries indexes based on partitions. If each partition contains only a few rows of data, the scan performance may be reduced because large numbers of list partitions are scanned during queries. If each partition contains many rows of data, indexes may be frequently created within the partition. It is important to select appropriate list partitions to ensure system stability.
- Appropriate replicated tables. Replicated tables are stored on each node to facilitate JOIN operations without the need for network transmission and improve the concurrent processing capability of the system. However, create, modify, and delete operations must be repeated on data rows of replicated tables to ensure that the operations are performed on each copy of these tables. We recommend that you do not create large replicated tables or frequently create, modify, or delete replicated tables.
For more information, see Data modeling optimization.
- Q: Why does the CPU utilization remain high even when the maximum write throughput decreases?
A: AnalyticDB for MySQL creates indexes for written data in real time to speed up queries. Index creation consumes large amounts of system resources, especially when the maximum write throughput causes burst write requests.
- Q: Why is the query performance of AnalyticDB for MySQL much lower than that of ApsaraDB RDS?
A: AnalyticDB for MySQL is designed to implement concurrent processing for large amounts of data based on a distributed architecture. In some scenarios where a small number of queries are required, ApsaraDB RDS provides higher query performance than AnalyticDB for MySQL without the need to undertake the overhead of a distributed architecture. In some other scenarios, ApsaraDB RDS Basic Edition instances can better use the storage indexes to improve the query performance.
- Q: What do I do if the query memory usage exceeds the limit?
A: To ensure the stability of AnalyticDB for MySQL clusters and prevent cluster performance from being affected by slow queries, you must set a limit on query memory usage. For more information about the analysis of slow queries, see Slow query types. Pay attention to the maximum memory usage and the amount of scanned data.The following table describes the error codes returned when the query memory usage exceeds the limit and their corresponding causes and solutions.
ErrorCode Cause Solution CLUSTER_OUT_OF_MEMORY(32001) Large amounts of memory are consumed throughout the entire cluster. To ensure system stability and prevent effects on other query connections, the system closes query connections that consume large amounts of memory.
- We recommend that you set a limit on the number of concurrent queries.
- Navigate to the Diagnostics and Optimization page. Check slow queries to see whether they are consuming the maximum amount of memory or scanning large amounts of data. Analyze the cause of high memory usage.
EXCEEDED_MEMORY_LIMIT(32003) The memory usage of the current query exceeds the limit. We recommend that you check the operators in the SQL statement that consume large amounts of memory in the query. OUT_OF_PHYSICAL_MEMORY_ERROR(33015) The runtime of the current query exceeds the limit for internal memory pool computation. Check for queries that have the maximum memory usage and large amounts of scanned data during the time range when the problem occurs. Analyze the cause of high memory usage.
- Q: What do I do if the disk usage exceeds the limit during a query?
A: Elastic clusters may query data in batch mode. In these cases, intermediate results of queries are written to disks. If large amounts of data are contained in the intermediate results, the disk usage may exceed the limit.
ErrorCode Cause Solution OUT_OF_SPILL_SPACE(32007) The disk usage exceeds the limit, and space is insufficient to store data on disks. Analyze the cause for disk usage exceeding the limit when SQL statements are executed. Pay attention to the maximum memory usage and the amount of scanned data. In addition, change the value of the
batch_hash_partition_countparameter to a greater value. The default value of this parameter is 32. We recommend that you set the maximum value of this parameter to 64. Values greater than 64 may lead to other errors.
- Q: What do I do if my queries suddenly slow down?
A: Queries executed by using the same SQL statement template may slow down for the following reasons:
- Query conditions remain unchanged but data has been added to some tables. As a result, more data needs to be processed.
- Query conditions are changed. For example, more list partitions need to be scanned or the query range is enlarged.
- The system experiences high resource consumption. For example, the increase of data writes or slow queries may consume large amounts of system resources.
You can navigate to the Cluster Running Report page and click the Query Pattern Statistics tab to view resource consumption statistics for different SQL statement templates, such as the amount of scanned data, memory, and CPU time. You can also navigate to the Diagnostics and Optimization page and click the Slow SQL Query tab to check for slow queries that consume large amounts of resources.
- Q: How do I identify which queries consume large amounts of memory and CPU resources?
A: For information about how to resolve this problem, see Slow query types.
- Q: Which edition can provide better performance, Cluster Edition or elastic mode for Cluster Edition?
A: We recommend that you select elastic mode for Cluster Edition. Elastic mode for Cluster Edition can separate computing resources from storage resources. It allows you to flexibly allocate computing and storage resources and provides features such as on-demand scaling, resource group isolation, batch query, and tiered storage of hot and cold data. Cluster Edition uses the reserved mode in which computing and storage resources are coupled. It provides a low latency for queries that are performed within milliseconds. For more information about their differences, see Editions.