If an ApsaraDB for Redis instance experiences high CPU utilization, the throughput and response time of the application are affected. In some cases, the application may stop responding. When the average CPU utilization is higher than 50% and the average peak CPU utilization within 5 minutes is higher than 90%, the stability of your application may be affected. You must pay close attention to this issue and locate the cause.
Search for and disable commands that consume a large amount of CPU resources
Commands that consume a large amount of CPU resources have a time complexity of O(N) or higher. In most cases, a command with a higher time complexity consumes more resources. This increases CPU utilization. For more information about the time complexity of each command, see the Redis official website.
Due to single threading, when ApsaraDB for Redis runs commands that consume a large amount of resources, pending requests are piled up in the queue. This slows down the response of the application. In some cases, the ApsaraDB for Redis instance may be overwhelmed by pending requests. The application may be disconnected due to the timeouts of these requests. In addition, user requests may be directly forwarded to the backend database. As a result, the application may stop responding.
- Use the performance monitoring feature to identify the time period during which CPU utilization is high. For more information, see Query monitoring data.
- Use the following method to identify the commands that cause high CPU utilization:
- The audit log records modification and deletion operations that are performed on ApsaraDB for Redis instances. You can analyze the commands and trends within a specified time period. This allows you to identify the commands that cause high CPU utilization. For more information, see Query audit logs.
- The slow log records commands that are run with a duration longer than the specified
threshold. You can identify commands that cause high CPU utilization based on the
statements and duration that are recorded in the slow log. For more information, see
Query slow logs.
Note The amount of time that is required to run the slow query statement is in microseconds.
- Assess and disable commands with a high risk and high-CPU utilization, such as FLUSHALL, KEYS, and HGETALL. For more information, see Disable high-risk commands.
- Optimize your application. For example, do not frequently sort data.
- Optional:Use the following methods to modify the instance based on your business settings:
In most cases, this is caused by hotkeys. You can analyze the slow log and audit log, and then check the hotkeys on each node. This way, you can resolve the issue or slightly decrease CPU utilization. For more information, see View real-time hot key logs.
Optimize short-lived connections
Connections are frequently established. As a result, a large amount of resources of the ApsaraDB for Redis instance are consumed. In this case, CPU utilization is high, the number of established connections is large, and the queries per second (QPS) does not reach the expected value.
Disable AOF persistence
By default, append-only file (AOF) persistence is enabled for ApsaraDB for Redis instances. When an ApsaraDB for Redis instance runs with heavy loads, frequent AOF operations may increase CPU utilization.
You can disable AOF persistence if this does not adversely affect your business. In addition, you can back up the Redis data during off-peak hours or during the maintenance window to minimize the impact.
Evaluate the service performance
The preceding methods are used to optimize the performance of your application. However, the load of the instance is still high in most cases. The average CPU utilization may be higher than 50%. This indicates that the instance may have a performance bottleneck.
To resolve this issue, first check for commands and requests from application hosts that may degrade the instance performance. If such problems exist, you must optimize your service. If the preceding problems are not found but the load is still high, we recommend that you upgrade the instance specifications to ensure business stability. You can also upgrade the instance to a cluster instance or read/write splitting instance. For more information about how to upgrade an instance, see Change specifications.