If an ApsaraDB for Redis instance experiences high CPU utilization, the throughput and response time of an application that connects to the instance are affected. In some cases, the application may stop responding. If the average CPU utilization is higher than 50% and the average peak CPU utilization within 5 minutes is higher than 90%, the stability of the application may be affected. You must pay close attention to this issue and locate the cause.

Search for and disable commands that consume a large amount of CPU resources

Commands that consume a large amount of CPU resources have a time complexity of O(N) or higher. In most cases, a command with a higher time complexity consumes more resources. This increases CPU utilization. For more information about the time complexity of each command, see the Redis official website.

If ApsaraDB for Redis runs commands that consume a large amount of resources, pending requests are piled up in the queue due to single-threading. This slows down the response of applications. In some cases, the ApsaraDB for Redis instance may be overwhelmed by pending requests. The application may be disconnected due to the timeouts of these requests. In addition, user requests may be directly forwarded to the backend database. As a result, a cache avalanche occurs.

  1. Use the performance monitoring feature to identify the time period during which CPU utilization is high. For more information, see Query monitoring data.
  2. Use the following methods to identify the commands that cause high CPU utilization:
    • Audit logs record modification and deletion operations that are performed on ApsaraDB for Redis instances. You can query audit logs to analyze the commands and trends within a specified time period. This allows you to identify the commands that cause high CPU utilization. For more information, see Query audit logs.
      Figure 1. Sample audit log query
      Sample audit log query
    • Slow logs record commands that are run with a duration longer than the specified threshold. You can identify commands that cause high CPU utilization based on the statements and duration that are recorded in slow logs. For more information, see Query slow logs.
      Figure 2. Sample slow log query
      Sample slow log query
      Note The amount of time that is taken to run a statement is measured in microseconds.
  3. Assess and disable commands with a high risk and high-CPU utilization, such as FLUSHALL, KEYS, and HGETALL. For more information, see Disable high-risk commands.
  4. Optimize your application. For example, do not frequently sort data.
  5. Optional:Use the following methods to modify the instance based on your business settings:
    • Change the architecture of the instance to read/write splitting to distribute commands or applications that consume a large amount of CPU resources.
    • Change the instance to a performance-enhanced instance that lowers CPU utilization by using the multi-threading feature.
    Note For more information about how to change the architecture and instance type, see Change the specification of an ApsaraDB for Redis instance.

Optimize hotkeys

Issue description:

A cluster instance or a read/write splitting instance is used. The CPU utilization is high on some data nodes.

Solution:

  • Enable the proxy query cache feature. After you enable this feature, proxy servers cache the request and response data of hotkeys. If a proxy server receives a duplicate request within the validity period of cached data, the proxy server directly returns a response to the client without the need to interact with backend data shards. For more information, see Use proxy query cache to address issues caused by hotkeys.
    Note This feature is supported only by performance-enhanced instances of ApsaraDB for Redis Enhanced Edition (Tair) in the cluster architecture.
  • In most cases, the issue is caused by hotkeys. You can analyze the slow logs and audit logs, and then check the hotkeys on each node. This way, you can resolve the issue or slightly decrease CPU utilization. For more information, see View real-time hot key logs.

Optimize short-lived connections

Issue description:

Connections are frequently established. As a result, a large amount of resources of the ApsaraDB for Redis instance are consumed. In this case, CPU utilization is high, the number of established connections is large, and the queries per second (QPS) does not reach the expected value.

Solution:

  • Change short-lived connections to persistent connections. For example, create a JedisPool connection pool. For more information, see Jedis client.
  • Change the instance to a performance-enhanced instance that optimizes the processing of short-lived connections.

Disable AOF persistence

Issue description:

By default, append-only file (AOF) persistence is enabled for ApsaraDB for Redis instances. If an ApsaraDB for Redis instance runs with heavy loads, frequent AOF operations may increase CPU utilization.

Solution:

You can disable AOF persistence if this does not adversely affect your business. In addition, you can back up the Redis data during off-peak hours or during the maintenance window to minimize the impact.

Warning If you use a performance-enhanced instance of ApsaraDB for Redis Enhanced Edition (Tair), you cannot use AOF files (Data flashback) to restore data after you disable AOF persistence. You can use only backup sets to restore data (Restore data from a backup set to a new instance). Proceed with caution if you disable AOF persistence.

Evaluate the service performance

The preceding methods are used to optimize the performance of your instance. If the average CPU utilization still exceeds 50% during normal business operations, the instance may have a performance bottleneck.

To resolve this issue, first check for commands and requests from application hosts that may degrade the instance performance. If such commands or requests exist, you must optimize your business system. If no such commands or requests are found but the CPU utilization is still high, we recommend that you upgrade the instance specifications to ensure business stability. You can also change the instance to a cluster instance or read/write splitting instance. For more information about how to upgrade an instance, see Change the specification of an ApsaraDB for Redis instance.

Note To ensure business stability, we recommend that you purchase a pay-as-you-go instance before you upgrade the instance. You can release this pay-as-you-go instance after you complete the stress and compatibility tests.