A common issue that affects the service performance is connection timeouts caused by slow requests. The slow log feature of ApsaraDB for Redis allows you to find the IP address of the client that sends these requests and troubleshoot issues based on the details of slow logs.

Prerequisites

  • The major version of the ApsaraDB for Redis instance is Redis 4.0 or later, and the latest minor version is used.
    Note If your instance uses a version of ApsaraDB for Redis earlier than Redis 4.0, you can upgrade the instance version after you assess the compatibility with your business. For more information, see Upgrade the minor version and Upgrade the major version.
  • The instance is an instance of ApsaraDB for Redis Community Edition or a performance-enhanced instance of ApsaraDB for Redis Enhanced Edition (Tair).

Background information

Slow logs record requests that take longer to execute than a specified threshold. Slow logs are classified into slow logs from data nodes and slow logs from proxy servers.
Note Only the slow logs from data nodes are collected for standard instances.
Slow log type Description Parameter
Slow logs from data nodes
  • The command execution time collected in slow logs that are generated on a data node includes only the time required to actually run a command on the data node. The time required for the data node to communicate with a proxy or client and the latency of the command in the single-threaded queue are not included.
  • In most cases, the number of slow logs from data nodes is small due to the high-performance capabilities of ApsaraDB for Redis.
  • slowlog-log-slower-than: specifies the threshold of command execution time for slow logs from data nodes. If a command runs for a period of time that exceeds this threshold, the command is recorded in a slow log. Default value: 20000. Unit: μs. 20000 μs is equal to 20 ms.
    Note In most cases, the latency is higher than the specified value of this parameter because this value does not include the time required to transmit and process data among clients, proxies, and data nodes.
  • slowlog-max-len: specifies the maximum number of slow logs that can be stored. Default value: 1024.

For more information, see Modify the parameters of an ApsaraDB for Redis instance.

Slow logs from proxy servers
  • The command execution time collected in slow logs from proxy servers starts from the time when a proxy server sends a request to a data node and ends at the time when the proxy server receives the response from the data node. This includes the command execution time on the data node, the data transmission time over the network, and the queuing latency of the command.
  • Slow logs from proxy servers are retained for 72 hours. The number of slow logs from proxy servers is unlimited.
  • In most cases, the latency value recorded in a slow log from proxy servers is similar to the actual latency of the application. However, we recommend that you check the accuracy of this value when you troubleshoot timeout issues.
rt_threshold_ms: specifies the threshold of command execution time for slow logs from proxy servers. Default value: 500. Unit: ms. We recommend that you set the threshold to a value close to the client timeout value, which is from 200 ms to 500 ms.

For more information, see Modify the parameters of an ApsaraDB for Redis instance.

Methods used to query slow logs

Slow log type Method
Slow logs from data nodes
Slow logs from proxy servers Log on to the ApsaraDB for Redis console or call an API operation:

Procedure

In most cases, service timeouts may be caused by slow requests. We recommend that you perform the following steps to troubleshoot the timeout issues:

  1. If a service timeout issue occurs, first check the slow logs generated on proxy servers. For more information, see Query slow logs.
    Note
    • For standard instances, go to Step 3 and analyze slow logs from data nodes.
    • If no slow logs from proxy servers exist, you can check the network between the client and the ApsaraDB for Redis instance.
  2. Find the command recorded by the earliest slow log from proxy servers.
    Note If slow requests occur on data nodes and cause command accumulation, these requests are recorded in slow logs from proxy servers.

    In this example, the earliest recorded slow log is caused by the KEYS command. The IP address on the right of the log entry is the IP address of the client that sends the command.

    Find the earliest slow log in slow logs from proxy servers
  3. Check the slow logs from data nodes to find the slow logs from proxy servers that cause the timeout issue.
    Note Typically, the command that first generates slow logs in slow logs from proxy servers can also generate slow logs on data nodes. The number of slow logs from a data node is usually less than that of a proxy server. This is due to the different definitions of the execution time and different thresholds of slow logs.

    In this example, after you view slow logs from proxy servers, you can see that the slow log caused by the KEYS command also exists in slow logs from data nodes. No other slow logs that are displayed on the Proxy tab exist on the Data nodes tab. This shows that the KEYS command causes the timeout.

    View slow logs on the Data nodes tab
  4. In slow logs from proxy servers, you can search for the client IP address based on the command found in Step 2.