A common issue that affects service performance is connection timeouts caused by slow requests. The slow log feature of ApsaraDB for Redis allows you to find the IP address of the client that sends these requests and troubleshoot issues based on the details of slow logs.

Background information

Slow logs record requests that take longer than a specified threshold to execute in ApsaraDB for Redis. Slow logs are classified into slow logs from data nodes and slow logs from proxy nodes.

Note
Slow log typeDescriptionParameter
Slow logs from data nodes
  • The command execution duration collected in slow logs that were generated on a data node includes only the amount of time required to actually run a command on the data node. The amount of time required for the data node to communicate with a proxy node or client and the execution latency of the command in the single-threaded queue are not included.
  • Slow logs from data nodes are retained for 72 hours. The number of slow logs that can be stored is unlimited.
  • In most cases, a small number of slow logs are generated on data nodes due to the high performance of ApsaraDB for Redis.
  • slowlog-log-slower-than: the threshold of the command execution duration for slow logs from data nodes. If a command runs for a period of time that exceeds this threshold, the command is recorded in a slow log. Default value: 20000. Unit: microseconds. This value is equivalent to 20 milliseconds.
    Note In most cases, the actual latency is higher than the specified value of this parameter because this value does not include the amount of time required to transmit and process data among clients, proxy nodes, and data nodes.
  • slowlog-max-len: the maximum number of slow log entries that can be stored. Default value: 1024.
Slow logs from proxy nodes
  • The command execution duration collected in slow logs from proxy nodes starts from the time when a proxy node sends a request to a data node and ends at the time when the proxy node receives the response from the data node. This includes the command execution duration on the data node, the data transmission duration over the network, and the queuing latency of the command.
  • Slow logs from proxy nodes are retained for 72 hours. The number of slow logs that can be stored is unlimited.
  • In most cases, the latency value recorded in a slow log from proxy nodes is closer to the actual latency of the application. Therefore, we recommend that you check this slow log type when you troubleshoot timeout issues.
rt_threshold_ms: the threshold of the command execution duration for slow logs from proxy nodes. Default value: 500. Unit: milliseconds. We recommend that you set the threshold to a value close to the client timeout period, which is from 200 milliseconds to 500 milliseconds.

Methods used to query slow logs

Slow log typeMethod
Slow logs from data nodes
  • Connect to the ApsaraDB for Redis instance from a client and run the SLOWLOG GET command. For more information, see SLOWLOG.
  • Log on to the ApsaraDB for Redis console or call an API operation:
Slow logs from proxy nodesLog on to the ApsaraDB for Redis console or call an API operation:

Procedure

In most cases, service timeouts are caused by slow requests. We recommend that you perform the following steps to troubleshoot timeout issues:

  1. If a service timeout issue occurs, first check the slow logs generated on proxy nodes. For more information, see View slow logs.
    Note
    • For standard instances, go to Step 3 and analyze slow logs from data nodes.
    • If no slow logs from proxy nodes exist, you can check the network connection between the client and the ApsaraDB for Redis instance.
  2. Find the command that generated the earliest slow log on proxy nodes.
    Note If slow requests accumulate on data nodes, these requests are recorded in slow logs from proxy nodes.

    In this example, the earliest recorded slow log is generated by the KEYS command. The IP address on the right of the log entry is the IP address of the client that sends the command.

    Find the earliest slow log among the slow logs from proxy nodes
  3. Check the slow logs from data nodes to find the slow logs from proxy nodes that cause the timeout issue.
    Note Typically, the command that first generates slow logs on proxy nodes can also generate slow logs on data nodes. The number of slow logs from a data node is usually smaller than that of a proxy node. This is because the two slow log types have different definitions of execution time and slow log thresholds.

    In this example, after you view slow logs from proxy nodes, you can find that the slow log generated by the KEYS command also exists on data nodes. No other slow logs that are displayed on the Proxy tab exist on the Data nodes tab. This shows that the KEYS command causes the timeout issue.

    View slow logs on the Data nodes tab
  4. In slow logs from proxy nodes, you can search for the client IP address for optimization based on the command found in Step 2.