All Products
Search
Document Center

Tair:Query slow logs

Last Updated:Feb 28, 2024

If you want to analyze the performance of your Tair instance, identify slow query commands, or pinpoint potential performance bottlenecks, you can do so by examining the slow logs. This analysis can help you uncover clues for resolving performance issues and optimizing queries. Slow logs record commands whose execution time exceeds the threshold specified by slowlog-log-slower-than. By default, this threshold is set to 20 milliseconds. You can customize this value to suit your specific needs.

Overview

Slow logs record requests that take longer than a specified threshold to execute in Tair. Slow logs are classified into slow logs of data nodes and slow logs of proxy nodes.

Slow logs of data nodes

  • The command execution duration collected in slow logs that were generated on a data node includes only the amount of time required to actually run a command on the data node. The amount of time required for the data node to communicate with a proxy node or client and the execution latency of the command in the single-threaded queue are not included.

  • Slow logs of data nodes are retained for 72 hours. The number of slow logs that can be stored is unlimited.

  • In most cases, few slow logs are generated on data nodes due to the high performance of Tair.

Parameters

Parameter

Description

slowlog-log-slower-than

The threshold of the command execution duration for slow logs of data nodes. If a command runs for a period of time that exceeds this threshold, the command is recorded in a slow log. Default value: 20000. Unit: microseconds. 20000 microseconds is equal to 20 milliseconds.

Note

In most cases, the actual latency is higher than the specified value of this parameter because this value does not include the amount of time required to transmit and process data among clients, proxy nodes, and data nodes.

slowlog-max-len

The maximum number of slow log entries that can be stored. Default value: 1024.

For more information, see Modify the values of parameters for an instance.

Slow logs of proxy nodes

  • The command execution duration collected in slow logs of proxy nodes starts from the time when a proxy node sends a request to a data node and ends at the time when the proxy node receives the response from the data node. This includes the command execution duration on the data node, the duration of data transmission over the network, and the queuing latency of the command.

  • Slow logs of proxy nodes are retained for 72 hours. The number of slow logs that can be stored is unlimited.

  • In most cases, the latency value recorded in a slow log of proxy nodes is closer to the actual latency of the application. Therefore, we recommend that you check this slow log type when you troubleshoot timeout issues of Tair.

Note

Standard instances of Tair do not involve slow logs of proxy nodes.

Parameters

Parameter

Description

rt_threshold_ms

The threshold of the command execution duration for slow logs of proxy nodes. Default value: 500. Unit: milliseconds. We recommend that you set the threshold to a value close to the client timeout period, which is anywhere from 200 milliseconds to 500 milliseconds.

For more information, see Modify the values of parameters for an instance.

Procedure

  1. Log on to the Tair console and go to the Instances page. In the top navigation bar, select the region in which the instance that you want to manage resides. Then, find the instance and click the instance ID.

  2. In the left-side navigation pane, choose Logs > Slow Logs.

  3. On the Slow Logs page, filter slow logs by time range or keyword. For cluster and read/write splitting instances, you can also filter slow logs by node type and node ID.

    Note

    By default, the Host Address parameter for cluster and read/write splitting instances displays the IP addresses of proxy nodes. To obtain the IP address of a specific client, set the ptod_enabled parameter to 1 in the System Parameters section. For more information, see Modify the values of parameters for an instance.

Irrelevant slow SQL statements

Note

Specific slow SQL statements are not related to the actual execution rate of your requests but related to the engine logic of an instance. You can ignore the following slow SQL statements.

Slow SQL statement

Description

latency:eventloop

Tair uses a event-driven model during runtime. An event loop consists of reading, parsing, and running commands and returning outputs. The execution duration of a latency:eventloop statement indicates the overall amount of time taken for an event loop.

latency:pipeline

Tair allows the client to work in pipeline mode. In this mode, the client sends commands and receives outputs in batches. After all commands are executed, outputs begin to be returned. If your Tair instance uses the cluster architecture, proxy nodes use the pipeline mode to send requests in batches to the backend of Tair. The execution duration of a latency:pipeline statement indicates the amount of time consumed by a batch of client requests in pipeline mode.

latency:fork

The execution duration of a latency:fork statement indicates the amount of time required to fork a child process. The larger the amount of data, the longer the time required.

Related API operations

API operation

Description

DescribeSlowLogRecords

Queries the slow logs of a Tair instance that were generated within a specified period of time.