All Products
Search
Document Center

How to troubleshoot high latency of access to a Linux instance website

Last Updated: Dec 15, 2020

Overview

When the website access is slow or cannot be accessed, if significant problems have been eliminated and obvious packet loss is detected by using the ping command, it is recommended that you perform a link test. In Linux, we recommend that you use the mtr command line tool or traceroute command line tool to test the link. Generally, the link test procedure is as follows.

  1. Use a link test tool to detect network conditions and server status.
  2. Analyze and handle the problem based on the link test results.

 

Description

They are described as follows:Mtr command line toolAndTracert command line toolHow to use and how to analyze the link test results.

 

Mtr command line tool

Almost all of the Linux release versions of the pre-installed network testing tool, mtr (My traceroute) integrates the graphical interface of tracert and ping commands, and is very powerful. Ping and tracert are usually used to detect network conditions and server status.

The name of the command. Description
ping Sends packets to the specified server. If the server receives a response, it returns the packet with the return time.
tracert Returns all the nodes (routes) that pass through a user's computer to a specified server and the response speed of each node.

 

By default, mtr Sends ICMP data packets for link detection, and uses the "-u" parameter to specify UDP data packets for detection. Compared with traceroute, mtr performs a link tracking test only. mtr continuously detects the relevant nodes on the link and provides corresponding statistical information. Mtr can avoid the impact of node fluctuations on the test results, so the test results are more correct, it is recommended that you use it first.

 

Usage instructions

mtr [-hvrctglspni46] [--help] [--version] [--report]
[--report-cycles=COUNT] [--curses] [--gtk]
[--raw] [--split] [--no-dns] [--address interface]
[--psize=bytes/-s bytes]
[--interval=SECONDS] HOSTNAME [PACKETSIZE]

Sample output

[root@centos ~]# mtr 223.5.5.5
My traceroute [v0.75]
mycentos6.6 (0.0.0.0) Wed Jun 15 23:16:27 2016
Keys: Help Display mode Restart statistics Order of fields quit
Packets Pings
Host Loss% Snt Last Avg Best Wrst StDev
1 .???
2. 192.X.X.20 0.0% 7 13.1 5.6 2.1 14.7 5.7
3. 111.X.X.41 0.0% 7 3.0 99.2 2.7 632.1 235.4
4. 111.X.X.197 0.0% 7 1.8 2.0 1.2 2.9 0.6
5. 211.X.X.25 0.0% 6 0.9 4.7 0.9 13.9 5.8
6. 211.X.X.70 0.0% 6 1.8 22.8 1.8 50.8 23.6
211.X.X.134
211.X.X.2
211.X.X.66
7. 42.X.X.186 0.0% 6 1.4 1.6 1.3 1.8 0.2
42.X.X.198
8. 42.X.X.246 0.0% 6 2.8 2.9 2.6 3.2 0.2
42.X.X.242
9. ???
10. 223.5.5.5 0.0% 6 2.7 2.7 2.5 3.2 0.3

 

Description of common optional parameters

  • -r or -- report: displays the output in report mode.
  • -p or -- split: lists the results of each trace separately, instead of -- report for the whole result.
  • -s or -- psize: specify the size of the ping packet.
  • -n or -- no-dns: no reverse resolution is required for the IP address.
  • -a or -- address: Set the IP address of the data packet to be sent. Used when the host has multiple IP addresses.
  • -4: only IPv4 is used.
  • -6: only IPv6 is used.

When mtr is running, you can also enter the corresponding letter to quickly switch mode, the meaning of each letter is as follows.

  • ? Or h: the help menu is displayed.
  • d: switch the display mode.
  • n: enable or disable DNS resolution.
  • u: switches to ICMP or UDP data packets for detection.

 

Response description

The following table describes the columns in the returned results based on the default configuration.

  • The first column (Host): node IP address and domain name. PressnThe key can be switched.
  • Second column (Loss %): node packet Loss rate.
  • The third column (Snt): The number of packets sent per second. The default value is 10, which can be specified by the "-c" parameter.
  • Last: The Last detection latency.
  • Columns 5, 6, and 7 (Avg, Best, and Worst): Average, minimum, and maximum values of the detection latency, respectively.
  • Column 8 (StDev): standard deviation. The larger the value, the more unstable the corresponding node is.

 

traceroute command line tool

traceroute is a network testing tool pre-installed in almost all Linux releases. It is used to track the path that an Internet Protocol (IP) data packet passes to a target address. Traceroute firstly sends UDP test packets with the maximum TTL value (Max_TTL), and then listens to the ICMP TIME_EXCEEDED response on the entire link starting from the Gateway. The test starts when TTL = 1 and continues until the ICMP PORT_UNREACHABLE message is received. The ICMP PORT_UNREACHABLE message is used to indicate that the target host has been located, or the maximum TTL of the command has been reached. traceroute sends UDP data packets for link detection by default. You can use the "-I" parameter to specify the ICMP data packet to be sent for detection.

 

Usage instructions

traceroute [-I] [ -m Max_ttl ] [ -n ] [ -p Port ] [ -q Nqueries ] [ -r ] [ -s SRC_Addr ] [  -t TypeOfService ] [ -f flow ] [ -v ] [  -w WaitTime ] Host [ PacketSize ]

 

Sample output

[root@centos ~]# traceroute -I 223.5.5.5
traceroute to 223.5.5.5 (223.5.5.5), 30 hops max, 60 byte packets
1 * * *
2 192.X.X.20 (192.X.X.20) 3.965 ms 4.252 ms 4.531 ms
3 111.X.X.41 (111.X.X.41) 6.109 ms 6.574 ms 6.996 ms
4 111.X.X.197 (111.X.X.197) 2.407 ms 2.451 ms 2.533 ms
5 211.X.X.25 (211.X.X.25) 1.321 ms 1.285 ms 1.304 ms
6 211.X.X.70 (211.X.X.70) 2.417 ms 211.138.114.66 (211.X.X.66) 1.857 ms 211.X.X.70 (211.X.X.70) 2.002 ms
7 42.X.X.194 (42.X.X.194) 2.570 ms 2.536 ms 42.X.X.186 (42.X.X.186) 1.585 ms
8 42.X.X.246 (42.X.X.246) 2.706 ms 2.666 ms 2.437 ms
9 * * *
10 public1.alidns.com (223.5.5.5) 2.817 ms 2.676 ms 2.401 ms

Common available parameters

  • -d: provides Socket-level troubleshooting.
  • -f: sets the TTL value for the first detection packet.
  • -F: disable segmentation.
  • -g: source routing gateways. A maximum of eight routing gateways can be set.
  • -i: Use the specified Nic to send data packets. Used when the host has multiple NICs.
  • -l: use ICMP data packets instead of UDP data packets for detection.
  • -m: specifies the maximum TTL of the detected data packet.
  • -n: use the IP address rather than the host name (disable reverse DNS lookup ).
  • -p: sets the communication port of the UDP transmission protocol.
  • -r: ignores the common Routing Table and directly sends data packets to the remote host.
  • -s: specifies the IP address that the local host sends out data packets.
  • -t: sets the value of the photos of the detected data packet.
  • -v: displays the command execution process in detail.
  • -w: Set the waiting time for the remote host to return packets.
  • -x: Enables or disables data packet verification.

 

Analyze link test results

The following example shows the test results.

  1. Determine whether abnormalities exist in each region and handle them separately based on the situation in each region.
    • Zone A is the local network of the client, that is, local area network and local network provider network. For exceptions in the region and node problems related to the client's local network, troubleshoot and analyze the local network. For problems related to the network of the local network provider, contact the local operator.

    • Area B: The backbone network of the service provider. In case of exceptions in the region, the operator can be queried based on the IP address of the abnormal node, and then the problem can be reported directly or through Alibaba Cloud after-sales technical support.

    • Zone C shows the local network of the target server, that is, the network where the target host belongs to a network provider. In case of exceptions in this region, report the problem to the network provider of the target host.

  2. This function combines Avg (average) and StDev (standard deviation) to determine whether each node has an exception.

    • If StDev is very high, it will simultaneously observe the Best and Worst of the corresponding node to determine whether the corresponding node has exceptions.
    • If StDev is not high, Avg is used to determine whether the corresponding node has an exception.

      Note:The StDev is high or not high, and there is no specific time range standard. Make a relative evaluation based on the latency values in other columns of the same node. For example, if Avg is 30 ms, then StDev is 25 ms, it is considered to be a high deviation. However, if Avg is 325 ms, the same StDev is 25 ms, which means that the deviation is not high.

  3. Check the node packet Loss rate. If "Loss %" is not 0, this indicates that the network of this hop route may be faulty. There are two possible causes for node packet loss.

    • The ICMP transmission rate of the node is limited, resulting in packet loss.
    • The node is abnormal, resulting in packet loss.
  4. Determine the cause of packet loss on the current abnormal node.

    • If no packet loss occurs on the subsequent nodes, the packet loss on the current node is caused by the carrier policy and can be ignored. As shown in the 2nd-hop routing network in the preceding link test result example.

    • If packet loss occurs on the subsequent node, the current node has a network exception, resulting in packet loss. As shown in the 5th-hop routing network in the preceding link test result example.

      InstructionsThe preceding two situations may occur at the same time, that is, the corresponding node has both a policy speed limit and a network exception. In this case, if packet loss occurs continuously on the current node and its subsequent nodes, and the packet loss rates of each node are different, the packet loss rate of the last few hop routes is generally used. As shown in the preceding link test result example, packet loss occurs on the network of the third, 6, and 7 hop routes. Therefore, 7th of the network with 40% hop routing is used as the reference for the final packet loss.

  5. Check whether there is obvious delay to check whether the node has any exception. Perform analysis from the following two aspects.

    • If the latency of a hop route increases sharply, it is generally determined that the node has a network exception. As shown in the preceding link test result example, the latency of subsequent nodes after the 5th-hop route network increases sharply, which means that a network exception occurs on the network node of the 5th-hop route.

      Note: high latency does not necessarily mean that the corresponding node has exceptions. Large latency may also be caused by the data return link. It is recommended to analyze it together with the reverse link test.

    • The ICMP policy speed limit may also cause a sharp increase in latency for the corresponding node, but the subsequent nodes will usually return to normal. As shown in the preceding link test result example, the packet loss rate of the 3rd-hop routing network is 100%, and the latency increases sharply. However, the latency of the node immediately returned to normal. Therefore, the sharp increase in latency and packet loss on the node are determined by the policy speed limit.

 

Proposed operations

  • If a 100% packet loss occurs at the target address, we recommend that you check the security policy configuration of the target server.
  • If data packets are not able to reach the target server due to circular jump, we recommend that you contact the corresponding node owner.
  • If the data packet does not receive any feedback after the jump, it is recommended that you further confirm with the reverse link test, and contact the corresponding node to which the operator handles.
  • Leased lines are provided for communication between Alibaba cloud data centers in mainland China and data centers in other countries or regions. To reduce packet loss rates during communication, we recommend that you useExpress connect.
  • If packets are dropped and the latency is very high on the host, we recommend that you perform a two-way mtr test, that is, local-to-server and local-to-local tests. If you cannot log on remotelyManagement terminalLog on.

 

Application scope

  • ECS