All Products
Search
Document Center

Troubleshooting for high latency of access packet loss for Windows instances

Last Updated: Dec 15, 2020

Overview

When the website access is slow or cannot be accessed, if significant problems have been eliminated and obvious packet loss is detected by using the ping command, it is recommended that you perform a link test. In Windows, we recommend that you use the WinMTR tool or tracert Command Line for link testing to determine the cause of the problem. Usually, the link test takes the following steps.

  1. Use a link test tool to detect network conditions and server status.
  2. Analyze and handle the problem based on the link test results.

 

Detail

Alibaba Cloud reminds you that:

  • When you perform operations that have risks, such as modifying instances or data, check the disaster recovery and fault tolerance capabilities of the instance to ensure data security.
  • If you modify the configurations and data of instances including but not limited to ECS and RDS instances, we recommend that you create snapshots or enable RDS log backup.
  • If you have authorized or submitted security information such as the logon account and password in the Alibaba Cloud Management console, we recommend that you modify such information in a timely manner.

 

WinMTR tool

As a network test tool, mtr(My traceroute) integrates the graphical user interface of tracert and ping commands. ping and tracert are usually used to detect network conditions and server status.

Command Name Description
ping Sends packets to the specified server. If the server receives a response, it returns the packet with the return time.
tracert Returns all the nodes (routes) that pass through a user's computer to a specified server and the response speed of each node.

 

WinMTR is the graphical implementation of mtr in a Windows environment. It is suitable for route tracking and ping tests in Windows. By default, WinMTR Sends ICMP data packets for detection and cannot switch over.

 

Compared with tracert command line tools, WinMTR avoids the impact of node fluctuations on the test results, and the test results are more correct. We recommend that you use WinMTR for link testing in Windows. Download the WinMTR tool.

  1. Download and decompress the WinMTR tool. After running the program, enter the domain name or IP address of the target server in the Host field.
    1
  2. Click Start to Start the test. After the Test starts, the corresponding button changes to Stop.
  3. After a period of time, click Stop to Stop the test.

    Note: you can perform a test for several minutes. After the test is completed, export the results.

 

Description of common optional parameters

  • Copy Text to clipboard: Copy the test results in Text format to the clipboard.
  • Copy HTML to clipboard: Copy the test results in HTML format to the clipboard.
  • Export TEXT: exports the test results to a specified file in TEXT format.
  • Export HTML: exports the test results to a specified file in HTML format.
  • Options: The parameters are optional. Specifically, the following parameters are included.
    • Interval(sec): the Interval (expiration) of each probe. The default value is 1 second.
    • Ping size(bytes): the size of the data packet used for ping Detection. The default value is 64 bytes.
    • Max hosts in LRU list: the maximum number of hosts supported by the LRU list. The default value is 128.
    • Resolve names: displays relevant nodes by domain name based on reverse lookup of IP addresses.

 

Description of the returned results after WinMTR operation

The following table describes the WinMTR test results in the default configuration.

  • The first column (Hostname): The host IP address or domain name of each node to pass through the target server.
  • Column 2 (Nr): node number.
  • The third column (Loss%): node packet Loss rate. ping the percentage of packets failed to return, thus you can determine which node (line) has failed, whether it is the server's machine room or the international routing backbone.
  • The fourth column (Sent): The number of Sent packets.
  • The Fifth Column (Recv): The number of packets that have been successfully received.
  • Columns 6, 7, 8, and 9 (Best, Avg, Worst, and Last): indicate the minimum, average, maximum, and response time of the Last data packet.

 

tracert command line tool

tracert (Trace Route) is a network diagnosis command line utility that comes with Windows, used to track the path that an Internet Protocol (IP) data packet passes to the target address.

 

tracert Sends ICMP data packets to determine the route to the target address. For these data packets, tracert uses different IP address TTL values. Because routers along the way are required to reduce the TTL by at least 1 before forwarding data packets, the TTL is actually equivalent to a hop counter. When the TTL of a packet reaches zero, the corresponding node sends an ICMP timeout message to the source computer. tracert first sends the packet whose TTL is 1, increases the TTL by 1 in each subsequent transmission, until the target address responds or reaches the maximum TTL value. The ICMP timeout messages sent back from intermediate routers contain information about the corresponding nodes.

  1. Click the start menu at the bottom of the desktop and select run.
  2. Open the Run box, enter cmd, and click OK.
  3. On the command execution page, enter tracert. Press the Enter key to display the tracert usage instructions.
  4. Enter the target address to be traced, as shown in the following example.

    C:\> tracert -d 223.5.5.5
    Routes to 223.5.5.5 are tracked through up to 30 hops
    1 * * * The request times out.
    2 9 ms 3 ms 12 ms 192.X.X.20
    3 4 ms 9 ms 2 ms 111.X.X.41
    4 9 ms 2 ms 1 ms 111.X.X.197
    5 11 ms * * 211.X.X.57
    6 3 ms 2 ms 2 ms 211.X.X.62
    7 2 ms 2 ms 1 ms 42.X.X.190
    8 32 ms 4 ms 3 ms 42.X.X.238
    9 * * * The request times out.
    10 3 ms 2 ms 2 ms 223.5.5.5

 

Analyze link test results

This diagram is based on the following link test result example.

  1. Determine whether abnormalities exist in each region and handle them separately based on the situation in each region.
    • Zone A is the local network of the client, that is, local area network and local network provider network. For exceptions in this region and node problems related to the local network of the client, troubleshoot and analyze the local network. For node problems related to the local network provider network, submit feedback to the local operator.

    • Area B: The backbone network of the service provider. In case of exceptions in the region, the operator can be queried based on the IP address of the abnormal node, and then the problem can be reported directly or through Alibaba Cloud after-sales technical support.

    • Zone C shows the local network of the target server, that is, the network where the target host belongs to a network provider. In case of exceptions in this region, report the problem to the network provider of the target host.

  2. This function combines Avg (average) and StDev (standard deviation) to determine whether each node has an exception.

    • If StDev is very high, it will simultaneously observe the Best and Worst of the corresponding node to determine whether the corresponding node has exceptions.
    • If StDev is not high, Avg is used to determine whether the corresponding node has an exception.

      Note: the StDev is high or not high, and there is no specific time range standard. Make a relative evaluation based on the latency values in other columns of the same node. For example, if Avg is 30ms, then StDev is 25ms, it is considered to be a high deviation. However, if Avg is 325MS, the same StDev is 25ms, which means that the deviation is not high.

  3. Check the node packet Loss rate. If "Loss%" is not 0, this indicates that the network of this hop route may be faulty. There are two possible causes for node packet loss.

    • The ICMP transmission rate of the node is limited, resulting in packet loss.
    • The node does have an exception, resulting in packet loss.
  4. Determine the cause of packet loss on the current abnormal node.

    • If no packet loss occurs on the subsequent nodes, the packet loss on the current node is caused by the carrier policy and can be ignored. As shown in the second-hop routing network in the preceding link test result example.

    • If packet loss occurs on the subsequent node, the current node has a network exception, resulting in packet loss. As shown in the fifth-hop routing network in the preceding link test result example.

      Note: the preceding two situations may occur at the same time, that is, the corresponding node has both a policy speed limit and a network exception. In this case, if packet loss occurs continuously on the current node and its subsequent nodes, and the packet loss rates of each node are different, the packet loss rate of the last few hop routes is generally used. As shown in the preceding link test result example, packet loss occurs on the network with the No. 5, No. 6, and No. 7 hop routes. Therefore, the final packet loss is 40% of the 7th hop as a reference.

  5. Perform analysis from the following two aspects.

    • If the latency of a hop route increases sharply, it is generally determined that the node has a network exception. As shown in the preceding link test result example, the latency of subsequent nodes after the 5-hop route increases sharply, which means that a network exception occurs on the network node of the 5-hop route.

      Note: high latency does not necessarily mean that the corresponding node has exceptions. Large latency may also be caused by the data return link. It is recommended to analyze it together with the reverse link test.

    • The ICMP policy speed limit may also cause a sharp increase in latency for the corresponding node, but the subsequent nodes will usually return to normal. As shown in the preceding link test result example, the 3-hop routing network has a packet loss rate of 100%, and the latency also increases sharply. However, the latency of the node immediately returned to normal. Therefore, the sharp increase in latency and packet loss on the node are determined by the policy speed limit.

 

Suggestions

  • If 100% packet loss occurs at the target address, we recommend that you check the security policy configuration of the target server.
  • If data packets are not able to reach the target server due to circular jump, we recommend that you contact the corresponding node owner.
  • If the data packet does not receive any feedback after the jump, it is recommended that you further confirm with the reverse link test, and contact the corresponding node to which the operator handles.
  • Network Communication leased lines are provided for Alibaba cloud data centers in mainland China and other countries or regions. To reduce packet loss rates during communication, we recommend that you use express connect.
  • If packets are dropped and the latency is very high on the host, we recommend that you perform a two-way WinMTR test, that is, local-to-server and server-to-local tests. If you cannot log on remotely, log on through the management terminal.

 

Application scope

  • ECS