edit-icon download-icon

Link testing tool for ping packet loss or ping failure

Last Updated: Mar 05, 2018

When a client accesses a target server, if ping packet loss occurs or if the server cannot be pinged, you can use tools such as traceroute (or TRACERT) and MTR (or WinMTR) to test the link to locate the problem. This document proposes four link testing tools, describes the testing procedure and analyzes the testing results.

Traceroute command line tool

Traceroute is a network testing tool pre-installed on almost all versions of Linux. It tracks the path of transferring data packets to a target IP address using Internet Protocol (IP).

Traceroute firstly sends UDP testing packets whose maximum Time To Live (Max_TTL) period is short, and then listens the ICMP TIME_EXCEEDED response on the entire link starting from the gateway. The testing starts when TTL=1 and continues as the TTL value increases until you receive the ICMP PORT_UNREACHABLE message. The ICMP PORT_UNREACHABLE message identifies if the target host is located, or if the maximum TTL value for tracking the path of transferring data packets is reached.

Traceroute sends UDP data packets for link testing by default. You can set the -I parameter so that traceroute sends an ICMP data packet for link testing.

Usage:

  1. traceroute [-I] [ -m Max_ttl ] [ -n ] [ -p Port ] [ -q Nqueries ] [ -r ] [ -s SRC_Addr ] [ -t TypeOfService ] [ -f flow ] [ -v ] [ -w WaitTime ] Host [ PacketSize ]

Output example:

  1. [root@centos ~]# traceroute -I 223.5.5.5
  2. traceroute to 223.5.5.5 (223.5.5.5), 30 hops max, 60 byte packets
  3. 1 * * *
  4. 2 192.168.17.20 (192.168.17.20) 3.965 ms 4.252 ms 4.531 ms
  5. 3 111.1.20.41 (111.1.20.41) 6.109 ms 6.574 ms 6.996 ms
  6. 4 111.1.34.197 (111.1.34.197) 2.407 ms 2.451 ms 2.533 ms
  7. 5 211.138.114.25 (211.138.114.25) 1.321 ms 1.285 ms 1.304 ms
  8. 6 211.138.114.70 (211.138.114.70) 2.417 ms 211.138.114.66 (211.138.114.66) 1.857 ms 211.138.114.70 (211.138.114.70) 2.002 ms
  9. 7 42.120.244.194 (42.120.244.194) 2.570 ms 2.536 ms 42.120.244.186 (42.120.244.186) 1.585 ms
  10. 8 42.120.244.246 (42.120.244.246) 2.706 ms 2.666 ms 2.437 ms
  11. 9 * * *
  12. 10 public1.alidns.com (223.5.5.5) 2.817 ms 2.676 ms 2.401 ms

Common parameter description:

Parameter Description
-d Enables the troubleshooting function of the Socket layer.
-f Specifies the TTL value of the first testing packet.
-F Specifies not to display segment identifiers.
-g Specifies the source routing gateway. You can specify up to eight source routing gateways.
-i Specifies a network adapter to send data packets when a host has multiple network adapters.
-I Replaces UDP data packets with ICMP data packets for link testing.
-m Specifies the maximum TTL value for testing packets.
-n Uses the IP address rather than the host name, with reverse DNS lookup disabled.
-p Specifies the UDP communication port.
-r Neglects the common Routing Table and directly sends data packets to a remote host.
-s Specifies the IP address of data packets sent from the local host.
-t Specifies the TOS value of testing packets.
-v Displays the process of executing commands in details.
-w Specifies the duration of waiting for the remote host sending a response back for receiving data packets.
-x Enables or disables packet correctness testing.

MTR command line tool

My traceroute (MTR) is a network testing tool pre-installed on almost all Linux versions. It integrates the ping and traceroute functions in one more powerful tool.

MTR sends ICMP data packets for link testing by default. You can set the -u parameter to make MTR send UDP data packets for link testing.

Different from traceroute that tests the link only once, and MTR continuously tests nodes on the link and provides corresponding statistics. Therefore, MTR is highly recommended because it avoids the influence of node fluctuation so that the testing result is more accurate.

Usage:

  1. mtr [-hvrctglspni46] [--help] [--version] [--report]
  2. [--report-cycles=COUNT] [--curses] [--gtk]
  3. [--raw] [--split] [--no-dns] [--address interface]
  4. [--psize=bytes/-s bytes]
  5. [--interval=SECONDS] HOSTNAME [PACKETSIZE]

Output example:

  1. [root@centos ~]# traceroute -I 223.5.5.5
  2. My traceroute [v0.75]
  3. mycentos6.6 (0.0.0.0) Wed Jun 15 23:16:27 2016
  4. Keys: Help Display mode Restart statistics Order of fields quit
  5. Packets Pings
  6. Host Loss% Snt Last Avg Best Wrst StDev
  7. 1\. ???
  8. 2\. 192.168.17.20 0.0% 7 13.1 5.6 2.1 14.7 5.7
  9. 3\. 111.1.20.41 0.0% 7 3.0 99.2 2.7 632.1 235.4
  10. 4\. 111.1.34.197 0.0% 7 1.8 2.0 1.2 2.9 0.6
  11. 5\. 211.138.114.25 0.0% 6 0.9 4.7 0.9 13.9 5.8
  12. 6\. 211.138.114.70 0.0% 6 1.8 22.8 1.8 50.8 23.6
  13. 211.138.128.134
  14. 211.138.114.2
  15. 211.138.114.66
  16. 7\. 42.120.244.186 0.0% 6 1.4 1.6 1.3 1.8 0.2
  17. 42.120.244.198
  18. 8\. 42.120.244.246 0.0% 6 2.8 2.9 2.6 3.2 0.2
  19. 42.120.244.242
  20. 9\. ???
  21. 10\. 223.5.5.5 0.0% 6 2.7 2.7 2.5 3.2 0.3

Common parameter description:

Parameter Description
-r Displays output information in a report.
--report
-p Lists the tracking result specifically each time, which is different from --report that provides all the result.
--split
-s Specifies the size of the ping data packets.
--split
-n Specifies not to use the IP address to perform reverse domain name resolution.
--no-dns
-a Specifies the IP address for sending data packets when a host has multiple IP addresses.
--address
-4 Specifies to use only IPv4.
-6 Specifies to use only IPv6.

In MTR, you can enter a letter to switch to another mode. See the following table for reference:

Parameter Description
? Displays the help menu.
h
d Switches between display modes.
n Enables or disables domain name resolution.
u Switches between using ICMP or UDP data packets for link testing.

Returned result description:

The following table describes each data column in the returned result for the default configuration:

Parameter Description
Col 1: Host IP address and domain name of the node. You can enter the English letter n to switch to display the IP address or the domain name according to the preceding table.
Col 2: Loss% Packet loss rate on the node.
Col 3: Snt The number of data packets sent in a second. The default value is 10. You can specify it by the –c parameter.
Col 4: Last The latest testing latency value.
Col 5: Avg The average value of testing latency.
Col 6: Best The minimum value of testing latency.
Col 7: Wrst The maximum value of testing latency.
Col 8: StDev Standard deviation. A higher standard deviation indicates nodes that are more unstable.

TRACERT command line tool

TRACERT (or Trace Route) is a Windows command line utility for network diagnosis. It tracks the path of an IP data packet sent to the target IP address.

TRACERT sends ICMP data packets to determine the route to the target address. In these data packets, TRACERT uses different TTL value of IP addresses. Since routers along the data packet forwarding path must at least reduce the TTL by 1 before forwarding data packets, the TTL is actually equivalent to a hop counter. When the TTL of a packet reaches zero, the corresponding node sends an ICMP “timeout” message to the source computer.

TRACERT firstly sends the packet whose TTL value is 1, increases the TTL value by 1 and sends the corresponding packet in each subsequent transmission until the destination responds or the maximum TTL value is reached. The ICMP “timeout” messages sent back from intermediate routers contain information of corresponding nodes.

Usage:

  1. tracert [-d] [-h maximum_hops] [-j host-list] [-w timeout] [-R] [-S srcaddr] [-4] [-6] target_name

Output example:

  1. C:\> tracert -d 223.5.5.5
  2. Use at most 30 hops to track the routes of 223.5.5.5.
  3. 1 * * * Request timeout.
  4. 2 9 ms 3 ms 12 ms 192.168.17.20
  5. 3 4 ms 9 ms 2 ms 111.1.20.41
  6. 4 9 ms 2 ms 1 ms 111.1.34.197
  7. 5 11 ms * * 211.140.0.57
  8. 6 3 ms 2 ms 2 ms 211.138.114.62
  9. 7 2 ms 2 ms 1 ms 42.120.244.190
  10. 8 32 ms 4 ms 3 ms 42.120.244.238
  11. 9 * * * Request timeout.
  12. 10 3 ms 2 ms 2 ms 223.5.5.5
  13. Tracking is finished.

Common parameter description:

Parameter Description
-d Specifies not to resolve an IP address to a host name, with reverse domain name resolution disabled.
-h Specifies the maximum number of hops when the target IP address is searched.
maximum_hops
-j Specifies the loose source route along the host list.
host-list
-w Shows the waiting time specified by the time-out duration of each response. Unit: milliseconds.
timeout
-R Tracks the round trip route, only for IPv6.
-S Specifies the source IP address to be used, only for IPv6.
srcaddr
-4 Specifies to use only IPv4.
-6 Specifies to use only IPv6.
target_host The target host domain name or IP address.

WinMTR command line tool

WinMTR is the graphical MTR tool for Windows. However, it only supports some MTR parameters. By default, WinMTR sends ICMP data packets for link testing. Download WinMTR from its official website.

Compared with tracert, WinMTR (likeMTR) avoids the influence of node fluctuation so that the testing result is more accurate. Therefore, we recommend that you use WinMTR to test the link when WinMTR is available.

Usage:

  1. Decompress and start WinMTR without installation.

  2. Enter the target server domain name or IP address for the Host field after the program starts.

    Note: Do not enter a space before the domain name or IP address.)

  3. Click Start to start the testing. (When the testing starts, the Start button changes to Stop.)

    FillIn

  4. Click Stop to stop the testing after a while.

  5. Description about other options:

    • Copy Text to Clipboard: Copy the testing result to the Clipboard in text format.
    • Copy HTML to Clipboard: Copy the testing result to the Clipboard in HTML format.
    • Export TEXT: Export the text result to a specified file in text format.
    • Export HTML: Export the text result to a specified file in HTML format.
    • Options: Optional parameters, including:
      • Interval (sec): Interval (expiration time) of each testing. The default value is one second.
      • Ping size (bytes): Size of the data packet used in the ping testing. The default value is 64 bytes.
      • Max hosts in LRU list: Maximum number of hosts supported in the LRU list. The default value is 128.
      • Resolve names: Display the node in domain name though reverse IP address lookup.

Returned result description:

The following table describes each data column in the returned result for default configuration:

Parameter Description
Col 1: Hostname IP address or domain name of the node.
Col 2: Nr Node number.
Col 3: Loss% Packet loss rate on the node.
Col 4: Sent Number of data packets sent.
Col 5: Recv Number of data packets received successfully.
Col 6: Avg Average value of testing latency to the corresponding node.
Col 7: Best Minimum value of testing latency to the corresponding node.
Col 8: Wrst Maximum value of testing latency to the corresponding node.
Col 9: Last Latest value of testing latency to the corresponding node.
Col 10: StDev Standard deviation. A higher standard deviation indicates nodes that are more unstable.

See the following link testing procedure:

  1. Obtain the public IP address of the local network
  2. Test the link
  3. Test the link reversely
  4. Analyze the testing result

1. Obtain the public IP address of the local network

Access a website (such as http://www.howtofindmyipaddress.com/) from a client on the local network, to obtain the public network IP address of the local network.

Ping the target server or use MTR to test the link on a client:

  1. Ping the domain name or IP address of the target server from the client continuously, at least 100 data packets.
  2. Record the testing result.
  3. Set the testing destination address on the WinMTR or MTR to the domain name or IP address of the target server,
  4. Perform the link testing
  5. Record the testing result.

3. Test the link reversely

Access the target server system to ping the client or use MTR to test the link reversely:

  1. Ping the domain name or IP address obtained in step 1 for the client from the target server, at least 100 data packets
  2. Record the testing result.
  3. Set the testing destination address on the WinMTR or MTR to the client IP address obtained from step 1.
  4. Perform the link testing.
  5. Record the testing result.

4. Analyze the testing result

See the following descriptions to analyze the testing result. After locating the exception node, query and find out the responsible carrier and network of the node.

If exceptions occur on the local network node of the client, you must analyze and troubleshoot the local network. If exceptions occur on the carrier’s node, directly report the problem to the responsible carrier or contact Alibaba Cloud Technical Support.

Link testing result analysis

Take the link testing by MTR (WinMTR) as an example. The analysis is based on the following Sample of the link testing result:

WinMTR

Focus on the following items when you analyze the link testing result:

Network areas

In normal conditions, the whole link connecting a client to a target server significantly contains the following areas:

  • Local network of the client: Area A in the Sample of the link testing result. It refers to a LAN and local network provider’s network. If exceptions occur in this area on the client’s local network node, you must analyze and troubleshoot the local network. Otherwise, if exceptions occur on the local network carrier’s network node, you must report the problem to the local carrier.
  • Carrier’s backbone network: Area B in the Sample of the link testing result. If exceptions occur in this area, you can query the responsible carrier based on the exception node IP address, and report the problem to the responsible carrier.
  • Local network of the target server: Area C in the Sample of the link testing result. It refers to the carrier’s network of the target host. If exceptions occur in this area, report the problem to the carrier of the target host network.

Link load balancing

Area D in the Sample of the link testing result. If link load balancing is enabled on some parts of the intermediate link, MTR would only number the beginning and the ending nodes, test them, and collect statistics about them. MTR only displays the corresponding IP addresses or domain names on the intermediate node.

Evaluation based on both Avg and StDev

  • The Best and Wrst values of a node may vary greatly because of link jitter or other factors.

  • The Avg value collects statistics about the average value of all the link testing, therefore, Avg value reflects the network quality on the corresponding node better.

  • The StDev value indicates the discrete degree of the packet latency values on corresponding node. The StDev value judges if Avg value actually reflects the network quality on the corresponding node. For example, if the standard deviation is large, the data packet latency is uncertain. The latency of some data packets may be small (for example, 25 ms), while the latency of other data packets may be great (for example, 350 ms), but the average latency may be normal eventually. Therefore, the Avg value cannot well reflect the actual network quality.

  • We recommend you the following analysis criteria:

    • If the StDev value is great, observe whether the Best and Worst values on the corresponding node are also great, and evaluate whether exceptions occur on the node or not.
    • If the StDev value is not great, use the Avg value to evaluate whether exceptions occur on the node or not.

      Note: Whether the StDev value is great or not is evaluated based on the latency values listed in the other columns for the same node. For example, if the Avg value is 30 ms, then a StDev value of 25 ms is a great standard deviation value. However, if the Avg value is 325 ms, the same StDev value of 25 ms is not a great standard deviation value.

Evaluation based on packet loss rate

If Loss% (packet loss rate) on a node is not zero, there may be exceptions in this hop on the network. Packet loss usually occurs on a node due to either of the following reasons:

  • The carrier manually limits the ICMP transmission rate on the node for the sake of security or performance, resulting in packet loss.
  • Exceptions occur on the node, resulting in packet loss.

You can locate the reasons for packet loss based on whether the problem recurs on nodes following the exception node:

  • If packet loss does not occur on the subsequent nodes, it indicates that packet loss on the exception node is caused by the carrier’s speed limit policy, and packet loss can be ignored, see the Sample of the link testing result.
  • If packet loss occurs on the subsequent nodes, it indicates that network exceptions occur on the exception node, see the fifth hop in the Sample of the link testing result.

Note: Both possibilities may occur simultaneously. Namely, both the speed limit policy and network exceptions cause packet loss on the corresponding node. In this case, if packet loss occurs consecutively on the exception node and the subsequent nodes, and the packet loss rate is different for each node, usually the packet loss rate in the last few hops is applied. See the Sample of the link testing result, packet loss occurs in the fifth, sixth, and seventh hops. Therefore, the packet loss rate of 40% in the seventh hop is applied for reference.

Latency

Latency jump

If the latency increases sharply after a certain hop, it usually indicates network exceptions on this node. See the Sample of the link testing result, the latency increases sharply after the fifth hop, indicating that network exceptions occur on the node of the fifth hop.

However, a great latency does not necessarily mean exceptions on a corresponding node. See the Sample of the link testing result, although the latency increased sharply after the fifth hop on the node, the testing result data reaches the host normally in the end. Therefore, great latency may occurs when a response for receiving data packets is sent back, and then you must test the link reversely to analyze the problem.

ICMP speed limit

ICMP speed limit policy may also increase of the latency sharply, but the latency usually becomes normal on the subsequent nodes. See the Sample of the link testing result, the packet loss rate is 100% in the third hop, meanwhile the latency increases sharply. However, the latency becomes normal on the subsequent nodes. Therefore, we conclude that the speed limit policy is the cause for the sharp increase of latency and packet loss on the node.

See the following common link exception scenarios and testing analysis:

Improper configuration of the target host network

Data example:

  1. root@mycentos6 ~]# mtr --no-dns www.google.com
  2. My traceroute [v0.75]
  3. mycentos6.6 (0.0.0.0) Wed Jun 15 19:06:29 2016
  4. Keys: Help Display mode Restart statistics Order of fields quit
  5. Packets Pings
  6. Host Loss% Snt Last Avg Best Wrst StDev
  7. 1\. ???
  8. 2\. ???
  9. 3\. 111.1.20.41 0.0% 10 521.3 90.1 2.7 521.3 211.3
  10. 4\. 111.1.34.209 0.0% 10 2.9 4.7 1.6 10.6 3.9
  11. 5\. 211.138.126.29 80.0% 10 3.0 3.0 3.0 3.0 0.0
  12. 6\. 221.183.14.85 0.0% 10 1.7 7.2 1.6 34.9 13.6
  13. 7\. 221.183.10.5 0.0% 10 5.2 5.2 5.1 5.2 0.0
  14. 221.183.11.5
  15. 8\. 221.183.23.26 0.0% 10 5.3 5.2 5.1 5.3 0.1
  16. 9\. 173.194.200.105 100.0% 10 0.0 0.0 0.0 0.0 0.0

In this example, a 100% packet loss occurs at the destination IP address. At first glance it looks like packets fail to arrive, but in fact ICMP is disabled by firewalls or iptables on the target server, so that the destination host cannot send any response.

Therefore, you must check the security policy configured on the target server.

ICMP speed limit

Data sample:

  1. [root@mycentos6 ~]# mtr --no-dns www.google.com
  2. My traceroute [v0.75]
  3. mycentos6.6 (0.0.0.0) Wed Jun 15 19:06:29 2016
  4. Keys: Help Display mode Restart statistics Order of fields quit
  5. Packets Pings
  6. Host Loss% Snt Last Avg Best Wrst StDev
  7. 1\. 63.247.74.43 0.0% 10 0.3 0.6 0.3 1.2 0.3
  8. 2\. 63.247.64.157 0.0% 10 0.4 1.0 0.4 6.1 1.8
  9. 3\. 209.51.130.213 0.0% 10 0.8 2.7 0.8 19.0 5.7
  10. 4\. aix.pr1.atl.google.com 0.0% 10 6.7 6.8 6.7 6.9 0.1
  11. 5\. 72.14.233.56 60.0% 10 27.2 25.3 23.1 26.4 2.9
  12. 6\. 209.85.254.247 0.0% 10 39.1 39.4 39.1 39.7 0.2
  13. 7\. 64.233.174.46 0.0% 10 39.6 40.4 39.4 46.9 2.3
  14. 8\. gw-in-f147.1e100.net 0.0% 10 39.6 40.5 39.5 46.7 2.2

In this example, an obvious packet loss occurs in the fifth hop, but no exceptions occur on the subsequent nodes. Therefore, ICMP speed limit on the node may cause the packet loss may.

Data transmission from final client to the target is not affected, therefore, you do not need to analyze it.

Loop

Data sample:

  1. [root@mycentos6 ~]# mtr --no-dns www.google.com
  2. My traceroute [v0.75]
  3. mycentos6.6 (0.0.0.0) Wed Jun 15 19:06:29 2016
  4. Keys: Help Display mode Restart statistics Order of fields quit
  5. Packets Pings
  6. Host Loss% Snt Last Avg Best Wrst StDev
  7. 1\. 63.247.74.43 0.0% 10 0.3 0.6 0.3 1.2 0.3
  8. 2\. 63.247.64.157 0.0% 10 0.4 1.0 0.4 6.1 1.8
  9. 3\. 209.51.130.213 0.0% 10 0.8 2.7 0.8 19.0 5.7
  10. 4\. aix.pr1.atl.google.com 0.0% 10 6.7 6.8 6.7 6.9 0.1
  11. 5\. 72.14.233.56 0.0% 10 0.0 0.0 0.0 0.0 0.0
  12. 6\. 72.14.233.57 0.0% 10 0.0 0.0 0.0 0.0 0.0
  13. 7\. 72.14.233.56 0.0% 10 0.0 0.0 0.0 0.0 0.0
  14. 8\. 72.14.233.57 0.0% 10 0.0 0.0 0.0 0.0 0.0
  15. 9 ??? 0.0% 10 0.0 0.0 0.0 0.0 0.0

In this example, a loop occurs after the fifth hop, so that data packets cannot arrive at the target server. Abnormal routing configurations on the carrier’s node may cause this exception.

Therefore, you must contact the carrier responsible for the node.

Link interruption

Data sample:

  1. [root@mycentos6 ~]# mtr --no-dns www.google.com
  2. My traceroute [v0.75]
  3. mycentos6.6 (0.0.0.0) Wed Jun 15 19:06:29 2016
  4. Keys: Help Display mode Restart statistics Order of fields quit
  5. Packets Pings
  6. Host Loss% Snt Last Avg Best Wrst StDev
  7. 1\. 63.247.74.43 0.0% 10 0.3 0.6 0.3 1.2 0.3
  8. 2\. 63.247.64.157 0.0% 10 0.4 1.0 0.4 6.1 1.8
  9. 3\. 209.51.130.213 0.0% 10 0.8 2.7 0.8 19.0 5.7
  10. 4\. aix.pr1.atl.google.com 0.0% 10 6.7 6.8 6.7 6.9 0.1
  11. 5\. ??? 0.0% 10 0.0 0.0 0.0 0.0 0.0
  12. 6\. ??? 0.0% 10 0.0 0.0 0.0 0.0 0.0
  13. 7\. ??? 0.0% 10 0.0 0.0 0.0 0.0 0.0
  14. 8\. ??? 0.0% 10 0.0 0.0 0.0 0.0 0.0
  15. 9 ??? 0.0% 10 0.0 0.0 0.0 0.0 0.0

In this example, no response returns from data packets after the fourth hop. This is usually due to link interruption on the corresponding node. We recommend that you test the link reversely to troubleshoot the problem.

You must contact the responsible carrier for the node.

Reference

Thank you! We've received your feedback.