When you encounter packet loss or connectivity issues while using the ping command to test network connectivity between a client and a server or a Server Load Balancer (SLB) instance, network testing tools can help diagnose the problem. This topic describes how to use these tools to test a network path and offers insights for analyzing the results.
Test procedure
The flowchart below illustrates the process for testing a network path.
-
You can visit websites such as IP address query to find the public IP address of the local network.
-
The 'target server' denotes the domain name or public IP address of the target service.
Introduction to tools
MTR is a network diagnostic tool that merges the capabilities of ping
and traceroute
. Unlike traceroute
, which conducts a single path trace, mtr
continuously tests the path's nodes and provides statistical information. This continuous probing helps mtr
deliver more accurate results by mitigating the effects of node fluctuations.
If you're operating on a Linux
system, you can utilize mtr
by installing the respective mtr
package. For Windows
systems, WinMTR
is an available option. For details on installation and usage, see the section below.
mtr (Linux)
Installation instructions
CentOS 6/7/8
Ubuntu/Debian
Usage introduction
Command format
The mtr
command format is as follows: hostname
represents the service's domain name, while ip
denotes the service's public ip
address.
mtr [options] hostname/ip
Parameter description
Below is a description of common optional parameters for mtr. For a more extensive list, run the man mtr
command.
Optional parameters | Parameter description |
| Put mtr into report mode. |
| Set mtr to display output in a format that is suitable for a split-user interface. |
| Specify the size of each ping packet. |
| Configures MTR to display numeric IP addresses and prevents MTR from resolving the host names. |
| Configures source IP addresses for packets. Note This parameter is used in scenarios in which a host has multiple IP addresses. |
-4 | Use IPv4 only. |
-6 | Use IPv6 only. |
Launching the mtr command enters you into interactive mode, where pressing the ? or h key displays the help menu, allowing you to control the mtr
tool or change the display view as per the help document.
Example
Diagnosing network issues with IPv4.
sudo mtr -4 www.aliyun.com
Sample test results returned by mtr
For instance, running the mtr target IP address
command yields the following results:
The table below explains the command output parameters based on the default settings.
Parameter | Parameter description |
Host | The IP address and domain name of the node. You can press the |
Loss% | The packet loss rate of the node. |
Snt | The number of packets that are sent. Default value: 10. You can specify the value by using the |
Last | The latency of the last probe. |
Avg | The average RTD. |
Best | The lowest latency of all probes. |
Wrst | The maximum RTD. |
StDev | The standard deviation. A larger value indicates a larger difference between the response times for data packets on the node. |
WinMTR (Windows)
Installation instructions
WinMTR does not require installation. Simply decompress the downloaded package and launch WinMTR by following these steps:
-
Download WinMTR from the official website.
-
Extract the WinMTR package and double-click to run WinMTR.
Usage introduction
-
In the Host field, enter the domain name or IP address of the target server.
ImportantEnsure there are no spaces in the IP address or domain name entered.
Configure additional features or parameters as needed. The table below describes these features and parameters.
Feature or parameter
Parameter description
Copy Text to clipboard
Copy the test results to the clipboard in the text format.
Copy HTML to clipboard
Copy the test results to the clipboard in the HTML format.
Export TEXT
Export the test results to a specified file in the text format.
Export HTML
Export the test results to a specified file in the HTML format.
Options
The options, including the following:
Interval (sec): The interval (expiration) time of each probe. Default value: 1 second.
Ping Size (bytes): The size of the packet used for the PING probe. Default value: 64 bytes.
Max. hosts in LRU list: The maximum number of hosts supported by the LRU list. Default value: 128.
Resolve names: Display the relevant nodes by reverse lookup of IP addresses to domain names.
-
Click Start to initiate the test.
Once the test begins, Start changes to Stop, and WinMTR automatically presents the test results.
-
To end the test after a sufficient duration, click Stop.
Sample test results returned by WinMTR
For example, using WinMTR to test a destination server's domain name displays the following results.
The table below explains the test result parameters based on the default settings.
Parameter | Parameter description |
Hostname | The IP address or domain name of the node. |
Nr | The number of the node. |
Loss% | The packet loss rate of the node. |
Sent | The number of packets that are sent. |
Recv | The number of packets that are received. |
Best | The lowest latency of all probes. |
Avg | The average latency of all probes. |
Worst | The highest latency of all probes. |
Last | The latency of the previous probe. |
StDev | The standard deviation. A higher value indicates a larger difference between the response times for data packets on the node. |
Result analysis guide
The mtr command is known for its accuracy. This section offers an analysis of the test results produced by the mtr command. Below is a sample output from an mtr command.
Network area description
Typically, the network path from a client to a destination server traverses the following networks. For details on these networks and troubleshooting tips, see the section below.
-
Client's local network
The client's local network includes a LAN and the networks of local carriers, as depicted in Region A in the sample output. Issues in this region can be categorized as follows:
-
For LAN node issues, inspect and address problems within the LAN.
-
For issues on a local carrier's node, report the problem to the carrier.
-
-
Carrier networks
The path crosses multiple carriers' backbone networks, as shown in Region B. If an issue arises on a carrier's node, identify the carrier by querying the node's IP address. Then, contact the carrier or Alibaba Cloud technical support for assistance.
-
Destination server's local network
The destination server is located within a carrier's network, as indicated by Region C. Report any node issues in this region to the respective carrier.
If load balancing is active on certain segments of the path, the mtr
command only numbers and tests the first and last nodes. For intermediate nodes, it only shows the IP address or domain name.
Metric-based analysis
To assess network path connectivity or performance, consider a comprehensive analysis using metrics such as Loss% (packet loss rate), Avg (average), StDev (standard deviation), and latency. The following section provides a basic approach to analyzing path connectivity using these metrics.
Loss% (packet loss rate)
A non-zero Loss% at a node suggests a potential network issue at that hop. Possible reasons for packet loss include the following:
-
The node's ICMP traffic may be limited by the carrier for security or performance reasons.
-
An issue on the node causing packet loss. To pinpoint the cause, check if subsequent nodes also experienced packet loss:
-
If subsequent nodes show no packet loss, the issue is likely due to the carrier's ICMP rate limiting. This type of packet loss, as seen in the second hop of the sample output, can generally be disregarded.
-
If all subsequent nodes also show packet loss, the issue likely originates from the node in question, as indicated by the sixth hop in the sample output.
-
If only some subsequent nodes show packet loss, it suggests a combination of ICMP rate limiting and a network issue at the node. If packet loss is consistent across these nodes with varying rates, prioritize the last hop's rate. For instance, in the sample output, hops six through nine exhibit packet loss, with the ninth hop showing a 30.3% loss rate, which should be the reference point.
-
Avg (average) and StDev (standard deviation)
Node latency values can fluctuate due to jitter or other factors, causing significant variation between the Best
and Wrst
values. The Avg metric reflects the average latency since the start of the MTR test and indicates the node's network quality. A high StDev value suggests greater latency variability, making the data packets more dispersed on the node. Thus, StDev can help determine if the Avg value accurately represents the node's network quality. For example, a high StDev value may indicate that while some packets transmit at low latency, others may experience high latency, resulting in an Avg value that does not truly represent the network condition.
Consider the following when analyzing Avg and StDev values:
-
A high StDev value warrants a review of the node's
Best
andWrst
values to assess potential network issues. -
If the StDev value is low, evaluate the node's network status based on its Avg value.
NoteA StDev value is considered large or not large relative to the latency values in other columns for the same node. For instance, an Avg of 30 ms with a StDev of 25 ms is considered large, whereas an Avg of 325 ms with a StDev of 25 ms is not.
Latency
-
Latency spike
A sudden increase in latency after a hop indicates an issue at that node. The sample output shows a spike after the sixth hop, suggesting a problem at that point. Note that high latency alone does not necessarily mean there is an issue with a node. As seen in the sample output, despite the spike after the sixth hop, the data reached the destination host. High latency could also occur on the return path, so a reverse path test is recommended for further investigation.
-
ICMP throttling increases latency
ICMP throttling on a node can cause a latency spike without affecting subsequent nodes. In the sample output, the ninth hop shows a 30% packet loss and a latency spike, but the latency immediately drops to normal levels on the following nodes. This indicates that the spike and packet loss are due to throttling.
Example analysis and conclusion
Drawing from the path test results and metric-based analysis described above, the following conclusions can be made:
-
In the client's local network, packet loss is observed at the second, sixth, seventh, eighth, and ninth hops, but not significantly at the third, fourth, fifth, tenth, eleventh, and fifteenth hops. If service requests in the network are unaffected, the packet loss at the earlier hops may be attributed to ICMP throttling.
-
The
Wrst
value at the fourth hop is high, but theAvg
value is moderate, suggesting a temporary network fluctuation due to jitter or device performance during the probe. -
The average latency across all nodes in the path ranges from 1.8 ms to 17.6 ms, indicating low network latency overall.
These findings suggest that the network path is generally clear of issues. If network jitter is experienced in actual service, consider the reverse path test results for a more detailed analysis.
Network path test results should be analyzed flexibly. The provided analysis is a general method for evaluating metrics. For accurate conclusions, assess the results in the context of your specific business scenario. If a one-way path test is inconclusive, conduct a reverse path test for a more thorough investigation.
Common scenarios in which exceptions occur on network paths
This section describes typical scenarios where network path exceptions occur, with examples based on running the mtr command on a Linux system. Results may vary depending on the operating system and testing tool used.
Improper network configuration of the destination host
In this scenario, packet loss occurs at the end of the transmission, as shown below. The destination server's security policies, such as firewall or iptables rules, may block ICMP, preventing responses and the completion of packet delivery. Review the destination server's security settings to resolve the issue.
ICMP throttling
Here, packet loss occurs at the end of the transmission, as depicted below. The destination server's security policies or the carrier's throttling policies may be restricting ICMP, preventing responses and packet delivery. Check the destination server's security settings or perform a reverse path test to further analyze the issue.
Routing loop in the path
In this example, a routing loop occurs after the fifth hop, preventing packets from reaching the destination server, as shown below. Routing loops are typically caused by misconfigurations in a carrier's routing setup. Contact the relevant carrier for resolution.
Path interruption
Here, packets receive no feedback after the fourth hop, as shown below. The metrics such as Loss%, Last, Avg, and Best show no data, indicating a likely interruption at the node. A reverse path test is recommended for further troubleshooting. Contact the carrier associated with the node for assistance.
References
Network connectivity issues can also be diagnosed through the console. For more information, see the referenced document.