Problem description
Service access issues: You experience significantly longer service response times, request timeouts, or inaccessible services.
High monitoring metrics: In the Elastic Compute Server (ECS) console or CloudMonitor, you find that your instance's outbound Internet bandwidth usage consistently exceeds 80%, approaching or reaching the bandwidth limit. You may also see a sudden increase in the number of network connections.
Monitoring alerts: You receive an SMS or email alert that your network bandwidth usage has exceeded the preset threshold.
Causes
Malicious programs or processes: The instance is infected with a mining program or a Trojan, or is being used as part of a DDoS botnet. These malicious programs generate a large amount of abnormal network traffic.
Network attacks: Malicious attacks, such as a DDoS attack or brute-force attack, target the instance's public-facing ports and saturate the inbound bandwidth with invalid requests.
Insufficient instance network specifications: As your business grows, the instance's bandwidth specification can no longer handle the normal service traffic, creating a performance bottleneck.
Solutions
First, use the sar tool to locate the high-traffic network interface card (NIC). Then, use the iftop tool to identify the peer IP addresses consuming the bandwidth or the nethogs tool to identify high-traffic processes. Finally, take appropriate action based on the identified process and IP address.
Step 1: Locate the high-traffic NIC
Use the sar tool to identify the high-traffic NIC and narrow the scope of your investigation.
Log on to an ECS instance using a VNC connection.
Go to ECS console - Instances. In the top navigation bar, select the target region and resource group.
Go to the details page of the target instance. Click Connect and select VNC. Enter the username and password to log on to the ECS instance.
Gather network interface statistics.
# -n DEV: Reports network device statistics # 1 5: Samples every 1 second, 5 times in total sudo sar -n DEV 1 5Identify the high-traffic NIC.
Focus on the
txkB/svalue in theAveragesection. Compare the values to find theIFACE(NIC name) with the highest value.rxkB/srepresents the average outbound bandwidth in kilobytes per second.
In the example, the
eth0NIC has the highesttxkB/svalue, which identifies it as the high-traffic NIC.
Step 2: Analyze and resolve the high-traffic issue
Analyze the NIC traffic.
iftop: Monitors NIC traffic from a connection perspective. This tool helps you identify the IP addresses and ports with the highest traffic to and from your instance.For web services, use
iftopto locate high-traffic IP addresses, then use tools likelogwatchto analyze web logs and determine if the traffic is legitimate.nethogs: Monitors NIC traffic from a process perspective. This tool helps you identify the processes that are consuming the most network bandwidth.
iftop tool
Install the
iftoptool.For Alibaba Cloud Linux and CentOS:
sudo yum install -y iftopFor Ubuntu and Debian:
sudo apt update sudo apt install -y iftop
Monitor the high-traffic NIC.
Replace
<IFACE>with the high-traffic NIC name from Step 1.# -i <IFACE>: Specifies the NIC to monitor as <IFACE> # -P: Displays the port number (Port) sudo iftop -i <IFACE> -PFor example, if the high-traffic NIC is
eth0, runsudo iftop -i eth0 -P.Analyze the NIC traffic to find the peer IP address that consumes the most bandwidth.

The real-time traffic information is sorted in descending order. The
=>symbol indicates the outbound data rate from your instance to a peer IP address. In the example, the average outbound traffic from the local instance to the IP address140.205.11.xover the last 2 seconds is4.32Mb/s.Enter
qto exit theiftoptool.View the process associated with the port.
Replace
<HIGH_TRAFFIC_PEER_IP>with the peer IP address you found in the previous step.sudo netstat -antp | grep <peer IP address that consumes bandwidth>Example output:

In the example, the local IP address is
172.16.0.x, and the peer IP address is140.205.11.x. The corresponding process isnginx: worker, with a process ID (PID) of2282.
nethogs tool
Example
Install the
nethogstool.For Alibaba Cloud Linux and CentOS:
sudo yum install -y nethogsFor Ubuntu and Debian:
sudo apt update sudo apt install -y nethogs
Monitor the high-traffic NIC.
Replace
<IFACE>with the high-traffic NIC name from Step 1.# The default monitoring interval is 1 second. You can use the -d parameter to specify the monitoring interval. sudo nethogs <IFACE>For example, if the high-traffic NIC is
eth0, runsudo nethogs eth0.Analyze the NIC traffic.

The
SENTcolumn shows the rate at which your instance is sending data. In this example, the process consuming the most traffic isnginx: worker process, with an outbound traffic rate of about 696 KB/s and a process ID (PID) of2282. Enterqto exit the tool.
Choose a solution based on the process or peer IP address.
If the identified process (such as a download tool like
wgetorcurl, or an unknown program) exhibits suspicious behavior, or if it communicates with a suspicious peer IP address:Terminate the abnormal process: Use
sudo kill -15 <PID>to terminate the abnormal process. Replace<PID>with the PID of the high-traffic process you identified.ImportantBefore you terminate a process, ensure that it is not a critical business process to avoid service disruption.
Block the malicious IP address: Manage security group rules to block access from the malicious IP address.
Scan and remove malicious programs: Use the virus detection and removal feature (a paid feature) in Security Center to scan the instance and remove any detected viruses.
If the high traffic is from a legitimate business process, it is likely due to normal operations:
Upgrade bandwidth: If the instance's current bandwidth is a bottleneck, upgrade it.
Optimize the application: Check your application code for potential optimizations, such as reducing unnecessary data transfers, adding caching, or compressing data.
Rate-limit traffic: If your business allows, use tools like
iptablesto limit the traffic for a specific IP address or port. This can prevent a single user or service from consuming all available bandwidth.
If you cannot find any abnormal processes that are consuming bandwidth but the overall bandwidth usage remains high, this indicates that the total traffic volume has exceeded the instance's network capacity. You should upgrade the instance bandwidth.
Recommendations
Periodically collect system metrics for ongoing analysis. For more information, see Use the atop tool to monitor Linux system metrics.
To enhance security, purchase and configure Anti-DDoS Origin or Anti-DDoS Pro and Anti-DDoS Premium with appropriate protected objects and mitigation policies.
To receive notifications about future risks and anomalies, configure instance monitoring and alerting.