Resolve packet loss caused by Linux kernel softnet backlog queue overflow -

Problem description

Application-level symptoms: The application experiences network timeouts, connection interruptions, or data transfer failures.

System-level packet loss: When you continuously monitor the /proc/net/softnet_stat file with the cat command, the count in the second column (dropped) or the third column (squeezed) increases rapidly.

# Each row corresponds to a CPU core.
# Column 1: Total number of network frames received.
# Column 2: Number of packets dropped because the backlog queue was full (dropped).
# Column 3: Number of packets deferred because the processing time budget was exceeded (squeezed).
$ cat /proc/net/softnet_stat
000bb344 00000471 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
000bc76f 00000305 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001

High CPU usage for software interrupts: When you use tools such as top or mpstat, you observe abnormally high CPU usage for si (softirq).

Causes

When the network interface controller (NIC) driver receives a data packet, it uses a software interrupt (NET_RX_SOFTIRQ) to notify the kernel protocol stack to process the packet. To buffer traffic bursts, the kernel maintains a softnet backlog queue for each CPU core. Packet loss is primarily caused by two factors:

The softnet backlog queue is too small: In high network throughput scenarios, if the rate at which packets enter the queue is consistently higher than the CPU processing rate, the queue fills up quickly. New incoming packets are then dropped directly. This causes the dropped count in the second column of the /proc/net/softnet_stat file to increase.
CPU processing power is insufficient: Even if the queue size is adequate, packet processing can be interrupted if a CPU core cannot process the packets in the softnet backlog queue within its allocated time budget (net.core.netdev_budget). This situation is called a time_squeeze and causes the squeezed count in the third column of the /proc/net/softnet_stat file to increase. This indicates that the bottleneck is CPU computing power, not the queue size.

Solutions

First, monitor the dynamic changes in /proc/net/softnet_stat to determine if packet loss is due to an insufficient queue size (dropped increases) or a CPU processing bottleneck (squeezed increases). Then, apply targeted optimizations.

Step 1: Diagnose the root cause of packet loss

Monitor the softnet status in real time to distinguish between a dropped issue and a squeezed issue.

Log on to the ECS Linux instance.
Run the following command to refresh the softnet statistics every second. Focus on the incremental changes in the dropped and squeezed columns.
/proc/net/softnet_stat are cumulative since the system started. A problem is indicated only when the values increase continuously and rapidly.
```
watch -d 'awk "{print \"CPU\"(NR-1)\": dropped=\"\$2\", squeezed=\"\$3}" /proc/net/softnet_stat'
```
Based on the command output, determine the problem type:
- The dropped column increases continuously, while the squeezed column remains mostly unchanged: This indicates that the softnet backlog queue is too small. Proceed to Step 2: Adjust the backlog queue size.
- The squeezed column increases continuously (regardless of whether dropped increases): This indicates that CPU processing power is the bottleneck. Proceed to Step 3: Optimize CPU processing capability.

Step 2: Adjust the backlog queue size

Increase the value of the net.core.netdev_max_backlog parameter to expand the queue capacity. This helps mitigate packet loss caused by traffic bursts.

Run the following command to view the current netdev_max_backlog value. The default value is typically 1000.
```
sysctl net.core.netdev_max_backlog
```

Based on the instance's network bandwidth, refer to the following table to set a reasonable netdev_max_backlog value.

Important: An unreasonably large value increases memory consumption and may introduce network latency. Set this value with caution. Memory usage estimation formula: Memory usage (Bytes) ≈ netdev_max_backlog × Average packet size × Number of CPU cores.

The size you set depends mainly on your network bandwidth and business scenario.

Business Scenarios	Bandwidth	Recommended value	Description
Default/Low configuration	1 Gbps	1000 (default)	The default value is typically 1000, which is sufficient for normal traffic.
Medium load	1 Gbps to 10 Gbps	5000 to 10000	Suitable for most standard web servers and application servers.
High concurrency/High throughput	10 Gbps	30000	Suitable for Nginx gateways, Redis, and high-frequency API services.
Extremely high performance	40 Gbps+	60000 to 100000	Suitable for core switch nodes, DDoS traffic scrubbing, and ultra-high-frequency trading systems.

Temporarily modify the parameter value to apply it immediately. Replace NETDEV_MAX_BACKLOG_NUMBER with the desired value.
```
sysctl -w net.core.netdev_max_backlog=NETDEV_MAX_BACKLOG_NUMBER
```
To ensure the configuration persists after a server restart, make the setting permanent.
- Method 1: Recommended Create or modify the /etc/sysctl.d/99-network-tuning.conf file and add the following content:
```
# Increase kernel softnet backlog queue size
net.core.netdev_max_backlog = NETDEV_MAX_BACKLOG_NUMBER
```
- Method 2: Add the following content to the end of the /etc/sysctl.conf file:
```
net.core.netdev_max_backlog = NETDEV_MAX_BACKLOG_NUMBER
```
  Then, run sysctl -p to apply the configuration.
Return to Step 1 and monitor the dropped count again to verify that it has stopped increasing. This confirms the optimization was effective.

Step 3: Optimize CPU processing capability

Use multiple CPU cores to process network software interrupts in parallel by enabling Receive Packet Steering (RPS).

Confirm the number of vCPUs for the instance. RPS has no effect on single-core instances.
Find the name of the NIC to optimize, which is usually eth0 or eth1.
Enable RPS to distribute network software interrupts across all CPU cores for processing.
rps_cpus is a CPU bitmask. For example, the mask for an 8-core CPU is ff (binary 11111111), and for a 16-core CPU is ffff. You can calculate the mask based on the number of CPU cores or use a sufficiently large value (such as ffffffff) to cover all cores.
```
# Replace <interface> with the NIC name, for example, eth0
# Replace <cpu_mask> with the calculated CPU mask, for example, ff (8-core)
echo <cpu_mask> > /sys/class/net/<interface>/queues/rx-0/rps_cpus

# Example: Enable RPS for the eth0 NIC on an 8-core CPU
echo ff > /sys/class/net/eth0/queues/rx-0/rps_cpus
```
To make the configuration persist across restarts, add the preceding command to a startup script, such as /etc/rc.local.
Return to Step 1 and monitor the squeezed count again to verify that it has stopped increasing. This confirms the optimization was effective.

Recommendations

Set up monitoring and alerts: In Alibaba Cloud CloudMonitor, configure custom monitoring for your ECS instance. Monitor the growth rate of dropped and squeezed in /proc/net/softnet_stat and set alert rules. This ensures you are promptly notified if a problem occurs.
Select a suitable instance type: For network-intensive applications, select a network-enhanced instance family (such as c7ne or g7ne). These instance families provide higher network packets per second (PPS) and bandwidth performance.