Problem description
In environments with multiple network interface controllers (NICs) or complex network routing, such as those that use LVS load balancing, overlay networks, or asymmetric routing, the following issues may occur:
Service access issues: When you access a backend server through Server Load Balancer (SLB), health checks may fail or service access may be blocked. However, you can access the service port normally when you log on directly to the backend server.
Network connection timeouts: On an ECS instance with multiple NICs or policy-based routing configured, network connections in some directions intermittently time out or fail completely.
System packet loss: When you capture packets using tools such as
tcpdump, you may observe that datagrams enter the NIC but the application layer does not receive the data.
Causes
This issue usually occurs when the rp_filter kernel parameter is set to strict mode (value 1) in an environment with asymmetric routing. Asymmetric routing means that the inbound and outbound paths of a datagram are different. When strict mode for rp_filter is enabled, the kernel performs a reverse path validation on each received packet. If the ingress NIC for a packet is not the optimal egress interface in the system route table, the packet is dropped.
Cause 1: Asymmetric routing due to multiple NICs or policy-based routing: On an ECS instance with multiple NICs, if policy-based routing is configured, a packet might enter through the
eth0NIC, but its response packet is sent out through theeth1NIC based on the routing policy. In this case, the strict mode ofrp_filterconsiders the packet path invalid and drops the packet.Cause 2: Asymmetric routing due to load balancing (LVS-DR mode): When you use a Layer 4 (TCP/UDP) listener for an Alibaba Cloud load balancer, such as Classic Load Balancer (CLB), the backend servers use Direct Routing (DR) mode by default. In this mode, client request packets travel through LVS to the backend ECS instance, but the response packets from the ECS instance bypass LVS and return directly to the client. This is a typical asymmetric routing scenario for the ECS instance. If the ECS instance has
rp_filterstrict mode enabled, it will drop request packets from LVS.
Solutions
Method 1: Set to loose mode (recommended)
Set rp_filter to loose mode (value 2). The kernel only checks if the source IP address exists in a route table and is reachable. It does not require the ingress NIC to be the optimal egress interface. This method is suitable for all asymmetric routing scenarios.
Temporarily modify the kernel parameter to apply the change immediately.
# Set the rp_filter mode for all NICs to loose mode echo 2 > /proc/sys/net/ipv4/conf/all/rp_filter echo 2 > /proc/sys/net/ipv4/conf/default/rp_filterTest if the service has recovered. If the issue is resolved, proceed to the next step to make the change permanent.
Edit the
/etc/sysctl.conffile. Add or modify the following configuration to ensure it persists after a system restart.# Edit the configuration file vi /etc/sysctl.conf # Add or modify the following two lines net.ipv4.conf.all.rp_filter = 2 net.ipv4.conf.default.rp_filter = 2Run the
sysctl -pcommand to apply the permanent configuration.sudo sysctl -p
Method 2: Disable validation for a specific NIC
If only a specific NIC, such as eth0, receives traffic from LVS, disable reverse path validation only for that NIC.
Temporarily disable
rp_filterfor theeth0NIC.echo 0 > /proc/sys/net/ipv4/conf/eth0/rp_filterTest if the service has recovered. If the issue is resolved, proceed to the next step to make the change permanent.
Edit the
/etc/sysctl.conffile. Add or modify the following configuration.# Edit the configuration file vi /etc/sysctl.conf # Add or modify the following line net.ipv4.conf.eth0.rp_filter = 0Run the
sysctl -pcommand to apply the permanent configuration.sudo sysctl -p
Method 3: Globally disable validation
This method completely disables reverse path validation and exposes the system to the risk of IP spoofing attacks.
Temporarily modify the kernel parameter.
# Warning: This operation reduces system security. Use with caution. echo 0 > /proc/sys/net/ipv4/conf/all/rp_filter echo 0 > /proc/sys/net/ipv4/conf/default/rp_filterTest if the service has recovered. If the issue is resolved, proceed to the next step to make the change permanent.
Edit the
/etc/sysctl.conffile. Add or modify the following configuration.# Edit the configuration file vi /etc/sysctl.conf # Warning: The following configuration exposes the server to the risk of IP spoofing attacks. net.ipv4.conf.all.rp_filter = 0 net.ipv4.conf.default.rp_filter = 0Run the
sysctl -pcommand to apply the permanent configuration.sudo sysctl -p
Recommendations
Follow the principle of least privilege: When you adjust the rp_filter parameter, prioritize the solution with the minimum scope of impact. For example, using loose mode (rp_filter=2) is better than completely disabling the feature (rp_filter=0). Modifying the configuration for a single NIC is also better than modifying the global configuration.