All Products
Search
Document Center

The application on the ECS instance occasionally suffers packet loss and the kernel Log (dmesg) contains the error message "kernel: nf_conntrack: table full, dropping packet".

Last Updated: Apr 29, 2020

Problem Description

Packet loss occurs occasionally when connecting applications in ECS instances. After troubleshooting, the perimeter network of the ECS instance is normal, but the Kernel Log (dmesg) contains" kernel: nf_conntrack: table full, dropping packet"The error message. The ECS instances with the same faulty IP address meet the following requirements:

  • Image: aliyun-2.1903-x64-20G-alibase-20190327.vhd and later versions.
  • Kernel: kernel-4.19.24-9.al7 and all kernel versions later.

Cause of problem

nf_conntrack is a module that tracks connection entries for NAT within the Linux. The nf_conntrack module uses a hash table to record the "established connection" record of the TCP protocol. When the hash table is full, a new connection is generated. nf_conntrack: table full, dropping packet"Error. For more information about the important parameters of the nf_conntrack module, see the following information.

  • nf_conntrack_buckets: the size of the hash table, which can be specified when the module is loaded, or through sysctl command modification. The default value is "65536" when the system memory is larger than or equal to 4GB.
  • nf_conntrack_max: the maximum number of nodes in the hash table, that is, the maximum number of connections supported by the nf_conntrack module. The default value is "262144" when the system memory is greater than or equal to 4G. The default value is relatively small for servers that handle a large number of connections.
  • nf_conntrack_tcp_timeout_time_wait: the TCP connection time in the time_wait status is saved in the nf_conntrack module. The default value is 120s.

Solution

Alibaba Cloud reminds you that:

  • If you have any risky operations on an instance or data, pay attention to the disaster tolerance and fault tolerance capabilities of the instance to ensure data security.
  • If you modify the configuration and data of an instance (including but not limited to ECS and RDS), we recommend that you create snapshots or enable RDS log backup.
  • If you have granted permissions on the Alibaba Cloud platform or submitted security information such as the logon account and password, we recommend that you modify the information as soon as possible.

Select the solution that best suits your business needs from the following solutions based on the actual conditions.

Solution 1: Use the sysctl interface to adjust the parameter values in the nf_conntrack module

The business side must confirm the maximum number of nf_conntrack connections that may be used by applications in advance, and use the following command to set the parameter values of the nf_conntrack module through the sysctl interface.

Note: If your business itself is characterized by high concurrent connections, it is mainly short-lived connections. Recommended increase nf_conntrack_max and nf_conntrack_buckets ensure that the hash table of nf_conntrack is not full due to the excessive number of connections. General recommendations nf_conntrack_max the parameter value is nf_conntrack_buckets four times the parameter value.

sudo sysctl -w net.netfilter.nf_conntrack_max=1503232
sudo sysctl -w net.netfilter.nf_conntrack_buckets=375808# if a non -4.19 core is used, this option may not be modified at runtime.
sudo sysctl -w net.netfilter.nf_conntrack_tcp_timeout_time_wait=60

Note:

  • The values of these parameters are for reference only. Modify the values according to the actual situation on site. Before a change, we recommend that you create a snapshot or back up important files to ensure data security.
  • It is recommended to adjust together nf_conntrack_buckets and nf_conntrack_max parameter. If you only change nf_conntrack_max as a result, the linked list on a hash table may become excessively long and results in low query efficiency. If you only change nf_conntrack_buckets the error message returned because the specified parameter value does not resolve the issue.

Solution 2: filter connections that do not need tracking through iptables

Add the "-j notrack" action to the iptables rule to filter connections that do not need to be tracked (track), as shown in the following command. The advantage of this method is to cure the problem. It can directly perform notrack processing on connections that do not need to be tracked, and will not occupy the space of the hash table and will not cause an error.

sudo iptables -t raw -A PREROUTING -p udp -j NOTRACK
sudo iptables -t raw -A PREROUTING -p tcp --dport 22 -j NOTRACK

Note: The Command here indicates that the TCP connection of UDP and 22 is not tracked, for reference only. The actual situation shall prevail.

Related Documents

Applicable to

  • Elastic Compute Service