edit-icon download-icon

Common kernel network parameters of ECS Linux instance and problem troubleshooting

Last Updated: Mar 05, 2018

This document introduces some Linux kernel parameters and provides solutions for related issues. Pay attention to the following items before modifying the kernel parameters:

  • Consider your actual demand, obtain the relevant data for proof, and we recommend you do not modify the kernel parameters arbitrarily.
  • Understand the specific function of each parameter, and the kernel parameters may be different in the same type or version of the operating system.
  • Backup the important data in your ECS instance. See Create snapshots for instruction.

Query and modify the kernel parameters of Linux instances

In the /proc/sys/ directory

Query the kernel parameters: Run cat to query the content of the corresponding file. For example, run cat /proc/sys/net/ipv4/tcp_tw_recycle to query the value of net.ipv4.tcp_tw_recycle.

Modify the kernel parameters: Run echo to modify the corresponding file of the kernel parameter. For example, run echo "0" > /proc/sys/net/ipv4/tcp_tw_recycle to change the value of net.ipv4.tcp_tw_recycle to 0.

Note:

  • The /proc/sys/ directory not only provides information about the Linux instance but also allows the system administrator to immediately enable and disable kernel features. The net folder in its directory stores all the kernel parameters that take effect in the current Linux instance. Its directory tree structure is related to the full name of the parameter, such as parameter net.ipv4.tcp_tw_recycle with corresponding file /proc/sys/net/ipv4/tcp_tw_recycle and the parameter value as the content of the file.
  • If you modify the value of the parameter in the /proc/sys/ directory, it only takes effect in the current operation and restores to the previous value after the Linux instance restarts. You can modify the parameter permanently in the sysctl.conf file.

In the sysctl.conf file

Query the kernel parameters: Run sysctl -a to query all enabled kernel parameters in your ECS instance, see the following code block for example:

  1. net.ipv4.tcp_app_win = 31
  2. net.ipv4.tcp_adv_win_scale = 2
  3. net.ipv4.tcp_tw_reuse = 0
  4. net.ipv4.tcp_frto = 2
  5. net.ipv4.tcp_frto_response = 0
  6. net.ipv4.tcp_low_latency = 0
  7. net.ipv4.tcp_no_metrics_save = 0
  8. net.ipv4.tcp_moderate_rcvbuf = 1
  9. net.ipv4.tcp_tso_win_divisor = 3
  10. net.ipv4.tcp_congestion_control = cubic
  11. net.ipv4.tcp_abc = 0
  12. net.ipv4.tcp_mtu_probing = 0
  13. net.ipv4.tcp_base_mss = 512
  14. net.ipv4.tcp_workaround_signed_windows = 0
  15. net.ipv4.tcp_challenge_ack_limit = 1000
  16. net.ipv4.tcp_limit_output_bytes = 262144
  17. net.ipv4.tcp_dma_copybreak = 4096
  18. net.ipv4.tcp_slow_start_after_idle = 1
  19. net.ipv4.cipso_cache_enable = 1
  20. net.ipv4.cipso_cache_bucket_size = 10
  21. net.ipv4.cipso_rbm_optfmt = 0
  22. net.ipv4.cipso_rbm_strictvalid = 1

Modify the kernel parameters:

  1. Run /sbin/sysctl -w kernel.parameter="example" to modify the specified parameter, such as sysctl -w net.ipv4.tcp_tw_recycle="0".
  2. Run vi /etc/sysctl.conf to modify the parameters in /etc/sysctl.conf file.
  3. Run /sbin/sysctl -p to activate the configuration.

Note: You must restart your ECS instance, because the kernel becomes unstable after you modified the kernel parameters.

Troubleshooting for common problem of Linux kernel parameter

Packet loss caused by a full Hash table

The kernel parameters involved here are:

  • net.netfilter.nf_conntrack_buckets
  • net.nf_conntrack_max

Symptom

ECS Linux instance experiences intermittent packet loss and you cannot connect to the instance. You cannot find exception by link testing tools such as tracert or MTR. Simultaneously, a large amount of error messages table full, dropping packet. are repeated in the system log as shown in the following code block.

  1. Feb 6 16:05:07 i-*** kernel: nf_conntrack: table full, dropping packet.
  2. Feb 6 16:05:07 i-*** kernel: nf_conntrack: table full, dropping packet.
  3. Feb 6 16:05:07 i-*** kernel: nf_conntrack: table full, dropping packet.
  4. Feb 6 16:05:07 i-*** kernel: nf_conntrack: table full, dropping packet.

Analysis

The ip_conntrack is a module that indicates the number of conntracks of NAT in Linux instance. The ip_conntrack module uses a Hash table to record established record of TCP connection. A full hash table causes nf_conntrack: table full, dropping packet error. Linux instance reserves a room to maintain each TCP connection, the size of this room depends on the value of nf_conntrack_buckets and nf_conntrack_max. The default value of nf_conntrack_max is 4 times larger than that of nf_conntrack_buckets, while you cannot modify the value of nf_conntrack_max after the Linux instance startup, so you can increase the value of nf_conntrack_max.

Note: Because the operating system consumes memory to manage TCP connection, make sure the Linux instance has enough memory before increasing the value of nf_conntrack_max. The adjusted value depends on the state of the Linux instance.

Solution

  1. Log on to the ECS instance by using the management terminal.
  2. Run # vi /etc/sysctl.conf to edit the kernel configuration.
  3. Modify the maximum parameter value in Hash table: net.netfilter.nf_conntrack_max = 655350.
  4. Modify the timeout parameter: net.netfilter.nf_conntrack_tcp_timeout_established = 1200, its default value is 432000 (seconds).
  5. Run # sysctl -p to activate the configuration.

Error of time wait bucket table overflow

The kernel parameters involved here is net.ipv4.tcp_max_tw_buckets.

Symptom

You find many TIME_WAIT connections when you run netstat -ant|grep TIME_WAIT|wc -l to count the sum of TCP connections.

The /var/log/message logs are all error message similar to kernel: TCP: time wait bucket table overflow, see the following code block as example:

  1. Feb 18 12:28:38 i-*** kernel: TCP: time wait bucket table overflow
  2. Feb 18 12:28:44 i-*** kernel: printk: 227 messages suppressed.
  3. Feb 18 12:28:44 i-*** kernel: TCP: time wait bucket table overflow
  4. Feb 18 12:28:52 i-*** kernel: printk: 121 messages suppressed.
  5. Feb 18 12:28:52 i-*** kernel: TCP: time wait bucket table overflow
  6. Feb 18 12:28:53 i-*** kernel: printk: 351 messages suppressed.
  7. Feb 18 12:28:53 i-*** kernel: TCP: time wait bucket table overflow
  8. Feb 18 12:28:59 i-*** kernel: printk: 319 messages suppressed.

Analysis

The parameter net.ipv4.tcp_max_tw_buckets manages the sum of TIME_WAIT state TCP connections in the kernel. When the sum of the TIME_WAIT TCP connections and the TCP connections on the verge of TIME_WAIT exceeds the value of net.ipv4.tcp_max_tw_buckets, the kernel prints the error message in the message log and closes the TCP connections beyond the net.ipv4.tcp_max_tw_buckets value without becoming a TIME_WAIT TCP connection.

Note: We recommend that you increase the value of net.ipv4.tcp_max_tw_buckets in line with the actual situation, and meanwhile improve TCP connections from the business layer.

Solution

  1. Run netstat -anp |grep tcp |wc -l to count the sum of TCP connections.
  2. Run vi /etc/sysctl.conf to query the value of net.ipv4.tcp_max_tw_buckets. Compare the sum with the value of net.ipv4.tcp_max_tw_buckets to check if excess connections exists.
  3. Increase the value of net.ipv4.tcp_max_tw_buckets to augment the limitation.
  4. Run # sysctl -p to activate the configuration.

Too many FIN_WAIT2 TCP connections in Linux instance

The kernel parameters involved here is net.ipv4.tcp_fin_timeout.

Symptom

Too many TCP connections in FIN_WAIT2 state.

Analysis

  • In HTTP service, the server closes the TCP connection when KEEPALIVE timeout happens and the server steps into the FIN_WAIT2 state as a result.
  • In TCP/IP service, the semi-join is allowed, so a FIN_WAIT2 TCP connection is not timeout (unlike TIME_WAIT state). If the client does not close it, the TCP connection remains in FIN_WAIT2 state until the Linux instance restarts. Thus, more and more FIN_WAIT2 TCP connection causes the kernel crash.
  • You can decrease the value of net.ipv4.tcp_fin_timeout appropriately to speed up the Linux instance recycling FIN_WAIT2 TCP connection.

Solution

  1. Run vi /etc/sysctl.conf to modify or add the following code block:

    1. net.ipv4.tcp_syncookies = 1<br>
    2. net.ipv4.tcp_fin_timeout = 30<br>
    3. net.ipv4.tcp_max_syn_backlog = 8192<br>
    4. net.ipv4.tcp_max_tw_buckets = 5000
  2. Run # sysctl –p to activate the configuration.

Note: Because the TCP connections in the FIN_WAIT2 state will step into the TIME_WAIT state, you can have a glance over the time wait bucket table overflow error section.

Too many CLOSE_WAIT TCP connections in Linux instance

Symptom

You find a large amount of CLOSE_WAIT TCP connections when you run netstat -atn|grep CLOSE_WAIT|wc -l in your Linux instance.

Analysis

Both peer and local end can initiate a request to close the TCP connection. If the peer end initiates the request but the local end does not close the connection, the connection becomes CLOSE_WAIT. Although the TCP connection is in a semi-open state, it cannot communicate with the peer, you must release it in time. You must estimate whether the peer has closed a connection timely or not based on the program logic.

Solution

In some programming language, you can run read or write to inspect the CLOSE_WAIT TCP connection, for example:

Java :

  1. Run read to evaluate I/O. When the read method returns -1, it means that it has reached its end.
  2. Run close to close the connection.

C :

Check the return value of read:

  • If it is 0, close the connection.
  • If it is less than 0, and then check errno, if it is not AGAIN, close the connection.

The NAT configured client fails to access to ECS or RDS

The kernel parameters involved here are:

  • net.ipv4.tcp_tw_recycle
  • net.ipv4.tcp_timestamps

Symptom

The NAT configured client fails to access to ECS or RDS, including the SNAT configured ECS instance with VPC network type. Meanwhile the detection tool for packet loss rate finds that the SYN packet sent by the client to the server does not respond.

Analysis

If both the value of net.ipv4.tcp_tw_recycle andnet.ipv4.tcp_timestamps are 1, the server will check the Timestamps in each message. If Timestamps in the message are not incremental, the server will not intercept the message. But the NAT configured clients have the same source IP, each client’s time before configuring NAT may be different, so the server may intercept the message because the Timestamps in the message are not incremental.

Solution

  • When the remote server is ECS, modify net.ipv4.tcp_tw_recycle to 0.
  • When the remote server is PaaS service such as RDS. The RDS cannot directly modify the kernel parameters, you must modify net.ipv4.tcp_tw_recycle and net.ipv4.tcp_timestamps to 0 on the client.

Relevant Linux kernel parameter in the documentation

Parameter Description
net.ipv4.tcp_max_syn_backlog This parameter determines the maximum amount of SYN_RECV TCP connections. TheSYN_RECV state refers to the last ACK phase of the three-way handshake, it happens after the Linux instance receives SYN, responds to SYN+ACK and waits for the other party to reply.
net.ipv4.tcp_syncookies This parameter determines whether the TCP SYN_COOKIES has been enabled. The kernel must enable and compile the CONFIG_SYN_COOKIES, SYN_COOKIES prevents the socket from overloading when undue TCP connection requests exist.
When the parameter is set to 1 and the SYN_RECV queue is full, the kernel will modify the reply of SYN packet, namely, the original serial number of the SYN+ACK response packet is an intricately calculated TCP packet that consists of source IP, source Port, target IP, target Port and time. When the serial number of the SYN+ACK response packet is not the normally calculated, the attacker cannot respond to it, however, the caller responds correctly when receiving the SYN+ACK.
Note: When tcp_syncookies is enabled, the parameter net.ipv4.tcp_max_syn_backlog is ignored.
net.ipv4.tcp_synack_retries This parameter determines the number of times that the SYN+ACK packet is retransmitted in the SYN_RECV state.
net.ipv4.tcp_abort_on_overflow When this parameter is set to 1, if the ECS instance receives a large amount of requests in a short time and the associated application fails to process, the ECS instance sends a Reset packet to terminate the links directly. You can optimize the application to improve the processing capability rather than Reset.
The default value of net.ipv4.tcp_abort_on_overflow is 0.
net.core.somaxconn This parameter determines the maximum length of the listening queue of each port, which is a global parameter. This value is associated with net.ipv4.tcp_max_syn_backlog, which refers to the upper limit of the semi-join of the three-way handshake, the net.core.somaxconn refers to the upper limit of the ESTABLISHED connections. It is necessary to increase the parameter if your instance runs a high-load business. The backlog parameter in the listen (2) also indicates the upper limits of ESTABLISHED listening port. When the value of backlog is greater than net.core.somaxconn, the Linux instance chooses thenet.core.somaxconn value in the kernel.
net.core.netdev_max_backlog When the processing speed of the kernel is lower than the packet receiving speed of the network interface, the network interface store extra packets in the receiving queue. The net.core.netdev_max_backlog determines the maximum length of the receiving queues.

References

Thank you! We've received your feedback.