Resolve CentOS instance hang or reboot caused by too many vSwitch instances -

A CentOS 7 instance running kernel 3.10.0-514 may hang or experience an abnormal reboot when in a Virtual Private Cloud (VPC) CIDR block with a large number of instances. This topic explains the cause of this Linux kernel issue and provides solutions.

Symptoms

An instance created from a CentOS public image or custom image may hang or experience an abnormal reboot under the following conditions:

The instance is running kernel version 3.10.0-514.
Note
Run the uname -a command to check the kernel version of the CentOS system.
The number of instances in the same VPC CIDR block exceeds 128.
Note
The more instances in the same VPC CIDR block, the more likely this issue occurs.

Cause

If the number of instances in a VPC CIDR block exceeds the value of the net.ipv4.neigh.default.gc_thresh1 kernel parameter, which defaults to 128, and the instances communicate with each other, the number of entries in the kernel's Address Resolution Protocol (ARP) cache table can also exceed this limit.

This condition triggers the kernel's garbage collection mechanism for ARP entries. In kernel version 3.10.0-514 for CentOS, a race condition exists between the ARP garbage collection process and other kernel functions that handle ARP entries. This can lead to a kernel crash. Symptoms of this kernel crash include an abnormal reboot or a system hang. The following are examples of kernel stack traces from this type of crash:

PID: 35 TASK: ffff88023fe13ec0 CPU: 0 COMMAND: "kworker/0:1"
[exception RIP: __write_lock_failed+9]
RIP: ffffffff813275c9 RSP: ffff88023f7e3dc8 RFLAGS: 00000297
RAX: ffff88019c338000 RBX: ffff880035c89800 RCX: 000000000000000a
RDX: 0000000000000372 RSI: 000000012eeea6c0 RDI: ffff880035c8982c
RBP: ffff88023f7e3dc8 R8: ffffffff81aa7858 R9: 0001f955a06a7850
R10: 0001f955a06a7850 R11: 0000000000000000 R12: 0000000000000372
R13: ffffffff81aa7850 R14: ffff880035c89828 R15: ffff88019c339b90
CS: 0010 SS: 0018
#0 [ffff88023f7e3dd0] _raw_write_lock at ffffffff8168e7d7
#1 [ffff88023f7e3de0] neigh_periodic_work at ffffffff8157f3ac
#2 [ffff88023f7e3e20] process_one_work at ffffffff810a845b
#3 [ffff88023f7e3e68] worker_thread at ffffffff810a9296
#4 [ffff88023f7e3ec8] kthread at ffffffff810b0a4f
#5 [ffff88023f7e3f50] ret_from_fork at ffffffff81697758
PID: 0 TASK: ffff880173afce70 CPU: 20 COMMAND: "swapper/20"
[exception RIP: native_halt+5]
RIP: ffffffff81060ff5 RSP: ffff880173b1b878 RFLAGS: 00000046
RAX: 000000000000912c RBX: ffff881fbf30f380 RCX: 000000000000912e
RDX: 000000000000912c RSI: 000000000000912e RDI: ffff8801736a0000
RBP: ffff880173b1b878 R8: 0000000000000086 R9: 0000000000000000
R10: 0000000000000000 R11: ffff880173b1b95e R12: 0000000000000082
R13: 0000000000000014 R14: 0000000000000000 R15: 0000000000000e20
CS: 0010 SS: 0018
#0 [ffff880173b1b880] kvm_lock_spinning at ffffffff81060b5a
#1 [ffff880173b1b8b0] __raw_callee_save_kvm_lock_spinning at ffffffff8105ff05
#2 [ffff880173b1b900] _raw_spin_lock_irqsave at ffffffff8168dcd3
#3 [ffff880173b1b940] mod_timer at ffffffff81098e24
#4 [ffff880173b1b988] add_timer at ffffffff81098fe8
#5 [ffff880173b1b998] fbcon_add_cursor_timer at ffffffff81381069
#6 [ffff880173b1b9c0] fbcon_cursor at ffffffff8138422a
#7 [ffff880173b1ba10] hide_cursor at ffffffff813f6628
#8 [ffff880173b1ba28] vt_console_print at ffffffff813f8058
#9 [ffff880173b1ba90] call_console_drivers.constprop.15 at ffffffff81086ca1
#10 [ffff880173b1bab8] console_unlock at ffffffff810884be
#11 [ffff880173b1baf0] vprintk_emit at ffffffff810889d4
#12 [ffff880173b1bb60] vprintk_default at ffffffff81088d49
#13 [ffff880173b1bb70] printk at ffffffff8167f854
#14 [ffff880173b1bbd0] no_context at ffffffff8167ecbb
#15 [ffff880173b1bc20] __bad_area_nosemaphore at ffffffff8167ee29
#16 [ffff880173b1bc68] bad_area_nosemaphore at ffffffff8167ef93
#17 [ffff880173b1bc78] __do_page_fault at ffffffff81691f1e
#18 [ffff880173b1bcd8] trace_do_page_fault at ffffffff81692176
#19 [ffff880173b1bd18] do_async_page_fault at ffffffff8169181b
#20 [ffff880173b1bd30] async_page_fault at ffffffff8168e3b8
[exception RIP: get_next_timer_interrupt+440]
RIP: ffffffff810991a8 RSP: ffff880173b1bde0 RFLAGS: 00010017
RAX: 0000000000000000 RBX: 0098950e05e51640 RCX: 0000ffbc0000ffbc
RDX: 0000000b3fe32cf2 RSI: ffff8801736a1318 RDI: 0000000affe32d
RBP: ffff880173b1be30 R8: 0000000000000001 R9: 000000000000002f
R10: 000000000000002d R11: ffff8801736a1028 R12: 0000000affe32cf2
R13: ffff8801736a0000 R14: ffff880173b1bde8 R15: ffff880173b1be00
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#21 [ffff880173b1be38] tick_nohz_stop_sched_tick at ffffffff810f3418
#22 [ffff880173b1be90] __tick_nohz_idle_enter at ffffffff810f35be
#23 [ffff880173b1bec0] tick_nohz_idle_enter at ffffffff810f3aed
#24 [ffff880173b1bed0] cpu_startup_entry at ffffffff810e7c13
#25 [ffff880173b1bf28] start_secondary at ffffffff8104f11a

Resolution

Permanent solution

Run the sudo yum update kernel command to upgrade the kernel to version 3.10.0-693.21.1.el7.x86_64 or later.

Note

After upgrading the kernel, you must restart the instance. For more information, see Restart an instance.

Temporary solutions

If you cannot immediately upgrade the kernel, use one of the following workarounds to mitigate the issue.

Method 1

Run the following commands to adjust kernel parameters. Set the value of gc_thresh1 to be greater than the number of instances in the VPC CIDR block, and ensure that gc_thresh3 >= gc_thresh2 >= gc_thresh1. For example, you can set the three kernel parameters as follows:

sysctl -w net.ipv4.neigh.default.gc_thresh1=4096
sysctl -w net.ipv4.neigh.default.gc_thresh2=8192
sysctl -w net.ipv4.neigh.default.gc_thresh3=8192

Note

To make these settings persist across reboots, add them to the /etc/sysctl.conf file. Otherwise, the settings are lost when the instance restarts.

Method 2

During network planning, limit the number of instances in a single VPC CIDR block to avoid this issue.