On Elastic Compute Service (ECS) Bare Metal instances, the kdump service that comes with the operating system may fail to generate crash dump files. This topic describes the cause of and solutions to this issue.
Problem description
- Instances of the sixth-generation ECS Bare Metal Instance families (including ebmg6,
ebmc6, and ebmr6) use the following images:
- CentOS 8.3 and earlier
- Ubuntu 16 and Ubuntu18
- Debian 10
- Alibaba Cloud Linux 2 with a kernel version earlier than 4.19.91-24.al7
- Instances of the seventh-generation ECS Bare Metal Instance families (including ebmg7, ebmc7, and ebmr7) use Debian 10 images.
Possible cause
crashkernel
phase, the pci_resource
resource of Elastic Block Storage (EBS) device vda cannot be allocated. As a result, crash dump files cannot be generated. The root
cause of this issue is that the instance type is incompatible with the operating system.
A command output similar to the following one is returned.
Solutions
- Method 1: Upgrade the operating system kernel to version 5.10.
- Method 2: Add the following patches to the operating system and rebuild a kernel:
Benjamin Herrenschmidt (1): PCI: Don't auto-realloc if we're preserving firmware config Kelsey Skunberg (1): PCI: Make pci_hotplug_io_size, mem_size, and bus_size private Logan Gunthorpe (1): PCI: Don't disable bridge BARs when assigning bus resources Nicholas Johnson (2): PCI: Add "pci=hpmmiosize" and "pci=hpmmioprefsize" parameters PCI: Avoid double hpmemsize MMIO window assignment
Note Some kernel versions already contain some of the preceding patches. You need to add patches that are not contained to the kernels. For example, the kernel version 4.19 of Debian 10 already contains the first and third patches, but the second, fourth and fifth patches must be added to the kernel before the operating system can use the kdump service.
In addition to upgrading the kernel version and adding patches to the kernel, you must take note of the following items:
crashkernel
parameter to adjust the amount of memory reserved for the operating system. Recommended
settings:crashkernel=0M-2G:0M,2G-8G:192M,8G-:256M
crashkernel
phase, and crash dump files cannot be generated.
crashkernel
parameter to reserve 256 MB of memory out of 384 MB of system memory:
- Open the /kdump-tools.cfg file.
vim /etc/default/grub.d/kdump-tools.cfg
- Press the i key to enter the edit mode and change the
crashkernel
parameter settings to the following content:
After the parameter is changed, press the Esc key, enterGRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT crashkernel=384M-:256M"
:wq
, and press the Enter key to save and close the file. - Update the configurations of GRand Unified Bootloader (GRUB).
update-grub
- Restart the ECS instance for the configurations to take effect.
We recommend that you restart ECS instances during off-peak hours to reduce the impact on your business operations caused by instance restarts. For more information about how to restart an ECS instance, see Reboot the instance.