This topic describes how to resolve the issue that an Oops exception occurs when you hot-unplug a virtio device from an Elastic Compute Service (ECS) instance that runs a specific recent kernel version.
Problem description
When you hot-unplug a virtio device, such as a disk or a network interface controller (NIC), from an ECS instance that runs a specific recent kernel version, the following Oops exceptions occur:
When the
kernel.panic_on_oopsparameter is set to 1, a kernel panic occurs on the ECS instance.When the
kernel.panic_on_oopsparameter is set to 0, the kernel on the ECS instance becomes unresponsive.
kernel.panic_on_oops is a kernel parameter that controls the kernel behavior when the kernel encounters an Oops exception.
Kernel panic: The system stops ongoing tasks, saves debugging information, and then restarts or shuts down to resolve issues and reduce potential impacts.
Kernel unresponsiveness: The kernel may attempt to continue operation. To prevent data corruption or other severe issues, do not allow the kernel to continue operation in production environments in this case.
Cause
The Linux upstream community adds support for the admin virtqueue of virtio devices. For more information, see Commit.
In the commit:
The is_avq function pointer is added to the virtio_pci_device definition to determine whether the admin virtqueue exists.
The values for the is_avq function pointer are added in the virtio_pci_modern_probe function that is used to initialize modern virtio devices.

When you hot-unplug a virtio device, the code checks whether the current queue is the admin virtqueue.

If the virtio device is a legacy virtio device and no value is assigned to the is_avq function pointer, the is_avq function pointer in the virtio_pci_device structure of the legacy virtio device is a null pointer. When you hot-unplug the virtio device and the code invokes if (vp_dev->is_avq(vdev, vq->index)), a null pointer exception is thrown. As a result, programs crash or a system error occurs.
Scope of impacts
Linux upstream community
The Linux upstream community already resolved the issue. For more information, see Commit. In the commit, a patch is provided to check whether the is_avq function pointer is null.
Operating systems
Ubuntu 24.
Operating systems that have kernel versions close to 6.8, provide the admin virtqueue capabilities (virtio-pci: Introduce admin virtqueue), and do not have the virtio-pci: Check if is_avq is NULL patch installed to resolve the is_avq function pointer issue.
NoteYou can run the
uname -rcommand to view the kernel version.
Virtio devices
Legacy virtio devices that are used on ECS instances and are hot-unplugged.
Solutions
Solution 1: To eliminate the issue, we recommend that you change the instance family of the ECS instance to an 8th-generation or later instance family on which modern virtio devices reside. For more information, see Change instance types. For information about instance families, see Overview of instance families.
Solution 2:
Upgrade to the latest kernel software packages and verify that the virtio-pci: Check if is_avq is NULL patch is included in the latest kernel software packages to check whether the is_avq function pointer is null.
(Conditionally required) If the preceding patch is not included in the latest kernel software packages, install the patch.
Appendix: Terms
The following table describes the terms that are used in this topic.
Term | Description |
virtio device | Virtio is a standardized framework that allows virtual machines to efficiently communicate with virtual hardware on hosts. Virtio devices are hardware devices, such as disks and NICs, that are emulated in virtualized environments. Virtio devices are categorized into legacy virtio devices and modern virtio devices. Legacy virtio devices and modern virtio devices use different configuration interfaces. |
admin virtqueue. | A special virtio queue that is used to manage and operate devices, such as obtaining device status and configuring devices. The admin virtqueue is not supported by all virtio devices. |
virtio_pci_device | The data structure that is used in the kernel to indicate a virtio Peripheral Component Interconnect (PCI) device. This data structure contains pointers to various functions, such as the is_avq function pointer, which is added to determine whether a specific virtio queue is the admin virtqueue. |
is_avq | The function that is used to determine whether a specific virtio queue is the admin virtqueue. |
virtio_pci_modern_probe | The function that is used to detect and initialize virtio PCI devices. After a device is detected, this function is invoked to configure the device, including reading configuration space, checking device features, and allocating required resources. |
RIP | A register in x86 CPUs that stores the address of the next instruction to be executed. If a program encounters an exception, such as when the program attempts to execute an instruction at an address to which a null pointer is pointed, the RIP points to the address of the instruction that caused the exception. |