All Products
Search
Document Center

:What do I do if an Oops exception occurs when I hot-unplug a virtio device from an ECS instance that runs a recent kernel version?

Last Updated:May 09, 2025

This topic describes how to resolve the issue that an Oops exception occurs when you hot-unplug a virtio device from an Elastic Compute Service (ECS) instance that runs a specific recent kernel version.

Problem description

When you hot-unplug a virtio device, such as a disk or a network interface controller (NIC), from an ECS instance that runs a specific recent kernel version, the following Oops exceptions occur:

  • When the kernel.panic_on_oops parameter is set to 1, a kernel panic occurs on the ECS instance.

  • When the kernel.panic_on_oops parameter is set to 0, the kernel on the ECS instance becomes unresponsive.

Note

kernel.panic_on_oops is a kernel parameter that controls the kernel behavior when the kernel encounters an Oops exception.

  • Kernel panic: The system stops ongoing tasks, saves debugging information, and then restarts or shuts down to resolve issues and reduce potential impacts.

  • Kernel unresponsiveness: The kernel may attempt to continue operation. To prevent data corruption or other severe issues, do not allow the kernel to continue operation in production environments in this case.

Cause

The Linux upstream community adds support for the admin virtqueue of virtio devices. For more information, see Commit.

In the commit:

  • The is_avq function pointer is added to the virtio_pci_device definition to determine whether the admin virtqueue exists.

  • The values for the is_avq function pointer are added in the virtio_pci_modern_probe function that is used to initialize modern virtio devices.image.png

  • When you hot-unplug a virtio device, the code checks whether the current queue is the admin virtqueue.

    image.png

If the virtio device is a legacy virtio device and no value is assigned to the is_avq function pointer, the is_avq function pointer in the virtio_pci_device structure of the legacy virtio device is a null pointer. When you hot-unplug the virtio device and the code invokes if (vp_dev->is_avq(vdev, vq->index)), a null pointer exception is thrown. As a result, programs crash or a system error occurs.

Scope of impacts

  • Linux upstream community

    The Linux upstream community already resolved the issue. For more information, see Commit. In the commit, a patch is provided to check whether the is_avq function pointer is null.

  • Operating systems

  • Virtio devices

    Legacy virtio devices that are used on ECS instances and are hot-unplugged.

Solutions

  • Solution 1: To eliminate the issue, we recommend that you change the instance family of the ECS instance to an 8th-generation or later instance family on which modern virtio devices reside. For more information, see Change instance types. For information about instance families, see Overview of instance families.

  • Solution 2:

    1. Upgrade to the latest kernel software packages and verify that the virtio-pci: Check if is_avq is NULL patch is included in the latest kernel software packages to check whether the is_avq function pointer is null.

    2. (Conditionally required) If the preceding patch is not included in the latest kernel software packages, install the patch.

Appendix: Terms

The following table describes the terms that are used in this topic.

Term

Description

virtio device

Virtio is a standardized framework that allows virtual machines to efficiently communicate with virtual hardware on hosts. Virtio devices are hardware devices, such as disks and NICs, that are emulated in virtualized environments. Virtio devices are categorized into legacy virtio devices and modern virtio devices. Legacy virtio devices and modern virtio devices use different configuration interfaces.

admin virtqueue.

A special virtio queue that is used to manage and operate devices, such as obtaining device status and configuring devices. The admin virtqueue is not supported by all virtio devices.

virtio_pci_device

The data structure that is used in the kernel to indicate a virtio Peripheral Component Interconnect (PCI) device. This data structure contains pointers to various functions, such as the is_avq function pointer, which is added to determine whether a specific virtio queue is the admin virtqueue.

is_avq

The function that is used to determine whether a specific virtio queue is the admin virtqueue.

virtio_pci_modern_probe

The function that is used to detect and initialize virtio PCI devices. After a device is detected, this function is invoked to configure the device, including reading configuration space, checking device features, and allocating required resources.

RIP

A register in x86 CPUs that stores the address of the next instruction to be executed. If a program encounters an exception, such as when the program attempts to execute an instruction at an address to which a null pointer is pointed, the RIP points to the address of the instruction that caused the exception.