This topic describes the cause of and solutions to the issue that the Alibaba Cloud Linux 3 operating system whose kernel version is 5.10.134-15.al8 is down due to exceptions that occur when you mount a file system of the Enhanced Read-Only File System (EROFS) type.
Problem description
The operating system is down when you mount an EROFS file system on a generic block device on an Alibaba Cloud Linux 3 instance that meets the following requirements:
Image: Alibaba Cloud Linux 3.2104
Kernel: 5.10.134-15.al8
Run the following commands to check whether exceptions occur when you mount an EROFS file system on a generic block device:
sudo yum install -y erofs-utils
mkdir -p test mnt
mkfs.erofs foo.erofs test
sudo mount -t erofs -o loop foo.erofs mntIf exceptions occur when you mount the EROFS file system on the generic block device, the operating system is down and the following call stack information appears:
[ 225.747952] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000370
..
[ 225.752658] CPU: 3 PID: 5829 Comm: mount Kdump: loaded Not tainted 5.10.134-15.al8.aarch64 #1
[ 225.753089] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 1.0.0 01/01/2017
[ 225.753468] pstate: 62401005 (nZCv daif +PAN -UAO +TCO BTYPE=--)
[ 225.753775] pc : __erofs_bread+0x64/0x1d0 [erofs]
[ 225.754016] lr : erofs_read_metabuf+0x44/0x80 [erofs]
[ 225.754271] sp : ffff800013fcbb00
[ 225.754442] x29: ffff800013fcbb00 x28: ffff0000c5ac0000
[ 225.754711] x27: 0000000000000000 x26: ffff0000ef1dcf50
[ 225.754982] x25: ffff0000d9896b80 x24: 0000000000000000
[ 225.755271] x23: 0000000000000001 x22: ffff0000ef1dcdd8
[ 225.755540] x21: ffff800013fcbbb0 x20: 0000000000000000
[ 225.755810] x19: 0000000000000000 x18: 0000000000000000
[ 225.756079] x17: 0000000000000000 x16: 0000000000000000
[ 225.756347] x15: ffffffffffffffff x14: ffffff0000000000
[ 225.756618] x13: 00000000000003f3 x12: 0000000000000000
[ 225.756888] x11: 0000000000000040 x10: ffff800011d169b8
[ 225.757158] x9 : ffff800009124a84 x8 : ffff0000c7cd9e00
[ 225.757427] x7 : 0000000000000000 x6 : 000000000000003f
[ 225.757697] x5 : ffff0000c773f000 x4 : 0000000000000001
[ 225.757966] x3 : 0000000000000000 x2 : ffff0000ef1dcdd8
[ 225.758235] x1 : ffff800013fcbbb0 x0 : 0000000000000000
[ 225.758508] Call trace:
[ 225.758636] __erofs_bread+0x64/0x1d0 [erofs]
[ 225.758859] erofs_read_metabuf+0x44/0x80 [erofs]
[ 225.759112] erofs_read_superblock+0x60/0x264 [erofs]
[ 225.759370] erofs_fc_fill_super+0xf0/0x310 [erofs]
[ 225.759621] get_tree_bdev+0x15c/0x250
[ 225.760109] erofs_fc_get_tree+0x38/0x54 [erofs]
[ 225.760662] vfs_get_tree+0x2c/0xf0
[ 225.761157] do_new_mount+0x164/0x1d0
[ 225.761652] path_mount+0x1bc/0x570
[ 225.762133] __arm64_sys_mount+0x114/0x140
[ 225.762633] el0_svc_common+0x90/0x250
[ 225.763124] do_el0_svc+0x7c/0x90
[ 225.763579] el0_svc+0x1c/0x30
[ 225.764019] el0_sync_handler+0xa8/0xb0
[ 225.764498] el0_sync+0x168/0x180
[ 225.764949] Code: eb1b001f 540003e0 aa0103e0 97ffff6c (f941bb00)
[ 225.765543] ---[ end trace 50d06630866b5b03 ]---Root cause
The EROFS-related feature that was added in kernel version 5.10.134-15.al8 modifies the __erofs_bread() function. After the modification, the function cannot correctly handle the scenario in which an EROFS file system is mounted on a generic block device. As a result, a null pointer dereference occurs in the operating system when an EROFS file system is mounted on a generic block device.
Solutions
Install a kernel hotfix.
sudo yum install -y kernel-hotfix-18359162-5.10.134-15We recommend that you do not use kernel version 5.10.134-15.al8. You can upgrade the kernel version to a later version, such as 5.10.134-15.1.al8 or later. For more information, see Change the kernel version.