All Products
Search
Document Center

The Buffer I/O write performance of the Ext4 file system in the ECS instance of the Aliyun Linux 2 is not as expected.

Last Updated: Jul 23, 2020

Disclaimer: this document may contain information about third-party products that are for reference only. Alibaba Cloud does not make any guarantee, express or implied, with respect to the performance and reliability of third-party products, as well as potential impacts of operations on the products.

Problem Description

When an ECS instance that meets the following conditions runs cache asynchronous I/O(Buffer I/O) write operations in the Ext4 file system, performance may not be as expected.

  • Image: aliyun-2.1903-x64-20G-alibase-20190327.vhd and later versions.
  • Kernel: kernel-4.19.24-9.al7 and all later kernel versions.
  • By dioread_nolock and nodelalloc two options: Mount the Ext4 file system.
    Note: For more information about how to view the file system types and mount options, see more information.

Typical scenarios where the verification performance is not as expected are as follows.

Note: For more information about the performance of the ECS block storage, see block storage performance.

  • Scenario 1: Use CP the command used to copy large files to the Ext4 file system takes a long time. The copy rate can only reach about 30MiB/s per second.
  • Use the following command without the Sync Flag: DD the command writes the file to the Ext4 file system, which takes a long time.
    dd if=/dev/zero of=/mnt/badfile bs=10M count=1000
    And in the extra terminal, iostat -xm 1 when you run the following command to observe the write speed of the corresponding disk, the value in the wMB/s column is only about 30MiB/s. A similar output is displayed.
    avg-cpu: %user %nice %system %iowait %steal %idle
               0.00 0.00 12.77 0.00 0.00 87.23
    
    Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
    vda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
    vdb 0.00 7194.00 0.00 57.00 0.00 28.05 1008.00 0.02 17.81 0.00 17.81 0.39 2.20

Cause of problem

CP commands with and without the Sync Flag DD commands are written to files in asynchronous cache I/O mode. When writing files to a file system, you need to perform the following two steps:

  1. Page Cache write stage. This step is complete because the write operation is performed only at the memory level and the write speed is extremely fast.
  2. The page cache writes back to the file system. An exception occurred in this step.

 

From the perspective of the file system code, the root cause of the problem is that when the file system starts with dioread_nolock and nodelalloc when the combination option is mounted, a large number of Dirty pages are generated, which are called "unwritten extensions" in the kernel and have a size of 4 kB. Due to the defects of the processing logic of the Ext4 file system, these 4KB dirty pages are not merged into several huge pages for write-back, but are directly processed as small pages. When using the Perf tool to check the kernel writeback page cache process, we found that the processing was mainly performed on the Ext4 file system, ext4_writepages() inside the function, a lot of time is spent on finding and mapping 4 KB dirty pages logic, resulting in extremely low file writing performance.

Solution

Alibaba Cloud reminds you that:

  • If you have any risky operations on an instance or data, pay attention to the disaster tolerance and fault tolerance capabilities of the instance to ensure data security.
  • If you modify the configuration and data of an instance (including but not limited to ECS and RDS), we recommend that you create snapshots or enable RDS log backup.
  • If you have granted permissions on the Alibaba Cloud platform or submitted security information such as the logon account and password, we recommend that you modify the information as soon as possible.

This is a known issue in the Ext4 community and has no fixed solution. You can refer to the following temporary solution.

  1. Run the following command to remount the Ext4 file system. dioread_nolock and nodelalloc the combination of mount options.
    sudo mount -o remount,delalloc [$Device] [$Mount_Ponit]
    Note:
    • Device: indicates the Device name on which the Ext4 file system is mounted.
    • [$Mount_Ponit]: indicates the mount point of the Ext4 file system.
  2. Modify /etc/fstab file, delete the Ext4 file system nodelalloc option (the default is delalloc), to ensure that the system is automatically mounted at boot time.

More information

Follow these steps to check the file system type and mount options of the disk where the directory resides.

  1. Log on to the ECS instance and run the following command to check the disk partition where the directory is located.
    df [$DIR] | grep -v Filesystem | awk '{ print $1 }'
    Note:[$DIR] is the target directory for write operations.
  2. Run the following command to check the file system type and mount options of the target partition.
    mount | grep -w [$Partition] | grep ext4 | grep -w dioread_nolock | grep -w nodelalloc
    Note:[$Partition] is the name of the disk Partition obtained in the previous step.

Applicable to

  • Elastic Compute Service