Alibaba Cloud Linux 2 supports the cgroup writeback feature for the cgroup v1 kernel interface in the 4.19.36-12.al7 kernel version and later. This feature allows you to limit the buffered I/O rate when you use the cgroup v1 kernel interface.
cgroups (abbreviated from control groups) are a Linux kernel mechanism that organizes processes hierarchically and distributes system resources along the hierarchy in a controlled and configurable manner. Two versions of cgroups are available: cgroup v1 and cgroup v2. For more information, visit What are control groups. This topic describes how to enable the cgroup writeback feature for cgroup v1 to limit the buffered I/O rate of processes.
After you enable the cgroup writeback feature, check whether the mappings between memory subsystems (memcgs) and I/O subsystems (blkcgs) conform to the following rule. If the mappings conform to the rule, limit the buffered I/O rate of processes.
The cgroup writeback feature requires cooperation between memcgs and blkcgs to limit the buffered I/O rate. However, by default, the control subsystems of cgroup v1 do not work together. Memcgs and blkcgs must be associated based on the rule that each memcg is mapped to a unique blkcg. The mappings from memcgs to blkcgs can be one-to-one or many-to-one, but cannot be one-to-many or many-to-many.
- When A and B belong to two different memcgs, each of the memcgs can be mapped to a
unique blkcg. For example, A belongs to
blkcg1, and B belongs to
- When A and B belong to two different memcgs, these memcgs can also be mapped to the
same blkcg. For example, A belongs to
memcg1and B belongs to
memcg2, and both A and B belong to
- When A and B belong to the same memcg, the memcg can be mapped only to one blkcg.
For example, both A and B belong to
cgroup.procsinterface for the corresponding blkcg, and write the ID of the process to the interface to ensure that the memcgs mapped to this blkcg are not mapped to any other blkcgs. You can also use a tool to view the mappings between memcgs and blkcgs. For more information, see Verify the mappings from memcgs to blkcgs.
Enable cgroup writeback
By default, the cgroup writeback feature is disabled for the cgroup v1 interface. To enable this feature, perform the following steps:
- Add the
cgwb_v1field to the
grubbycommand and run the command to enable the cgroup writeback feature.In this example, the kernel version is
4.19.36-12.al7.x86_64. You must enter your actual kernel version during the operation. To query your kernel version, run the
sudo grubby --update-kernel="/boot/vmlinuz-4.19.36-12.al7.x86_64" --args="cgwb_v1"
- Restart the system for the cgroup writeback feature to take effect.
- Run the following command to read the
/proc/cmdlinekernel file. Ensure that the parameters of the command include the
cgwb_v1field. This way, the
blkio.throttle.write_iops_deviceinterfaces in the corresponding blkcgs can limit the buffered I/O rate.
cat /proc/cmdline | grep cgwb_v1
Verify the mappings from memcgs to blkcgs
Before you limit the buffered I/O rate of processes, you can use one of the following methods to check whether the mappings from memcgs to blkcgs are one-to-one or many-to-one.
- Run the following command to view the mappings from memcgs to blkcgs:
The following sample response shows that the mapping from the memcg to the blkcg is one-to-one.
sudo cat /sys/kernel/debug/bdi/bdi_wb_link
memory <---> blkio memcg1: 35 <---> blkcg1: 48
- Use the ftrace kernel monitoring tool.
- Use the ftrace tool.
sudo bash -c "echo 1 > /sys/kernel/debug/tracing/events/writeback/insert_memcg_blkcg_link/enable"
- View the output information.
The following sample response contains
sudo cat /sys/kernel/debug/tracing/trace_pipe
memcg_ino=35 blkcg_ino=48, which indicates that the mapping from the memcg to the blkcg is one-to-one.
<... >-1537  .... 99.511327: insert_memcg_blkcg_link: memcg_ino=35 blkcg_ino=48 old_blkcg_ino=0
- Use the ftrace tool.
Check whether cgroup writeback takes effect
In this example, two processes that generate I/O are simulated to check whether the cgroup writeback feature takes effect.
ddcommand responds quickly and the screen rolls too fast to be viewed. Run the
iostatcommand to view the dd command output.
ddcommand writes data in sequence. The system performs sequential I/O refreshing every time 1 MB of output data is generated. Therefore, you must set the threshold for
blkio.throttle.write_bps_deviceto a value of no less than 1 MB (1,048,576 bytes). If you set the threshold for blkio.throttle.write_bps_device to a value of less than 1 MB, I/O hangs may occur.
- Simulate two processes that generate I/O, and set the
cgroup.procsinterface of the blkcg based on the preceding limits.
sudo mkdir /sys/fs/cgroup/blkio/blkcg1 sudo mkdir /sys/fs/cgroup/memory/memcg1 sudo bash -c "echo $$ > /sys/fs/cgroup/blkio/blkcg1/cgroup.procs" # $$ specifies the process ID. sudo bash -c "echo $$ > /sys/fs/cgroup/memory/memcg1/cgroup.procs" # $$ specifies the process ID.
- Use the
blkio.throttle.write_bps_deviceinterface in the blkcg to limit the buffered I/O rate.
sudo bash -c "echo 254:48 10485760 > /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.write_bps_device" # Set the writeback throttling threshold for the disk to 10 MB/s based on the device number.
- Use the
ddcommand that does not have
oflagset to sync to generate buffered I/O.
sudo dd if=/dev/zero of=/mnt/vdd/testfile bs=4k count=10000
- Use the iostat tool to query the results. View the
wMB/scolumn in the command output. If the value is 10 MB/s, the cgroup writeback feature takes effect.
iostat -xdm 1 vdd