All Products
Search
Document Center

Alibaba Cloud Linux:Configure the blk-iocost weight-based throttling feature

Last Updated:Jul 14, 2025

The blk-iocost weight-based throttling feature is an Alibaba Cloud Linux improvement of the weight-based disk throttling feature of the cgroup I/O subsystem (blkcg). blk-iocost is an I/O controller that is used to allocate bandwidth to I/O operations on block devices based on the priorities of applications or processes. blk-iocost can also control the usage of the block device I/O bandwidth by specific applications or processes based on specified weight values. blk-iocost helps you better control and manage disk I/O resources.

Note

cgroup v1 and cgroup v2 are two versions of the resource management feature in the Linux kernel. In the Alibaba Cloud Linux kernel, the blk-iocost feature supports both cgroup v1 and v2 interfaces. In most cases, only one version is activated and used in a system. You can run the stat -fc %T /sys/fs/cgroup command to check.

  • If tmpfs is returned, the cgroup v1 interface is used.

  • If cgroup2fs is returned, the cgroup v2 interface is used.

Limits on operating systems

  • Alibaba Cloud Linux 2 with kernel version 4.19.81-17 or later

  • Alibaba Cloud Linux 3

Usage notes for the cost.qos interface

cost.qos is a read/write interface used to enable or disable the blk-iocost feature and limit the I/O quality of service (QoS) rate based on latency weight. The interface file exists only in the root group of blkcg, and its full name varies based on the cgroup version.

  • cgroup v1: blkio.cost.qos

  • cgroup v2: io.cost.qos

Configuration

Each line in the configuration file starts with the major (MAJ) and minor (MIN) numbers of a disk in the MAJ:MIN format, followed by the parameters in the following table. To query the MAJ and MIN numbers of a disk, run the lsblk | grep <disk name> command.

Parameter

Description

enable

Specifies whether to enable the blk-iocost feature.

  • 0: disables the blk-iocost feature. This is the default value.

  • 1: enables the blk-iocost feature.

ctrl

The control mode. Valid values:

  • auto: The system automatically detects the disk category and uses built-in parameters.

    Important

    If you set ctrl=auto and the category of the disk attached to an Elastic Compute Service (ECS) instance is SSD, such as standard SSD, Enterprise SSD (ESSD), or Non-Volatile Memory Express (NVMe) SSD, set the rotational attribute of the SSD to 0. blk-iocost can more accurately evaluate I/O costs and tune scheduling policies to improve the I/O performance of SSDs. Sample command:

    sudo sh -c 'echo 0 > /sys/block/<DISK_NAME>/queue/rotational'

    Replace <DISK_NAME> with the actual disk name.

  • user: Configure the following control parameters:

    • rpct: the read latency percentile. Valid values: 0 to 100.

    • rlat: the read latency threshold. Unit: microseconds.

    • wpct: the write latency percentile. Valid values: 0 to 100.

    • wlat: the write latency threshold. Unit: microseconds.

    • min: the minimum scaling percentage. Valid values: 1 to 10000.

    • max: the maximum scaling percentage. Valid values: 1 to 10000.

Enable the blk-iocost feature

Enable the blk-iocost feature for a disk. In this example, the 254:48 disk is used, and the control mode is set to user. If more than 95% of read and write requests have a latency (rlat|wlat) longer than 5 milliseconds, the disk is considered saturated. The kernel adjusts the rate at which requests are sent to the disk within the range of 50% to 150%.

  • cgroup v1 interface

    sudo sh -c 'echo "254:48 enable=1 ctrl=user rpct=95.00 rlat=5000 wpct=95.00 wlat=5000 min=50.00 max=150.00" > /sys/fs/cgroup/blkio/blkio.cost.qos'
  • cgroup v2 interface

    sudo sh -c 'echo "254:48 enable=1 ctrl=user rpct=95.00 rlat=5000 wpct=95.00 wlat=5000 min=50.00 max=150.00" > /sys/fs/cgroup/io.cost.qos'

Usage notes for the cost.model interface

cost.model is a read/write interface used to configure the cost model. The interface file exists only in the root group of blkcg, and its full name varies based on the cgroup version.

  • cgroup v1: blkio.cost.model

  • cgroup v2: io.cost.model

Configuration

Each line in the configuration file starts with the major (MAJ) and minor (MIN) numbers of a disk in the MAJ:MIN format, followed by the parameters in the following table. To query the MAJ and MIN numbers of a disk, run the lsblk | grep <disk name> command.

Parameter

Description

ctrl

The control mode. Valid values:

  • auto: The system automatically optimizes the I/O scheduling policy based on the current workload.

    Important

    If you set ctrl=auto and the category of the disk attached to an ECS instance is SSD, such as standard SSD, ESSD, or NVMe SSD, you must set the rotational attribute of the SSD to 0. blk-iocost can more accurately evaluate I/O costs and tune scheduling policies to improve the I/O performance of SSDs. Sample command:

    sudo sh -c 'echo 0 > /sys/block/<DISK_NAME>/queue/rotational'

    Replace <DISK_NAME> with the actual disk name.

  • user: Configure model parameters.

model

The model parameter. Valid value: linear. If you set the model parameter to linear, specify the following modeling parameters:

  • [r|w]bps: the maximum sequential I/O throughput. Unit: bytes/second.

  • [r|w]seqiops: the sequential input/output operations per second (IOPS) limit.

  • [r|w]randiops: the random IOPS limit.

    Note

    You can use the tools/cgroup/iocost_coef_gen.py script in the kernel source code to generate the preceding parameters and then write the parameters to the cost.model interface file to configure the cost model.

Use the cost.model interface to configure a cost model

In this example, the 254:48 disk is used. Set the model parameter to linear and specify modeling parameters to configure a cost model.

  • cgroup v1 interface

    sudo sh -c 'echo "254:48 ctrl=user model=linear rbps=2706339840 rseqiops=89698 rrandiops=110036 wbps=1063126016 wseqiops=135560 wrandiops=130734" > /sys/fs/cgroup/blkio/blkio.cost.model'
  • cgroup v2 interface

    sudo sh -c 'echo "254:48 ctrl=user model=linear rbps=2706339840 rseqiops=89698 rrandiops=110036 wbps=1063126016 wseqiops=135560 wrandiops=130734" > /sys/fs/cgroup/io.cost.model'

Usage notes for the weight/cost.weight interface

The weight interface of Alibaba Cloud Linux 3 and the cost.weight interface of Alibaba Cloud Linux 2 are core interfaces in the kernel used to control I/O resource allocation. Both interfaces are read/write interfaces. You can dynamically allocate disk I/O bandwidth by configuring weight values within the range of [1,10000]. The interface file exists only in the subgroup of blkcg, and its full name varies based on the cgroup version.

  • Alibaba Cloud Linux 3

    • cgroup v1: blkio.cost.weight

    • cgroup v2: io.weight

  • Alibaba Cloud Linux 2

    • cgroup v1: blkio.cost.weight

    • cgroup v2: io.cost.weight

Configuration

  • Set a weight value <weight> for the interface to change the default weight of blkcg.

  • Set a port number and a weight value MAJ:MIN <weight> for the interface to change the weight of the blkcg on the specified disk.

Modify the weight

After you enable the blk-iocost feature, create the blkcg1 control group of cgroup v1 and the cg1 control group of cgroup v2, and use the cost.weight interface for cgroup v1 and the weight interface for cgroup v2 to change the default weight of the control group to 50. Then, set the weight of the control group on the 254:48 disk to 50.

  • cgroup v1 interface

    sudo mkdir /sys/fs/cgroup/blkio/blkcg1    # Create the control group blkcg1
    sudo sh -c 'echo "50" > /sys/fs/cgroup/blkio/blkcg1/blkio.cost.weight'    # Change the default weight to 50
    sudo sh -c 'echo "254:48 50" > /sys/fs/cgroup/blkio/blkcg1/blkio.cost.weight'    # Set the weight on the disk to 50
  • cgroup v2 interface

    • Alibaba Cloud Linux 3

      sudo mkdir /sys/fs/cgroup/cg1    # Create the control group cg1
      sudo sh -c 'echo "50" > /sys/fs/cgroup/cg1/io.weight'    # Change the default weight to 50
      sudo sh -c 'echo "254:48 50" > /sys/fs/cgroup/cg1/io.weight'    # Set the weight on the disk to 50
    • Alibaba Cloud Linux 2

      sudo mkdir /sys/fs/cgroup/cg1    # Create the control group cg1
      sudo sh -c 'echo "50" > /sys/fs/cgroup/cg1/io.cost.weight'    # Change the default weight to 50
      sudo sh -c 'echo "254:48 50" > /sys/fs/cgroup/cg1/io.cost.weight'    # Set the weight on the disk to 50

Common monitoring tools

blk-iocost needs to monitor and evaluate the I/O performance of your system. You can use the following tools or interfaces to monitor I/O resource usage and optimize resource utilization.

  • iocost monitor script

    The tools/cgroup/iocost_monitor.py script in the kernel source code uses the drgn debugger to obtain kernel parameters and provide I/O performance monitoring data. Perform the following steps to use the script:

    1. Install the drgn debugger.

      sudo pip3 install drgn
    2. Download the iocost_monitor.py script.

      wget https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/plain/tools/cgroup/iocost_monitor.py
    3. Run the iocost_monitor.py script.

      In this example, the vdd disk is used.

      sudo python3 ./iocost_monitor.p vdd

      The following command output is returned:

      vdd RUN  per=500.0ms cur_per=3930.839:v14620.321 busy= +1 vrate=6136.22% params=hdd
                                active    weight      hweight% inflt% dbt  delay usages%
      blkcg1                       *    50/   50   9.09/  9.09   0.00   0  0*000 009:009:009
      blkcg2                       *   500/  500  90.91/ 90.91   0.00   0  0*000 089:091:092
  • blkio.cost.stat interface file of cgroup v1

    The Alibaba Cloud Linux kernel provides the blk-iocost interface file (blkio.cost.stat) of the cgroup v1 interface. This interface file records the QoS data of each controlled device. Run the following command to view the interface file:

    cat /sys/fs/cgroup/blkio/blkcg1/blkio.cost.stat

    The following command output is returned:

    254:48 is_active=1 active=50 inuse=50 hweight_active=5957 hweight_inuse=5957 vrate=159571
  • ftrace tool

    The Alibaba Cloud Linux kernel provides the ftrace tool related to the blk-iocost feature. For the blk-iocost feature, ftrace can help trace the decision-making process of the scheduler and the I/O request processing in detail to provide in-depth performance analysis. Perform the following steps to use the ftrace tool:

    1. Set the enable attribute to 1 to enable the ftrace tool.

      sudo sh -c 'echo 1 > /sys/kernel/debug/tracing/events/iocost/enable'
    2. View the output information.

      sudo cat /sys/kernel/debug/tracing/trace_pipe

      The following command output is returned:

          dd-1593  [008] d...   688.565349: iocost_iocg_activate: [vdd:/blkcg1] now=689065289:57986587662878 vrate=137438 period=22->22 vtime=0->57986365150756 weight=50/50 hweight=65536/65536
          dd-1593  [008] d.s.   688.575374: iocost_ioc_vrate_adj: [vdd] vrate=137438->137438 busy=0 missed_ppm=0:0 rq_wait_pct=0 lagging=1 shortages=0 surpluses=1
      <idle>-0     [008] d.s.   688.608369: iocost_ioc_vrate_adj: [vdd] vrate=137438->137438 busy=0 missed_ppm=0:0 rq_wait_pct=0 lagging=1 shortages=0 surpluses=1
          dd-1594  [006] d...   688.620002: iocost_iocg_activate: [vdd:/blkcg2] now=689119946:57994099611644 vrate=137438 period=22->26 vtime=0->57993412421644 weight=250/250 hweight=65536/65536
      <idle>-0     [008] d.s.   688.631367: iocost_ioc_vrate_adj: [vdd] vrate=137438->137438 busy=0 missed_ppm=0:0 rq_wait_pct=0 lagging=1 shortages=0 surpluses=1
      <idle>-0     [008] d.s.   688.642368: iocost_ioc_vrate_adj: [vdd] vrate=137438->137438 busy=0 missed_ppm=0:0 rq_wait_pct=0 lagging=1 shortages=0 surpluses=1
      <idle>-0     [008] d.s.   688.653366: iocost_ioc_vrate_adj: [vdd] vrate=137438->137438 busy=0 missed_ppm=0:0 rq_wait_pct=0 lagging=1 shortages=0 surpluses=1
      <idle>-0     [008] d.s.   688.664366: iocost_ioc_vrate_adj: [vdd] vrate=137438->137438 busy=0 missed_ppm=0:0 rq_wait_pct=0 lagging=1 shortages=0 surpluses=1