To better monitor Linux block I/O throttling, Aliyun Linux 2 provides an interface in the kernel version 4.19.81-17.al7 to collect more statistics on block I/O throttling. This topic describes the new interface and how to use it.

Background information

Linux block I/O throttling (bit/s or IOPS) is required in multiple scenarios, especially those where cgroup writeback is enabled. Aliyun Linux 2 provides an interface that enhances the monitoring of block I/O throttling to facilitate your operations.

Interface description

Interface Description
blkio.throttle.io_service_time The duration of time from when I/O operations are issued from the block I/O throttling layer to when the operations are completed. Unit: ns.
blkio.throttle.io_wait_time The duration of throttling at the block I/O throttling layer. Unit: ns.
blkio.throttle.io_completed The number of completed I/O operations. The parameter is used to calculate the average latency of the block I/O throttling layer. Unit: counts.
blkio.throttle.total_io_queued The number of I/O operations that were throttled in the history. The number of I/O operations that were throttled in the current cycle can be calculated based on periodic monitoring and be used to analyze whether an I/O latency is related to throttling. Unit: counts.
blkio.throttle.total_bytes_queued The total bytes of I/O that were throttled in the history. Unit: bytes.

The path of the preceding parameters is /sys/fs/cgroup/blkio/<cgroup>/, where <cgroup> is the control group.

Example

You can obtain the average I/O latency of a disk by using the interface that enhances the monitoring of block I/O throttling. In this example, the average I/O write latency of the vdd disk between two points in time five seconds away from each other is monitored. Then the average I/O latency of the vdd disk is calculated. The following table describes relevant parameters.

Parameter Description
write_wait_time<N> Obtains the duration of throttling at the block I/O throttling layer.
write_service_time<N> Obtains the duration of time from when I/O operations are issued from the block I/O throttling layer to when the operations are completed.
write_completed<N> Obtains the number of completed I/O operations.
  1. Obtain the monitoring data at the T1 time.
    write_wait_time1 = `cat /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.io_wait_time | grep -w "254:48 Write" | awk '{print $3}'`
    write_service_time1 = `cat /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.io_service_time | grep -w "254:48 Write" | awk '{print $3}'`
    write_completed1 = `cat /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.io_completed | grep -w "254:48 Write" | awk '{print $3}'`
  2. Wait for five seconds and obtain the monitoring data at the T2 (T1 + 5s) time.
    write_wait_time2 = `cat /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.io_wait_time | grep -w "254:48 Write" | awk '{print $3}'`
    write_service_time2 = `cat /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.io_service_time | grep -w "254:48 Write" | awk '{print $3}'`
    write_completed2 = `cat /sys/fs/cgroup/blkio/blkcg1/blkio.throttle.io_completed | grep -w "254:48 Write" | awk '{print $3}'`
  3. Calculate the average I/O latency during the five seconds.
    Average I/O latency = (Total I/O duration at the T2 time - Total I/O duration at the T1 time)/(Number of completed I/O operations at the T2 time - Number of completed I/O operations at the T1 time).
    avg_delay = `echo "((write_wait_time2 + write_service_time2) - (write_wait_time1+write_service_time1)) / (write_completed2 - write_completed1)" | bc`