All Products
Search
Document Center

Simple Application Server:Detect I/O hangs of file systems and block layers

Last Updated:Feb 27, 2026

Alibaba Cloud Linux 2 and Alibaba Cloud Linux 3 provide kernel-level interfaces that detect I/O hangs in file systems and block layers with minimal system overhead.

Overview

An I/O hang occurs when the system becomes unstable or even goes down due to time-consuming I/O requests.

Alibaba Cloud Linux extends core kernel data structures to expose I/O hang detection through sysfs, debugfs, and procfs interfaces. Use these interfaces to:

  • Set and adjust I/O hang detection thresholds.

  • Monitor the number of hung I/O operations on a block storage device.

  • Inspect detailed information about individual hung I/O requests.

  • Identify the specific resources that a process or thread is waiting for.

Prerequisites

  • The instance runs Alibaba Cloud Linux 2 or Alibaba Cloud Linux 3.

Interfaces

The following table lists all available interfaces. Replace the variables in the paths with your actual values:

  • <device>: The name of the block storage device, such as vdb.

  • <pid>: The process ID.

  • <tid>: The thread ID.

Threshold and monitoring interfaces

Use these interfaces to configure detection sensitivity and check whether I/O hangs have occurred.

Interface

Path

Description

Hang threshold

/sys/block/<device>/queue/hang_threshold

Query or set the threshold for I/O hangs in milliseconds. An I/O request that exceeds this duration is classified as hung. Default: 5000 (5 seconds).

Hang count

/sys/block/<device>/hang

Query the number of I/O operations that exceed the threshold for I/O hangs on the device. Returns two space-separated integers: the read hang count and the write hang count.

Diagnostic interfaces

Use these interfaces to investigate the root cause of detected I/O hangs.

Interface

Path

Description

Hang details

/sys/kernel/debug/block/<device>/rq_hang

Query detailed information about each hung I/O request through debugfs. Output includes the operation type, request state, and timestamps.

Process wait resource

/proc/<pid>/wait_res

Query information about the resources for which a process is waiting.

Thread wait resource

/proc/<pid>/task/<tid>/wait_res

Query information about the resources for which a thread is waiting.

wait_res output fields

The wait_res interfaces return four space-separated fields:

Field

Position

Description

Resource type

1

The type of resource the process or thread is waiting for. 1 = page cache in the file system. 2 = block I/O layer.

Address

2

The address of the resources (page cache or block I/O layers) being waited on.

Wait start time

3

The time at which the process began waiting for resources.

Current time

4

The current time when the file is being read. The difference between Field 4 and Field 3 is the wait duration.

Examples

Adjust the I/O hang threshold

By default, the hang threshold is 5,000 ms. To change the threshold to 10,000 ms for device vdb, run the following command:

echo 10000 > /sys/block/vdb/queue/hang_threshold

Verify the change:

cat /sys/block/vdb/queue/hang_threshold

Expected output:

10000

A higher threshold reduces false positives in environments with expected high-latency I/O. A lower threshold increases detection sensitivity.

Check for hung I/O operations

Query the number of I/O operations that cause I/O hangs on device vdb:

cat /sys/block/vdb/hang

Sample output:

0        1

The value on the left (0) is the number of read operations that cause I/O hangs. The value on the right (1) is the number of write operations that cause I/O hangs. In this example, one write operation has exceeded the hang threshold.

Inspect hung I/O request details

For detailed information about hung I/O requests on device vdb, query the debugfs interface:

cat /sys/kernel/debug/block/vdb/rq_hang

Sample output:

ffff9e50162fc600 {.op=WRITE, .cmd_flags=SYNC, .rq_flags=STARTED|ELVPRIV|IO_STAT|STATS, .state=in_flight, .tag=118, .internal_tag=67, .start_time_ns=1260981417094, .io_start_time_ns=1260981436160, .current_time=1268458297417, .bio = ffff9e4907c31c00, .bio_pages = { ffffc85960686740 }, .bio = ffff9e4907c31500, .bio_pages = { ffffc85960639000 }, .bio = ffff9e4907c30300, .bio_pages = { ffffc85960651700 }, .bio = ffff9e4907c31900, .bio_pages = { ffffc85960608b00 }}

The output shows the details of an I/O operation. Key fields include:

Field

Description

.op

The I/O operation type, such as WRITE or READ.

.cmd_flags

The command flags for the request, such as SYNC.

.rq_flags

The request flags, such as STARTED|ELVPRIV|IO_STAT|STATS.

.state

The current state of the request, such as in_flight (submitted to the device driver).

.tag, .internal_tag

The block layer tag and internal tag assigned to the request.

.start_time_ns

The time when the I/O request was created, in nanoseconds.

.io_start_time_ns

The time when the I/O request was dispatched to the device driver, in nanoseconds. If this parameter has an assigned value, the I/O request was not processed in a timely manner.

.current_time

The current kernel time, in nanoseconds.

.bio, .bio_pages

The associated block I/O (bio) structures and their page addresses.

Identify resources a process is waiting for

Query the resources for which process 577 is waiting:

cat /proc/577/wait_res

Sample output:

1 0000000000000000 4310058496 4310061448

Based on the wait_res output fields table, interpret this output as follows:

  • Resource type = 1: The process is waiting for a page cache operation in the file system.

  • Address = 0000000000000000: The address of the page cache entry.

  • Wait start time = 4310058496 and Current time = 4310061448: The difference between these values is the amount of time the process has been waiting.

To query a specific thread within a process, use the thread-level interface:

cat /proc/<pid>/task/<tid>/wait_res