What do I do if a Linux instance is unresponsive and error messages similar to "jbd2/vda1-8:366 blocked for more than 120 seconds" appear in the /var/log/messages file? -

This topic describes the cause of and solution to the issue that a Linux Elastic Compute Service (ECS) instance is unresponsive and error messages similar to "INFO:task jbd2/vda1-8:366 blocked for more than 120 seconds" appear in the /var/log/messages file.

Problem description

A Linux ECS instance is unresponsive, becomes slow, or has high system load or specific processes on the instance cannot run as expected. A large number of error messages that are similar to the following error message appear in the /var/log/messages file:

[8291809.483930] INFO:task jbd2/vda1-8:366 blocked for more than 120 seconds.

Cause

Journaling Block Device 2 (jbd2) is a journaling system in the Linux kernel used to manage journaled file systems, such as XFS and Ext4 file systems. Error messages similar to "INFO:task jbd2/vda1-8:366 blocked for more than 120 seconds" appear in the /var/log/messages file, which indicates a bottleneck in disk operations. This issue may occur due to the following reasons:

Processes are blocked. If a process encounters an issue, such as a deadlock or a memory leak, the process may be blocked and unable to continue to run.
Issues, such as vulnerabilities, occur in the kernel.
System resources are constrained. The system resources of a Linux ECS instance, such as CPUs, memory, and disks, are overutilized by applications or processes.

Solution

Perform the following steps to troubleshoot the issue:

Restart the Linux ECS instance.
If the Linux ECS instance cannot run as expected, restart the instance to restore normal operation. For more information, see Restart an instance.

Check the journaling configuration of file systems.

Run the following command to query information about the jbd2 process:

sudo ps -ef | grep jbd2

The following command output is returned:

root       370     2  0 15:50 ?        00:00:00 [jbd2/vda1-8]
root       371     2  0 15:50 ?        00:00:00 [jbd2-ckpt/vda1-]
root      1910  1833  0 15:52 pts/0    00:00:00 grep --color=auto jbd2

Run the following command to check whether the journaling feature is enabled for the file system on a partition:

sudo dumpe2fs /dev/vda1 | grep has_journal

If the journaling feature is enabled for the file system on the partition, has_journal is displayed in the command output.

dumpe2fs 1.43.5 (04-Aug-2017)
Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize

Run the following command to view the kernel version of the operating system:
```
sudo uname -a
```
(Optional) Upgrade the kernel version.
If the kernel version is outdated, upgrade the kernel version to resolve the issue. For more information, see Upgrade the operating system kernel of a Linux ECS instance.
Warning
The kernel upgrade operation is a complex and high-risk operation that may lead to system instability or incompatibility issues. Before you upgrade the kernel version, we recommend that you back up important data. Make sure that you fully understand the upgrade procedure and the impacts of the upgrade operation. Proceed with caution. To back up instance data, create snapshots for the disks attached to the Linux ECS instance. For more information, see Create a snapshot
(Optional) Upgrade the instance type.
If the Linux ECS instance has high system load or does not have sufficient resources due to low instance specifications, upgrade the instance type to resolve the issue. For more information, see Troubleshoot and resolve high load issues on Linux instances and Overview of instance configuration changes.