All Products
Search
Document Center

Elastic Compute Service:Resolve the issue of disk space insufficiency on a Linux instance

Last Updated:Jan 24, 2024

If the No space left on device error message is returned when you create a file or application on a Linux Elastic Compute Service (ECS) instance, the disk space of the instance is insufficient. If the disk space insufficiency is caused by your operations, you can refer to this topic to resolve the issue.

Note

If the disk space insufficiency is not caused by your operations, you can create more disks or resize existing disks to resolve the issue. For more information, see Create a disk, Attach a data disk, and Overview.

Causes:

In most cases, disk space insufficiency occurs due to the following causes:

  1. The space usage of disk partitions reaches 100%.

  2. The inode usage of disk partitions reaches 100%.

  3. Leftover files (zombies) exist.

    Note

    File handles are still occupied by the deleted files. In this case, disk space that is consumed by deleted files cannot be released.

  4. Mount points are overwritten.

    Note

    For example, the file system of a disk partition that manages a large number of files is mounted on a directory, which is referred to as a mount point. If the file system of another disk partition is mounted on the mount point, the mount point is overwritten by the new file system. However, applications in your system may still read data from and write data to the original file system. In this case, an error message that indicates disk space insufficiency may be returned. If you run the df or du command to check the disk space usage of files and directories, no information is returned. This is because the df or du command returns the information about the space usage of the disk partition that corresponds to the current mount point.

  5. The upper limit on inotify watches is reached.

    The inotify API provides a mechanism for monitoring file system events in Linux. inotify can be used to monitor file changes in file systems in real time. This error is not caused by disk space insufficiency. This topic describes the error to help you troubleshoot issues.

Solutions

To resolve the issue, perform the following operations based on the cause of the issue:

1. The space usage of disk partitions reaches 100%

To resolve this issue, you can delete files or directories that consume a large amount of disk space, resize existing disks, or create more disks. The following section describes the operations.

Delete files or directories that consume a large amount of disk space

  1. Connect to the ECS instance.

    For more information, see Connect to a Linux instance by using a password or key.

  2. Run the following command to query the usage of disk space:

    df -h

    A command output similar to the following one is displayed. The following command output indicates that the space usage of the /dev/xvda1 partition is 15%.

    image

  3. Run the following commands to enter the root directory and identify which directory consumes the largest amount of disk space:

    cd /
    du -sh *

    A command output similar to the following one is displayed. The following command output indicates that the /usr directory consumes the largest amount of disk space. Then, you need to further identify which file or directory consumes the largest amount of disk space in the /usr directory. Perform operations based on the actual scenario. image

  4. Run the following commands in sequence to identify which directory consumes the largest amount of disk space in a hierarchical manner. For example, the preceding command output indicates that the /usr directory consumes the largest amount of disk space. Then, you need to further identify which file or directory consumes the largest amount of disk space in the /usr directory.

    cd /usr
    du -sh *

    A command output similar to the following one is displayed. The command output indicates that the local directory consumes the largest amount of disk space in the /usr directory. Then, you need to further identify which file or directory consumes the largest amount of disk space in the local directory. Repeat the operation until the file or directory that consumes the largest amount of disk space is identified.image

  5. Delete files or directories that are no longer required based on your business requirements.

Resize existing disks or create more disks

If no files can be deleted to release disk space, you can resize existing disks or create more disks. For more information, see Create a disk, Attach a data disk, and Overview.

2. The inode usage of disk partitions reaches 100%

When the inode usage of disk partitions reaches 100%, directories or files cannot be created for applications even if the disk space is sufficient. In actual scenarios, when the inode usage reaches 100%, you may not be able to detect it. To resolve this issue, you can delete files or directories that use a large number of inodes or increase the number of inodes.

Note

In Linux, an inode contains the following information: the type, size, permissions, owner of a file, number of connections to the file, creation time and update time of the file, and pointers that point to data blocks. Modify inode configurations only when the inode usage reaches 100%.

Query the inode usage

  1. Connect to the ECS instance.

    For more information, see Connect to a Linux instance by using a password or key.

  2. Run the following command to query the inode usage:

    df -i

    image

  3. If the inode usage reaches or is about to reach 100%, perform the following operations:

Delete files or directories that use a large number of inodes

If you do not want to format disks to increase the number of inodes, perform the following operations to delete files or directories that use a large number of inodes:

  1. Run the following command to query the number of files that exist in each subdirectory of the root directory:

    for i in /*; do echo $i; find $i | wc -l; done

    A command output similar to the following one is displayed. The command output indicates that the /usr directory has the largest number of files. Then, you need to identify which directory in the /usr directory has the largest number of files. Inode usage increases with the number of files. Perform operations based on the actual scenario.

    image

  2. Run the preceding command to identify which file or directory uses the largest number of inodes in a hierarchical manner. Then, delete the file or directory.

Increase the number of inodes

If files stored on disks cannot be deleted or if the inode usage remains high after files are deleted, back up disk data, format the disks, and then copy the data back to the disks to increase the number of inodes in the file systems.

Warning
  • To increase the number of inodes on disks, the disks must be formatted. In this case, the data stored on the disks is deleted. Before you format the disks, you must back up the data. You can copy the data or create snapshots for the disks. For information about how to create snapshots, see Create a snapshot of a disk.

  • To increase the number of inodes, you must unmount the file systems from mount points. However, this may interrupt your application services. We recommend that you unmount the file systems during an appropriate period of time.

  1. Run the following command to unmount a file system from a mount point.

    In this example, a file system is unmounted from the /home mount point. Modify the command based on the actual scenario.

    umount /home
  2. Run the following command to create a file system and specify a larger number of inodes.

    In this example, an ext3 file system is created for the /dev/xvdb partition and the number of inodes is set to 1,638,400. Modify the command based on the actual scenario.

    mkfs.ext3 /dev/xvdb -N 1638400
    Note

    In Linux, the number of inodes is subject to the disk capacity. In most cases, the number of inodes is calculated by using the following formula: Number of inodes = Disk capacity (KB)/16 KB. For example, a 40 GB disk can have 2,621,440 inodes based on the preceding formula. The maximum number of inodes allowed for a disk is 2^32, which is approximately 4.3 billion. Specify an appropriate number of inodes based on the disk capacity.

  3. Run the following command to mount the new file system to the mount point.

    In this example, the new file system is mounted to the /home mount point based on the configurations in the /etc/fstab file. Modify the command based on the actual scenario.

    mount -a
  4. (Optional) Run the following command to view the number of inodes and check whether the number of inodes is increased:

    dumpe2fs -h /dev/xvdb | grep node

    A command output similar to the following one is displayed. The command output indicates that the number of inodes is increased. You can copy backup data back to the disk and restore the affected applications.

    image

3. Leftover files (zombies) exist

If no exceptions are detected for the space usage and inode usage of disk partitions, disk space insufficiency on the instance may be caused by a large number of leftover files. If specific files are deleted and displayed in the deleted state but are still used by processes, the disk space that is occupied by the files cannot be released or queried by running the df or du command. An excessively large number of leftover files occupy a large amount of disk space. Perform the following operations to query and delete leftover files:

  1. Connect to the ECS instance.

    For more information, see Connect to a Linux instance by using a password or key.

  2. If lsof is not installed on the operating system, run one of the following commands to install lsof:

    • Alibaba Cloud Linux and CentOS:

      yum install -y lsof
    • Debian and Ubuntu:

      apt-get install -y lsof
  3. Run the following command to query the space usage of leftover files:

    lsof |grep delete | sort -k7 -rn | more

    A command output similar to the following one is displayed. The sizes of files that are in the deleted state are displayed in the seventh column. Check whether the sum of the sizes is close to the unexpected usage of disk space. If the sum is close to the unexpected usage, it indicates that leftover files consume the disk space. image

  4. Use one of the following methods to release the file handles, clear the leftover files, and then release the disk space:

    • Restart the instance

      After the instance is restarted, the system terminates running processes and releases the handles of deleted files.

      Important

      Instance restarts may interrupt services. We recommend that you restart instances during an appropriate period of time.

    • Run the kill command

      After you run the lsof command, the process IDs (PIDs) of running processes that correspond to the leftover files are usually displayed in the second column. You can specify a PID in the kill command to terminate the corresponding process.

      1. Run the following command to list the PIDs of processes:

        lsof |grep delete 
      2. Run the following command to terminate a process that corresponds to a leftover file based on your business requirements:

        kill <PID>
        Important

        When processes are terminated, services that run on instances may be affected. Proceed with caution.

4. Mount points are overwritten

If the issue persists after you troubleshoot the issue based on the preceding operations, check whether the issue is caused by overwritten mount points. Use the following method to check whether mount points are overwritten.

In the example shown in the following figure, the space usage of the /dev/vda1 system disk whose size is 30 GB reaches 95%. After you run the du command, you can see that the /home directory consumes 24 GB of the disk space.

image

After you mount /dev/vdb1 on the /home directory, the space usage of the system disk /dev/vda1 is still 95%. However, in the root partition of the system disk, only the /usr directory consumes more than 1 GB of disk space and the /home directory consumes only 20 KB instead of 24 GB of disk space based on the previous check. You cannot identify which directory consumes the largest amount of disk space. In this case, the issue may be caused by overwritten mount points.

image

To resolve this issue, unmount the disk partition and check the space usage of the original mount point.

Warning

The unmount operation may interrupt your application services. We recommend that you unmount partitions during an appropriate period of time.

5. The upper limit on inotify watches is reached

If an error message similar to tail: cannot watch '...': No space left on device is returned when you run commands such as tail -f, the upper limit on inotify watches is reached. In this case, you can increase the upper limit on inotify watches to resolve the issue.

  1. Run the following command to view the upper limit on inotify watches:

    cat /proc/sys/fs/inotify/max_user_watches
  2. Run the following command to change the upper limit on inotify watches:

    sudo sysctl fs.inotify.max_user_watches=<The value of new upper limit>

    Replace <The value of new upper limit> with the value of upper limit that you want to specify for inotify watches.

    Note

    If you increase the upper limit on inotify watches, inotify watches may occupy a larger amount of system memory. When you change the upper limit on inotify watches, evaluate the memory and performance of the system and possible impact. You can run the man 7 inotify command to learn more about inotify watches and the related settings.