All Products
Search
Document Center

Troubleshoot Linux instance CPU usage up to 100% exception

Last Updated: Dec 29, 2020

Disclaimer: This article may contain information about third-party products. Such information is for reference only. Alibaba Cloud does not make any guarantee, express or implied, with respect to the performance and reliability of third-party products, as well as potential impacts of operations on the products.

Overview

This topic describes how to troubleshoot abnormal CPU usage on a Linux ECS instance.

Background

Alibaba Cloud reminds you that:

  • Before you perform operations that may cause risks, such as modifying instance configurations or data, we recommend that you check the disaster recovery and fault tolerance capabilities of the instances to ensure data security.
  • If you modify the configurations and data of instances including but not limited to ECS and RDS instances, we recommend that you create snapshots or enable RDS log backup.
  • If you have authorized or submitted security information such as the logon account and password in the Alibaba Cloud Management console, we recommend that you modify such information in a timely manner.

If the usage of CPU resources reaches 100%, you cannot run top or htop command to query the processes that consume CPU resources. You can use the following three elements to troubleshoot the problem.

View the monitoring data of the CloudMonitor console

Log on to the CloudMonitor console and click host monitoring to open the monitoring chart of the abnormal host. In the operating system monitoring area, view the CPU usage of the host.

Note: when you view the monitoring data, you can view the start time of CPU usage changing to 100% and the time when the CPU usage never drops.

View the history of command changes to a Linux instance

  1. Connect to the Linux instance. Run the following command to check whether Linux current command has been modified recently: Check whether the change time of the system command is consistent with the time point when 100% of the CPU usage occurs.
    stat /usr/bin/top
    The following command output is returned.
    CommandChanges
  2. Run the rpm command in sequence to verify that the system command has been modified. Normally, the system should return no modification information.
    rpm -Vf /bin/ps
    rpm -Vf /usr/bin/top
    A similar output is displayed when a system exception occurs.
    CommandQuery

View the external connections of a Linux instance

Run the following command to check whether the current instance is connected to an abnormal domain name, such as crypto-pool.fr in the example.

iftop -i [$Device] -n -P

Note:[$Device] the current system is interacts with the external connected using the network interface controller name

A similar output is displayed. crypto-pool.fr in the following figure is the abnormal domain name.

Solution

After the preceding troubleshooting, all the results are consistent with the preceding characteristics. The virus has infected your Linux instance, and you can perform the following operations to fix the problem.

  1. Back up the data of an instance. For the steps, see create snapshot.
  2. Reinitialize the disk and refer to the ECS server Trojan virus after the solution to consolidate the system.

Application scope

  • ECS