All Products
Search
Document Center

Elastic Compute Service:High system load drills

Last Updated:Apr 23, 2025

System load is a metric that measures the workload of a system. It represents the average number of processes in a runnable state and an uninterruptible state within a specific time interval. You can determine the business workload status based on system load, making timely alerts and taking response actions.

Implementation

A high system load drill uses the ACS-ECS-HighLoad Cloud Assistant plug-in to create a specific number of processes by running the vfork command to achieve a target system load value. Each created process sleeps until it times out and exits. This scenario has minimal impact on system operations.

Procedure

Prerequisites

Inject a fault

  1. Log on to the ECS instance.

    For more information, see Use Workbench to connect to a Linux instance over SSH.

  2. Run the ACS-ECS-HighLoad Cloud Assistant plug-in as a user with sudo privileges.

    sudo acs-plugin-manager --exec --plugin ACS-ECS-HighLoad --params inject,[num-processes=paramA],[duration=paramB]

    The following fault injection parameters enclosed in brackets [] are optional:

    • num-processes: The number of processes that you want to create. This value roughly approximates the expected maximum system load value. Default value: 100.

    • duration: The duration in seconds. Default value: 300.

    The following output shows that the ACS-ECS-HighLoad Cloud Assistant plug-in runs.

    image

  3. Check whether the fault is injected.

    • Run the top command to check the 1-minute, 5-minute, and 15-minute system load averages.

    • View the system load curve in the CloudMonitor console.

Recover from the fault

  • Method 1 (recommended): Wait for automatic recovery after a timeout.

  • Method 2: Run the following command on the ECS instance to recover from the fault:

  • sudo acs-plugin-manager --exec --plugin ACS-ECS-HighLoad --params recover

Example

  1. Run the following command to inject a fault:

    sudo acs-plugin-manager --exec --plugin ACS-ECS-HighLoad --params inject

    The following output shows that the fault is injected.

    image

  2. Check the fault injection effect.

    • Run the top command to view the system load averages. The following figure shows that the 1-minute system load average is 98.33, the 5-minute system load average is 58.24, and the 15-minute system load average is 32.66.image

    • View the system load averages in the CloudMonitor console.

      image

  3. Use one of the following methods to recover from the fault:

    • Wait for the system load to return to normal after a timeout. The following figure shows that the 1-minute system load average is lower than the 5-minute system load average, which indicates that the ECS instance is recovering from the fault.

      image

    • Run the following command:

      sudo acs-plugin-manager --exec --plugin ACS-ECS-HighLoad --params recover