Network delay affects the response speed of applications and services, and high network delay degrades user experience. Factors that lead to network delay include increased network traffic and unstable lines. This drill scenario verifies the system alert and recovery mechanisms when network delay occurs.
Limits
This drill scenario requires the tc Traffic Control (TC) utility and its dependent kernel Network Emulator (NetEm) component.
If the system lacks the
tcutility, run thesudo yum install -y iproute-tcorsudo apt-get install -y iproute2command to install it, or specify relevant parameters to automatically install it when you run theACS-ECS-NetDelayplugin.If the system kernel lacks the NetEm component, such as in CentOS, run the
sudo yum install kernel-modules-extracommand to install the component package and restart the system.WarningNote that installing the
kernel-modules-extrapackage changes the kernel version. Proceed with caution. We recommend that you do the drill using Elastic Compute Service (ECS) instances running other operating systems.
Implementation
This drill uses the ACS-ECS-NetDelay Cloud Assistant plugin, which utilizes the tc utility and NetEm component to add specific rules to network interface controllers (NICs) to control Linux kernel traffic. This plugin can restrict traffic for either all IP addresses or a single one, without affecting traffic for the Cloud Assistant CIDR block 100.100.0.0/16.
Procedure
Prerequisites
Cloud Assistant Agent is installed on the ECS instance for which you want to perform a drill.
The status of Cloud Assistant is Normal on the ECS instance. For more information, see View the status of Cloud Assistant and handle anomalies.
Inject a fault
Connect to the ECS instance as a user with sudo privileges.
For more information, see Use Workbench to connect to a Linux instance over SSH.
Run the
ACS-ECS-NetDelayCloud Assistant plugin.sudo acs-plugin-manager --exec --plugin ACS-ECS-NetDelay --params inject,dev=eth0,[time=paramA],[jitter=paramB],[target-ip=paramC],[replace=paramD],[duration=paramE],[install-tc=paramF]The optional fault injection parameters are enclosed in brackets (
[]).dev (required): the NIC to which you want to inject a fault, such as eth0. You can run the
ifconfigcommand to view the available NICs in the system.time (optional): delay time. Unit: milliseconds. Default value: 100.
jitter (optional): jitter range. Unit: milliseconds. Default value: 10.
target-ip (optional): the target IP address you want to affect. This parameter is left empty by default, which indicates that all IP addresses are affected. If you specify a target IP address, only the IP address is affected.
replace (optional): If you configured TC rules for the NIC, fault injection may fail due to conflicts. To overwrite the existing rules, set this parameter to true.
duration (optional): fault duration. Unit: seconds. Default value: 300.
install-tc (optional): If the system lacks the
tcutility, set this parameter totruefor automatic installation. Default value:false.
The following command output indicates that the
ACS-ECS-NetDelayplugin is run as expected.
Ping the target network to check whether a fault is injected.
Recover from the fault
Method 1: Wait for automatic recovery after the fault times out.
Method 2: Run the following fault recovery command:
sudo acs-plugin-manager --exec --plugin ACS-ECS-NetDelay --params recover
Drill case
Simulate business access to the target network.
ping www.taobao.comIn this example, www.taobao.com is used as the target network. The following command output is returned.

Inject a fault.
sudo acs-plugin-manager --exec --plugin ACS-ECS-NetDelay --params inject,dev=eth0,time=1000,jitter=30,duration=120In this example, a delay of 1,000 milliseconds and jitter of 30 milliseconds are injected for the target IP address.

Check the injection result.
Check the network delay status. After you inject the fault, the ping command has an increased delay of 1,000 milliseconds, with the actual delay ranging from 970 to 1,030 milliseconds.

Wait for fault recovery.
After the injection times out, the delay decreases back to the original level, and the network recovers.
