If the following conditions are met, the algorithm for kernel TCP congestion is set to BBR by default in the console. BBR works when the CPU usage is high and the packet forwarding rate (PPS) is high: this may affect the network performance. For example, the performance of the Redis database is reduced.
aliyun_2_1903_64_20G_alibase_20190619.vhdand all previous image versions.
kernel-4.19.48-14.al7and all previous kernel versions.
Cause of problem
The TCP congestion control kernel of the Aliyun Linux 2 system currently supports three algorithms: Reno, BBR, and cubic. The control performance is different in different network scenarios. The BBR algorithm estimates the BW (throughput) and RTT (latency) of the current connection to adjust the congestion window. The BBR algorithm relies on the TCP encapsulation feature. TCP encapsulation is implemented in two ways:
- If network interface controller Nic uses qdisc for tc-fq scheduling, the stream-based encapsulation in tc-fq scheduling is directly reused.
- If the Nic device does not use qdisc for tc-fq scheduling, TCP uses its internal encapsulation method instead.
The TCP encapsulation method depends on the Linux hrtimer, which consumes additional CPU resources. When CPU usage and network PPS are high, BBR algorithm has more obvious influence on network performance. When the CPU is idle and the network PPS is low, the impact is small.
Alibaba Cloud reminds you that:
- If you have any risky operations on an instance or data, pay attention to the disaster tolerance and fault tolerance capabilities of the instance to ensure data security.
- If you modify the configuration and data of an instance (including but not limited to ECS and RDS), we recommend that you create snapshots or enable RDS log backup.
- If you have granted permissions on the Alibaba Cloud platform or submitted security information such as the logon account and password, we recommend that you modify the information as soon as possible.
See the following TCP congestion control algorithm recommendations to select the solution that meets your service needs.
- If the applications in the ECS instances only provide services for the internal network, we recommend that you use the following command to change the TCP congestion control algorithm to cubic. High bandwidth and low latency in the intranet environment.
sysctl -w net.ipv4.tcp_congestion_control=cubic sh -c "echo 'net.ipv4.tcp_congestion_control=cubic'" >> /etc/sysctl.d/50-aliyun.conf
- If an application in the ECS instance provides services, we recommend that you continue to use the BBR algorithm. However, change the scheduling policy of the network interface controller to tc-fq. For more information, see Fair Queue traffic policing. The modified command is as follows:
tc qdisc add dev [$Dev] root fq
Note:[$Dev] indicates the name of the network card to be adjusted.
- We recommend that you do not use non-tc-fq scheduling policies while using the BBR algorithm. Because this will consume more additional CPU resources.
Upgrade the kernel of an ECS instance to
kernel-4.19.57-15.al7 or a later version.
Aliyun Linux 2 system allows different connections to use different congestion algorithms and can be controlled by the net namespace. If an ECS instance has multiple containers that belong to different network namespaces, some containers provide external services only, while other containers provide internal services only. Then different congestion control algorithms can be set for these containers.
- Elastic Compute Service