CPU Burst is a service level objective (SLO)-aware resource scheduling feature provided by Container Service for Kubernetes (ACK). You can use CPU Burst to improve the performance of latency-sensitive applications. CPU scheduling for a container may be throttled by the kernel due to the CPU limit, which downgrades the performance of the application. The ack-koordinator component automatically detects CPU throttling events and adjusts the CPU limit to a proper value. This greatly improves the performance of latency-sensitive applications. This topic introduces CPU Burst. This topic also describes how to use CPU Burst and how to verify the performance improvement.
Prerequisites
An ACK Pro cluster is created. CPU Burst is supported only by ACK Pro clusters. For more information, see Create an ACK Pro cluster.
ack-koordinator is installed. For more information, see ack-koordinator.
Limits
The following table describes the versions of the system components that are required for enabling CPU Burst.
Component | Required version |
Kubernetes | ≥ 1.18 |
ack-koordinator | ≥ 0.8.0 |
Use scenarios
CPU Burst is suitable in the following scenarios:
CPU throttling is triggered for an application though the CPU usage of the application is less than the CPU limit of the application. As a result, the performance of the application is degraded. You can enable CPU Burst to resolve this issue and improve the performance of the application.
The CPU usage during an application startup is higher than the CPU usage after the application is started. You can enable CPU Burst to meet the CPU requirements during an application startup. This way, you do not need to specify an excessively high CPU request for the application, which reduces resource waste.
How CPU Burst works
Kubernetes allows you to specify CPU limits, which can be reused based on time-sharing. If you specify a CPU limit for a container, the OS limits the amount of CPU resources that can be used by the container within a specific time period. For example, you set the CPU limit of a container to 2. The OS kernel limits the CPU time slices that the container can use to 200 milliseconds within each 100-millisecond period.
CPU utilization is a key metric that is used to evaluate the performance of a container. In most cases, the CPU limit is specified based on CPU utilization. CPU utilization on a per-millisecond basis shows more spikes than on a per-second basis. If the CPU utilization of a container reaches the limit within a 100-millisecond period, CPU throttling is enforced by the OS kernel and threads in the container are suspended for the rest of the time period, as shown in the following figure.
The following figure shows the thread allocation of a web application container that runs on a node with four vCores. The CPU limit of the container is set to 2. The overall CPU utilization within the last second is low. However, Thread 2 cannot be resumed until the third 100-millisecond period starts because CPU throttling is enforced somewhere in the second 100-millisecond period. This increases the response time (RT) and causes long-tail latency problems in containers.
Alibaba Cloud Linux 2 provides the CPU Burst feature in kernel version 4.19.91-22.al7. CPU Burst allows a container to accumulate CPU time slices when the container is idle. The container can use the accumulated CPU time slices to burst above the CPU limit when CPU utilization spikes. This improves performance and reduces the RT of the container.
For kernel versions that do not support CPU Burst, ack-koordinator detects CPU throttling events and dynamically adjusts the CPU limit to achieve the same effect as CPU Burst.
ack-koordinator achieves this by modifying the value of the CFS quota
in the cgroup parameters instead of modifying the value of the CPU limit in the pod specifications.
The preceding Completely Fair Scheduler (CFS) quota adjustment policy can be used to handle CPU usage spikes For example, when traffic spikes occur, ack-koordinator can eliminate CPU bottlenecks within a few seconds, while ensuring a proper number of workloads on the node.
The CPU Burst feature supported by the kernel of Alibaba Cloud Linux handles CPU usage spikes at a faster rate. We recommend that you enable the CPU Burst feature provided by the kernel of Alibaba Cloud Linux for latency-sensitive applications. For more information about CPU Burst, see the Alibaba Cloud presentation at KubeCon 2021: CPU Burst: Getting Rid of Unnecessary Throttling, Achieving High CPU Utilization and Application Performance at the Same Time.
How to use CPU Burst
Use an annotation to enable CPU Burst
ImportantTo enable CPU Burst for a pod, configure the
annotations
parameter in themetadata
section of the pod configuration.To enable CPU Burst for a Deployment, configure the
annotations
parameter in thetemplate.metadata
section of the Deployment configuration.
annotations: # Set the value to auto to enable CPU Burst for the pod. koordinator.sh/cpuBurst: '{"policy": "auto"}' # Set the value to none to disable CPU Burst for the pod. #koordinator.sh/cpuBurst: '{"policy": "none"}'
Use a ConfigMap to enable CPU Burst for all pods in a cluster
Modify the ack-slo-config ConfigMap based on the following content to enable CPU Burst for all pods in a cluster:
apiVersion: v1 data: cpu-burst-config: '{"clusterStrategy": {"policy": "auto"}}' #cpu-burst-config: '{"clusterStrategy": {"policy": "cpuBurstOnly"}}' #cpu-burst-config: '{"clusterStrategy": {"policy": "none"}}' kind: ConfigMap metadata: name: ack-slo-config namespace: kube-system
To avoid modifying other configurations in the ConfigMap, we recommend that you update the ConfigMap by using a patch. To do this, run the following command:
kubectl patch cm -n kube-system ack-slo-config --patch "$(cat configmap.yaml)
Enable CPU Burst for pods in specified namespaces
apiVersion: v1 kind: ConfigMap metadata: name: ack-slo-pod-config namespace: kube-system data: # Enable or disable CPU Burst for pods in specified namespaces. cpu-burst: | { "enabledNamespaces": ["white-ns"], "disabledNamespaces": ["black-ns"] }
Advanced configurations
You can specify advanced configurations in the ConfigMap or the
annotations
parameter in themetadata
section of the pod configuration.# Example of the ack-slo-config ConfigMap. data: cpu-burst-config: | { "clusterStrategy": { "policy": "auto", "cpuBurstPercent": 1000, "cfsQuotaBurstPercent": 300, "sharePoolThresholdPercent": 50, "cfsQuotaBurstPeriodSeconds": -1 } } # Example of pod annotations. koordinator.sh/cpuBurst: '{"policy": "auto", "cpuBurstPercent": 1000, "cfsQuotaBurstPercent": 300, "cfsQuotaBurstPeriodSeconds": -1}'
The following table describes the advanced parameters of CPU Burst.
Parameter
Type
Description
policy
string
none: disables CPU Burst. If you set the value to none, the related fields are reset to their original values. This is the default value.
cpuBurstOnly: enables the CPU Burst feature only for the kernel of Alibaba Cloud Linux 2.
cfsQuotaBurstOnly: enables automatic adjustment of CFS quotas of general kernel versions.
auto: enables CPU Burst and all the related features, including CPU Burst for the kernel of Alibaba Cloud Linux and automatic adjustment of CFS quotas of general kernel versions.
cpuBurstPercent
int
Default value:
1000
. Unit: %.This field is used to configure the CPU Burst feature for the kernel of Alibaba Cloud Linux 2. This field specifies the percentage to which the CPU limit can be increased by CPU Burst. If the CPU limit is set to
1
, CPU Burst can increase the limit to 10 by default. For more information, see Enable the CPU burst feature for cgroup v1.cfsQuotaBurstPercent
int
Default value:
300
. Unit: %.This field specifies the maximum percentage to which the value of cfs_quota in the cgroup parameters can be increased. By default, the value of cfs_quota can be increased to at most three times.
cfsQuotaBurstPeriodSeconds
int
Default value:
-1
. Unit: seconds. This indicates that the time period in which the container can run with an increased CFS quota is unlimited.This field specifies the time period in which the container can run with an increased CFS quota, which cannot exceed the upper limit specified by
cfsQuotaBurstPercent
.sharePoolThresholdPercent
int
Default value:
50
. Unit: %.This field specifies the CPU utilization threshold of the node. If the CPU utilization of the node exceeds the threshold, the value of
cfs_quota
in cgroup parameters is reset to the original value.ImportantAfter you set
policy
tocfsQuotaBurstOnly
orauto
, theCFS quota
assigned by the node OS is automatically adjusted based on whether CPU throttling is triggered.When you perform stress tests on a container, we recommend that you record the CPU utilization of the container throughout the test period or set
policy
tocpuBurstOnly
ornone
. This ensures higher resource elasticity for your production environment.
Verify the effect of CPU Burst
Use the following YAML template to create an apache-demo.yaml file:
To enable CPU Burst for a pod, specify an annotation in the
annotations
parameter of themetadata
section of the pod configuration.apiVersion: v1 kind: Pod metadata: name: apache-demo annotations: koordinator.sh/cpuBurst: '{"policy": "auto"}' # The annotation is used to enable or disable CPU Burst. spec: containers: - command: - httpd - -D - FOREGROUND image: registry.cn-zhangjiakou.aliyuncs.com/acs/apache-2-4-51-for-slo-test:v0.1 imagePullPolicy: Always name: apache resources: limits: cpu: "4" memory: 10Gi requests: cpu: "4" memory: 10Gi nodeName: $nodeName # Replace nodeName with the actual node name. hostNetwork: False restartPolicy: Never schedulerName: default-scheduler
Run the following command to create an application by using Apache HTTP Server:
kubectl apply -f apache-demo.yaml
Use the wrk2 tool to perform stress tests.
Replace the IP address in the command with the pod IP address of the Apache application.
You can modify the -R field to change the number of queries per unit time from the sender.
# Download, decompress, and then install the wrk2 package. For more information, visit https://github.com/giltene/wrk2.
# Gzip compression is enabled in the Apache image to simulate the request processing logic of the server.
# Run the following command to send requests. Replace the IP address in the command with the IP address of the application.
./wrk -H "Accept-Encoding: deflate, gzip" -t 2 -c 12 -d 120 --latency --timeout 2s -R 24 ht
Analyze the result
The following tables show metrics before and after CPU Burst is enabled for Alibaba Cloud Linux 2 and CentOS 7.
The Disabled column shows the metrics when the CPU Burst policy is set to
none
.The Enabled column shows the metrics when the CPU Burst policy is set to
auto
.
Alibaba Cloud Linux 2 | Disabled | Enabled |
apache RT-p99 | 107.37 ms | 67.18 ms (-37.4%) |
CPU Throttled Ratio | 33.3% | 0% |
Average pod CPU utilization | 31.8% | 32.6% |
CentOS 7 | Disabled | Enabled |
apache RT-p99 | 111.69 ms | 71.30 ms (-36.2%) |
CPU Throttled Ratio | 33% | 0% |
Average pod CPU utilization | 32.5% | 33.8% |
The preceding metrics indicate the following information:
After CPU Burst is enabled, the P99 latency is greatly reduced.
After CPU Burst is enabled, CPU throttling is stopped and the average pod CPU utilization remains approximately at the same value.
FAQ
Is the CPU Burst feature that is enabled based on the earlier version of the ack-slo-manager protocol supported after I upgrade ack-slo-manager to ack-koordinator?
The earlier version of the pod protocol requires you to add the alibabacloud.com/cpuBurst
annotation. ack-koordinator is fully compatible with the earlier protocol version. You can seamlessly upgrade from ack-slo-manager to ack-koordinator.
ack-koordinator is compatible with the earlier protocol version until July 30, 2023. We recommend that you upgrade the resource parameters of the earlier protocol version to the latest version.
The following table describes the compatibilities between ack-koordinator and different types of protocols.
ack-koordinator version | alibabacloud.com protocol | koordinator.sh protocol |
≥ 0.2.0 | Supported | Not supported |
≥ 0.8.0 | Supported | Supported |