CPU Burst is a service level objective (SLO)-aware resource scheduling feature provided by Container Service for Kubernetes (ACK). You can use CPU Burst to improve the performance of latency-sensitive applications. CPU scheduling for a container may be throttled by the kernel due to the CPU limit, which downgrades the performance of the application. The ack-slo-manager component automatically detects CPU throttling events and automatically adjusts the CPU limit to a proper value. This greatly improves the performance of latency-sensitive applications. This topic introduces CPU Burst. This topic also describes how to use CPU Burst and how to verify the performance improvement.

Prerequisites

  • An ACK Pro cluster is created. CPU Burst is supported only in ACK Pro clusters. For more information, see Create an ACK Pro cluster.
  • ack-slo-manager is installed in the cluster. For more information, see Usage notes.

Limits

The following table describes the versions of the system components that are required to enable CPU Burst.

Component Required version
Kubernetes 1.18 and later
ack-slo-manager 0.2.0 and later
Helm 3.0 and later
Kernel and OS Alibaba Cloud Linux 2, CentOS 7.6, and CentOS 7.7

How CPU Burst works

Kubernetes allows you to specify CPU limits, which can be reused based on time-sharing. If you specify a CPU limit for a container, the OS limits the amount of CPU resources that can be used by the container within a specific time period. For example, you set the CPU limit of a container to 2. The OS kernel limits the CPU time slices that the container can use to 200 milliseconds within each 100-millisecond period.

CPU utilization is a key metric that is used to evaluate the performance of a container. In most cases, the CPU limit is specified based on CPU utilization. CPU utilization on a per-millisecond basis shows more spikes than on a per-second basis. If the CPU utilization of a container reaches the limit within a 100-millisecond period, CPU throttling is enforced by the OS kernel and threads in the container are suspended for the rest of the time period, as shown in the following figure.

How CPU Burst works

The following figure shows the thread allocation of a web application container that runs on a node with four vCPUs. The CPU limit of the container is set to 2. The overall CPU utilization within the last second is low. However, Thread 2 cannot be resumed until the third 100-millisecond period starts because CPU throttling is enforced somewhere in the second 100-millisecond period. This increases the response time (RT) and causes long-tail latency problems in containers.

ack-slo-manager example.png

Alibaba Cloud Linux 2 provides the CPU Burst feature in kernel version 4.19.91-22.al7. CPU Burst allows a container to accumulate CPU time slices when the container is idle. The container can use the accumulated CPU time slices to burst above the CPU limit when CPU utilization spikes. This improves performance and reduces the RT of the container.

CPU Burst.png

For kernel versions that do not support CPU Burst, ack-slo-manager detects CPU throttling events and dynamically adjusts the CPU limit to achieve the same effect as CPU Burst.

Note ack-slo-manager achieves this by modifying the value of the CFS quota in the cgroup parameters instead of modifying the value of the CPU limit in the pod specifications.

The preceding Completely Fair Scheduler (CFS) quota adjustment policy can be used to handle CPU usage spikes For example, when traffic spikes occur, ack-slo-manager can eliminate CPU bottlenecks within a few seconds, while ensuring a proper number of workloads on the node.

The CPU Burst feature supported by the kernel of Alibaba Cloud Linux handles CPU usage spikes at a faster rate. We recommend that you enable the CPU Burst feature provided by the kernel of Alibaba Cloud Linux for latency-sensitive applications. For more information about CPU Burst, see the Alibaba Cloud presentation at KubeCon 2021: CPU Burst: Getting Rid of Unnecessary Throttling, Achieving High CPU Utilization and Application Performance at the Same Time.

How to use CPU Burst

  • Use an annotation to enable CPU Burst

    Add the following annotation to the pod configuration to enable CPU Burst:

    annotations:
      # Set the value to auto to enable CPU Burst for the pod. 
      alibabacloud.com/cpuBurst: '{"policy": "auto"}'
      # To disable CPU Burst for the pod, set the value to none. 
      #alibabacloud.com/cpuBurst: '{"policy": "none"}'
  • Use a ConfigMap to enable CPU Burst for all pods in a cluster

    Modify the ack-slo-manager-config ConfigMap based on the following content to enable CPU Burst for all pods in a cluster:

    apiVersion: v1
    data:
      cpu-burst-config: '{"clusterStrategy": {"policy": "auto"}}'
      #cpu-burst-config: '{"clusterStrategy": {"policy": "cpuBurstOnly"}}'
      #cpu-burst-config: '{"clusterStrategy": {"policy": "none"}}'
    kind: ConfigMap
    metadata:
      name: ack-slo-manager-config
      namespace: kube-system
    To avoid modifying other configurations in the ConfigMap, we recommend that you update the ConfigMap by running the kubectl patch command:
    kubectl patch cm -n kube-system ack-slo-manager-config --patch "$(cat configmap.yaml)"
  • Enable CPU Burst for pods in specified namespaces
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ack-slo-pod-config
      namespace: kube-system
    data:
      # Enable or disable CPU Burst for pods in specified namespaces. 
      cpu-burst: |
        {
          "enabledNamespaces": ["white-ns"],
          "disabledNamespaces": ["black-ns"]
        }
  • Advanced configurations
    The following code block shows the pod annotations and ConfigMap fields that you can use for advanced configurations:
    # Example of the ack-slo-manager-config ConfigMap. 
    data:
      cpu-burst-config: |
        {
          "clusterStrategy": {
            "policy": "auto",
            "cpuBurstPercent": 1000,
            "cfsQuotaBurstPercent": 300,
            "sharePoolThresholdPercent": 50,
            "cfsQuotaBurstPeriodSeconds": -1
          }
        }
    
    # Example of pod annotations. 
      alibabacloud.com/cpuBurst: '{"policy": "auto", "cpuBurstPercent": 1000, "cfsQuotaBurstPercent": 300, "cfsQuotaBurstPeriodSeconds": -1}'

    The following table describes the ConfigMap fields that you can use for advanced configurations of CPU Burst. If you need assistance, see Submit a ticket.

    Field Data type Description
    policy string
    • none: disables CPU Burst. If you set the value to none, the related fields are reset to their original values. This is the default value.
    • cpuBurstOnly: enables the CPU Burst feature only for the kernel of Alibaba Cloud Linux 2.
    • cfsQuotaBurstOnly: enables automatic adjustment of CFS quotas of general kernel versions.
    • auto: enables CPU Burst and all the related features, including CPU Burst for the kernel of Alibaba Cloud Linux and automatic adjustment of CFS quotas of general kernel versions.
    cpuBurstPercent int Default value: 1000. Unit: %.

    This field is used to configure the CPU Burst feature for the kernel of Alibaba Cloud Linux 2. This field specifies the percentage to which the CPU limit can be increased by CPU Burst. If the CPU limit is set to 1, CPU Burst can increase the limit to 10 by default. For more information, see Enable the CPU burst feature for cgroup v1.

    cfsQuotaBurstPercent int

    Default value: 300. Unit: %.

    This field specifies the maximum percentage to which the value of cfs_quota in the cgroup parameters can be increased. By default, the value of cfs_quota can be increased to at most three times.

    cfsQuotaBurstPeriodSeconds int

    Default value: -1. Unit: seconds. This indicates that the time period in which the container can run with an increased CFS quota is unlimited.

    This field specifies the time period in which the container can run with an increased CFS quota, which cannot exceed the upper limit specified by cfsQuotaBurstPercent.

    sharePoolThresholdPercent int

    Default value: 50. Unit: %.

    This field specifies the CPU utilization threshold of the node. If the CPU utilization of the node exceeds the threshold, the value of cfs_quota in cgroup parameters is reset to the original value.

    Notice
    • After you set policy to cfsQuotaBurstOnly or auto, the CFS quota assigned by the node OS is automatically adjusted based on whether CPU throttling is triggered.
    • When you perform stress tests on a container, we recommend that you record the CPU utilization of the container throughout the test period or set policy to cpuBurstOnly or none. This ensures higher resource elasticity for your production environment.

Verify the effect of CPU Burst

  1. Use the following YAML template to create an apache-demo.yaml file:
    apiVersion: v1
    kind: Pod
    metadata:
      name: apache-demo
      annotations:
        alibabacloud.com/cpuBurst: '{"policy": "auto"}'   # Use this annotation to enable or disable CPU Burst. 
    spec:
      containers:
      - command:
        - httpd
        - -D
        - FOREGROUND
        image: registry.cn-zhangjiakou.aliyuncs.com/acs/apache-2-4-51-for-slo-test:v0.1
        imagePullPolicy: Always
        name: apache
        resources:
          limits:
            cpu: "4"
            memory: 10Gi
          requests:
            cpu: "4"
            memory: 10Gi
      nodeName: # $nodeName Set the value to the name of the node that you use. 
      hostNetwork: False
      restartPolicy: Never
      schedulerName: default-scheduler
  2. Run the following command to create an application by using Apache HTTP Server:
    kubectl apply -f apache-demo.yaml
  3. Use the wrk2 tool to perform stress tests.
    # Download, decompress, and then install the wrk2 package. 
    
    # The Gzip module is enabled in the configuration of the Apache application. The Gzip module is used to simulate the logic of processing requests on the server. 
    # Run the following command to send requests. Replace the IP address in the command with the IP address of the application. 
    ./wrk -H "Accept-Encoding: deflate, gzip" -t 2 -c 12 -d 120 --latency --timeout 2s -R 24 http://$target_ip_address:8010/static/file.1m.test
    Note
    • Replace the IP address in the command with the pod IP address of the Apache application.
    • You can modify the -R field to change the number of queries per unit time from the sender.

Analyze the result

The following tables show metrics before and after CPU Burst is enabled for Alibaba Cloud Linux 2 and CentOS 7.
  • The Disabled column shows the metrics when the CPU Burst policy is set to none.
  • The Enabled column shows the metrics when the CPU Burst policy is set to auto.
Alibaba Cloud Linux 2 Disabled Enabled
apache RT-p99 107.37 ms 67.18 ms (-37.4%)
CPU Throttled Ratio 33.3% 0%
Average pod CPU utilization 31.8% 32.6%
CentOS 7 Disabled Enabled
apache RT-p99 111.69 ms 71.30 ms (-36.2%)
CPU Throttled Ratio 33% 0%
Average pod CPU utilization 32.5% 33.8%

The preceding metrics indicate the following information:

  • After CPU Burst is enabled, the P99 latency is greatly reduced.
  • After CPU Burst is enabled, CPU throttling is stopped and the average pod CPU utilization remains approximately at the same value.