All Products
Search
Document Center

Container Service for Kubernetes:CPU Burst

Last Updated:Sep 04, 2023

CPU Burst is a service level objective (SLO)-aware resource scheduling feature provided by Container Service for Kubernetes (ACK). You can use CPU Burst to improve the performance of latency-sensitive applications. CPU scheduling for a container may be throttled by the kernel due to the CPU limit, which downgrades the performance of the application. The ack-koordinator component automatically detects CPU throttling events and adjusts the CPU limit to a proper value. This greatly improves the performance of latency-sensitive applications. This topic introduces CPU Burst. This topic also describes how to use CPU Burst and how to verify the performance improvement.

Prerequisites

  • An ACK Pro cluster is created. CPU Burst is supported only by ACK Pro clusters. For more information, see Create an ACK Pro cluster.

  • ack-koordinator is installed. For more information, see ack-koordinator.

Limits

The following table describes the versions of the system components that are required for enabling CPU Burst.

Component

Required version

Kubernetes

≥ 1.18

ack-koordinator

≥ 0.8.0

Use scenarios

CPU Burst is suitable in the following scenarios:

  1. CPU throttling is triggered for an application though the CPU usage of the application is less than the CPU limit of the application. As a result, the performance of the application is degraded. You can enable CPU Burst to resolve this issue and improve the performance of the application.

  2. The CPU usage during an application startup is higher than the CPU usage after the application is started. You can enable CPU Burst to meet the CPU requirements during an application startup. This way, you do not need to specify an excessively high CPU request for the application, which reduces resource waste.

How CPU Burst works

Kubernetes allows you to specify CPU limits, which can be reused based on time-sharing. If you specify a CPU limit for a container, the OS limits the amount of CPU resources that can be used by the container within a specific time period. For example, you set the CPU limit of a container to 2. The OS kernel limits the CPU time slices that the container can use to 200 milliseconds within each 100-millisecond period.

CPU utilization is a key metric that is used to evaluate the performance of a container. In most cases, the CPU limit is specified based on CPU utilization. CPU utilization on a per-millisecond basis shows more spikes than on a per-second basis. If the CPU utilization of a container reaches the limit within a 100-millisecond period, CPU throttling is enforced by the OS kernel and threads in the container are suspended for the rest of the time period, as shown in the following figure.

原理说明

The following figure shows the thread allocation of a web application container that runs on a node with four vCores. The CPU limit of the container is set to 2. The overall CPU utilization within the last second is low. However, Thread 2 cannot be resumed until the third 100-millisecond period starts because CPU throttling is enforced somewhere in the second 100-millisecond period. This increases the response time (RT) and causes long-tail latency problems in containers.

ack-slo-manager example.png

Alibaba Cloud Linux 2 provides the CPU Burst feature in kernel version 4.19.91-22.al7. CPU Burst allows a container to accumulate CPU time slices when the container is idle. The container can use the accumulated CPU time slices to burst above the CPU limit when CPU utilization spikes. This improves performance and reduces the RT of the container.

CPU Burst.png

For kernel versions that do not support CPU Burst, ack-koordinator detects CPU throttling events and dynamically adjusts the CPU limit to achieve the same effect as CPU Burst.

Note

ack-koordinator achieves this by modifying the value of the CFS quota in the cgroup parameters instead of modifying the value of the CPU limit in the pod specifications.

The preceding Completely Fair Scheduler (CFS) quota adjustment policy can be used to handle CPU usage spikes For example, when traffic spikes occur, ack-koordinator can eliminate CPU bottlenecks within a few seconds, while ensuring a proper number of workloads on the node.

The CPU Burst feature supported by the kernel of Alibaba Cloud Linux handles CPU usage spikes at a faster rate. We recommend that you enable the CPU Burst feature provided by the kernel of Alibaba Cloud Linux for latency-sensitive applications. For more information about CPU Burst, see the Alibaba Cloud presentation at KubeCon 2021: CPU Burst: Getting Rid of Unnecessary Throttling, Achieving High CPU Utilization and Application Performance at the Same Time.

How to use CPU Burst

  • Use an annotation to enable CPU Burst

    Important
    • To enable CPU Burst for a pod, configure the annotations parameter in the metadata section of the pod configuration.

    • To enable CPU Burst for a Deployment, configure the annotations parameter in the template.metadata section of the Deployment configuration.

    annotations:
      # Set the value to auto to enable CPU Burst for the pod. 
      koordinator.sh/cpuBurst: '{"policy": "auto"}'
      # Set the value to none to disable CPU Burst for the pod. 
      #koordinator.sh/cpuBurst: '{"policy": "none"}'
  • Use a ConfigMap to enable CPU Burst for all pods in a cluster

    Modify the ack-slo-config ConfigMap based on the following content to enable CPU Burst for all pods in a cluster:

    apiVersion: v1
    data:
      cpu-burst-config: '{"clusterStrategy": {"policy": "auto"}}'
      #cpu-burst-config: '{"clusterStrategy": {"policy": "cpuBurstOnly"}}'
      #cpu-burst-config: '{"clusterStrategy": {"policy": "none"}}'
    kind: ConfigMap
    metadata:
      name: ack-slo-config
      namespace: kube-system

    To avoid modifying other configurations in the ConfigMap, we recommend that you update the ConfigMap by using a patch. To do this, run the following command:

    kubectl patch cm -n kube-system ack-slo-config --patch "$(cat configmap.yaml)
  • Enable CPU Burst for pods in specified namespaces

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ack-slo-pod-config
      namespace: kube-system
    data:
      # Enable or disable CPU Burst for pods in specified namespaces. 
      cpu-burst: |
        {
          "enabledNamespaces": ["white-ns"],
          "disabledNamespaces": ["black-ns"]
        }
  • Advanced configurations

    You can specify advanced configurations in the ConfigMap or the annotations parameter in the metadata section of the pod configuration.

    # Example of the ack-slo-config ConfigMap. 
    data:
      cpu-burst-config: |
        {
          "clusterStrategy": {
            "policy": "auto",
            "cpuBurstPercent": 1000,
            "cfsQuotaBurstPercent": 300,
            "sharePoolThresholdPercent": 50,
            "cfsQuotaBurstPeriodSeconds": -1
          }
        }
    
    # Example of pod annotations. 
      koordinator.sh/cpuBurst: '{"policy": "auto", "cpuBurstPercent": 1000, "cfsQuotaBurstPercent": 300, "cfsQuotaBurstPeriodSeconds": -1}'

    The following table describes the advanced parameters of CPU Burst.

    Parameter

    Type

    Description

    policy

    string

    • none: disables CPU Burst. If you set the value to none, the related fields are reset to their original values. This is the default value.

    • cpuBurstOnly: enables the CPU Burst feature only for the kernel of Alibaba Cloud Linux 2.

    • cfsQuotaBurstOnly: enables automatic adjustment of CFS quotas of general kernel versions.

    • auto: enables CPU Burst and all the related features, including CPU Burst for the kernel of Alibaba Cloud Linux and automatic adjustment of CFS quotas of general kernel versions.

    cpuBurstPercent

    int

    Default value: 1000. Unit: %.

    This field is used to configure the CPU Burst feature for the kernel of Alibaba Cloud Linux 2. This field specifies the percentage to which the CPU limit can be increased by CPU Burst. If the CPU limit is set to 1, CPU Burst can increase the limit to 10 by default. For more information, see Enable the CPU burst feature for cgroup v1.

    cfsQuotaBurstPercent

    int

    Default value: 300. Unit: %.

    This field specifies the maximum percentage to which the value of cfs_quota in the cgroup parameters can be increased. By default, the value of cfs_quota can be increased to at most three times.

    cfsQuotaBurstPeriodSeconds

    int

    Default value: -1. Unit: seconds. This indicates that the time period in which the container can run with an increased CFS quota is unlimited.

    This field specifies the time period in which the container can run with an increased CFS quota, which cannot exceed the upper limit specified by cfsQuotaBurstPercent.

    sharePoolThresholdPercent

    int

    Default value: 50. Unit: %.

    This field specifies the CPU utilization threshold of the node. If the CPU utilization of the node exceeds the threshold, the value of cfs_quota in cgroup parameters is reset to the original value.

    Important
    • After you set policy to cfsQuotaBurstOnly or auto, the CFS quota assigned by the node OS is automatically adjusted based on whether CPU throttling is triggered.

    • When you perform stress tests on a container, we recommend that you record the CPU utilization of the container throughout the test period or set policy to cpuBurstOnly or none. This ensures higher resource elasticity for your production environment.

Verify the effect of CPU Burst

  1. Use the following YAML template to create an apache-demo.yaml file:

    To enable CPU Burst for a pod, specify an annotation in the annotations parameter of the metadata section of the pod configuration.

    apiVersion: v1
    kind: Pod
    metadata:
      name: apache-demo
      annotations:
        koordinator.sh/cpuBurst: '{"policy": "auto"}'   # The annotation is used to enable or disable CPU Burst. 
    spec:
      containers:
      - command:
        - httpd
        - -D
        - FOREGROUND
        image: registry.cn-zhangjiakou.aliyuncs.com/acs/apache-2-4-51-for-slo-test:v0.1
        imagePullPolicy: Always
        name: apache
        resources:
          limits:
            cpu: "4"
            memory: 10Gi
          requests:
            cpu: "4"
            memory: 10Gi
      nodeName: $nodeName # Replace nodeName with the actual node name. 
      hostNetwork: False
      restartPolicy: Never
      schedulerName: default-scheduler
  2. Run the following command to create an application by using Apache HTTP Server:

    kubectl apply -f apache-demo.yaml
  3. Use the wrk2 tool to perform stress tests.

  4. # Download, decompress, and then install the wrk2 package. For more information, visit https://github.com/giltene/wrk2. 
    # Gzip compression is enabled in the Apache image to simulate the request processing logic of the server. 
    # Run the following command to send requests. Replace the IP address in the command with the IP address of the application. 
      ./wrk -H "Accept-Encoding: deflate, gzip" -t 2 -c 12 -d 120 --latency --timeout 2s -R 24 ht
    Note
    • Replace the IP address in the command with the pod IP address of the Apache application.

    • You can modify the -R field to change the number of queries per unit time from the sender.

Analyze the result

The following tables show metrics before and after CPU Burst is enabled for Alibaba Cloud Linux 2 and CentOS 7.

  • The Disabled column shows the metrics when the CPU Burst policy is set to none.

  • The Enabled column shows the metrics when the CPU Burst policy is set to auto.

Alibaba Cloud Linux 2

Disabled

Enabled

apache RT-p99

107.37 ms

67.18 ms (-37.4%)

CPU Throttled Ratio

33.3%

0%

Average pod CPU utilization

31.8%

32.6%

CentOS 7

Disabled

Enabled

apache RT-p99

111.69 ms

71.30 ms (-36.2%)

CPU Throttled Ratio

33%

0%

Average pod CPU utilization

32.5%

33.8%

The preceding metrics indicate the following information:

  • After CPU Burst is enabled, the P99 latency is greatly reduced.

  • After CPU Burst is enabled, CPU throttling is stopped and the average pod CPU utilization remains approximately at the same value.

FAQ

Is the CPU Burst feature that is enabled based on the earlier version of the ack-slo-manager protocol supported after I upgrade ack-slo-manager to ack-koordinator?

The earlier version of the pod protocol requires you to add the alibabacloud.com/cpuBurstannotation. ack-koordinator is fully compatible with the earlier protocol version. You can seamlessly upgrade from ack-slo-manager to ack-koordinator.

Note

ack-koordinator is compatible with the earlier protocol version until July 30, 2023. We recommend that you upgrade the resource parameters of the earlier protocol version to the latest version.

The following table describes the compatibilities between ack-koordinator and different types of protocols.

ack-koordinator version

alibabacloud.com protocol

koordinator.sh protocol

≥ 0.2.0

Supported

Not supported

≥ 0.8.0

Supported

Supported