All Products
Search
Document Center

Container Service for Kubernetes:Enable CPU QoS for containers

Last Updated:Mar 26, 2026

In colocation environments, latency-sensitive (LS) and best-effort (BE) applications share the same node. CPU requests and limits alone cannot prevent BE workloads from interfering with LS workloads—especially under high load. CPU QoS uses Linux scheduling priorities via the group identity feature to give LS Pods higher kernel scheduling priority than BE Pods, reducing interference in colocation scenarios.

To understand the underlying concepts, see Pod quality of service classes and Assign memory resources to containers and Pods in the Kubernetes documentation.

How CPU QoS works

ack-koordinator assigns a group identity to each CPU cgroup. The kernel uses these identities to schedule tasks with differentiated priorities:

  • Faster scheduling for LS tasks: The OS schedules LS application tasks with higher priority, improving response speed and throughput.

  • No preemption by BE tasks: When a BE task wakes up, it cannot preempt a running LS process.

  • SMT isolation: In Simultaneous MultiThreading (SMT) environments, BE tasks do not run in parallel with LS tasks on the same physical core, preventing resource contention at the hardware level.

Choose a QoS class

QoS classCPU scheduling priority
LS (latency-sensitive)High (group identity: 2)
BE (best-effort)Low (group identity: -1)

Pods labeled koordinator.sh/qosClass: LS are treated as high-priority. Pods labeled koordinator.sh/qosClass: BE are treated as low-priority. For Pods without this label, ack-koordinator falls back to the native Kubernetes QoS class: BestEffort Pods map to BE, and all others map to LS.

Prerequisites

Before you begin, make sure you have:

If your nodes do not run Alibaba Cloud Linux, use CPU Suppress to limit BE Pod CPU usage instead.

Billing

ack-koordinator is free to install and use. Additional costs may apply in the following cases:

  • Worker node resources: ack-koordinator is self-managed and consumes worker node CPU and memory after installation. Configure resource requests for each module during installation.

  • Prometheus monitoring: If you enable the Enable Prometheus Monitoring for ACK-Koordinator option and use Alibaba Cloud Prometheus, the exposed metrics are billed as custom metrics. Costs depend on cluster size and application count. Review the billing of Prometheus instances documentation before enabling this option, and query usage data to monitor your consumption.

Enable CPU QoS

Enable CPU QoS at the cluster level using a ConfigMap, then label your Pods with the appropriate QoS class.

Step 1: Apply the ConfigMap

  1. Create a file named configmap.yaml with the following content:

    To apply CPU QoS to a workload such as a Deployment, add the koordinator.sh/qosClass label to the Pod template under template.metadata.
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: ack-slo-config
      namespace: kube-system
    data:
      # Enable the CPU QoS feature for containers.
      resource-qos-config: |
        {
          "clusterStrategy": {
            "lsClass": {
              "cpuQOS": {
                "enable": true,
                "groupIdentity": 2
              }
            },
            "beClass": {
              "cpuQOS": {
                "enable": true,
                "groupIdentity": -1
              }
            }
          }
        }

    The lsClass field configures LS Pods; beClass configures BE Pods. The following table describes the key parameters under cpuQOS.

    ParameterTypeValid valuesDescription
    enableBooleantrue / falsetrue: enables CPU QoS cluster-wide. false: disables it.
    groupIdentityInteger-1 to 2The kernel scheduling priority for the cgroup. Higher values mean higher priority. Default: 2 for LS Pods, -1 for BE Pods. Set to 0 to disable group identity for that class.
  2. Check whether the ack-slo-config ConfigMap already exists in the kube-system namespace:

    • If it exists, run the following patch command to update it without overwriting other configuration fields:

      kubectl patch cm -n kube-system ack-slo-config --patch "$(cat configmap.yaml)"
    • If it does not exist, create it:

      kubectl apply -f configmap.yaml

Step 2: Deploy an LS Pod and verify

  1. Create a file named ls-pod-demo.yaml with the following content:

    apiVersion: v1
    kind: Pod
    metadata:
      name: ls-pod-demo
      labels:
        koordinator.sh/qosClass: 'LS' # Specify the QoS class of the Pod as LS.
    spec:
      containers:
      - command:
        - httpd
        - -D
        - FOREGROUND
        image: registry.cn-zhangjiakou.aliyuncs.com/acs/apache-2-4-51-for-slo-test:v0.1
        imagePullPolicy: Always
        name: apache
        resources:
          limits:
            cpu: "4"
            memory: 10Gi
          requests:
            cpu: "4"
            memory: 10Gi
      restartPolicy: Never
      schedulerName: default-scheduler
  2. Deploy the Pod:

    kubectl apply -f ls-pod-demo.yaml
  3. On the node, check the group identity assigned to the LS Pod's cgroup:

    cat /sys/fs/cgroup/cpu/kubepods.slice/kubepods-pod1c20f2ad****.slice/cpu.bvt_warp_ns

    Expected output:

    # The group identity of the LS Pod is 2, which indicates high priority.
    2

Step 3: Deploy a BE Pod and verify

  1. Create a file named be-pod-demo.yaml with the following content:

    apiVersion: v1
    kind: Pod
    metadata:
      name: be-pod-demo
      labels:
        koordinator.sh/qosClass: 'BE' # Specify the QoS class of the Pod as BE.
    spec:
      containers:
        - args:
            - '-c'
            - '1'
            - '--vm'
            - '1'
          command:
            - stress
          image: registry-cn-beijing.ack.aliyuncs.com/acs/stress:v1.0.4
          imagePullPolicy: Always
          name: stress
      restartPolicy: Always
      schedulerName: default-scheduler
  2. Deploy the Pod:

    kubectl apply -f be-pod-demo.yaml
  3. On the node, check the group identity assigned to the BE Pod's cgroup:

    cat /sys/fs/cgroup/cpu/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod4b6e96c8****.slice/cpu.bvt_warp_ns

    Expected output:

    # The group identity of the BE Pod is -1, which indicates low priority.
    -1

The LS Pod has a group identity of 2 and the BE Pod has a group identity of -1. This difference confirms that CPU QoS is active: the kernel prioritizes CPU resources for LS workloads over BE workloads.

FAQ

Protocol compatibility when upgrading to ack-koordinator

Earlier versions of ack-slo-manager (v0.8.0 and earlier) used the alibabacloud.com/qosClass annotation to configure CPU QoS. ack-koordinator supports both protocols, so you can upgrade the component and migrate your Pods to the koordinator.sh protocol incrementally. Note that support for the older alibabacloud.com protocol ended on July 30, 2023—update your resources to the new protocol as soon as possible.

Component versionalibabacloud.com protocolkoordinator.sh protocol
>= 0.5.2 and < 0.8.0SupportedNot supported
>= 0.8.0SupportedSupported

What's next