Configure L3 cache and MBA resource isolation to ensure application QoS - Container Service for Kubernetes

When high-priority and low-priority workloads share the same node, they compete for hardware resources like the L3 cache (last level cache, LLC) and memory bandwidth. This competition degrades quality of service (QoS) for latency-sensitive (LS) applications. By using Intel Resource Director Technology (RDT) with ack-koordinator, you can cap the L3 cache and Memory Bandwidth Allocation (MBA) available to BestEffort (BE) pods, protecting LS applications without changing pod scheduling.

Note For background on Kubernetes QoS classes and resource requests, see Pod QoS classes and Assign memory resources to containers and pods.

How it works

RDT partitions hardware resources at the node level. ack-koordinator reads the ack-slo-config ConfigMap and applies resctrl group settings to each pod based on its QoS class label. By default, BE pods get 30% of the L3 cache and LS pods get 100%. Memory bandwidth defaults to 100% for both classes and can be restricted as needed.

Prerequisites

Before you begin, ensure that you have:

An ACK cluster with at least one Elastic Compute Service (ECS) bare metal instance whose CPU supports RDT. See ECS bare metal instance overview and intel-cmt-cat for supported CPU models.
ack-koordinator installed at version 0.8.0 or later. See ack-koordinator (FKA ack-slo-manager).

Billing

Installing and using ack-koordinator is free. Costs may apply in two situations:

Node resource consumption: ack-koordinator is a non-managed component and runs on worker nodes. Specify per-module resource requests at install time to control overhead.
Prometheus metrics: If you enable Prometheus metrics for ack-koordinator and use Managed Service for Prometheus, the metrics count as custom metrics and are billed accordingly. Charges depend on cluster size and the number of applications. Review the Managed Service for Prometheus billing page before enabling, and monitor usage with Query observable data and bills.

Step 1: Check and enable RDT on the node kernel

Check whether the kernel has RDT enabled:
```
cat /proc/cmdline
```
Look for l3cat and mba in the output:
```
# Other fields omitted. This example shows only the RDT portion of the BOOT_IMAGE field.
BOOT_IMAGE=... rdt=cmt,l3cat,l3cdp,mba
```
If l3cat and mba appear, RDT is already enabled — skip to Step 2. Otherwise, continue with the steps below.
Add RDT options to the kernel boot parameters. Edit /etc/default/grub and append the RDT options to the GRUB_CMDLINE_LINUX field, separated from existing entries by a space:
Important
Separate the RDT options from existing settings with a space. Do not overwrite other kernel parameters.
```
# Other fields omitted. This example shows only the RDT portion of the GRUB_CMDLINE_LINUX field.
GRUB_CMDLINE_LINUX="... rdt=cmt,mbmtotal,mbmlocal,l3cat,l3cdp,mba"
```

Regenerate the GRUB configuration:

# The file path may differ depending on your OS distribution.
sudo grub2-mkconfig -o /boot/grub2/grub.cfg

Restart the node:
```
sudo systemctl reboot
```
After the node restarts, rerun cat /proc/cmdline to confirm that l3cat and mba appear in the output.

Step 2: Configure L3 cache and MBA isolation

L3 cache and MBA isolation is controlled through the ack-slo-config ConfigMap in the kube-system namespace. After you apply the ConfigMap, label pods with their QoS class so the policy takes effect.

Apply the ConfigMap

Create a file named configmap.yaml with the following content:

apiVersion: v1
kind: ConfigMap
metadata:
  name: ack-slo-config
  namespace: kube-system
data:
  # resource-qos-config: configures QoS-based resource isolation features.
  # resctrlQOS: controls Intel RDT (L3 cache and MBA) isolation per QoS class.
  #   enable: set to true to activate L3 cache and MBA isolation for pods of this class.
  resource-qos-config: |
    {
      "clusterStrategy": {
        "beClass": {
          "resctrlQOS": {
            "enable": true
          }
        }
      }
    }

Then apply the ConfigMap:

If ack-slo-config already exists in kube-system, patch it to preserve other settings:
```
kubectl patch cm -n kube-system ack-slo-config --patch "$(cat configmap.yaml)"
```
If the ConfigMap does not exist, create it:
```
kubectl apply -f configmap.yaml
```

(Optional) Tune isolation percentages per QoS class

To fine-tune the L3 cache and memory bandwidth allocation for each QoS class, update the ConfigMap with explicit percentages:

apiVersion: v1
kind: ConfigMap
metadata:
  name: ack-slo-config
  namespace: kube-system
data:
  resource-qos-config: |
    {
      "clusterStrategy": {
        "lsClass": {
          "resctrlQOS": {
            "enable": true,
            "catRangeEndPercent": 100,  # LS pods: 100% of L3 cache (default)
            "mbaPercent": 100           # LS pods: 100% memory bandwidth (default)
          }
        },
        "beClass": {
          "resctrlQOS": {
            "enable": true,
            "catRangeEndPercent": 30,   # BE pods: 30% of L3 cache (default)
            "mbaPercent": 100           # BE pods: 100% memory bandwidth (default)
          }
        }
      }
    }

The following table describes the resctrlQOS parameters:

Parameter	Type	Valid values	Default (LS class)	Default (BE class)	Description
`enable`	Boolean	`true` / `false`	—	—	Enables or disables L3 cache and MBA isolation for workloads in the cluster.
`catRangeEndPercent`	Int	0–100 (%)	100	30	Percentage of the L3 cache allocated to the QoS class.
`mbaPercent`	Int	0–100 (%), must be a multiple of 10	100	100	Percentage of memory bandwidth available to the QoS class.

Label pods with a QoS class

Apply the koordinator.sh/qosClass label to the pods you want to isolate. The following example creates a BE pod that runs a memory stress workload:

apiVersion: v1
kind: Pod
metadata:
  name: pod-demo
  labels:
    koordinator.sh/qosClass: 'BE'  # Assigns this pod to the BE QoS class.
spec:
  containers:
  - name: pod-demo
    image: polinux/stress
    resources:
      requests:
        cpu: 1
        memory: "50Mi"
      limits:
        cpu: 1
        memory: "1Gi"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "256M", "-c", "2", "--vm-hang", "1"]

Note For Deployments, set the koordinator.sh/qosClass label in the template.metadata field, not at the Deployment level.

Deploy the pod:

kubectl apply -f pod-demo.yaml