In Kubernetes 1.27 and earlier, changing container resource parameters for a running pod requires updating the PodSpec and resubmitting it, which deletes and recreates the pod. ACK lets you dynamically modify CPU, memory, and disk I/O isolation parameters of a pod on a single node using cgroup files — without restarting the pod.
This feature is intended for temporary emergency adjustments only, not for formal or routine operations. For regular resource management, use CPU Burst, CPU topology-aware scheduling, or resource profiling.
How it works
ACK uses a custom resource definition (CRD) of kind: Cgroups to communicate resource changes to the ack-koordinator component. When you apply a Cgroups resource, the koordlet daemon on the node writes the new values directly to the cgroup files on the host — bypassing the Kubernetes scheduler and kubelet resource reconciliation loop. The pod's PodSpec remains unchanged; only the cgroup values on the node are updated.
Use the spec.pod field to target a specific pod, or spec.deployment to apply the change to all pods in a Deployment.
Prerequisites
Before you begin, ensure that you have:
-
A kubectl client connected to the ACK cluster. For more information, see Connect to an ACK cluster using kubectl.
-
ack-koordinator version 0.5.0 or later installed. For more information, see ack-koordinator (ack-slo-manager).
Billing
ack-koordinator is free to install and use. Additional charges may apply in these situations:
-
Worker node resources: ack-koordinator is an unmanaged component that consumes resources on worker nodes after installation. Specify the resource requests for each module when you install the component.
-
Prometheus monitoring metrics: If you enable the Enable Prometheus monitoring metrics for ACK-Koordinator option and use Managed Service for Prometheus, the metrics are billed as custom metrics. Charges depend on cluster size and number of applications. Review the Prometheus instance billing documentation and use usage query to monitor consumption before enabling this feature.
Limitations
| Constraint | Detail |
|---|---|
| Scope | Temporary adjustments only. Does not update the pod's PodSpec or persist across pod restarts. |
| Cluster version (memory) | Cluster version 1.22 or later requires ack-koordinator v1.5.0-ack1.14 or later. Earlier component versions support only clusters running version 1.22 or earlier. |
| Disk I/O | Worker nodes must run Alibaba Cloud Linux. |
| cgroup v1 buffered I/O | In a cgroup v1 environment, blkio limits apply only to direct I/O. To limit buffered I/O, enable the cgroup writeback feature in Alibaba Cloud Linux. |
| cgroup v2 | Disk I/O throttling via blkio is not supported in cgroup v2 environments. |
| CPU limit | For temporary CPU limit adjustments, see the procedure in Migrate from resource-controller to ack-koordinator. |
Modify the memory limit
If a pod's memory usage is rising and you need to raise the limit without triggering the out-of-memory (OOM) killer, follow these steps. The example creates a pod with an initial memory limit of 1 GiB, then raises it to 5 GiB using a cgroup file.
-
Create
pod-demo.yamlwith the following content.apiVersion: v1 kind: Pod metadata: name: pod-demo spec: containers: - name: pod-demo image: registry-cn-beijing.ack.aliyuncs.com/acs/stress:v1.0.4 resources: requests: cpu: 1 memory: "50Mi" limits: cpu: 1 memory: "1Gi" # Initial memory limit: 1 GiB command: ["stress"] args: ["--vm", "1", "--vm-bytes", "256M", "-c", "2", "--vm-hang", "1"] -
Deploy the pod.
kubectl apply -f pod-demo.yaml -
Verify the initial memory limit. The cgroup path is constructed from the pod UID and container ID.
cat /sys/fs/cgroup/memory/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podaf44b779_41d8_43d5_a0d8_8a7a0b17****.slice/memory.limit_in_bytesExpected output:
1073741824The value
1073741824equals 1 × 1024³ bytes (1 GiB), which matchesspec.containers.resources.limits.memoryin the pod definition. -
Create
cgroups-sample.yamlto set the new memory limit.apiVersion: resources.alibabacloud.com/v1alpha1 kind: Cgroups metadata: name: cgroups-sample spec: pod: name: pod-demo namespace: default containers: - name: pod-demo memory: 5Gi # New memory limit: 5 GiB -
Apply the Cgroups resource.
kubectl apply -f cgroups-sample.yaml -
Verify the updated memory limit.
cat /sys/fs/cgroup/memory/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podaf44b779_41d8_43d5_a0d8_8a7a0b17****.slice/memory.limit_in_bytesExpected output:
5368709120The value
5368709120equals 5 × 1024³ bytes (5 GiB), which matchesspec.pod.containers.memoryin the Cgroups resource. -
Confirm that the pod was not restarted.
kubectl describe pod pod-demoIn the
Eventssection, confirm there are no restart events — only the original scheduling and startup events:Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 36m default-scheduler Successfully assigned default/pod-demo to cn-hangzhou.192.168.0.50 Normal AllocIPSucceed 36m terway-daemon Alloc IP 192.XX.XX.51/24 took 4.490542543s Normal Pulling 36m kubelet Pulling image "registry-cn-beijing.ack.aliyuncs.com/acs/stress:v1.0.4" Normal Pulled 36m kubelet Successfully pulled image "registry-cn-beijing.ack.aliyuncs.com/acs/stress:v1.0.4" in 2.204s (2.204s including waiting). Image size: 7755078 bytes. Normal Created 36m kubelet Created container pod-demo Normal Started 36m kubelet Started container pod-demoNo restart events confirm the memory limit was updated in place.
Modify the CPU core binding scope
For CPU-intensive applications requiring stricter resource isolation, bind a pod to specific CPU cores. The example creates a pod with no CPU binding (cores 0–31 available), then restricts it to cores 2–3.
For persistent CPU core binding in production, use CPU topology-aware scheduling instead.
-
Create
pod-cpuset-demo.yamlwith the following content.apiVersion: v1 kind: Pod metadata: name: pod-cpuset-demo spec: containers: - name: pod-cpuset-demo image: registry-cn-beijing.ack.aliyuncs.com/acs/stress:v1.0.4 resources: requests: memory: "50Mi" limits: memory: "1000Mi" cpu: 0.5 command: ["stress"] args: ["--vm", "1", "--vm-bytes", "556M", "-c", "2", "--vm-hang", "1"] -
Deploy the pod.
kubectl apply -f pod-cpuset-demo.yaml -
Check the current CPU core binding. The path is constructed from the pod UID and container ID.
cat /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf9b79bee_eb2a_4b67_befe_51c270f8****.slice/cri-containerd-aba883f8b3ae696e99c3a920a578e3649fa957c51522f3fb00ca943dc2c7****.scope/cpuset.cpusExpected output:
0-31The range
0-31means the container has access to all 32 CPU cores with no constraints. -
Create
cgroups-sample-cpusetpod.yamlto set the CPU binding.apiVersion: resources.alibabacloud.com/v1alpha1 kind: Cgroups metadata: name: cgroups-sample-cpusetpod spec: pod: name: pod-cpuset-demo namespace: default containers: - name: pod-cpuset-demo cpuset-cpus: 2-3 # Restrict the pod to CPU cores 2 and 3 -
Apply the Cgroups resource.
kubectl apply -f cgroups-sample-cpusetpod.yaml -
Verify the updated CPU binding.
cat /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf9b79bee_eb2a_4b67_befe_51c270f8****.slice/cri-containerd-aba883f8b3ae696e99c3a920a578e3649fa957c51522f3fb00ca943dc2c7****.scope/cpuset.cpusExpected output:
2-3The output confirms the container is bound to CPU cores 2 and 3, matching
spec.pod.containers.cpuset-cpusin the Cgroups resource. -
Confirm that the pod was not restarted.
kubectl describe pod pod-cpuset-demoThe
Eventssection includes aCPUSetBindevent from koordlet but no restart events:Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 7m7s default-scheduler Successfully assigned default/pod-cpuset-demo to cn-hangzhou.192.XX.XX.50 Normal AllocIPSucceed 7m5s terway-daemon Alloc IP 192.XX.XX.56/24 took 2.060752512s Normal Pulled 7m5s kubelet Container image "registry-cn-beijing.ack.aliyuncs.com/acs/stress:v1.0.4" already present on machine Normal Created 7m5s kubelet Created container pod-cpuset-demo Normal Started 7m5s kubelet Started container pod-cpuset-demo Normal CPUSetBind 84s koordlet set cpuset 2-3 to container pod-cpuset-demo success
Modify disk I/O parameters
Worker nodes must run Alibaba Cloud Linux to use this feature. In a cgroup v1 environment, blkio limits apply only to direct I/O. To limit buffered I/O, enable the cgroup writeback feature for cgroup v1 in Alibaba Cloud Linux. This feature is not supported in cgroup v2 environments.
The example deploys an I/O-intensive test application using fio, then limits its write throughput using a cgroup file.
-
Create
fio-demo.yamlwith the following content. The host directory/mntis mounted into the pod at/data, corresponding to the disk device/dev/vda1.apiVersion: apps/v1 kind: Deployment metadata: name: fio-demo labels: app: fio-demo spec: selector: matchLabels: app: fio-demo template: metadata: labels: app: fio-demo spec: containers: - name: fio-demo image: registry.cn-zhangjiakou.aliyuncs.com/acs/fio-for-slo-test:v0.1 command: ["sh", "-c"] # Run a sequential write test on disk I/O using fio args: ["fio -filename=/data/test -direct=1 -iodepth 1 -thread -rw=write -ioengine=psync -bs=16k -size=2G -numjobs=10 -runtime=12000 -group_reporting -name=mytest"] volumeMounts: - name: pvc mountPath: /data volumes: - name: pvc hostPath: path: /mnt -
Deploy the application.
kubectl apply -f fio-demo.yaml -
Limit the write throughput using a cgroup file.
-
Create
cgroups-sample-fio.yamlto set a bytes-per-second (BPS) write limit on/dev/vda1.apiVersion: resources.alibabacloud.com/v1alpha1 kind: Cgroups metadata: name: cgroups-sample-fio spec: deployment: name: fio-demo namespace: default containers: - name: fio-demo blkio: # BPS limit in bytes per second (e.g., 1048576 = 1 MiB/s) device_write_bps: [{device: "/dev/vda1", value: "1048576"}] -
Apply the Cgroups resource.
kubectl apply -f cgroups-sample-fio.yaml -
Verify the updated disk I/O limit. The path is constructed from the pod UID and container ID.
cat /sys/fs/cgroup/blkio/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod0840adda_bc26_4870_adba_f193cd00****.slice/cri-containerd-9ea6cc97a6de902d941199db2fcda872ddd543485f5f987498e40cd706dc****.scope/blkio.throttle.write_bps_deviceExpected output:
253:0 1048576The output confirms the write BPS limit is set to
1048576bytes per second (1 MiB/s) for device253:0, which corresponds to/dev/vda1. The pod was not restarted during the modification.
-
-
To view disk monitoring data in Prometheus, go to the console and choose Operations > Prometheus Monitoring. On the Application Monitoring tab, filter for the sample application. For setup instructions, see Connect to and configure Alibaba Cloud Prometheus Monitoring.
Apply changes at the Deployment level
All the procedures above work at the Deployment level as well. Pod-level modifications use spec.pod; Deployment-level modifications use spec.deployment. The following example applies CPU core binding to a Deployment.
-
Create
go-demo.yamlwith the following content. The Deployment runs two replicas, each using 0.5 CPU cores.apiVersion: apps/v1 kind: Deployment metadata: name: go-demo labels: app: go-demo spec: replicas: 2 selector: matchLabels: app: go-demo template: metadata: labels: app: go-demo spec: containers: - name: go-demo image: registry-cn-beijing.ack.aliyuncs.com/acs/stress:v1.0.4 command: ["stress"] args: ["--vm", "1", "--vm-bytes", "556M", "-c", "1", "--vm-hang", "1"] imagePullPolicy: Always resources: requests: cpu: 0.5 limits: cpu: 0.5 -
Deploy the application.
kubectl apply -f go-demo.yaml -
Create
cgroups-cpuset-sample.yamlto bind the Deployment's pods to specific CPU cores.apiVersion: resources.alibabacloud.com/v1alpha1 kind: Cgroups metadata: name: cgroups-cpuset-sample spec: deployment: # Targets a Deployment, not a single pod name: go-demo namespace: default containers: - name: go-demo cpuset-cpus: 2,3 # Bind to CPU cores 2 and 3 -
Apply the Cgroups resource.
kubectl apply -f cgroups-cpuset-sample.yaml -
Verify the CPU binding for one of the pods. The path is constructed from the pod UID and container ID.
cat /sys/fs/cgroup/cpuset/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod06de7408_346a_4d00_ba25_02833b6c****.slice/cri-containerd-733a0dc93480eb47ac6c5abfade5c22ed41639958e3d304ca1f85959edc3****.scope/cpuset.cpusExpected output:
2-3The output confirms the container is bound to CPU cores 2 and 3, matching
spec.deployment.containers.cpuset-cpusin the Cgroups resource.
What's next
-
CPU Burst: Containers accumulate unused CPU time slices and spend them during traffic spikes, reducing latency and improving quality of service. See Enable the CPU Burst policy.
-
CPU topology-aware scheduling: Pin pods to specific CPU cores at scheduling time to eliminate CPU context-switching overhead and cross-NUMA memory access. See Enable CPU topology-aware scheduling.
-
Dynamic resource overselling: Reclaim allocated-but-unused resources and make them available to lower-priority workloads. See Enable dynamic resource overselling.
-
Resource profiling: Analyze historical usage data to get right-sizing recommendations for container requests and limits. See Resource profiling.