cgroup v2 unifies resource control, improves memory handling, and ensures Kubernetes compatibility.
Cgroup versions
The Linux kernel provides cgroup v1 and cgroup v2 to limit, account for, and isolate physical resources (such as CPU, memory, and I/O) for process groups. cgroup v2 resolves v1's multiple-hierarchy issues with a unified controller model. The filesystem interfaces are incompatible, so applications that directly access cgroupfs must be updated.
See Differences between cgroup v1 and cgroup v2.
Kubernetes v1.31 moved cgroup v1 support to maintenance mode, and v1.35 dropped support. Key advantages of cgroup v2:
-
Improved stability: Unified memory accounting manages the page cache effectively, resolving cgroup v1's issue where high-disk-I/O applications preempt memory and cause OOMKilled events.
-
Unified hierarchy: All resource controllers (such as CPU and memory) share a single hierarchy, eliminating configuration conflicts from cgroup v1's parallel hierarchies.
-
Enhanced resource observability: Pressure Stall Information (PSI) measures time stalled on CPU, memory, or I/O, providing fine-grained metrics for bottleneck analysis.
Check the cgroup version
Log on to a node and run the following command to check its cgroup version.
# Run this command after logging on to the target node
stat -fc %T /sys/fs/cgroup/
# Expected output:
# cgroup2fs --> Indicates cgroup v2
# tmpfs --> Indicates cgroup v1
Migration procedure
Node level: Change the operating system
The cgroup version of a node is determined by its operating system.
-
ECS (including EGS) node pools:
Change the operating system at the node pool level. The following operating systems use cgroup v2 by default:
-
Alibaba Cloud Linux 3.2104 LTS 64-bit container-optimized
-
Alibaba Cloud Linux 4 LTS 64-bit container-optimized
-
ContainerOS 3.3 and later
-
RHEL 9 and later
-
Ubuntu 22 and later
-
-
Reinstall the nodes and rejoin them to the cluster:
-
Remove the node: Remove a Lingjun node from the ACK cluster. Optionally perform a Drain before removal.
A node drain requires other nodes with sufficient resources to accommodate evicted Pods. Ensure the cluster has enough capacity.
-
Reinstall the OS: In the Lingjun console, reinstall the node with a cgroup v2-enabled OS image.
-
Rejoin the cluster: Add the Lingjun node back to the ACK cluster.
-
-
You manage the OS. Upgrade to a cgroup v2-compatible OS to prevent failures when upgrading or rejoining nodes.
Application level: Ensure workload compatibility
cgroup v1 and v2 have incompatible filesystem structures and parameter names. Any application that directly reads /sys/fs/cgroup must be verified or upgraded for cgroup v2.
|
Category |
Description |
|
Java applications |
|
|
Go applications |
If you use uber-go/automaxprocs, upgrade to v1.5.1 or later. |
|
cAdvisor |
If you deploy cAdvisor as a standalone DaemonSet, update to v0.43.0 or later. |
|
Nginx Ingress |
Older versions may trigger OOMKilled errors due to incorrect CPU core parsing in cgroup v2. Upgrade to v1.11.2 or later. See GitHub Issue #9665. To upgrade the ACK Nginx Ingress Controller, see Upgrade the Nginx Ingress Controller component. |
Other applications and components
-
Third-party monitoring and APM agents: Tools such as Prometheus Node Exporter, Datadog Agent, and SkyWalking read cgroupfs for metrics. Incompatible versions may cause data loss or anomalies. Upgrade to a version that supports cgroup v2.
-
Security and auditing tools: Tools such as Falco and Sysdig use cgroup data to attribute events. Incompatible versions may cause detection rule failures or false positives. Upgrade to a compatible version and verify rules in a test environment.
-
Performance-sensitive applications and custom scripts: Startup scripts that read cgroup files for auto-tuning (such as setting thread count from CPU quota) will fail under cgroup v2 due to path changes. Review and update these scripts for cgroup v2.
Production recommendations
-
Application compatibility
Verify your applications and scripts do not rely on cgroup v1 files such as
cpu.cfs_quota_us, as cgroup v2 uses incompatible interfaces. -
Custom node configurations
Changing the OS resets the node. Use node pool features to persist custom modifications. Related topics:
-
Monitoring and alerting: Enable Alibaba Cloud Prometheus monitoring to observe cluster health and container resource usage for prompt anomaly detection.