All Products
Search
Document Center

Alibaba Cloud Linux: Differences between cgroup v1 and cgroup v2

Last Updated:Apr 02, 2026

Control groups (cgroups) are a Linux kernel feature that restricts, accounts for, and isolates the physical resources (such as CPU, memory, and I/O) of process groups. A parent process can use cgroups to manage the resource consumption of its child process groups.

This page lists the interface differences between cgroup v1 and cgroup v2. Each table maps v1 interfaces to their v2 equivalents, or marks them as N/A when no equivalent exists in v2.

How cgroup v2 differs from cgroup v1

Before reviewing the interface tables, understanding two core architectural changes in cgroup v2 helps explain why some interfaces were removed, renamed, or restructured.

No processes in inner nodes. In cgroup v2, processes can only be attached to leaf cgroups — a cgroup that has child cgroups cannot directly contain processes. This constraint eliminates the ambiguity that arose in v1 when a process belonged to both a parent and a child cgroup.

Unified hierarchy. cgroup v1 allowed each subsystem to be mounted on a separate hierarchy. cgroup v2 uses a single unified hierarchy, which is why per-subsystem mount options and named hierarchies are gone.

These two rules explain several v1 interfaces that have no v2 equivalent: tasks, cgroup.clone_children, cgroup.sane_behavior, and the entire cpuacct subsystem (its accounting functionality is folded into the cpu subsystem in v2).

Naming conventions in v2. cgroup v2 uses consistent naming patterns across subsystems:

  • min / max — hard guarantees and limits

  • low / high — soft guarantees and limits

  • weight — proportional resource distribution (range: 1–10000, default: 100)

For example, memory.min is a hard memory guarantee, memory.low is a soft guarantee, and memory.high is a soft limit — all follow this pattern.

General interface differences

cgroup v1 interfaces

Interface name Purpose In-house interface Corresponding cgroup v2 interface
cgroup.procs Moves a process into the cgroup by writing its PID to this file. No cgroup.procs
cgroup.clone_children When set to 1, a child cgroup inherits the cpuset configuration of its parent. Applies only to the cpuset subsystem. No N/A
cgroup.sane_behavior An interface for an experimental v2 feature, kept for backward compatibility after the official release of v2. No N/A
notify_on_release When set to 1, the system runs the release_agent script when the cgroup becomes empty. Exists only in the root cgroup. No cgroup.events, which provides similar functionality
release_agent No
tasks Moves a thread into the cgroup by writing its thread ID (TID) to this file. No cgroup.threads
pool_size Controls the cgroup cache pool size. In high-concurrency scenarios, this accelerates cgroup creation and binding. Depends on cgroup_rename. Not available in cgroup v2. Yes N/A

cgroup v2 interfaces

Interface name Purpose In-house interface Corresponding cgroup v1 interface
cgroup.procs Moves a process into the cgroup by writing its PID to this file. No cgroup.procs
cgroup.type Write threaded to enable thread-granularity control. Supported only for the cpu, pids, and perf_event subsystems. No N/A
cgroup.threads Moves a thread into the cgroup by writing its TID to this file. Requires cgroup.type to be set to threaded. No tasks
cgroup.controllers Lists the subsystems enabled for the current cgroup. No N/A
cgroup.subtree_control Controls which subsystems are enabled for child cgroups. The enabled subsystems must be a subset of those listed in cgroup.controllers. No N/A
cgroup.events Records whether the cgroup is managing processes and whether it is frozen. Monitor status changes using fsnotify. Does not exist in the root cgroup. No notify_on_release and release_agent (similar functionality)
cgroup.max.descendants Controls the maximum number of descendant cgroups. No N/A
cgroup.max.depth Controls the maximum depth of descendant cgroups. No N/A
cgroup.stat Shows the number of descendant cgroups and the number of descendant cgroups in a dying state (being destroyed). No N/A
cgroup.freeze Freezes or unfreezes all processes in the cgroup. Does not exist in the root cgroup. No freezer.state in the freezer subsystem
cpu.stat Shows CPU usage statistics. No N/A
io.pressure Shows Pressure Stall Information (PSI). Supports poll. See psi.rst and Enable the PSI feature for cgroup v1. No io.pressure, memory.pressure, and cpu.pressure under the cpuacct subsystem (after enabling PSI for cgroup v1)
memory.pressure No
cpu.pressure No

Subsystem interface differences

CPU

In cgroup v2, the cpuacct subsystem no longer exists. Its functionality — CPU accounting and extended statistics — is now part of the cpu subsystem.

cgroup v1 interfaces

Interface name Purpose In-house interface Corresponding cgroup v2 interface
cpu.shares Controls the weight used to allocate CPU time proportionally. Default value: 1024. No cpu.weight, cpu.weight.nice (different units)
cpu.idle Sets the cgroup scheduling policy to idle. An idle group receives the minimum CPU share and is no longer guaranteed a minimum runtime, making it more likely to yield the CPU to non-idle processes. When cpu.idle is 1, cpu.shares becomes read-only and changes to 3. No cpu.idle
cpu.priority Sets a fine-grained preemptive priority. Preemption is determined at clock interrupts or wake-ups and adjusted based on the priority difference, making it easier for high-priority tasks to preempt low-priority ones. Yes cpu.priority
cpu.cfs_quota_us The maximum CPU runtime for tasks in a cgroup within a period defined by cpu.cfs_period_us (CFS bandwidth control). No cpu.max
cpu.cfs_period_us No
cpu.cfs_burst_us The burst time a process can run within a cpu.cfs_period_us period. See Enable the CPU burst feature for cgroup v1. No cpu.max.burst
cpu.cfs_init_buffer_us The burst time a process can run at startup. Yes cpu.max.init_buffer
cpu.stat Shows statistics related to CFS bandwidth control, such as the number of periods elapsed and the number of throttling events. No cpu.stat
cpu.rt_runtime_us Real-time (RT) task bandwidth control. Within a period defined by cpu.rt_period_us, processes in the group can run for a maximum of cpu.rt_runtime_us. No N/A
cpu.rt_period_us No N/A
cpu.bvt_warp_ns Controls the group identity attribute to differentiate online and offline processes, providing better CPU quality of service (QoS) for online processes. See Group identity feature. Yes cpu.bvt_warp_ns
cpu.identity Yes cpu.identity
cpu.ht_stable Controls whether to generate noise on the simultaneous multithreading (SMT) sibling to stabilize SMT computing power. Yes N/A
cpu.ht_ratio Controls whether extra quota is calculated when an SMT sibling is idle, used to stabilize SMT computing power. Yes cpu.ht_ratio

cgroup v2 interfaces

Interface name Purpose In-house interface Corresponding cgroup v1 interface
cpu.weight Controls the weight used to allocate CPU time proportionally. Default value: 100. No cpu.shares (different units)
cpu.weight.nice Controls the weight used to allocate CPU time proportionally, expressed as a nice value. Default value: 0. No cpu.shares (different units)
cpu.idle Sets the cgroup scheduling policy to idle. An idle group receives the minimum CPU share and yields the CPU to non-idle processes. When cpu.idle is 1, cpu.weight and cpu.weight.nice become read-only and are set to the minimum weight (0.3). Due to rounding, reading cpu.weight returns 0. No cpu.idle
cpu.priority Sets a fine-grained preemptive priority. Preemption is determined at clock interrupts or wake-ups and scaled based on the priority difference. Yes cpu.priority
cpu.max CFS bandwidth control. Contains two values: quota and period. Within the period, processes in the group can run for a maximum of quota time. No cpu.cfs_quota_us, cpu.cfs_period_us
cpu.max.burst The burst time a process can run within the period defined by cpu.max. No cpu.cfs_burst_us
cpu.max.init_buffer The burst time a process can run at startup. Yes cpu.cfs_init_buffer_us
cpu.bvt_warp_ns Controls the group identity attribute to differentiate offline processes, providing better CPU QoS for online processes. Yes cpu.bvt_warp_ns
cpu.identity Yes cpu.identity
cpu.sched_cfs_statistics Provides CFS-related statistics, such as run time and time spent waiting for sibling or non-sibling cgroups. Requires kernel.sched_schedstats to be enabled. Yes cpuacct.sched_cfs_statistics
cpu.wait_latency The latency distribution of processes waiting in the run queue. Requires kernel.sched_schedstats and /proc/cpusli/sched_lat_enabled to be enabled. Yes cpuacct.wait_latency
cpu.cgroup_wait_latency The latency distribution of process groups waiting in the run queue. Tracks the group sched_entity, while cpu.wait_latency tracks the task sched_entity. Requires kernel.sched_schedstats and /proc/cpusli/sched_lat_enabled to be enabled. Yes cpuacct.cgroup_wait_latency
cpu.block_latency The latency distribution of processes blocked for non-I/O reasons. Requires kernel.sched_schedstats and /proc/cpusli/sched_lat_enabled to be enabled. Yes cpuacct.block_latency
cpu.ioblock_latency The latency distribution of processes blocked for I/O reasons. Requires kernel.sched_schedstats and /proc/cpusli/sched_lat_enabled to be enabled. Yes cpuacct.ioblock_latency
cpu.ht_ratio Controls whether extra quota is calculated when an SMT sibling is idle, used to stabilize SMT computing power. Takes effect only when core scheduling is enabled. Yes cpu.ht_ratio

cpuset

cgroup v1 interfaces

Interface name Purpose In-house interface Corresponding cgroup v2 interface
cpuset.cpus Controls the CPUs on which tasks can run. Tasks cannot be attached to a cgroup when this interface is empty. No cpuset.cpus
cpuset.mems Controls the non-uniform memory access (NUMA) nodes that can be allocated to tasks in a cgroup. Tasks cannot be attached to a cgroup when this interface is empty. No cpuset.mems
cpuset.effective_cpus Queries the effective CPUs on which tasks are running. Affected by CPU hotplug events. No cpuset.cpus.effective
cpuset.effective_mems Queries the effective NUMA nodes allocated to running tasks. Affected by memory node hotplug events. No cpuset.mems.effective
cpuset.cpu_exclusive Controls which CPUs are used exclusively by this cgroup and cannot be used by other cpusets at the same level. No cpuset.cpus.partition (similar functionality)
cpuset.mem_exclusive Controls which NUMA nodes are used exclusively by this cgroup and cannot be used by other cpusets at the same level. No N/A
cpuset.mem_hardwall When set to 1, tasks can only allocate memory from the memory nodes attached to the cpuset. No N/A
cpuset.sched_load_balance Controls whether CPUs are load-balanced within the cpuset. Enabled by default. No N/A
cpuset.sched_relax_domain_level Controls the search range for CPUs when the scheduler migrates tasks to balance load. Default value: -1. Values: -1 (default system policy), 0 (no search), 1 (hyperthreads in same core), 2 (cores in same package), 3 (CPUs on same node), 4 (CPUs on same chunk), 5 (entire system). No N/A
cpuset.memory_migrate When set to a non-zero value, if a task is allocated a memory page in a cpuset and then migrated to another cpuset, the memory page migrates to the new cpuset as well. No N/A
cpuset.memory_pressure Calculates the memory paging pressure of the current cpuset. No N/A
cpuset.memory_spread_page When set to 1, the kernel distributes the page cache evenly across the cpuset's memory nodes. No N/A
cpuset.memory_spread_slab When set to 1, the kernel distributes slab caches evenly across the cpuset's memory nodes. No N/A
cpuset.memory_pressure_enabled When set to 1, enables memory pressure statistics collection for the cpuset. No N/A

cgroup v2 interfaces

Interface name Purpose In-house interface Corresponding cgroup v1 interface
cpuset.cpus Controls the CPUs on which tasks can run. When empty, the CPUs of the parent cpuset are used. No cpuset.cpus
cpuset.mems Controls the NUMA nodes that can be allocated to tasks in a cgroup. When empty, the NUMA nodes of the parent cpuset are used. No cpuset.mems
cpuset.cpus.effective Queries the effective CPUs on which tasks are running. Affected by CPU hotplug events. No cpuset.effective_cpus
cpuset.mems.effective Queries the effective NUMA nodes allocated to running tasks. Affected by memory node hotplug events. No cpuset.effective_mems
cpuset.cpus.partition Controls whether the CPUs of a cpuset are used exclusively. Write root to enable exclusive use. No cpuset.cpu_exclusive (similar functionality)
.__DEBUG__.cpuset.cpus.subpartitions Queries which CPUs are used exclusively when root is written to cpuset.cpus.partition. Available only when the cgroup_debug feature is enabled in the kernel cmdline. No N/A

blkio

cgroup v1 interfaces

Interface name Purpose In-house interface Corresponding cgroup v2 interface
blkio.throttle.read_bps_device Specifies the maximum read throughput (bytes/second) for a device. Format: echo "<major>:<minor> <bps>" > /sys/fs/cgroup/blkio/<cgroup>/blkio.throttle.read_bps_device No io.max
blkio.throttle.write_bps_device Specifies the maximum write throughput (bytes/second) for a device. Format: echo "<major>:<minor> <bps>" > /sys/fs/cgroup/blkio/<cgroup>/blkio.throttle.write_bps_device No io.max
blkio.throttle.read_iops_device Specifies the maximum read IOPS for a device. Format: echo "<major>:<minor> <iops>" > /sys/fs/cgroup/blkio/<cgroup>/blkio.throttle.read_iops_device No io.max
blkio.throttle.write_iops_device Specifies the maximum write IOPS for a device. Format: echo "<major>:<minor> <iops>" > /sys/fs/cgroup/blkio/<cgroup>/blkio.throttle.write_iops_device No io.max
blkio.throttle.io_service_bytes Queries bandwidth statistics (read, write, sync, async, discard, and total) across all devices. Unit: bytes. No io.stat
blkio.throttle.io_service_bytes_recursive Recursive version of blkio.throttle.io_service_bytes. Includes statistics from descendant cgroups. No N/A
blkio.throttle.io_serviced Queries IOPS statistics (read, write, sync, async, discard, and total) across all devices. No io.stat
blkio.throttle.io_serviced_recursive Recursive version of blkio.throttle.io_serviced. Includes statistics from descendant cgroups. No N/A
blkio.throttle.io_service_time Queries the duration between request dispatch and completion for I/O operations (average I/O latency). See Enhance the monitoring of block I/O throttling. Yes io.extstat
blkio.throttle.io_wait_time Queries how long I/O operations wait in scheduler queues (average I/O latency). See Enhance the monitoring of block I/O throttling. Yes io.extstat
blkio.throttle.io_completed Queries the number of completed I/O operations (average I/O latency). See Enhance the monitoring of block I/O throttling. Yes io.extstat
blkio.throttle.total_bytes_queued Queries the number of throttled I/O bytes, used to determine whether I/O latency is throttling-related. See Enhance the monitoring of block I/O throttling. Yes io.extstat
blkio.throttle.total_io_queued Queries the number of throttled I/O operations, used to determine whether I/O latency is throttling-related. See Enhance the monitoring of block I/O throttling. Yes io.extstat
blkio.cost.model Specifies the blk-iocost cost model. The control mode (ctrl) can be auto or user. Exists only in the root cgroup. Format: echo "<major>:<minor> ctrl=user model=linear rbps=<rbps> rseqiops=<rseqiops> rrandiops=<rrandiops> wbps=<wbps> wseqiops=<wseqiops> wrandiops=<wrandiops>" > /sys/fs/cgroup/blkio/blkio.cost.model. See Configure the blk-iocost weight-based throttling feature. Yes io.cost.model
blkio.cost.qos Controls the blk-iocost feature and configures a QoS policy to detect disk congestion. Exists only in the root cgroup. Format: echo "<major>:<minor> enable=1 ctrl=user rpct= rlat=5000 wpct=95.00 wlat=5000 min=50.00 max=150.00" > /sys/fs/cgroup/blkio/blkio.cost.qos. See Configure the blk-iocost weight-based throttling feature. Yes io.cost.qos
blkio.cost.weight Specifies the cgroup weight for blk-iocost. Exists only in non-root cgroups. Modes: weight (same weight for all devices) or major:minor weight (per-device weight). See Configure the blk-iocost weight-based throttling feature. Yes io.cost.weight
blkio.cost.stat Queries blk-iocost statistics. Exists only in non-root cgroups. Yes N/A

cgroup v2 interfaces

Interface name Purpose In-house interface Corresponding cgroup v1 interface
io.max Throttling interface. Specifies read and write limits in bytes/second and IOPS. Format: echo "<major>:<minor> rbps=<bps> wbps=<bps> riops=<iops> wiops=<iops>" > /sys/fs/cgroup/<cgroup>/io.max No blkio.throttle.read_bps_device, blkio.throttle.read_iops_device, blkio.throttle.write_bps_device, blkio.throttle.write_iops_device
io.stat Queries I/O operation statistics, including read, write, and discard rates in bytes/second and IOPS. No blkio.throttle.io_service_bytes, blkio.throttle.io_serviced
io.extstat Queries extended I/O statistics, including wait time, service time, number of completed I/O operations, and throttling rates. No blkio.throttle.io_service_time, blkio.throttle.io_wait_time, blkio.throttle.io_completed, blkio.throttle.total_bytes_queued, blkio.throttle.total_io_queued
io.cost.model Specifies the blk-iocost cost model. The control mode (ctrl) can be auto or user. Exists only in the root cgroup. Format: echo "<major>:<minor> ctrl=user model=linear rbps=<rbps> rseqiops=<rseqiops> rrandiops=<rrandiops> wbps=<wbps> wseqiops=<wseqiops> wrandiops=<wrandiops>" > /sys/fs/cgroup/io.cost.model. See Configure the blk-iocost weight-based throttling feature. No blkio.cost.model
io.cost.qos Controls the blk-iocost feature and configures a QoS policy to detect disk congestion. Exists only in the root cgroup. Format: echo "<major>:<minor> enable=1 ctrl=user rpct= rlat=5000 wpct=95.00 wlat=5000 min=50.00 max=150.00" > /sys/fs/cgroup/io.cost.qos. See Configure the blk-iocost weight-based throttling feature. No blkio.cost.qos
io.cost.weight Specifies the cgroup weight for blk-iocost. Exists only in non-root cgroups. Modes: weight (same weight for all devices) or major:minor weight (per-device weight). See Configure the blk-iocost weight-based throttling feature. No blkio.cost.weight

memory

cgroup v1 interfaces

Interface name Purpose In-house interface Corresponding cgroup v2 interface
memory.usage_in_bytes Queries the current memory usage. No N/A
memory.max_usage_in_bytes Queries the peak memory usage. No N/A
memory.limit_in_bytes Sets a hard upper limit on memory usage. No N/A
memory.soft_limit_in_bytes Sets a soft lower limit on memory usage. No N/A
memory.failcnt Queries the number of times memory usage reached the upper limit. No N/A
memory.mglru_batch_size Specifies the batch size for proactive memory reclamation based on the Multi-Generational Least Recently Used (MGLRU) framework. The kernel attempts to release CPUs between reclamation batches. Yes N/A
memory.mglru_reclaim_kbytes Specifies the amount of memory to reclaim proactively based on the MGLRU framework. Yes N/A
memory.wmark_ratio Controls the memcg backend asynchronous reclaim feature by setting the memory watermark that triggers asynchronous reclamation. Unit: percentage of the memcg memory limit. Valid values: 0–100. When 0 (default), asynchronous reclaim is disabled. See Memcg backend asynchronous reclaim. Yes memory.wmark_ratio
memory.wmark_high Read-only. When memory usage exceeds this value, backend asynchronous reclamation starts. Calculated as: memory.limit_in_bytes × memory.wmark_ratio / 100. Not stored in the memcg root directory. See Memcg backend asynchronous reclaim. Yes
memory.wmark_low Read-only. When memory usage falls below this value, backend asynchronous reclamation ends. Calculated as: memory.wmark_high - memory.limit_in_bytes × memory.wmark_scale_factor / 10000. Not stored in the memcg root directory. See Memcg backend asynchronous reclaim. Yes
memory.wmark_scale_factor Specifies the interval between memory.wmark_high and memory.wmark_low. Unit: 0.01% of the memcg memory limit. Valid values: 1–1000. Default: 50 (0.50% of the limit), inherited from the parent group. Not stored in the memcg root directory. See Memcg backend asynchronous reclaim. Yes
memory.wmark_min_adj Adjusts the global minimum watermark for this memcg. Valid values: -25–50. Default: 0, inherited from the parent cgroup. Negative values adjust within the [0, WMARK_MIN] range; positive values adjust within the [WMARK_MIN, WMARK_LOW] range. When the adjusted minimum watermark is exceeded, throttling occurs; throttling time is linearly proportional to excess memory usage (1–1000 ms). See Memcg global minimum watermark rating. Yes
memory.force_empty Forcefully reclaims memory pages from the cgroup. No N/A
memory.use_hierarchy Controls whether hierarchical memory accounting is enabled. Yes N/A
memory.swappiness Sets the swappiness parameter for vmscan, which controls how aggressively the kernel uses the swap partition. No N/A
memory.priority Sets the memcg out-of-memory (OOM) priority. Valid values: 0–12. A higher value indicates higher priority. Default: 0. Not inherited by descendant cgroups. Priority sorting applies only to sibling cgroups within the same parent; sibling memcgs with the same priority are sorted by memory usage, with the largest consumer triggering an OOM error first. See Memcg OOM priority policy. Yes memory.priority
memory.move_charge_at_immigrate Controls whether a task's memory charges move with it when migrated between cgroups. No N/A
memory.oom_control Controls whether the OOM killer terminates tasks when an OOM error occurs, and generates OOM status notifications. No N/A
memory.oom.group Enables the OOM group feature, which terminates all tasks in a memcg when an OOM error occurs. Yes memory.oom.group
memory.pressure_level Configures memory pressure notifications. No N/A
memory.kmem.limit_in_bytes Sets a hard limit on kernel memory usage. No N/A
memory.kmem.usage_in_bytes Queries current kernel memory usage. No N/A
memory.kmem.failcnt Queries the number of times kernel memory usage reached the upper limit. No N/A
memory.kmem.max_usage_in_bytes Queries the peak kernel memory usage. No N/A
memory.kmem.slabinfo Queries kernel slab memory usage. No N/A
memory.kmem.tcp.limit_in_bytes Sets a hard limit on kernel TCP memory usage. No N/A
memory.kmem.tcp.usage_in_bytes Queries current kernel TCP memory usage. No N/A
memory.kmem.tcp.failcnt Queries the number of times kernel TCP memory usage reached the upper limit. No N/A
memory.kmem.tcp.max_usage_in_bytes Queries the peak kernel TCP memory usage. No N/A
memory.memsw.usage_in_bytes Queries combined memory and swap usage. No N/A
memory.memsw.max_usage_in_byte Queries the peak combined memory and swap usage. No N/A
memory.memsw.limit_in_bytes Sets an upper limit on the total memory and swap usage of tasks in the cgroup. No N/A
memory.memsw.failcnt Queries the number of times the combined memory and swap usage reached the upper limit. No N/A
memory.swap.high Sets an upper limit on swap usage in a cgroup. Yes memory.swap.high
memory.swap.events Queries events that occurred when swap usage reached the upper limit. Yes memory.swap.events
memory.min Sets a minimum memory amount that the cgroup must retain (hard guarantee). See Memcg QoS feature of the cgroup v1 interface. Yes memory.min
memory.low Sets a minimum memory amount that the cgroup should retain (soft guarantee). See Memcg QoS feature of the cgroup v1 interface. Yes memory.low
memory.high Sets a throttle limit on memory usage. See Memcg QoS feature of the cgroup v1 interface. Yes memory.high
memory.allow_duptext Controls the code duptext feature for tasks in a specific memcg, when the global code duptext feature is enabled via /sys/kernel/mm/duptext/enabled. Valid values: 0 (disabled, default) or 1 (enabled). See Code duptext feature. Yes memory.allow_duptext
memory.allow_duptext_refresh Controls whether the code duptext feature starts immediately when a binary file is generated or downloaded. Uses an asynchronous task mode to refresh tasks when code duptext does not take effect in PageDirty or PageWriteback scenarios. Yes memory.allow_duptext_refresh
memory.duptext_nodes Limits the memory nodes used for duptext memory allocation. Yes memory.duptext_nodes
memory.allow_text_unevictable Controls whether the memcg code snippet is locked (unevictable). Yes memory.allow_text_unevictable
memory.text_unevictable_percent Specifies the ratio of locked memcg code snippet memory to total memcg code memory. Yes memory.text_unevictable_percent
memory.thp_reclaim Controls the Transparent Huge Pages (THP) reclaim feature. Valid values: reclaim (enabled), swap (reserved for future use), disable (disabled, default). See THP reclaim. Yes memory.thp_reclaim
memory.thp_reclaim_stat Queries THP reclaim status. Reports per-NUMA-node counts for queue_length (THPs in the reclaim queue), split_hugepage (THPs split by reclaim), and reclaim_subpage (zero subpages reclaimed). See THP reclaim. Yes memory.thp_reclaim_stat
memory.thp_reclaim_ctrl Configures how THP reclaim is triggered. Parameters: threshold (maximum zero subpages in a THP before reclaim is triggered; default: 16) and reclaim (triggers reclaim). See THP reclaim. Yes memory.thp_reclaim_ctrl
memory.thp_control Controls the memcg THP feature. Prevents use of anonymous, shmem, and file THPs, reducing THP contention and memory waste for offline memcgs. Yes memory.thp_control
memory.reclaim_caches Controls whether the kernel proactively reclaims cache in memcgs. Example: echo 100M > memory.reclaim_caches. Yes memory.reclaim_caches
memory.pgtable_bind Controls whether page table memory is forcefully allocated on the current node. Yes memory.pgtable_bind
memory.pgtable_misplaced Queries statistics about page table memory allocated across NUMA nodes. Yes memory.pgtable_misplaced
memory.oom_offline Marks a memcg as belonging to an offline task in the Quick OOM feature. Yes memory.oom_offline
memory.async_fork Controls the Async-fork feature (formerly fast convergent merging, or FCM) for memcgs. Yes memory.async_fork
memory.direct_compact_latency Specifies the latency threshold for direct memory compaction in the memsli feature. Yes memory.direct_compact_latency
memory.direct_reclaim_global_latency Specifies the latency threshold for direct global memory reclamation in the memsli feature. Yes memory.direct_reclaim_global_latency
memory.direct_reclaim_memcg_latency Specifies the latency threshold for direct memcg memory reclamation in the memsli feature. Yes memory.direct_reclaim_memcg_latency
memory.direct_swapin_latency Specifies the latency threshold for direct memory swap-in in the memsli feature. Yes memory.direct_swapin_latency
memory.direct_swapout_global_latency Specifies the latency threshold for direct global memory swap-out in the memsli feature. Yes memory.direct_swapout_global_latency
memory.direct_swapout_memcg_latency Specifies the latency threshold for direct memcg memory swap-out in the memsli feature. Yes memory.direct_swapout_memcg_latency
memory.exstat Queries extended memory statistics for in-house features: wmark_min_throttled_ms (throttling time since the adjusted minimum watermark was exceeded), wmark_reclaim_work_ms (duration of kernel memory reclamation attempts), unevictable_text_size_kb (size of locked code snippets), and pagecache_limit_reclaimed_kb (page cache limit reclaimed). See Memcg Exstat feature. Yes (self-developed enhancement) memory.exstat
memory.idle_page_stats Queries kidled memory usage statistics for a memcg and its cgroup hierarchy. Yes memory.idle_page_stats
memory.idle_page_stats.local Queries kidled memory usage statistics for a memcg only (non-hierarchical). Yes memory.idle_page_stats.local
memory.numa_stat Queries NUMA statistics for anonymous, file, and locked memory. No memory.numa_stat
memory.pagecache_limit.enable Controls the Page Cache Limit feature. See Page Cache Limit feature. Yes memory.pagecache_limit.enable
memory.pagecache_limit.size Specifies the page cache size limit. See Page Cache Limit feature. Yes memory.pagecache_limit.size
memory.pagecache_limit.sync Specifies the Page Cache Limit mode: synchronous or asynchronous. See Page Cache Limit feature. Yes memory.pagecache_limit.sync
memory.reap_background Controls whether zombie memcg reapers reclaim memcg memory asynchronously in the background. Yes memory.reap_background
memory.stat Queries memory statistics. No memory.stat
memory.use_priority_oom Controls the memcg OOM priority policy feature. See Memcg OOM priority policy. Yes memory.use_priority_oom
memory.use_priority_swap Controls whether memory is swapped based on cgroup priorities. See Memcg OOM priority policy. Yes memory.use_priority_swap

cgroup v2 interfaces

Interface name Purpose In-house interface Corresponding cgroup v1 interface
memory.current Queries the current memory usage. No N/A
memory.min Sets a minimum memory amount that the cgroup must retain (hard guarantee). See Memcg QoS feature of the cgroup v1 interface. No memory.min
memory.low Sets a minimum memory amount that the cgroup should retain (soft guarantee). See Memcg QoS feature of the cgroup v1 interface. No memory.low
memory.high Sets an upper throttle limit on memory usage. See Memcg QoS feature of the cgroup v1 interface. No memory.high
memory.max Sets a hard limit on memory usage. No memory.limit_in_bytes
memory.swap.current Queries current swap memory in use. No N/A
memory.swap.high Sets an upper limit on swap usage in a cgroup. No N/A
memory.swap.max Sets a hard limit on swap memory. No memory.memsw.limit_in_bytes
memory.swap.events Queries events that occurred when swap usage reached the upper limit. No N/A
memory.oom.group Controls the OOM group feature, which terminates all tasks in a memcg when an OOM error occurs. No memory.oom.group
memory.wmark_ratio Controls the memcg backend asynchronous reclaim feature. Unit: percentage of the memcg memory limit. Valid values: 0–100. Default: 0 (disabled). See Memcg backend asynchronous reclaim. Yes memory.wmark_ratio
memory.wmark_high Read-only. When memory usage exceeds this value, backend asynchronous reclamation starts. Calculated as: memory.limit_in_bytes × memory.wmark_ratio / 100. Not stored in the memcg root directory. See Memcg backend asynchronous reclaim. Yes memory.wmark_high
memory.wmark_low Read-only. When memory usage falls below this value, backend asynchronous reclamation ends. Calculated as: memory.wmark_high - memory.limit_in_bytes × memory.wmark_scale_factor / 10000. Not stored in the memcg root directory. See Memcg backend asynchronous reclaim. Yes memory.wmark_low
memory.wmark_scale_factor Specifies the interval between memory.wmark_high and memory.wmark_low. Unit: 0.01% of the memcg memory limit. Valid values: 1–1000. Default: 50 (0.50% of the limit), inherited from the parent group. Not stored in the memcg root directory. See Memcg backend asynchronous reclaim. Yes memory.wmark_scale_factor
memory.wmark_min_adj Adjusts the global minimum watermark for this memcg. Valid values: -25–50. Default: 0, inherited from the parent cgroup. See Memcg global minimum watermark rating. Yes memory.wmark_min_adj
memory.priority Sets the memcg OOM priority. Valid values: 0–12. A higher value indicates higher priority. Default: 0. Not inherited by descendant cgroups. See Memcg OOM priority policy. Yes memory.priority
memory.use_priority_oom Controls the memcg OOM priority policy feature. See Memcg OOM priority policy. Yes memory.use_priority_oom
memory.use_priority_swap Controls whether memory is swapped based on cgroup priorities. See Memcg OOM priority policy. Yes memory.use_priority_swap
memory.direct_reclaim_global_latency Specifies the latency threshold for direct global memory reclamation in the memsli feature. Yes memory.direct_reclaim_global_latency
memory.direct_reclaim_memcg_latency Specifies the latency threshold for direct memcg memory reclamation in the memsli feature. Yes memory.direct_reclaim_memcg_latency
memory.direct_compact_latency Specifies the latency threshold for direct memory compaction in the memsli feature. Yes memory.direct_compact_latency
memory.direct_swapout_global_latency Specifies the latency threshold for direct global memory swap-out in the memsli feature. Yes memory.direct_swapout_global_latency
memory.direct_swapout_memcg_latency Specifies the latency threshold for direct memcg memory swap-out in the memsli feature. Yes memory.direct_swapout_memcg_latency
memory.direct_swapin_latency Specifies the latency threshold for direct memory swap-in in the memsli feature. Yes memory.direct_swapin_latency
memory.exstat Queries extended memory statistics for in-house features: wmark_min_throttled_ms, wmark_reclaim_work_ms, unevictable_text_size_kb, and pagecache_limit_reclaimed_kb. See Memcg Exstat. Yes memory.exstat
memory.pagecache_limit.enable Controls the Page Cache Limit feature. See Page Cache Limit feature. Yes memory.pagecache_limit.enable
memory.pagecache_limit.size Specifies the page cache size limit. See Page Cache Limit feature. Yes memory.pagecache_limit.size
memory.pagecache_limit.sync Specifies the Page Cache Limit mode: synchronous or asynchronous. See Page Cache Limit feature. Yes memory.pagecache_limit.sync
memory.idle_page_stats Queries kidled memory usage statistics for individual memcgs at each hierarchy level. Yes memory.idle_page_stats
memory.idle_page_stats.local Queries kidled memory usage statistics for individual memcgs (non-hierarchical). Yes memory.idle_page_stats.local
memory.numa_stat Queries NUMA statistics for anonymous, file, and locked memory. Yes memory.numa_stat
memory.reap_background Controls whether zombie memcg reapers reclaim memcg memory asynchronously in the background. Yes memory.reap_background
memory.stat Queries memory statistics. No memory.stat

cpuacct

The cpuacct subsystem exists only in cgroup v1. In cgroup v2, CPU accounting is handled by the cpu subsystem.

Interface name Purpose In-house interface Corresponding cgroup v2 interface
cpuacct.usage Queries the total CPU time used. Unit: nanoseconds. No cpu.stat (similar data)
cpuacct.usage_user Queries the CPU time used in user mode. Unit: nanoseconds. No
cpuacct.usage_sys Queries the CPU time used in kernel mode. Unit: nanoseconds. No
cpuacct.usage_percpu Queries the CPU time used per CPU. Unit: nanoseconds. No
cpuacct.usage_percpu_user Queries the per-CPU time used in user mode. Unit: nanoseconds. No
cpuacct.usage_percpu_sys Queries the per-CPU time used in kernel mode. Unit: nanoseconds. No
cpuacct.usage_all Queries a summary of cpuacct.usage_percpu_user and cpuacct.usage_percpu_sys. Unit: nanoseconds. No
cpuacct.stat Queries CPU time used in user mode and kernel mode. Unit: ticks. No
cpuacct.proc_stat Queries container-level data including CPU time, average load, and running task count. Yes
cpuacct.enable_sli Controls whether container-level load average counting is enabled. Yes N/A
cpuacct.sched_cfs_statistics Queries CFS statistics, including cgroup runtime and wait time relative to sibling and non-sibling cgroups. Yes cpu.sched_cfs_statistics
cpuacct.wait_latency Queries the latency of tasks waiting in the run queue. Yes cpu.wait_latency
cpuacct.cgroup_wait_latency Queries the latency of cgroups waiting in the run queue. Tracks the group sched_entity, while cpuacct.wait_latency tracks the task sched_entity. Yes cpu.cgroup_wait_latency
cpuacct.block_latency Queries the latency of tasks blocked for non-I/O reasons. Yes cpu.block_latency
cpuacct.ioblock_latency Queries the latency of tasks blocked due to I/O operations. Yes cpu.ioblock_latency
io.pressure Queries PSI for I/O, memory, and CPU. Supports polling. See psi.rst and Enable the PSI feature for cgroup v1. No N/A
memory.pressure No
cpu.pressure No

freezer

The freezer subsystem exists only in cgroup v1. In cgroup v2, the equivalent functionality is provided by cgroup.freeze in the general interface.

Interface name Purpose In-house interface Corresponding cgroup v2 interface
freezer.state Controls the freeze state. Valid values: FROZEN and THAWED. No cgroup.freeze
freezer.self_freezing Queries whether a cgroup is frozen due to its own frozen state. No N/A
freezer.parent_freezing Queries whether a cgroup is frozen because an ancestor cgroup is frozen. No N/A

ioasids

The ioasids subsystem interfaces are the same in cgroup v1 and cgroup v2.

Interface name Purpose In-house interface
ioasids.current Queries the number of ioasids allocated to the current cgroup. Yes
ioasids.events Queries the number of events that occurred because the upper limit of allocable ioasids was exceeded. Yes
ioasids.max Queries the total number of ioasids that can be allocated to the current cgroup. Yes

net_cls and net_prio

The net_cls and net_prio interfaces are removed in cgroup v2. Use eBPF to filter and shape network traffic instead.
Interface name Purpose In-house interface Corresponding cgroup v2 interface
net_cls.classid Specifies the class identifier that tags network packets from the current cgroup. Works with qdisc or iptables. No N/A
net_prio.prioidx Queries the index value of the current cgroup in the internal data structure. Read-only; used internally by the kernel. No
net_prio.ifpriomap Specifies the network priority for each network interface controller (NIC). No

perf_event

The perf_event subsystem provides no interfaces in either cgroup v1 or cgroup v2. It is enabled by default in cgroup v2 and provides the same functionality as in cgroup v1.

pids

The pids subsystem interfaces are the same in cgroup v1 and cgroup v2.

Interface name Purpose In-house interface
pids.max Specifies the maximum number of tasks in a cgroup. No
pids.current Queries the current number of tasks in a cgroup. No
pids.events Queries the number of events where a fork operation failed because the maximum task count was reached. Supports fsnotify for filesystem notifications. No

rdma

The rdma subsystem interfaces are the same in cgroup v1 and cgroup v2.

Interface name Purpose In-house interface
rdma.max Specifies the upper limit on Remote Direct Memory Access (RDMA) adapter resource usage. No
rdma.current Queries the current RDMA adapter resource usage. No