Control groups (cgroups) are a Linux kernel feature that restricts, accounts for, and isolates the physical resources (such as CPU, memory, and I/O) of process groups. A parent process can use cgroups to manage the resource consumption of its child process groups. This document outlines the key differences between the two major versions, cgroup v1 and cgroup v2.
General interface differences
cgroup v1 interfaces
Interface name | Purpose | In-house interface | Corresponding cgroup v2 interface |
cgroup.procs | Moves a process into the cgroup by writing its PID to this file. | No | cgroup.procs. |
cgroup.clone_children | If set to 1, a child cgroup inherits the configuration of its parent's cpuset. Note This applies only to the | No | N/A |
cgroup.sane_behavior | An interface for an experimental v2 feature. It is kept for backward compatibility after the official release of v2. | No | N/A |
notify_on_release | When set to 1, the system executes the Note This file exists only in the root cgroup. | No | cgroup.events, which implements similar functionality |
release_agent | No | ||
tasks | Moves a thread to the cgroup when its thread ID (TID) is written to this file. | No | cgroup.threads. |
pool_size | Controls the size of the cgroup cache pool. In high-concurrency scenarios, this can accelerate cgroup creation and binding. Note This depends on | Yes | N/A |
cgroup v2 interfaces
Interface name | Purpose | In-house interface | Corresponding cgroup v1 interface |
cgroup.procs | Moves a process into the cgroup by writing its PID to this file. | No | cgroup.procs. |
cgroup.type | Write Note This is only supported for the cpu, pids, and perf_event subsystems. | No | N/A |
cgroup.threads | Moves a thread to the cgroup when its TID is written to this file. Note
| No | tasks. |
cgroup.controllers | Lists the subsystems enabled for the current cgroup. | No | N/A |
cgroup.subtree_control | Controls which subsystems are enabled for child cgroups. Note Subsystems must be a subset of those in | No | N/A |
cgroup.events | Records whether the cgroup is managing processes and whether it is frozen. You can use Note This file does not exist in the root cgroup. | No | Similar functionality can be achieved by using |
cgroup.max.descendants | Controls the maximum number of descendant cgroups. | No | N/A |
cgroup.max.depth | Controls the maximum depth of descendant cgroups. | No | N/A |
cgroup.stat | Shows the number of descendant cgroups and the number of descendant cgroups in a | No | N/A |
cgroup.freeze | Freezes or unfreezes all processes in the cgroup. Note This file does not exist in the root cgroup. | No | freezer.state in the freezer subsystem |
cpu.stat | Shows CPU usage statistics. | No | N/A |
io.pressure | Shows Pressure Stall Information (PSI). Supports | No | After you enable PSI for cgroup v1, this is implemented by |
memory.pressure | No | ||
cpu.pressure | No |
Subsystem interface differences
CPU
cgroup v1 interfaces
Interface name | Purpose | In-house interface | Corresponding cgroup v2 interface |
cpu.shares | Controls the weight to allocate CPU time slices based on proportion. The default value is 1024. | No |
|
cpu.idle | Sets the scheduling policy for the cgroup to Note When | No | cpu.idle |
cpu.priority | Sets the fine-grained preemptive priority. Preemption is determined during clock interrupts or wake-ups and adjusted based on the priority difference. This makes it easier for high-priority tasks to preempt low-priority ones. | Yes | cpu.priority |
cpu.cfs_quota_us | The CPU runtime controlled by using Completely Fair Scheduler (CFS). cpu.cfs_quota_us specifies the maximum CPU runtime of tasks in a cgroup within a period defined by the cpu.cfs_period_us interface. | No | cpu.max |
cpu.cfs_period_us | No | ||
cpu.cfs_burst_us | The amount of burst time a process is allowed to run within a | No | cpu.max.burst |
cpu.cfs_init_buffer_us | The amount of burst time a process is allowed to run at startup. | Yes | cpu.max.init_buffer |
cpu.stat | Shows statistics related to CPU bandwidth control, such as the number of periods elapsed and the number of times throttling occurred. | No | cpu.stat |
cpu.rt_runtime_us | A real-time (RT) task bandwidth control. For RT processes, within a period defined by | No | N/A |
cpu.rt_period_us | No | N/A | |
cpu.bvt_warp_ns | Controls the group identity attribute to differentiate between online and offline processes. This provides better CPU quality of service (QoS) for online processes. For more information, see Group identity feature. | Yes | cpu.bvt_warp_ns |
cpu.identity | Yes | cpu.identity | |
cpu.ht_stable | Controls whether to generate noise on the simultaneous multithreading (SMT) sibling to stabilize SMT computing power. | Yes | N/A |
cpu.ht_ratio | Controls whether extra quota is calculated due to an idle SMT sibling. This is used to stabilize SMT computing power. | Yes | cpu.ht_ratio |
cgroup v2 interfaces
Because cgroup v2 no longer supports the cpuacct subsystem, some of its interfaces or related features are now implemented in the cpu subsystem.
Interface name | Purpose | In-house interface | Corresponding cgroup v1 interface |
cpu.weight | Controls the weight to allocate CPU time slices based on proportion. The default value is 100. | No | cpu.shares, which uses a different unit |
cpu.weight.nice | Controls the weight to allocate CPU time slices based on proportion. The default value is 0. | No | cpu.shares, which uses a different unit |
cpu.idle | Sets the scheduling policy for the cgroup to Note When | No | cpu.idle |
cpu.priority | Sets the fine-grained preemptive priority. Preemption is determined during clock interrupts or wake-ups and scaled based on the priority difference. This makes it easier for high-priority tasks to preempt low-priority ones. | Yes | cpu.priority |
cpu.max | A CFS bandwidth control. Contains two values, | No | cpu.cfs_quota_us, cpu.cfs_period_us |
cpu.max.burst | The amount of burst time a process is allowed to run within the | No | cpu.max.burst |
cpu.max.init_buffer | The amount of burst time a process is allowed to run at startup. | Yes | cpu.cfs_init_buffer_us |
cpu.bvt_warp_ns | Controls the group identity attribute to differentiate offline processes. This provides better CPU QoS for online processes. | Yes | cpu.bvt_warp_ns |
cpu.identity | Yes | cpu.identity | |
cpu.sched_cfs_statistics | Provides CFS-related statistics, such as run time and time spent waiting for sibling or non-sibling cgroups. Note Requires | Yes | cpuacct.sched_cfs_statistics |
cpu.wait_latency | The latency distribution of processes waiting in the queue. Note Requires | Yes | cpuacct.wait_latency |
cpu.cgroup_wait_latency | The latency distribution of process groups waiting in the queue. The difference from Note Requires | Yes | cpuacct.cgroup_wait_latency |
cpu.block_latency | The latency distribution of processes blocked for non-I/O reasons. Note Requires | Yes | cpuacct.block_latency |
cpu.ioblock_latency | The latency distribution of processes blocked for I/O reasons. Note Requires | Yes | cpuacct.ioblock_latency |
cpu.ht_ratio | Controls whether extra quota is calculated due to an idle SMT sibling. This is used to stabilize SMT computing power. Note This takes effect only when core scheduling is enabled. | Yes | cpu.ht_ratio |
cpuset
cgroup v1 interfaces
Interface name | Purpose | In-house interface | Corresponding cgroup v2 interface |
cpuset.cpus | Controls the CPUs on which tasks can run. Note Tasks cannot be attached to a cgroup when this interface is empty. | No | cpuset.cpus |
cpuset.mems | Controls the non-uniform memory access (NUMA) nodes that can be allocated to tasks in a cgroup. Note Tasks cannot be attached to a cgroup when this interface is empty. | No | cpuset.mems |
cpuset.effective_cpus | Queries the effective CPUs on which tasks are running. The value of this interface is affected by CPU hotplug events. | No | cpuset.cpus.effective |
cpuset.effective_mems | Queries the effective NUMA nodes that are allocated to the running tasks. The value of this interface is affected by memory nodes hotplug events. | No | cpuset.mems.effective |
cpuset.cpu_exclusive | Controls which CPUs are exclusively used by a cgroup and cannot be used by other cpusets at the same level in a cgroup. | No | cpuset.cpus.partition, that supports similar functionality |
cpuset.mem_exclusive | Controls which NUMA nodes are exclusively used by a cgroup and cannot be used by other cpusets at the same level in a cgroup. | No | N/A |
cpuset.mem_hardwall | A value of 1 indicates that memory only from the memory nodes that are attached to the cpuset can be allocated to tasks. | No | N/A |
cpuset.sched_load_balance | Controls whether CPUs are load-balanced within the cpuset. By default, the feature is enabled. | No | N/A |
cpuset.sched_relax_domain_level | Controls the range in which to search for CPUs when a scheduler migrates tasks to load-balance CPUs for the tasks. Default value: -1.
| No | N/A |
cpuset.memory_migrate | A non-zero value indicates that if a task is allocated a memory page in a cpuset and migrated to another cpuset, the memory page can also be migrated to the new cpuset. | No | N/A |
cpuset.memory_pressure | Calculates the memory paging pressure of the current cpuset. | No | N/A |
cpuset.memory_spread_page | A value of 1 indicates that the kernel evenly allocates the page cache to the memory nodes of the cpuset. | No | N/A |
cpuset.memory_spread_slab | A value of 1 indicates that the kernel evenly allocates the slab caches to the memory nodes of the cpuset. | No | N/A |
cpuset.memory_pressure_enabled | A value of 1 indicates that memory pressure statistics collection is enabled for the cpuset. | No | N/A |
cgroup v2 interfaces
Interface name | Purpose | In-house interface | Corresponding cgroup v1 interface |
cpuset.cpus | Controls the CPUs on which tasks can run. Note When the value of this interface is empty, the CPUs of the parent cpuset are used. | No | cpuset.cpus |
cpuset.mems | Controls the NUMA nodes that can be allocated to tasks in a cgroup. Note When the value of this interface is empty, the NUMA nodes of the parent cpuset are used. | No | cpuset.mems |
cpuset.cpus.effective | Queries the effective CPUs on which tasks are running. The value of this interface is affected by CPU hotplug events. | No | cpuset.effective_cpus |
cpuset.mems.effective | Queries the effective NUMA nodes that are allocated to the running tasks. The value of this interface is affected by memory nodes hotplug events. | No | cpuset.effective_mems |
cpuset.cpus.partition | Controls whether CPUs of a cpuset are exclusively used. If root is written into the interface, CPUs of a cpuset are exclusively used. | No | cpuset.cpu_exclusive, which implements similar functionality |
.__DEBUG__.cpuset.cpus.subpartitions | Queries which CPUs are used exclusively when root is written into the cpuset.cpus.partition interface. Note This interface is available only if the cgroup_debug feature is enabled for kernel cmdline. | No | N/A |
blkio
cgroup v1 interfaces
Interface name | Purpose | In-house interface | Corresponding cgroup v2 interface |
blkio.throttle.read_bps_device | Specifies the maximum number of bytes per second that a cgroup can read from a device. Example: echo "<major>:<minor> <bps>" > /sys/fs/cgroup/blkio/<cgroup>/blkio.throttle.read_bps_device | No | io.max |
blkio.throttle.write_bps_device | Specifies the maximum number of bytes per second that a cgroup can write to a device. Example: echo "<major>:<minor> <bps>" > /sys/fs/cgroup/blkio/<cgroup>/blkio.throttle.write_bps_device | No | io.max |
blkio.throttle.read_iops_device | Specifies the maximum number of read operations per second that a cgroup can perform on a device. Example: echo "<major>:<minor> <iops>" > /sys/fs/cgroup/blkio/<cgroup>/blkio.throttle.read_iops_device | No | io.max |
blkio.throttle.write_iops_device | Specifies the maximum number of read operations per second that a cgroup can perform on a device. Example: echo "<major>:<minor> <iops>" > /sys/fs/cgroup/blkio/<cgroup>/blkio.throttle.write_iops_device | No | io.max |
blkio.throttle.io_service_bytes | Queries bandwidth statistics. This interface collects the read, write, sync, async, discard, and total bandwidth statistics of all devices. Unit: bytes. | No | io.stat |
blkio.throttle.io_service_bytes_recursive | The recursive version of the blkio.throttle.io_service_bytes interface. Statistics collected by using the blkio.throttle.io_service_bytes interface include data of descendant cgroups. | No | N/A |
blkio.throttle.io_serviced | Queries IOPS statistics. This interface collects the read, write, sync, async, discard, and total IOPS statistics of all devices. | No | io.stat |
blkio.throttle.io_serviced_recursive | The recursive version of the blkio.throttle.io_serviced interface. Statistics collected by using the blkio.throttle.io_serviced interface include data of descendant cgroups. | No | N/A |
blkio.throttle.io_service_time | Queries the duration between request dispatch and request completion for I/O operations, which is used to measure the average I/O latency. For more information, see Enhance the monitoring of block I/O throttling. | Yes | io.extstat |
blkio.throttle.io_wait_time | Queries the duration when I/O operations wait in scheduler queues, which is used to measure the average I/O latency. For more information, see Enhance the monitoring of block I/O throttling. | Yes | io.extstat |
blkio.throttle.io_completed | Queries the number of completed I/O operations, which is used to measure the average I/O latency. For more information, see Enhance the monitoring of block I/O throttling. | Yes | io.extstat |
blkio.throttle.total_bytes_queued | Queries the number of I/O bytes that were throttled, which is used to analyze whether I/O latency is related to throttling. For more information, see Enhance the monitoring of block I/O throttling. | Yes | io.extstat |
blkio.throttle.total_io_queued | Queries the number of I/O operations that were throttled, which is used to analyze whether I/O latency is related to throttling. For more information, see Enhance the monitoring of block I/O throttling. | Yes | io.extstat |
blkio.cost.model | Specifies the blk-iocost cost model. The control mode (ctrl) can be set to auto or user. This interface exists only in the root cgroup. Example: echo "<major>:<minor> ctrl=user model=linear rbps=<rbps> rseqiops=<rseqiops> rrandiops=<rrandiops> wbps=<wbps> wseqiops=<wseqiops> wrandiops=<wrandiops>" > /sys/fs/cgroup/blkio/blkio.cost.model For more information, see Configure the blk-iocost weight-based throttling feature. | Yes | io.cost.model |
blkio.cost.qos | Controls the blk-iocost feature and configures a QoS policy to check for disk congestion. This interface exists only in the root cgroup. Example: echo "<major>:<minor> enable=1 ctrl=user rpct= rlat=5000 wpct=95.00 wlat=5000 min=50.00 max=150.00" > /sys/fs/cgroup/blkio/blkio.cost.qos For more information, see Configure blk-iocost weight throttling. | Yes | io.cost.qos |
blkio.cost.weight | Specifies the cgroup weight. This interface exists only in non-root cgroups and can be configured in the following modes:
For more information, see Configure the blk-iocost weight-based throttling feature. | Yes | io.cost.weight |
blkio.cost.stat | Queries the blk-iocost statistics. The interface exists only in non-root cgroups. | Yes | N/A |
cgroup v2 interfaces
Interface name | Purpose | In-house interface | Corresponding cgroup v1 interface |
io.max | The throttling interface that specifies the read and write throttling rates in byte/s and IOPS. Example: echo "<major>:<minor> rbps=<bps> wbps=<bps> riops=<iops> wiops=<iops>" > /sys/fs/cgroup/<cgroup>/io.max | No | blkio.throttle.read_bps_device blkio.throttle.read_iops_device blkio.throttle.write_bps_device blkio.throttle.write_iops_device |
io.stat | Queries I/O operation statistics, which include the rates of read, write, and discard operations in byte/s and IOPS. | No | blkio.throttle.io_service_bytes blkio.throttle.io_serviced |
io.extstat | Queries extended I/O statistics, including the wait time, service time, number of completed I/O operations, and throttling rates in byte/s and IOPS. | No | blkio.throttle.io_service_time blkio.throttle.io_wait_time blkio.throttle.io_completed blkio.throttle.total_bytes_queued blkio.throttle.total_io_queued |
io.cost.model | Specifies the blk-iocost cost model. The control mode (ctrl) can be set to auto or user. This interface exists only in the root cgroup. Example: echo "<major>:<minor> ctrl=user model=linear rbps=<rbps> rseqiops=<rseqiops> rrandiops=<rrandiops> wbps=<wbps> wseqiops=<wseqiops> wrandiops=<wrandiops>" > /sys/fs/cgroup/io.cost.model For more information, see Configure blk-iocost weight throttling. | No | blkio.cost.model |
io.cost.qos | Controls the blk-iocost feature and configures a QoS policy to check for disk congestion. This interface exists only in the root cgroup. Example: echo "<major>:<minor> enable=1 ctrl=user rpct= rlat=5000 wpct=95.00 wlat=5000 min=50.00 max=150.00" > /sys/fs/cgroup/io.cost.qos For more information, see Configure blk-iocost weight throttling. | No | blkio.cost.qos |
io.cost.weight | Specifies the cgroup weight. This interface exists only in non-root cgroups and can be configured in the following modes:
For more information, see Configure blk-iocost weight throttling. | No | blkio.cost.weight |
memory
cgroup v1 interfaces
Interface name | Purpose | In-house interface | Corresponding cgroup v2 interface |
memory.usage_in_bytes | Queries the current memory usage. | No | N/A |
memory.max_usage_in_bytes | Queries the maximum memory usage. | No | N/A |
memory.limit_in_bytes | Specifies the hard upper limit on memory usage. | No | N/A |
memory.soft_limit_in_bytes | Specifies the soft lower limit on memory usage. | No | N/A |
memory.failcnt | Queries the number of times the memory usage reached the upper limit. | No | N/A |
memory.mglru_batch_size | Specifies the size of memory that is proactively reclaimed based on the Multi-Generational Least Recently Used (MGLRU) framework. An attempt is made to release CPUs between batches of memory reclamation. | Yes | N/A |
memory.mglru_reclaim_kbytes | Specifies the size of memory that is proactively reclaimed based on the MGLRU framework. | Yes | N/A |
memory.wmark_ratio | Controls the memcg backend asynchronous reclaim feature and sets the memcg memory watermark that triggers asynchronous reclamation. Unit: percent of the memcg memory upper limit. Valid values: 0 to 100.
For more information, see Memcg backend asynchronous reclaim. | Yes | memory.wmark_ratio |
memory.wmark_high | A read-only interface.
For more information, see Memcg backend asynchronous reclaim. | Yes | |
memory.wmark_low | A read-only interface.
For more information, see Memcg backend asynchronous reclaim. | Yes | |
memory.wmark_scale_factor | Specifies the interval between the memory.wmark_high value and the memory.wmark_low value. Unit: 0.01 percent of the memcg memory upper limit. Valid values: 1 to 1000.
For more information, see Memcg backend asynchronous reclaim. | Yes | |
memory.wmark_min_adj | The factor that is used in the memcg global minimum watermark rating feature. The value of this interface indicates an adjustment in percentage over the global minimum watermark. Valid values: -25 to 50.
For more information, see Memcg global minimum watermark rating. | Yes | |
memory.force_empty | Specifies whether to forcefully reclaim memory pages. | No | N/A |
memory.use_hierarchy | Specifies whether to collect hierarchical statistics. | Yes | N/A |
memory.swappiness | Specifies the swappiness parameter of vmscan, which controls the tendency of the kernel to use the swap partition. | No | N/A |
memory.priority | Specifies the memcg priority. This interface provides 13 memcg out-of-memory (OOM) priorities to sort business. Valid values: 0 to 12. A larger value indicates a higher priority. The priority of a parent cgroup is not inherited by its descendant cgroups. Default value: 0.
| Yes | memory.priority |
memory.move_charge_at_immigrate | Specifies whether charges of a task are moved along the task when the task is migrated between cgroups, which is a statistical control policy. | No | N/A |
memory.oom_control | Specifies whether to trigger the OOM killer to terminate tasks when an OOM error occurs and generate notifications about OOM status. | No | N/A |
memory.oom.group | Controls the OOM group feature that can terminate all tasks in a memcg if an OOM error occurs. | Yes | memory.oom.group |
memory.pressure_level | Specifies memory pressure notifications. | No | N/A |
memory.kmem.limit_in_bytes | Specifies the hard limit on the memory usage of the kernel. | No | N/A |
memory.kmem.usage_in_bytes | Queries the memory usage of the kernel. | No | N/A |
memory.kmem.failcnt | Queries the number of times the memory usage of the kernel reached the upper limit. | No | N/A |
memory.kmem.max_usage_in_bytes | Queries the maximum memory usage of the kernel. | No | N/A |
memory.kmem.slabinfo | Queries the slab memory usage of the kernel. | No | N/A |
memory.kmem.tcp.limit_in_bytes | Specifies the hard limit on the TCP memory usage of the kernel. | No | N/A |
memory.kmem.tcp.usage_in_bytes | Queries the TCP memory usage of the kernel. | No | N/A |
memory.kmem.tcp.failcnt | Queries the number of times the TCP memory usage of the kernel reached the upper limit. | No | N/A |
memory.kmem.tcp.max_usage_in_bytes | Queries the maximum TCP memory usage of the kernel. | No | N/A |
memory.memsw.usage_in_bytes | Queries the memory usage and swap memory usage. | No | N/A |
memory.memsw.max_usage_in_byte | Queries the maximum usage of memory and swap memory. | No | N/A |
memory.memsw.limit_in_bytes | Specifies the upper limit on the total usage of memory and swap memory used by tasks in the cgroup. | No | N/A |
memory.memsw.failcnt | Queries the number of times the total usage of memory and swap memory reached the upper limit. | No | N/A |
memory.swap.high | Specifies the upper limit on available swap memory usage in a cgroup. | Yes | memory.swap.high |
memory.swap.events | Queries the events occuring when the swap memory usage reached the upper limit. | Yes | memory.swap.events |
memory.min | Specifies a minimum amount of memory that a cgroup must retain, which is a hard guarantee of memory. For more information, see Memcg QoS feature of the cgroup v1 interface. | Yes | memory.min |
memory.low | Specifies the lower limit of memory that a cgroup can retain, which is a soft guarantee of memory. For more information, see Memcg QoS feature of the cgroup v1 interface. | Yes | memory.low |
memory.high | Specifies the throttle limit of the memory usage. For more information, see Memcg QoS feature of the cgroup v1 interface. | Yes | memory.high |
memory.allow_duptext | When the /sys/kernel/mm/duptext/enabled parameter is configured to globally enable the code duptext feature, the interface is used to control whether to enable the code duptext feature for tasks in a specific memcg. Valid values: 0 and 1. Default value: 0.
For more information, see Code duptext feature. | Yes | memory.allow_duptext |
memory.allow_duptext_refresh | Specifies whether the code duptext feature is immediately started when a binary file is generated or downloaded. The code duptext feature does not take effect in case of PageDirty or PageWriteback. The interface uses the asynchronous task mode to refresh tasks when the code duptext feature does not take effect in scenarios of PageDirty or PageWriteback. | Yes | memory.allow_duptext_refresh |
memory.duptext_nodes | Limits the duptext memory allocation nodes. | Yes | memory.duptext_nodes |
memory.allow_text_unevictable | Specifies whether the memcg snippet is locked. | Yes | memory.allow_text_unevictable |
memory.text_unevictable_percent | Specifies the ratio of the amount of memory used by locked memcg code snippet to the total amount of memory used by memcg code. | Yes | memory.text_unevictable_percent |
memory.thp_reclaim | Controls the Transparent Huge Pages (THP) reclaim feature. Valid values:
Default value: disable. For more information, see THP reclaim. | Yes | memory.thp_reclaim |
memory.thp_reclaim_stat | Queries the status of the THP reclaim feature. Parameters of this interface:
The values of the preceding parameters are listed in ascending order by NUMA node ID, such as node0 and node1, from left to right. For more information, see THP reclaim. | Yes | memory.thp_reclaim_stat |
memory.thp_reclaim_ctrl | Specifies how the THP reclaim feature is triggered. Parameters of this interface:
For more information, see THP reclaim. | Yes | memory.thp_reclaim_ctrl |
memory.thp_control | Controls the memcg THP feature. This interface can be used to prohibit the application of anon, shmem, and file THPs. For example, an offline memcg is not allowed to use THPs. This helps reduce THP contention and memory waste, even though memory fragmentation cannot be prevented. | Yes | memory.thp_control |
memory.reclaim_caches | Specifies whether the kernel proactively reclaims the cache in memcgs. Example: | Yes | memory.reclaim_caches |
memory.pgtable_bind | Specifies whether to forcefully apply for page table memory on the current node. | Yes | memory.pgtable_bind |
memory.pgtable_misplaced | Queries statistics about page memory in page tables when page memory is allocated across nodes. | Yes | memory.pgtable_misplaced |
memory.oom_offline | In the Quick OOM feature, you can use this interface to mark the memcg of an offline task. | Yes | memory.oom_offline |
memory.async_fork | Controls the Async-fork feature, formerly known as fast convergent merging (FCM), for memcgs. | Yes | memory.async_fork |
memory.direct_compact_latency | Specifies the latency in direct memory compaction of the memsli feature. | Yes | memory.direct_compact_latency |
memory.direct_reclaim_global_latency | Specifies the latency in direct global memory reclamation of the memsli feature. | Yes | memory.direct_reclaim_global_latency |
memory.direct_reclaim_memcg_latency | Specifies the latency in direct memcg memory reclamation of the memsli feature. | Yes | memory.direct_reclaim_memcg_latency |
memory.direct_swapin_latency | Specifies the latency in direct memory swap-in of the memsli feature. | Yes | memory.direct_swapin_latency |
memory.direct_swapout_global_latency | Specifies the latency in direct global memory swap-out of the memsli feature. | Yes | memory.direct_swapout_global_latency |
memory.direct_swapout_memcg_latency | Specifies the latency in direct memcg memory swap-out of the memsli feature. | Yes | memory.direct_swapout_memcg_latency |
memory.exstat | Queries statistics about extended memory and extra memory. Statistics about the following in-house features are collected:
For more information, see Memcg Exstat feature. | Self-developed enhancement | memory.exstat |
memory.idle_page_stats | Queries statistics about kidled memory usage of a memcg and the hierarchical information of the cgroup. | Yes | memory.idle_page_stats |
memory.idle_page_stats.local | Queries statistics about kidled memory usage of a memcg. | Yes | memory.idle_page_stats.local |
memory.numa_stat | Queries NUMA statistics for anonymous, file, and locked memory. | No | memory.numa_stat |
memory.pagecache_limit.enable | Controls the Page Cache Limit feature. For more information, see Page Cache Limit feature. | Yes | memory.pagecache_limit.enable |
memory.pagecache_limit.size | Specifies the size of the limited page cache. | Yes | memory.pagecache_limit.size |
memory.pagecache_limit.sync | Specifies the mode of the Page Cache Limit feature, which is synchronous or asynchronous. | Yes | memory.pagecache_limit.sync |
memory.reap_background | Specifies whether the zombie memcg reapers reap memory of memcgs in the backend asynchronous manner. | Yes | memory.reap_background |
memory.stat | Queries memory statistics. | No | memory.stat |
memory.use_priority_oom | Controls the memcg OOM priority policy feature. For more information, see Memcg OOM priority policy. | Yes | memory.use_priority_oom |
memory.use_priority_swap | Specifies whether the memory is swapped based on the priorities of cgroups. For more information, see Memcg OOM priority policy. | Yes | memory.use_priority_swap |
cgroup v2 interfaces
Interface name | Purpose | In-house interface | Corresponding cgroup v1 interface |
memory.current | Queries the memory usage. | No | N/A |
memory.min | Specifies a minimum amount of memory that a cgroup must retain, which is a hard guarantee of memory. For more information, see Memcg QoS feature of the cgroup v1 interface. | No | memory.min |
memory.low | Specifies the lower limit of memory that a cgroup can retain, which is a soft guarantee of memory. For more information, see Memcg QoS feature of the cgroup v1 interface. | No | memory.low |
memory.high | Specifies the upper limit on memory usage. For more information, see Memcg QoS feature of the cgroup v1 interface. | No | memory.high |
memory.max | Specifies the throttle limit of the memory usage. | No | memory.max |
memory.swap.current | Queries swap memory in use. | No | N/A |
memory.swap.high | Specifies the upper limit on available swap memory usage in a cgroup. | No | N/A |
memory.swap.max | Specifies a hard limit on swap memory. | No | N/A |
memory.swap.events | Queries the events occuring when the swap memory usage reached the upper limit. | No | N/A |
memory.oom.group | Specifies whether the OOM group feature is enabled, which can kill all tasks in a memcg if an OOM error occurs. | No | memory.oom.group |
memory.wmark_ratio | Controls the memcg backend asynchronous reclaim feature and sets the memcg memory watermark that triggers asynchronous reclamation. Unit: percent of the memcg memory upper limit. Valid values: 0 to 100.
For more information, see Memcg backend asynchronous reclaim. | Yes | memory.wmark_ratio |
memory.wmark_high | A read-only interface.
For more information, see Memcg backend asynchronous reclaim. | Yes | memory.wmark_high |
memory.wmark_low | A read-only interface.
For more information, see Memcg backend asynchronous reclaim. | Yes | memory.wmark_low |
memory.wmark_scale_factor | Specifies the interval between the memory.wmark_high value and the memory.wmark_low value. Unit: 0.01 percent of the memcg memory upper limit. Valid values: 1 to 1000.
For more information, see Memcg backend asynchronous reclaim. | Yes | memory.wmark_scale_factor |
memory.wmark_min_adj | The factor that is used in the memcg global minimum watermark rating feature. The value of this interface indicates an adjustment in percentage over the global minimum watermark. Valid values: -25 to 50.
For more information, see Memcg global minimum watermark rating. | Yes | memory.wmark_min_adj |
memory.priority | Specifies the memcg priority. This interface provides 13 memcg OOM priorities to sort business. Valid values: 0 to 12. A larger value indicates a higher priority. The priority of a parent cgroup is not inherited by its descendant cgroups. Default value: 0.
For more information, see Memcg OOM priority policy. | Yes | memory.priority |
memory.use_priority_oom | Controls the memcg OOM priority policy feature. For more information, see Memcg OOM priority policy. | Yes | memory.use_priority_oom |
memory.use_priority_swap | Specifies whether the memory is swapped based on the priorities of cgroups. For more information, see Memcg OOM priority policy. | Yes | memory.use_priority_swap |
memory.direct_reclaim_global_latency | Specifies the latency in direct global memory reclamation of the memsli feature. | Yes | memory.direct_reclaim_global_latency |
memory.direct_reclaim_memcg_latency | Specifies the latency in direct memcg memory reclamation of the memsli feature. | Yes | memory.direct_reclaim_memcg_latency |
memory.direct_compact_latency | Specifies the latency in direct memory compaction of the memsli feature. | Yes | memory.direct_compact_latency |
memory.direct_swapout_global_latency | Specifies the latency in direct global memory swap-out of the memsli feature. | Yes | memory.direct_swapout_global_latency |
memory.direct_swapout_memcg_latency | Specifies the latency in direct memcg memory swap-out of the memsli feature. | Yes | memory.direct_swapout_memcg_latency |
memory.direct_swapin_latency | Specifies the latency in direct memory swap-in of the memsli feature. | Yes | memory.direct_swapin_latency |
memory.exstat | Queries statistics about extended memory and extra memory. Statistics about the following in-house features are collected:
For more information, see Memcg Exstat. | Yes | memory.exstat |
memory.pagecache_limit.enable | Controls the Page Cache Limit feature. For more information, see Page Cache Limit feature. | Yes | memory.pagecache_limit.enable |
memory.pagecache_limit.size | Specifies the size of the limited page cache. For more information, see Page Cache Limit feature. | Yes | memory.pagecache_limit.size |
memory.pagecache_limit.sync | Specifies the mode of the Page Cache Limit feature, which is synchronous or asynchronous. For more information, see Page Cache Limit feature. | Yes | memory.pagecache_limit.sync |
memory.idle_page_stats | Queries statistics about kidled memory of individual memcgs of each hierarchy. | Yes | memory.idle_page_stats |
memory.idle_page_stats.local | Queries statistics about kidled memory of individual memcgs. | Yes | memory.idle_page_stats.local |
memory.numa_stat | Queries NUMA statistics for anonymous, file, and locked memory. | Yes | memory.numa_stat |
memory.reap_background | Specifies whether the zombie memcg reapers reap memory of memcgs in the backend asynchronous manner. | Yes | memory.reap_background |
memory.stat | Queries memory statistics. | No | memory.stat |
memory.use_priority_oom | Controls the memcg OOM priority policy feature. For more information, see Memcg OOM priority policy. | Yes | memory.use_priority_oom |
cpuacct
Interface name | Purpose | In-house interface | Corresponding cgroup v2 interface |
cpuacct.usage | Queries the total CPU time used. Unit: nanoseconds. | No | cpu.stat, which displays similar data |
cpuacct.usage_user | Queries the CPU time used in user mode. Unit: nanoseconds. | No | |
cpuacct.usage_sys | Queries the CPU time used in kernel mode. Unit: nanoseconds. | No | |
cpuacct.usage_percpu | Queries the use time of each CPU. Unit: nanoseconds. | No | |
cpuacct.usage_percpu_user | Queries the use time of each CPU in user mode. Unit: nanoseconds. | No | |
cpuacct.usage_percpu_sys | Queries the use time of each CPU in kernel mode. Unit: nanoseconds. | No | |
cpuacct.usage_all | Queries the summary of the cpuacct.usage_percpu_user and cpuacct.usage_percpu_sys interfaces. Unit: nanoseconds. | No | |
cpuacct.stat | Queries the CPU time used in user mode and kernel mode. Unit: tick. | No | |
cpuacct.proc_stat | Queries data such as the CPU time, average loads (loadavg), and number of running tasks at the container level. | Yes | |
cpuacct.enable_sli | Controls whether to count loadavgs at the container level. | Yes | N/A |
cpuacct.sched_cfs_statistics | Queries statistics about CFS, such as the runtime of a cgroup and the waiting time of cgroups at the same level or different levels. | Yes | cpu.sched_cfs_statistics |
cpuacct.wait_latency | Queries the latency of tasks waiting in the queue. | Yes | cpu.wait_latency |
cpuacct.cgroup_wait_latency | Queries the latency of cgroups waiting in the queue. The wait_latency interface counts the latency of task SEs, and the cgroup_wait_latency interface counts the latency of group SEs. | Yes | cpu.cgroup_wait_latency |
cpuacct.block_latency | Queries the latency of tasks blocked due to non-I/O causes. | Yes | cpu.block_latency |
cpuacct.ioblock_latency | Queries the latency of tasks blocked due to I/O operations. | Yes | cpu.ioblock_latency |
io.pressure | Query PSI for I/O performance, memory, and CPUs. The information can be polled. For more information, see the following topics: | No | N/A |
memory.pressure | No | ||
cpu.pressure | No |
freezer
Interface name | Purpose | In-house interface | Corresponding cgroup v2 interface |
freezer.state | Controls the freeze status. Valid values: | No | cgroup.freeze |
freezer.self_freezing | Queries whether a cgroup is frozen because of its own frozen state. | No | N/A |
freezer.parent_freezing | Queries whether a cgroup is frozen because its ancestor is frozen. | No | N/A |
ioasids
The cgroup v1 interfaces and the cgroup v2 interfaces of the ioasids subsystem are the same.
Interface name | Purpose | In-house interface |
ioasids.current | Queries the number of ioasids allocated to the current cgroup. | Yes |
ioasids.events | Queries the number of events that occurred because the upper limit of allocable ioasids was exceeded. | Yes |
ioasids.max | Queries the total number of ioasids that can be allocated to the current cgroup. | Yes |
net_cls and net_prio
Interface name | Purpose | In-house interface | Corresponding cgroup v2 interface |
net_cls.classid | Specifies the class identifer that tags network packets of the current cgroup. This interface works with qdisc or iptable. | No | N/A Note The corresponding interfaces are removed from cgroup v2. You can use ebpf to filter and shape traffic. |
net_prio.prioidx | Queries the index value of the current cgroup in the data structure. The interface is read-only and used internally by the kernel. | No | |
net_prio.ifpriomap | Specifies the network priority value for each network interface controller (NIC). | No |
perf_event
The perf_event subsystem does not provide interfaces. The perf_event subsystem is enabled by default for cgroup v2 and provides the same functionality as the perf_event subsystem in cgroup v1.
pids
The cgroup v1 interfaces and the cgroup v2 interfaces of the pids subsystem are the same.
Interface name | Purpose | In-house interface |
pids.max | Specifies the maximum number of tasks in a cgroup. | No |
pids.current | Queries the current number of tasks in a cgroup. | No |
pids.events | Queries the number of events in which the fork operation fails because the maximum number of supported tasks is reached. The fsnotify library is supported to provide filesystem notifications about the events. | No |
rdma
The cgroup v1 interfaces and the cgroup v2 interfaces of the rdma subsystem are the same.
Interface name | Purpose | In-house interface |
rdma.max | Specifies the upper limit on the resource usage of the Remote Direct Memory Access (RDMA) adapter. | No |
rdma.current | Queries the resource usage of the RDMA adapter. | No |