All Products
Search
Document Center

Container Service for Kubernetes:Memory diagnostics

Last Updated:May 06, 2025

Container Intelligence Service provides the memory diagnostics feature to help you identify common memory issues in Container Service for Kubernetes (ACK) clusters, including memory leaks, memory fragmentation, and out of memory (OOM) errors. Diagnostic results are displayed in charts and tables to show the system memory usage and make your O&M work easier. This topic introduces memory diagnostics.

Memory diagnostics consist of memory overview, memory analysis, and OOM analysis. You can view the memory usage of nodes and pods.

The diagnostic items may vary based on the cluster configuration. The actual diagnostic items on the diagnostic page shall prevail.

Important

When you use the diagnostics feature, ACK runs a data collection program on each node in the cluster and collects diagnostic results. The collected information includes the system version, status of the loads, Docker, and kubelet, and key error information in system logs. ACK does not collect business information or sensitive data.

Memory overview

The memory overview feature displays diagnostic items related to memory risks. The following table describes the diagnostic items.

Diagnostic item

Description

Leaked Memory

Checks for system kernel memory leaks in the Slab, Vmalloc, and buddy system (allocpage).

Memory Usage

Displays the utilization of system memory.

Memcg

Evaluates whether the unreleased memory cgroups compromise system performance and cause statistical errors.

Memory Fragmentation

Checks for memory fragmentation, which compromises system performance.

THPZeroPage

Evaluates the ratio of THP waste.

Information about system memory usage, including kernel memory, application memory in user mode, and free memory, is displayed in charts.

  • Kernel memory (kernel): the total amount of memory used by the operating system kernel.

  • Application memory (app): the total amount of memory used by programs in user mode.

  • Free memory (free): the amount of free system memory.

Terms

Term

Description

memory leaks

Memory leaks refer to the release of memory resources that are dynamically allocated to programs, which causes the system memory utilization to increase. Memory leaks can compromise the performance of programs or even cause system crashes.

memory utilization

Memory utilization = (Total memory - Free memory) × 100/Total memory. File caches are free memory, which does not affect the memory utilization. The system can reclaim and reuse file caches at any time.

unreleased Memcg

Memory cgroups that are not released due to system exceptions. These memory cgroups may compromise system performance.

memory fragmentation

Memory fragmentation refers to the failure to fulfill the contiguous memory allocation request because free contiguous memory blocks are too small after the system has been running for a long period of time. The failure delays memory allocation and causes business jitters.

ratio of THP waste

Transparent Huge Pages (THPs) are huge pages whose size is 2 MiB or 1 GiB in the kernel. The size of a subpage is 4 KiB.

When THPs are enabled, the kernel dynamically allocates THPs to reduce Translation Lookaside Buffer (TLB) misses and improve application performance.

However, THPs may cause memory bloat. The kernel allocates 2 MiB blocks of memory as THPs, which are equivalent to 512 subpages. This causes memory waste and results in memory overcommitment. Memory bloat may lead to OOM errors. For example, when an application that requests only 8 KiB of memory (2 subpages) is assigned a 2-MiB THP, the remaining 510 subpages are zero subpages, which result in a waste of resident set size (RSS) and cause an OOM error.

Ratio of THP waste = Number of zero THPs × 100%/Total number of THPs

buddy system

The buddy system is an algorithm used by the Linux kernel to manage memory pages. It divides memory pages into 11 groups. In most cases, a memory page is 4 KB in size. The buddy system manages the number of memory pages in each memory block in the power of two increments, such as 4 KB, 8 KB, 16 KB, 32 KB……4 MB.

Slab

A memory allocator that allocates small pieces of memory based on the buddy system of Linux.

Vmalloc

A memory allocator that uses nonlinear mapping based on the buddy system of Linux.

filecache

When Linux reads or writes a file, it caches the file content in memory. This way, programs can directly read or write the content in memory, which is much faster than reading or writing the file.

anonymous memory

Anonymous memory is dynamically allocated to the heap and stack of a process through new, malloc, or mmap. Anonymous memory is not backed by a file system.

shared memory

A memory block shared by two or more processes for communication.

tmpfs

A temporary file system of Linux based on memory. The file system caches the content that it reads or writes in memory.

hugetlb

The amount of memory consumed by huge pages in a file system.

Kernel memory

In most cases, memory leaks occur if the memory usage of Sunreclaim and the buddy system is abnormal. Pay close attention to their memory usage in kernel mode.

Metric

Description

Sreclaimable

Memory that can be reclaimed by the Slab.

Sunreclaim

Memory that cannot be reclaimed by the Slab.

PageTables

Memory occupied by kernel page tables.

Vmalloc

Memory allocated by calling the Vmalloc function.

KernelStack

Total memory occupied by the heap and stack of a process.

AllocPages

Memory allocated from the buddy system by calling functions such as alloc_pages. The memory cannot be retrieved by using any node file. Excessive use of the memory causes a black hole.

Application memory

You need to pay close attention to anonymous memory, shared memory, and file caches when you view the memory usage of applications in user mode.

Metric

Description

filecache

File caches that can be reclaimed by performing drop caches.

anon

The anonymous memory occupied by the heap and stack of a program. If a large amount of anonymous memory is occupied, you need to check for memory leaks in the process and check whether THPs are enabled.

mlock

Memory locked by the system.

huge

Memory occupied by huge pages.

buffer

The memory occupied by the metadata of the block device and file system.

shmem

Shared memory (tmpfs). If the tmpfs file is not deleted after the process is terminated or the tmpfs file is deleted while the file is open, shared memory leaks occur.

Memory analysis

Memory analysis consists of process memory analysis and pod memory analysis.

Process memory

Memory usage information is displayed by process, including anonymous memory, file caches, and shared memory.

Pod memory

The pod memory analysis feature allows you to view the files that occupy the file caches and shared memory of containers and pods, the ratio of active caches, and the ratio of inactive caches.

Diagnostic item

Description

Pod

The name of the pod.

Container

The name of the container.

File

The full path of the file, which includes the file name.

Cache

The file cache (filecache) occupied by the file.

Container Cache

The container cache occupied by the file. Different processes in a container may manage the same file.

Active Cache

The file cache that is in use.

Inactive Cache

The file cache that is not in use.

OOM analysis

The OOM analysis feature can quickly diagnose OOM errors and display the following diagnostic items.

Diagnostic item

Description

OS OOM Count

The total number of OOM errors that have occurred from the time when the host starts up to the time when the diagnostic is performed.

Available Memory

The amount of free system memory.

Low Watermark

The specified low memory usage threshold. When the memory usage drops below the low threshold, an asynchronous memory reclaim operation is triggered.

Container

The name of the pod, ID of the container, or name of the cgroup.

limit

The memory limit of the container.

usage

The amount of memory used by the container.

OOM Count

The total number of OOM errors that have occurred in the container.

OOM Type

The type of OOM error, which can be Host or cgroup.