All Products
Search
Document Center

Alibaba Cloud Linux:Troubleshoot the OOM Killer

Last Updated:May 27, 2026

When memory reclamation cannot resolve low memory on a Linux system, the kernel invokes the OOM Killer to forcibly terminate processes. This topic covers common causes of OOM Killer events on Alibaba Cloud Linux and how to resolve them.

Problem description

The following log shows the test process triggering the OOM Killer:

565 [Sat Sep 11 12:24:42 2021] test invoked oom-killer: gfp_mask=0x62****(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
566 [Sat Sep 11 12:24:42 2021] test cpuset=/ mems_allowed=0
567 [Sat Sep 11 12:24:42 2021] CPU: 1 PID: 29748 Comm: test Kdump: loaded Not tainted 4.19.91-24.1.al7.x86_64 #1
568 [Sat Sep 11 12:24:42 2021] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS e62**** 04/01/2014

Potential causes

OOM Killer events result from either insufficient global memory or insufficient cgroup memory. The following table lists common scenarios:

Cause

Scenario example

Insufficient cgroup memory

In this log, the OOM Killer fires for cgroup /mm_test containing the test process:

[Wed Sep  8 18:01:32 2021] test invoked oom-killer: gfp_mask=0x240****(GFP_KERNEL), nodemask=0, order=0, oom_score_adj=0
[Wed Sep  8 18:01:32 2021] Task in /mm_test killed as a result of limit of /mm_test
[Wed Sep  8 18:01:32 2021] memory: usage 204800kB, limit 204800kB, failcnt 26

Cause: The /mm_test cgroup reached its 200 MB memory limit.

Insufficient parent cgroup memory

Here, the test process in cgroup /mm_test/2 is killed because its parent cgroup /mm_test reached its memory limit:

[Fri Sep 10 16:15:14 2021] test invoked oom-killer: gfp_mask=0x240****(GFP_KERNEL), nodemask=0, order=0, oom_score_adj=0
[Fri Sep 10 16:15:14 2021] Task in /mm_test/2 killed as a result of limit of /mm_test
[Fri Sep 10 16:15:14 2021] memory: usage 204800kB, limit 204800kB, failcnt 1607

Cause: The parent cgroup /mm_test hit its 200 MB limit, triggering OOM Killer on the child cgroup /mm_test/2 even though the child had not reached its own limit.

Insufficient global memory

In this log, limit of host indicates insufficient global memory. Free memory (free) on node 0 has dropped below the low watermark (low):

[Sat Sep 11 12:24:42 2021] test invoked oom-killer: gfp_mask=0x62****(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0,
[Sat Sep 11 12:24:42 2021] Task in /user.slice killed as a result of limit of host
[Sat Sep 11 12:24:42 2021] Node 0 DMA32 free:155160kB min:152412kB low:190512kB high:228612kB
[Sat Sep 11 12:24:42 2021] Node 0 Normal free:46592kB min:46712kB low:58388kB high:70064kB

Cause: Free memory dropped below the minimum threshold and memory reclamation could not free enough pages.

Insufficient memory on a memory node

Key indicators in this log:

  • limit of host indicates a memory node ran out of memory.

  • The instance has two memory nodes: Node 0 and Node 1.

  • The free memory (free) on Node 1 is below its low watermark (low).

  • Total free memory remains high (free:4111496).

[Sat Sep 11 09:46:24 2021] main invoked oom-killer: gfp_mask=0x62****(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
[Sat Sep 11 09:46:24 2021] main cpuset=mm_cpuset mems_allowed=1
[Sat Sep 11 09:46:24 2021] Task in / killed as a result of limit of host
[Sat Sep 11 09:46:24 2021] Mem-Info:
[Sat Sep 11 09:46:24 2021] active_anon:172 inactive_anon:4518735 isolated_anon:
    free:4111496 free_pcp:1 free_cma:0
[Sat Sep 11 09:46:24 2021] Node 1 Normal free:43636kB min:45148kB low:441424kB high:837700kB
[Sat Sep 11 09:46:24 2021] Node 1 Normal: 856*4kB (UME) 375*8kB (UME) 183*16kB (UME) 184*32kB (UME) 87*64kB (ME) 45*128kB (UME) 16*256kB (UME) 5*512kB (UE) 14*1024kB (UME) 0     *2048kB 0*4096kB = 47560kB
[Sat Sep 11 09:46:24 2021] Node 0 hugepages_total=360 hugepages_free=360 hugepages_surp=0 hugepages_size=1048576kB
[Sat Sep 11 09:46:24 2021] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Sat Sep 11 09:46:24 2021] Node 1 hugepages_total=360 hugepages_free=360 hugepages_surp=0 hugepages_size=1048576kB
[Sat Sep 11 09:46:25 2021] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB

Cause: In a NUMA architecture, cpuset.mems can restrict a cgroup to specific memory nodes. If those nodes are exhausted, OOM Killer fires even when other nodes have free memory. Run cat /proc/buddyinfo to view node status.

Insufficient buddy system memory due to memory fragmentation

Key indicators in this log:

  • OOM Killer was triggered at order=3 allocation (32 KB contiguous block).

  • Free memory (free) on node 0 is above the low watermark (low).

  • The buddy system has no blocks of the required size (0*32kB (M)).

[Sat Sep 11 15:22:46 2021] insmod invoked oom-killer: gfp_mask=0x60****(GFP_KERNEL), nodemask=(null), order=3, oom_score_adj=0
[Sat Sep 11 15:22:46 2021] insmod cpuset=/ mems_allowed=0
[Sat Sep 11 15:22:46 2021] Task in /user.slice killed as a result of limit of host
[Sat Sep 11 15:22:46 2021] Node 0 Normal free:23500kB min:15892kB low:19864kB high:23836kB active_anon:308kB inactive_anon:194492kB active_file:384kB inactive_file:420kB unevi    ctable:0kB writepending:464kB present:917504kB managed:852784kB mlocked:0kB kernel_stack:2928kB pagetables:9188kB bounce:0kB
[Sat Sep 11 15:22:46 2021] Node 0 Normal: 1325*4kB (UME) 966*8kB (UME) 675*16kB (UME) 0*32kB (M) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB =

Cause: Memory fragmentation prevents the buddy system from allocating a contiguous block of the required size, even though total free memory is sufficient.

Note

The buddy system is a Linux kernel memory allocator that manages contiguous blocks of varying sizes to reduce fragmentation.

Solutions

Resolve the issue based on its cause.

Insufficient memory in a cgroup or parent cgroup

Terminate unnecessary processes to free memory. If your workload requires more memory, upgrade the instance.

  1. Upgrade the instance.

    Configuration change overview.

  2. After upgrading, manually adjust the cgroup's memory limit.

    sudo bash -c 'echo <value> > /sys/fs/cgroup/memory/<cgroup_name>/memory.limit_in_bytes'

    Replace <value> with the new memory limit for the cgroup in bytes and <cgroup_name> with the name of your cgroup.

Insufficient global memory

Investigate the following areas:

  • Check the slab_unreclaimable memory usage.

    cat /proc/meminfo | grep "SUnreclaim"

    slab_unreclaimable is memory the system cannot reclaim. If it exceeds 10% of total memory, a slab memory leak is likely. Troubleshoot using What do I do if an instance has a high percentage of slab_unreclaimable memory? If the issue persists, submit a ticket.

  • Check the systemd memory usage.

    cat /proc/1/status | grep "RssAnon"

    The kernel skips process 1 (systemd) during OOM kills, so systemd memory usage should not exceed 200 MB. If usage is abnormally high, update systemd to a newer version.

  • Review the performance of Transparent Huge Pages (THP).

    THP can cause memory bloating that leads to OOM events. To mitigate this, tune THP settings as described in How do I use THP to tune performance in Alibaba Cloud Linux?

Insufficient memory on a memory node

Reconfigure cpuset.mems to allow the cgroup to use memory from additional nodes.

  1. Identify the memory nodes in your system.

    cat /proc/buddyinfo
  2. Configure the cpuset.mems parameter.

    sudo bash -c 'echo <value> > /sys/fs/cgroup/cpuset/<cgroup_name>/cpuset.mems'

    Replace <value> with the corresponding memory node numbers and <cgroup_name> with the name of your cgroup.

    For example, if your system has three nodes (Node 0, Node 1, and Node 2) and you want the cgroup to use memory from Node 0 and Node 2, set <value> to 0,2.

Insufficient buddy system memory due to memory fragmentation

Perform memory compaction during off-peak hours:

sudo bash -c 'echo 1 > /proc/sys/vm/compact_memory'