What do I do if a memory leak occurs on the Intel SGX driver of an Alibaba Cloud Linux 2 ECS instance? - Alibaba Cloud Linux

This topic describes the cause of and solutions to the issue that a memory leak occurs on the Intel Software Guard Extension (SGX) driver of an Elastic Compute Service (ECS) instance that runs Alibaba Cloud Linux 2.

Problem description

When a memory leak occurs on the Intel SGX driver of an Alibaba Cloud Linux 2 instance with the following configurations, system memory may be exhausted:

Image: Alibaba Cloud Linux 2.1903 LTS 64-bit.
Kernel version: kernel-4.19.91-23.al7 or earlier. You can run the uname -r command to view the kernel version.
Instance family: c7t, r7t, or g7t.

Most memory is occupied by the test process application of Intel SGX. The error information is displayed as follows:

[   71.938733] systemd-journal invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[   71.938735] systemd-journal cpuset=/ mems_allowed=0
[   71.938738] CPU: 0 PID: 415 Comm: systemd-journal Not tainted 4.19.91-23.al7.x86_64 #1
[   71.938738] Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 0.0.0 02/06/2015
[   71.938739] Call Trace:
[   71.938746]  dump_stack+0x66/0x8b
[   71.938749]  dump_global_header+0x12/0x10f
[   71.938750]  oom_kill_process+0x2cf/0x310
[   71.938752]  out_of_memory+0xf7/0x4c0
[   71.938754]  __alloc_pages_nodemask+0xf07/0xfd0
[   71.938757]  ? blk_flush_plug_list+0xd7/0x220
[   71.938759]  pagecache_get_page+0x8c/0x350
[   71.938761]  filemap_fault+0x37e/0x6e0
[   71.938764]  ext4_filemap_fault+0x2c/0x3b
[   71.938766]  __do_fault+0x38/0x170
[   71.938768]  do_fault+0x2eb/0x640
[   71.938769]  __handle_mm_fault+0x621/0xa20
[   71.938772]  ? apic_timer_interrupt+0xa/0x20
[   71.938774]  handle_mm_fault+0x106/0x1c0
[   71.938776]  __do_page_fault+0x1ba/0x480
[   71.938778]  do_page_fault+0x32/0x140
[   71.938780]  ? async_page_fault+0x8/0x30
[   71.938781]  async_page_fault+0x1e/0x30
[   71.938782] RIP: 0033:0x55a1ca49516f
[   71.938786] Code: Bad RIP value.
[   71.938787] RSP: 002b:00007ffcd58b22b0 EFLAGS: 00010246
[   71.938788] RAX: 0000000000000000 RBX: 000055a1cbcc4400 RCX: a1fcdcf819d7e1e5
[   71.938788] RDX: 00007f3d4d72a000 RSI: 000055a1cbcc2060 RDI: 000055a1cbcc4400
[   71.938789] RBP: a1fcdcf819d7e1e5 R08: 00007ffcd58b23b0 R09: 00007ffcd58b23a8
[   71.938790] R10: 000055a1ca49a935 R11: 00000000d1ba4319 R12: 000055a1cbcc4400
[   71.938790] R13: 0000000000000011 R14: 000055a1cbcc2060 R15: a1fcdcf819d7e1e5
[   71.938791] Task in / killed as a result of limit of host
[   71.938792] Mem-Info:
[   71.938795] active_anon:85 inactive_anon:410619 isolated_anon:0
 active_file:150 inactive_file:353 isolated_file:0
 unevictable:0 dirty:0 writeback:0 unstable:0
 slab_reclaimable:6038 slab_unreclaimable:17336
 mapped:98 shmem:403568 pagetables:1793 bounce:0
 free:12881 free_pcp:440 free_cma:0
[   71.938797] Node 0 active_anon:340kB inactive_anon:1642476kB active_file:600kB inactive_file:1412kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:392kB dirty:0kB writeback:0kB shmem:1614272kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 2048kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[   71.938798] Node 0 DMA free:7408kB min:392kB low:488kB high:584kB active_anon:0kB inactive_anon:8312kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15996kB managed:15908kB mlocked:0kB kernel_stack:0kB pagetables:16kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[   71.938800] lowmem_reserve[]: 0 1761 1761 1761 1761
[   71.938801] Node 0 DMA32 free:44116kB min:44660kB low:55824kB high:66988kB active_anon:340kB inactive_anon:1633492kB active_file:688kB inactive_file:1812kB unevictable:0kB writepending:0kB present:1914960kB managed:1826408kB  mlocked:0kB kernel_stack:2208kB pagetables:7156kB bounce:0kB free_pcp:1760kB local_pcp:1396kB free_cma:0kB
[   71.938804] lowmem_reserve[]: 0 0 0 0 0
[   71.938805] Node 0 DMA: 0*4kB 2*8kB (UM) 2*16kB (UE) 0*32kB 1*64kB (E) 3*128kB (UME) 3*256kB (UME) 2*512kB (ME) 3*1024kB (UME) 1*2048kB (E) 0*4096kB = 7408kB
[   71.938810] Node 0 DMA32: 233*4kB (UMEH) 158*8kB (UMEH) 177*16kB (UMEH) 79*32kB (UEH) 34*64kB (UMEH) 16*128kB (UMEH) 6*256kB (E) 3*512kB (UE) 3*1024kB (ME) 3*2048kB (UME) 5*4096kB (M) = 44548kB
[   71.938815] Node 0 enormouspages_total=0 enormouspages_free=0 enormouspages_surp=0 enormouspages_size=1048576kB
[   71.938816] Node 0 enormouspages_total=0 enormouspages_free=0 enormouspages_surp=0 enormouspages_size=2048kB
[   71.938816] 404127 total pagecache pages
[   71.938817] 0 pages in swap cache
[   71.938818] Swap cache stats: add 0, delete 0, find 0/0
[   71.938818] Free swap  = 0kB
[   71.938819] Total swap = 0kB
[   71.938819] 482739 pages RAM
[   71.938820] 0 pages HighMem/MovableOnly
[   71.938820] 22160 pages reserved
[   71.938820] 0 pages cma reserved
[   71.938821] 0 pages hwpoisoned
[   71.938821] Tasks state (memory values in pages):
[   71.938822] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
[   71.938824] [    415]     0   415    11814       85   147456        0             0 systemd-journal
[   71.938826] [    439]     0   439    11430      228   118784        0         -1000 systemd-udevd
[   71.938827] [    550]     0   550    22654      218   212992        0             0 rngd
[   71.938828] [    554]    81   554    15051      155   167936        0          -900 dbus-daemon
[   71.938829] [    573]     0   573    48803      120   180224        0             0 gssproxy
[   71.938830] [    585]     0   585     6598       91    98304        0             0 systemd-logind
[   71.938831] [    587]     0   587     4456      115    61440        0             0 assist_daemon
[   71.938832] [    597]    32   597    17316      135   188416        0             0 rpcbind
[   71.938833] [    601]     0   601    31598      153   106496        0             0 crond
[   71.938834] [    606]   997   606    29454      129   143360        0             0 chronyd
[   71.938835] [    616]     0   616    27553       33    57344        0             0 agetty
[   71.938836] [    819]     0   819    25740      516   221184        0             0 dhclient
[   71.938837] [    887]     0   887   121900      708   430080        0             0 rsyslogd
[   71.938838] [    953]     0   953    10512      391   102400        0             0 AliYunDunUpdate
[   71.938839] [   1078]     0  1078    32317      732   274432        0             0 AliYunDun
[   71.938840] [   1235]     0  1235    28237      261   266240        0         -1000 sshd
[   71.938841] [   1283]     0  1283    39209      337   348160        0             0 sshd
[   71.938842] [   1292]     0  1292    29086      317    90112        0             0 bash
[   71.938843] [   1310]     0  1310    87597      530   311296        0          -900 abrt-dbus
[   71.938844] [   1397]     0  1397    39209      347   348160        0             0 sshd
[   71.938845] [   1399]     0  1399    29080      279    81920        0             0 bash
[   71.938846] [   1430]     0  1430    27028       25    77824        0             0 dmesg
[   71.938847] [   1431]     0  1431  8392985       92  3219456        0             0 app
[   71.938848] [   1432]     0  1432    39209      339   356352        0             0 sshd
[   71.938849] [   1434]     0  1434    29053      276    81920        0             0 bash
[   71.938850] [   1470]     0  1470     2146       23    57344        0             0 systemd-cgroups
[   71.938851] [   1471]     0  1471     2146       23    57344        0             0 systemd-cgroups
[   71.938852] [   1472]     0  1472     2146       23    53248        0             0 systemd-cgroups
[   71.938853] [   1473]     0  1473     2143       15    57344        0             0 systemd-cgroups
[   71.938854] Out of memory: Kill process 1431 (app) score 1 or sacrifice child
[   71.939026] Killed process 1431 (app) total-vm:33571940kB, anon-rss:320kB, file-rss:48kB, shmem-rss:0kB
[   71.942922] oom_reaper: reaped process 1431 (app), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Cause

The sgx_encl_mm_release_deferred function in Arch, X86, Kernel, CPU, SGX, and encl.c fails to properly process the reference count of the Encl structure. When a process that uses Enclave Page Cache (EPC) memory is forked, the reference count of Encl remains non-zero, resulting in an encrypted memory (EPC) leak. After physical memory is exhausted, shared memory is used to swap out the encrypted memory, eventually exhausting the non-encrypted memory.

Solutions

Warning

Kernel upgrades may cause compatibility and stability issues. Review the kernel features in release notes for Alibaba Cloud Linux 2 and exercise caution when you upgrade the kernel version.
The restart operation temporarily stops the instance, which may interrupt running services and lead to data loss. Therefore, back up critical instance data and then restart the instance during off-peak hours.

If the instance's kernel version is 4.19.91-23.al7.x86_64 or earlier, perform the following steps:
1. Upgrade the kernel to the latest version.
```
sudo yum update kernel
```
2. Restart the instance for the new kernel version to take effect.
```
sudo reboot
```
If the instance's kernel version is 4.19.91-23.al7.x86_64, install a kernel live patch.
```
sudo yum install -y kernel-hotfix-5577959-`uname -r | awk -F"-" '{print $NF}'`
```