Alibaba Cloud Linux 2 provides the Transparent Huge Page (THP) reclaim feature in kernel versions 4.19.91-24.al7 and later. You can use the THP reclaim feature to fix memory usage issues caused by THP, such as out of memory (OOM) errors. This topic describes the THP reclaim feature and provides examples on how to configure the operations of the feature.

Background information

In Linux operating systems, memory is managed in blocks known as pages. THPs are blocks of memory that come in 2 MiB and 1 GiB sizes, while the size of a regular memory page (subpage) is 4 KiB. Applications consume large amounts of memory, which incurs high address translation overheads. When applications request memory, the kernel dynamically allocates THPs to reduce Translation Lookaside Buffer (TLB) misses and improve application performance.

However, THPs may cause memory bloat. When the THP feature is enabled, the kernel allocates 2 MiB blocks of memory as THPs, which is equivalent to 512 subpages. This leads to high memory fragmentation, which in turn leads to memory overcommitment.

Memory bloat may lead to OOM errors. For example, when an application that only requests 8 KiB of memory (2 subpages) is assigned a 2 MiB THP, the remaining 510 subpages are zero subpages, which represents a significant waste of resident set size (RSS) and will eventually result in an OOM error.

To fix memory issues caused by THPs, Alibaba Cloud Linux 2 provides the THP reclaim feature for memory control groups (memcgs). THP reclaim is a mechanism that splits THPs and reclaims zero subpages to avoid OOM errors caused by memory bloat. However, using this feature may cause memory performance degradation.

Operations

The following table describes the operations of the THP reclaim feature.
Operation Description
memory.thp_reclaim Enables or disables the THP reclaim feature. Valid values:
  • reclaim: enables the THP reclaim feature.
  • swap: reserved for future use.
  • disable: disables the THP reclaim feature.
Default value: disable.
memory.thp_reclaim_stat Queries the state of the THP reclaim feature. Description of response parameters:
  • queue_length: the queue length of each node. When the THP reclaim feature is enabled, THPs are added to a reclaim queue.
  • split_hugepage: the total number of THPs of each node that have been split.
  • reclaim_subpage: the total number of zero subpages of each node that have been reclaimed.
The values of these parameters are listed for each NUMA node in ascending order of node IDs (node0, node1) from left to right.
memory.thp_reclaim_ctrl Controls how the THP reclaim feature is triggered. Description of request parameters:
  • threshold: The upper limit on the number of zero subpages in a THP. If the number of zero subpages exceeds the specified limit, the THP reclaim feature is triggered immediately. The default value of the threshold parameter is 16.
  • reclaim: manually triggers the THP reclaim feature.
/sys/kernel/mm/transparent_hugepage/reclaim The global operation. If you do not want to configure each memory control group, you can use this operation. Valid values:
  • memcg: the default value. This value specifies that each memory control group uses its own configuration that is configured by using memory.thp_reclaim.
  • reclaim: forcibly enables the THP reclaim feature for all memory control groups.
  • swap: reserved for future use.
  • disable: forcibly disables the THP reclaim feature for all memory control groups.

Configure the operations

This example describes how to configure the operations that implement the THP reclaim feature. In the example, a memory control group named test is used.

  1. Run the following command to create a memory control group named test:
    mkdir /sys/fs/cgroup/memory/test/
  2. Run the following command to enable the THP reclaim feature for test:
    echo reclaim > /sys/fs/cgroup/memory/test/memory.thp_reclaim
  3. Run the following command to check whether the THP reclaim feature is enabled for test:
    cat /sys/fs/cgroup/memory/test/memory.thp_reclaim
    The settings that take effect are enclosed in a pair of brackets ([]). [reclaim] shown in the following command output indicates that the THP feature is enabled for test. THP reclaim feature enabled
  4. Run the following command to forcibly enable the THP reclaim feature by using the global operation:
    echo reclaim > /sys/kernel/mm/transparent_hugepage/reclaim
    If you want to forcibly disable the THP reclaim feature, run the following command:
    echo disable > /sys/kernel/mm/transparent_hugepage/reclaim
    Note When the value of the global operation is reclaim or disable, the configurations of the global operation take precedence over the memory.thp_reclaim operation. However, the configurations for memory.thp_reclaim in individual memory control groups are not affected.
  5. Run the following command to use the memory.thp_reclaim_ctrl operation to set the upper limit on the number of zero subpages in a THP for test:
    echo "threshold 32" >  /sys/fs/cgroup/memory/test/memory.thp_reclaim_ctrl
    In this case, if the number of zero subpages in a THP exceeds 32, zero subpages reclaim is triggered.
  6. Manually trigger the zero subpage reclaim feature.
    The THP reclaim feature reclaims excess zero subpages if the number of zero subpages exceeds the value of threshold. If you want to set the memory.thp_reclaim_ctrl operation to reclaim, take note of the following items:
    Note You can only write reclaim into the /sys/fs/cgroup/memory/test/memory.thp_reclaim_ctrl file to trigger the zero subpage reclaim feature, and you cannot use the cat command to view the memory.thp_reclaim_ctrl operation configurations.
    • Run the following command to trigger the zero subpage reclaim feature for the current memory control group:
      echo "reclaim 1" >  /sys/fs/cgroup/memory/test/memory.thp_reclaim_ctrl
    • Run the following command to recursively trigger the zero subpage reclaim feature for the current memory control group and its sub control groups:
      echo "reclaim 2" >  /sys/fs/cgroup/memory/test/memory.thp_reclaim_ctrl
      The zero subpage reclaim feature can not only be manually triggered by setting memory.thp_reclaim_ctrl to reclaim, but also be automatically triggered by memory reclaim.
      • When an OOM error occurs, the zero subpage reclaim feature is triggered.
      • When the backend asynchronous reclaim feature is triggered, the zero subpage reclaim feature is triggered. For more information about the backend asynchronous reclaim feature, see Memcg backend asynchronous reclaim.
  7. Run the following command to view the state of the THP reclaim feature for test:
    cat  /sys/fs/cgroup/memory/test/memory.thp_reclaim_stat
    A command output similar to the following one is displayed:
    queue_length        14
    split_hugepage     523
    reclaim_subpage 256207

Sample C code

This section provides sample C code that is used for applications to request THPs. You can see different results of the applications to request THPs when the THP reclaim feature is enabled or disabled.

  1. Run the following command to set the value of memory.limit_in_bytes to 1 GiB:
    echo 1G > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
    Then, run the following command to query the value of memory.limit_in_bytes.
    cat /sys/fs/cgroup/memory/test/memory.limit_in_bytes
    A command output similar to the following one is displayed:Asynchronous reclaim
  2. Run the following command to disable the memcg backend asynchronous reclaim feature:
    For more information about the memcg backend asynchronous reclaim feature, see Memcg backend asynchronous reclaim.
    echo 0 > /sys/fs/cgroup/memory/test/memory.wmark_ratio
  3. Run the following C code when the THP reclaim feature is enabled and when the THP reclaim feature is disabled:
    // The application requests 1 GiB of memory (512 THPs). Of these THPs, 10 contain zero subpages. 
    #define HUGEPAGE_SIZE 4096 * 512
    int main()
    {
            int i, thp = 512;
            char *addr;
            posix_memalign((void **)&addr, HUGEPAGE_SIZE, HUGEPAGE_SIZE * thp);
    
            for (i = 0; i < 10; i++) {
                    memset(addr, 0xc, HUGEPAGE_SIZE >> 1);
                    addr += HUGEPAGE_SIZE;
            }
    
            for (; i < thp; i++) {
                    memset(addr, 0xc, HUGEPAGE_SIZE);
                    addr += HUGEPAGE_SIZE;
            }
    
            pause();
            return 0;
    }
    The results vary based on whether the THP reclaim feature is enabled.
    • When the THP reclaim feature is disabled, OOM errors occur when the last THP is requested.
    • When the THP reclaim feature is enabled, the THPs allocated for applications are split into subpages and zero subpages are reclaimed. In this case, OOM errors do not occur.