All Products
Search
Document Center

:Transparent huge page THP-related performance optimization in Alibaba Cloud Linux 2

Last Updated:Aug 28, 2020

Disclaimer: this document may contain information about third-party products that are for reference only. Alibaba Cloud does not make any guarantee, express or implied, with respect to the performance and reliability of third-party products, as well as potential impacts of operations on the products.

Overview

This article mainly describes the performance optimization methods related to the THP (Transparent Huge Pages) in the Alibaba Cloud Linux 2 system. For more information about the THP concept of transparent huge pages, see more information.

Description

Alibaba Cloud reminds you that:

  • If you have any risky operations on an instance or data, pay attention to the disaster tolerance and fault tolerance capabilities of the instance to ensure data security.
  • If you modify the configuration and data of an instance (including but not limited to ECS and RDS), we recommend that you create snapshots or enable RDS log backup.
  • If you have granted permissions on the Alibaba Cloud platform or submitted security information such as the logon account and password, we recommend that you modify the information as soon as possible.

User notification

In the Alibaba Cloud Linux 2 4.19.81-17.2 in earlier kernel versions, the default THP for transparent huge pages is madvise, that is, to enable the transparent huge page THP function. This setting is inconsistent with that of other mainstream operating systems, such as RHEL 7, CentOS 7, and Amazon Linux 2. To ensure compatibility with the default scenarios of mainstream operating systems, Alibaba Cloud Linux 2 4.19.91-18 starting from kernel version, the default THP for transparent huge pages is always, that is, the system globally enables the THP function.

Function switch

In the Alibaba Cloud Linux 2 kernel, the THP configuration file for transparent huge pages is located in the /sys/kernel/mm/transparent_hugepage/enabled the optional configuration items are as follows:

  • always
    The transparent huge page function is enabled globally.
    Attention: In the Alibaba Cloud Linux 2 4.19.91-18 in kernel versions and later, the default settings are always.
  • never
    The transparent huge page (THP) function is disabled globally.
  • madvise
    Only in through madvise() system call, and set MADV_HUGEPAGE specifies whether to enable the transparent huge page function in the specified memory area.

Defragmentation configuration

In addition to the above global configuration, there are also the following two defragmentation configurations related to the THP of transparent huge pages.

Transparent huge page THP defragmentation

When a Page Fault occurs, this feature controls the memory to perform Direct Reclaim, Background Reclaim, Direct Compaction, behavior of Background Compaction. The path of the configuration file that enables or disables this function is /sys/kernel/mm/transparent_hugepage/defrag the optional configuration items are as follows:

  • always
    When the system cannot allocate transparent huge pages, the system suspends the memory allocation and waits for the system to directly recycle and sort the memory. After the Memory Collection and deallocation, if there is enough continuous idle memory, transparent huge pages are allocated again.
  • defer
    When the system cannot allocate transparent huge pages, instead, the system allocates normal 4KB pages. At the same time, the kswapd kernel daemon is woken up to recycle the memory in the background, and the kcompactd kernel daemon is woken up to recycle the memory in the background. After a period of time, if there is enough continuous free memory, the khugepaged kernel daemon merges the previously allocated 4KB pages into 2MB transparent huge pages.
  • madvise
    Only in through madvise() system call, and set MADV_HUGEPAGE the memory allocation behavior is equivalent to always. The memory allocation behavior for the rest of the system remains the same as: when a page fault occurs, it changes to an ordinary 4 kB page.
    Attention: In the Alibaba Cloud Linux 2 4.19.91-18 in kernel versions and later, the default settings are madvise.
  • defer+madvise
    Only in through madvise() system call, and set MADV_HUGEPAGE the memory allocation behavior is equivalent to always. The remaining memory allocation behavior remains as defer.
  • never
    Defragmentation is prohibited.

khugepaged defragmentation

Important configurations related to this feature are as follows:

  • Function switch
    The configuration file path is /sys/kernel/mm/transparent_hugepage/khugepaged/defrag the optional configuration items are as follows:
    • 0
      Disables the khugepaged defragmentation function.
    • 1
      Configure to 1the khugepaged kernel daemon wakes up periodically when the system is idle and tries to merge consecutive 4KB pages into 2MB transparent huge pages.
      Note:
      • In the Alibaba Cloud Linux 2 4.19.91-18 in kernel versions and later, the default settings are 1.
      • This operation will be locked in the memory path, and the khugepaged kernel daemon may scan and convert huge pages at an incorrect time. This may affect the performance of the application.
  • Retry Interval
    When transparent huge page allocation fails, the khugepaged kernel daemon waits until the next huge page allocation. Avoid consecutive failures in huge page allocation in a short period of time. Default value: 60000 in milliseconds, that is, the default wait time is 60 seconds. The configuration file path is as follows:
    /sys/kernel/mm/transparent_hugepage/khugepaged/alloc_sleep_millisecs
  • Wake-up interval
    The interval between each wake-up call of the khugepaged kernel daemon. Default value: 10000, in milliseconds, that is, once every 10 seconds by default. The configuration file path is as follows:
    /sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs
  • Scanned pages
    The number of pages scanned by the khugepaged kernel daemon each time it wakes up. The default value is 4096 pages. The configuration file path is as follows:
    /sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan

Recommended configuration

Transparent huge Page THP can increase the chance of hitting TLB(Translation Lookaside Buffer), the Translation back Buffer, and reduce the memory overhead when accessing the Page Table entry PTE(Page Table Entries), thereby improving the system performance. The transparent huge page THP attempts to further relieve the O&M pressure and enables the user to enjoy the performance improvement caused by huge pages without noticing this. However, the transparent huge page THP resources are limited, so when the system reaches the bottleneck of huge page allocation, a series of mechanisms must be used to ensure the normal operation of the system. In this case, the default configuration of the system does not necessarily apply to all user scenarios.

The advantages and disadvantages of transparent huge pages have been discussed in the industry. Up to now, the prevailing view in the industry is that system O&M personnel do not know transparent huge pages (THP) well enough. Using the default configuration of the system is risky for many applications. The main risks may be:

  • If the defragmentation switch of the transparent huge page THP is set to always, when the memory is tight, it will be the same as the main 4 kB page, the direct collection of memory or the direct sorting of memory, both operations are synchronous waiting operations, which will cause system performance degradation.
  • If the switch to khugepaged defragmentation is set to 1, the khugepaged kernel daemon locks the memory path when it performs a memory merge operation. If khugepaged defragmentation is triggered at an incorrect time, the performance of memory-sensitive applications will be affected.
  • If transparent huge page THP is turned on while the two defragmentation switches are turned off, the memory allocation process may consume idle page resources faster than 4KB pages. Then, the system began to enter the process of memory Collection and in-memory management, with system performance degradation even earlier.

In summary, the impact of transparent huge page THP on system performance cannot be generalized. Common sample scenarios are as follows:

Attention before modifying any configuration files, we recommend that you back up the relevant configuration files or create ECS snapshots to ensure data security.

  • If you are confident enough about the system kernel, we recommend that you turn on the experimental switch () by referring to the following command to enable the background reclamation of the kernel memory (kswapd kernel daemon), the kcompactd kernel daemon and the khugepaged kernel daemon work in concert to ensure a balanced performance between memory management and performance.
    echo 'defer+madvise' > /sys/kernel/mm/transparent_hugepage/defrag
  • When the khugepaged kernel daemon reaches or approaches 100% of the CPU usage, you can increase the khugepaged kernel daemon wake-up gap, for example, changing it to 30 seconds. The sample command is as follows:
    echo 30000 > /sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs
    Alternatively, you can directly disable the khugepaged kernel daemon. The sample command is as follows:
    echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
  • In special scenarios, such as "database applications have a large number of access requests", "a large number of latency-sensitive application scenarios", or "a large number of short-lived memory Allocation (Short-lived Allocation) scenarios", if system stability is more important than performance, we recommend that you disable the transparent huge page THP function. Example command that disables transparent huge page THP when the system is running:
    echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
    Attention: this command is only valid during the current running of the system. The transparent huge page THP function is also enabled after the system restarts. If you want to permanently disable the transparent huge page function, see more information.

View the THP usage on transparent huge pages

The usage of THP for transparent huge pages is divided into two aspects:

  • System
    Run the following command to view the THP usage on the transparent huge page:
    cat /proc/meminfo | grep AnonHugePages
    The following is an example:
    AnonHugePages: 614400 kB
    Description: If the system returns a non-zero value, a certain number of transparent THP's are used in the system.
  • Process level
    Run the following command to view the transparent THP used by a specific process:
    cat /proc/[$PID]/smaps | grep AnonHugePages
    Description:[$PID] indicates the PID of the process.
    The following is an example:
    AnonHugePages: 0kB
     AnonHugePages: 0kB
     AnonHugePages: 0kB
     AnonHugePages: 0kB
     AnonHugePages: 0kB
     AnonHugePages: 0kB
     AnonHugePages: 0kB
     AnonHugePages: 0kB
     AnonHugePages: 0kB
     AnonHugePages: 0kB

More Information

THP functional concepts for transparent huge pages

The transparent Huge page THP enables the kernel to automatically allocate Huge Pages to user processes without having to reserve a certain number of Huge Pages in advance like HugeTLB. This feature can improve the performance of your applications. However, in actual production, if this option is not properly set, the performance of your applications may fluctuate.

Persistent shutdown THP function

Run the following commands as the root user to add a parameter to the kernel startup parameters to disable the transparent huge page THP feature.

grubby --args="transparent_hugepage=never" --update-kernel="/boot/vmlinuz-$(uname -r)"
reboot

References

For more information about transparent huge page THP, see the following documents.

Application scope

  • ECS