All Products
Search
Document Center

:THP-related performance optimization in Alibaba Cloud Linux

Last Updated:Apr 29, 2024

The Transparent Huge Pages (THP) feature is a general feature in the Linux kernel. THP can consolidate small pages (typically 4KB pages) into huge pages (typically pages that are 2 MB or larger in size) to reduce the number of page table entries (PTEs) and number of memory accesses. This way, the pressure on the translation lookaside buffer (TLB) cache is reduced and application performance is improved. This topic describes how to use the THP feature to improve system performance in Alibaba Cloud Linux.

THP-related configurations

In Alibaba Cloud Linux 2 kernel version 4.19.81-17.2 and earlier, madvise is used for THP by default, which indicates that the THP feature is limited. This setting is different from other major operating systems, such as Red Hat Enterprise Linux 7, CentOS 7, and Amazon Linux 2. To ensure compatibility with the default scenarios of major operating systems, starting from Alibaba Cloud Linux 2 kernel version 4.19.91-18, always is used for THP by default, which indicates that the THP feature is globally enabled in the system.

Global configurations

In the Alibaba Cloud Linux kernel, the directory of the THP configuration file is /sys/kernel/mm/transparent_hugepage/enabled. You can use one of the following options:

  • always

    THP is globally enabled.

  • never

    THP is globally disabled.

  • madvise

    THP is enabled only in the memory area that is called by the madvise() system call and is flagged by MADV_HUGEPAGE.

    Note

    When an application is flagged by MADV_HUGEPAGE, the kernel knows that the application requires transparent huge pages for memory allocation.

Defragmentation configurations

In addition to global configurations, the system provides the following THP-related defragmentation configurations.

  • THP defragmentation: THP defragmentation can merge small pages in the system into transparent huge pages. This reduces memory fragmentation and improves system performance.

  • khugepaged defragmentation: khugepaged is a kernel thread that is used for transparent huge page management and defragmentation. This reduces memory fragmentation and improves system performance. khugepaged monitors transparent huge pages in the system and attempts to merge scattered transparent huge pages into larger pages. This improves memory utilization and performance.

Khugepaged defragmentation manages and defragments existing transparent huge pages. THP defragmentation attempts to merge scattered small pages into larger pages. Both configurations reduce memory fragmentation and improve memory utilization and performance by merging pages. See the following description:

THP defragmentation

When a page fault occurs, this feature controls the memory usage by performing direct reclaim, background reclaim, direct compaction, and background compaction operations. The directory of the configuration file for enabling or disabling this feature is /sys/kernel/mm/transparent_hugepage/defrag. You can use one of the following options:

  • always

    When the system is not able to allocate a transparent huge page, the memory allocation is suspended and always waits for the system to perform direct reclaim and direct compaction. If the system has enough contiguous free memory after the direct reclaim and direct compaction is complete, the system continues to allocate transparent huge pages.

  • defer

    When the system is not able to allocate a transparent huge page, the systems allocates regular pages. The page size is 4 KB. Meanwhile, the systems starts the kswapd kernel daemon to perform background reclaim, and starts the kcompactd kernel daemon to perform background compaction. If the system has enough contiguous free memory after these operations run for a period of time, the khugepaged kernel daemon merges the previously allocated regular pages (4 KB in size) into transparent huge pages (2 MB in size).

  • madvise

    In the memory area that is called by the madvise() system call and is flagged by MADV_HUGEPAGE, the memory allocation behavior is the same as that of the always option. In other memory areas, when a page fault occurs, the system allocates regular pages instead. The page size is 4 KB.

    Note

    In Alibaba Cloud Linux 2 kernel version 4.19.81-17.2 and later, the system default option is madvise.

  • defer+madvise

    In the memory area that is called by the madvise() system call and is flagged by MADV_HUGEPAGE, the memory allocation behavior is the same as that of the always option. In other memory areas, the memory allocation behavior is the same as that of the defer option.

  • never

    Defragmentation is disabled.

khugepaged defragmentation

See the following configuration items of khugepaged defragmentation:

  • Enable or disable the feature

    The directory of the configuration file for enabling or disabling this feature is /sys/kernel/mm/transparent_hugepage/khugepaged/defrag. You can use one of the following options:

    • 0

      khugepaged defragmentation is disabled.

    • 1

      When the option 1 is used, the systems starts the khugepaged kernel daemon periodically when the system is idle and attempts to merge consecutive regular pages (4 KB in size) into transparent huge pages (2 MB in size).

      Note
      • In Alibaba Cloud Linux 2 kernel version 4.19.91-18 and later, the system default option is 1.

      • This operation locks the memory directory, and the khugepaged kernel daemon may start scanning and converting regular pages to transparent huge pages at the wrong time. Therefore, this operation may affect application performance.

  • Retry interval

    The retry interval of the khugepaged kernel daemon. If the THP allocation fails, the khugepaged kernel daemon waits for the specified period of time before it starts to allocate transparent huge pages again. This helps avoid consecutive THP allocation failures in a short period of time. Default value: 60000. Unit: milliseconds. The default value is equivalent to 60 seconds. The directory of the configuration file is /sys/kernel/mm/transparent_hugepage/khugepaged/alloc_sleep_millisecs.

  • Start Interval

    The start interval of the khugepaged kernel daemon. The system starts the khugepaged kernel daemon based on the specified interval. Default value: 10000. Unit: milliseconds. The default value is equivalent to 10 seconds. This indicates that the system starts the khugepaged kernel daemon every 10 seconds. The directory of the configuration file is /sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs.

  • Number of pages to scan

    The khugepaged kernel daemon scans the specified number of pages after it is started each time. The default number of pages is 4096. The directory of the configuration file is /sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan.

THP configuration recommendations

System performance impacts of using THP

THP can increase the hit rate of TLB, reduce the number of PTEs and number of memory accesses. This improves system performance. THP attempts to simplify O&M operations without modifying or configuring applications and provide performance improvements that are imperceptible to users. However, the resources of THP are limited. When the system reaches the bottleneck of THP allocation, you need to use certain mechanisms to ensure that the system runs as expected. In this case, the default configuration may not be suitable for all user scenarios. The default configuration may even affect the system performance in the following scenarios:

  • If the THP defragmentation is set to always, direct reclaim or direct compaction are performed when the memory is insufficient, which is the same as the behavior for regular pages (4 KB in size). The system waits for both operations to complete and then allocates pages again. This affects the system performance.

  • If the khugepaged defragmentation is set to 1, the memory directory is locked when the khugepaged kernel daemon merges memory. If khugepaged defragmentation is triggered during peak hours of your business, the performance of memory-sensitive applications may be affected.

  • If you enable THP and disable THP defragmentation and khugepaged defragmentation, the memory allocation process may consume idle page resources faster than regular pages (4 KB in size). This leads to memory reclaim and compaction operations and earlier system performance degrade.

Configuration recommendations

The impact of THP on system performance varies in different scenarios. You need to adjust the configurations based on the situation of your business, system, and application. See the following examples:

Important

Before you make modifications, back up the configuration files or create disk snapshots to avoid data loss. For more information about how to create a snapshot for a disk, see Create a snapshot for a disk.

  • If your system kernel provides sufficient resources, we recommend that you run the following command to enable the experimental option defer+madvise. This allows the kswapd kernel daemon, kcompactd kernel daemon, and khugepaged kernel daemon to work together as much as possible and achieve a balance between memory management and stable performance.

    sudo bash -c "echo 'defer+madvise' > /sys/kernel/mm/transparent_hugepage/defrag"
  • If the CPU utilization of the khugepaged kernel daemon reaches or approaches 100%, you can increase the start interval of the khugepaged kernel daemon, for example, to 30 seconds. See the following sample command:

    sudo sh -c 'echo 30000 > /sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs'

    Alternatively, you can directly stop the khugepaged kernel daemon. See the following sample command:

    sudo sh -c 'echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag'
  • In scenarios that involves a large number of access requests to the database application, a large number of latency-sensitive applications, or a large number of short-lived allocations, we recommend that you disable the THP feature if system stability is more important than performance. See the following sample command to disable THP when the system is running:

    sudo sh -c 'echo "never" > /sys/kernel/mm/transparent_hugepage/enabled'
    Note

    This command is valid only when the system is running. After the system is restarted, the THP feature is enabled. To permanently disable the THP feature, run the following commands in sequence as the root user to add the parameter for disabling the THP feature in the kernel startup parameters.

    sudo grubby --args="transparent_hugepage=never" --update-kernel="/boot/vmlinuz-$(uname -r)"
    sudo reboot

View THP usage

You can view the THP usage in the system mainly at the system level or the process level.

  • System level

    At the system level, the THP parameters and configuration options affect all processes in the entire system. Execute the following command to view the THP usage.

    cat /proc/meminfo | grep AnonHugePages

    A similar result is returned.

    AnonHugePages:    614400 kB
    Note

    If the returned result is not zero, a number of transparent huge pages are used in the system.

  • Process level: At the process level, you can use the madvise() system call and the MADV_HUGEPAGE flag to control the use of THP. This allows applications to independently enable or disable the THP feature without affecting other processes or the entire system. Execute the following command to view the THP usage of a specific process.

    sudo cat /proc/<PID>/smaps | grep AnonHugePages
    Note

    Replace <PID> with the PID of the process that you want to view.

    A similar result is returned.

    AnonHugePages:         0 kB
    AnonHugePages:         0 kB
    AnonHugePages:         0 kB
    AnonHugePages:         0 kB
    AnonHugePages:         0 kB
    AnonHugePages:         0 kB
    AnonHugePages:         0 kB
    AnonHugePages:         0 kB
    AnonHugePages:         0 kB
    AnonHugePages:         0 kB

References

  • For more information about the features and risks of THP, see Transparent Hugepage Support.

  • The Huge Page feature works similarly to THP to consolidate consecutive code fragments to the memory area of a huge page that is usually 2 MB or larger in size. This feature helps reduce TLB misses You can also use the Huge Page feature provided by Alibaba Cloud Linux to improve system performance. For more information, see Huge Pages.

  • For more information about how to optimize the system performance in Red Hat Enterprise Linux 7 by using THP, see Configuring Transparent Huge Pages.