All Products
Search
Document Center

Alibaba Cloud Linux:Use KFENCE to detect kernel memory pollution

Last Updated:Mar 14, 2024

Starting with kernel version 5.10.84-10 for the x86 architecture and 5.10.134-16 for the Arm architecture, Alibaba Cloud Linux 3 supports Kernel Electric-Fence (KFENCE). This topic describes the KFENCE feature and how to use this feature.

Usage notes

KFENCE is a built-in Linux kernel tool that can be enabled in an online environment. It detects memory pollution issues in the kernel and kernel modules. KFENCE was introduced in kernel version 5.12 of the upstream Linux kernel community. KFENCE detects accesses to freed or unallocated memory by inserting special fences close to memory boundaries. If memory pollution occurs, KFENCE detects the issue and prints an error message that contains the details of the issue. For more information about KFENCE, see KFENCE documentation and OpenAnolis.

Alibaba Cloud enhances the KFENCE feature in Alibaba Cloud Linux 3. You can flexibly and dynamically enable or disable KFENCE and use it to fully detect memory pollution issues, which facilitates online detection and offline debugging.

Note

If you are a developer of the kernel or kernel modules, you can use KFENCE to check whether memory pollution occurs in the kernel or kernel modules that you are developing. If you are a common user and encounter a kernel crash, you can use KFENCE to help Alibaba Cloud or third-party driver developers collect detailed information.

Enable KFENCE

The KFENCE feature is used in the following business scenarios:

Online detection scenario

Scenario 1: Use KFENCE to detect whether memory pollution occurs

Note

KFENCE in this scenario occupies 2 MiB of memory and does not affect performance.

Run the following command to enable KFENCE by adding the boot command line:

sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="kfence.sample_interval=100"

In this scenario, the configuration automatically takes effect the next time the system restarts.

Scenario 2: Use KFENCE to detect memory pollution issues

Important

In this scenario, a large amount of memory at the GiB level is consumed. Proceed with caution when you use small-memory machines.

  1. Create a memory allocation script and add the following content. In the following example, the script name is kfence.sh and the slab type to be monitored is kmalloc-64.

    #!/bin/bash
    # usage: ./kfence.sh kmalloc-64
    
    SLAB_PREFIX=/sys/kernel/slab
    MODULE_PREFIX=/sys/module/kfence/parameters
    
    if [ $# -eq 0 ]; then
    	echo "err: please input slabs"
    	exit 1
    fi
    
    #check whether slab exists
    for i in $@; do
    	slab_path=$SLAB_PREFIX/$i
    	if [ ! -d $slab_path ]; then
    		echo "err: slab $i not exist!"
    		exit 1
    	fi
    done
    
    #calculate num_objects
    sumobj=0
    for i in $@; do
    	objects=($(cat $SLAB_PREFIX/$i/objects))
    	maxobj=1
    	for ((j=1; j<${#objects[@]}; j++)); do
    		nodeobj=$(echo ${objects[$j]} | awk -F= '{print $2}')
    		[ $maxobj -lt $nodeobj ] && maxobj=$nodeobj
    	done
    	((sumobj += maxobj))
    done
    echo "recommend num_objects per node: $sumobj"
    
    #check kfence stats
    if [ $(cat $MODULE_PREFIX/sample_interval) -ne 0 ]; then
    	echo "kfence is running, disable it and wait..."
    	echo 0 > $MODULE_PREFIX/sample_interval
    	sleep 1
    fi
    
    #disable all slabs catching
    for file in $SLAB_PREFIX/*
    do
    	echo 0 > $file/kfence_enable
    done
    
    #disable order0 page catching
    echo 0 > $MODULE_PREFIX/order0_page
    
    #enable setting slabs catching
    for i in $@; do
    	echo 1 > $SLAB_PREFIX/$i/kfence_enable
    done
    
    #setting num_objects and node mode
    echo $sumobj > $MODULE_PREFIX/num_objects
    echo node > $MODULE_PREFIX/pool_mode
    
    #start kfence
    echo -1 > $MODULE_PREFIX/sample_interval
    if [ $? -ne 0 ]; then
    	echo "err: kfence enable fail!"
    	exit 1
    fi
    echo "kfence enabled!"

    The script is used to detect the number of active objects of the slabs, estimate the appropriate KFENCE pool size based on the number, and then enable KFENCE to obtain information about the memory allocation of all the slabs.

    Note

    Slabs are commonly used in memory management to optimize memory allocation and release operations. This improves system performance and efficiency. KFENCE can monitor slabs and order 0 pages. For more information, see the "Terms" section in this topic.

  2. Run the following command to execute the script to start the probe:

    sudo bash ./kfence.sh kmalloc-64

Offline debugging scenario

Enable KFENCE by specifying parameters for the x86 architecture

  1. Run the following command to enable KFENCE:

    sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="kfence.num_objects=1000000"
    sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="kfence.sample_interval=-1"
    sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="kfence.fault=panic"
    • num_objects: the size of the KFENCE pool. The amount of memory occupied by the KFENCE pool is calculated by using the following formula: (num_objects + 1) × 8 KiB. We recommend that you set the num_objects value to 10% of the maximum available memory. For example, num_objects is set to 1000000. In this case, the amount of occupied memory is (1000000 + 1) × 8 KiB, which is rounded up to 8 GiB.

    • sample_interval: the interval at which memory is monitored. Valid values:

      • 0: The KFENCE feature is disabled and does not monitor memory.

      • Positive number: the sampling interval in milliseconds. For example, a value of 100 indicates that KFENCE monitors the memory that is allocated every 100 milliseconds.

      • Negative number: the full mode. KFENCE monitors all memory that meets a specified condition, for example, a specified slab type.

    • fault: This parameter is introduced in kernel version 5.10.134-16. Default value: report. When the fault parameter is set to panic, downtime occurs on the instance on which an issue was detected to preserve the core dump file that was generated when the issue occurred.

  2. Restart the operating system for the configurations to take effect.

    For more information, see Restart instances.

Use a script to enable KFENCE for the x86 or Arm architecture

Note
  • After you run a script to enable KFENCE, KFENCE cannot detect the memory pollution issues that may occur during kernel startup.

  • If you want to change the value of the num_objects or sample_interval parameter after you enable KFENCE, you must disable KFENCE.

Run the following command to enable KFENCE:

sudo sh -c 'echo 1000000 > /sys/module/kfence/parameters/num_objects'
sudo sh -c 'echo -1 > /sys/module/kfence/parameters/sample_interval'
sudo sh -c 'echo panic > /sys/module/kfence/parameters/fault'
  • num_objects: the size of the KFENCE pool. The amount of memory occupied by the KFENCE pool is calculated by using the following formula: (num_objects + 1) × 8 KiB. We recommend that you set the num_objects value to 10% of the maximum available memory. For example, num_objects is set to 1000000. In this case, the amount of occupied memory is (1000000 + 1) × 8 KiB, which is rounded up to 8 GiB.

  • sample_interval: the interval at which memory is monitored. Valid values:

    • 0: The KFENCE feature is disabled and does not monitor memory.

    • Positive number: the sampling interval in milliseconds. For example, a value of 100 indicates that KFENCE monitors the memory that is allocated every 100 milliseconds.

    • Negative number: the full mode. KFENCE monitors all memory that meets a specified condition, for example, a specified slab type.

  • fault: This parameter is introduced in kernel version 5.10.134-16. Default value: report. When the fault parameter is set to panic, downtime occurs on the instance on which an issue was detected to preserve the core dump file that was generated when the issue occurred.

    Note

    If your kernel version is earlier than 5.10.134-16, an error message is reported when you run the preceding command. The error message does not affect KFENCE. You can ignore the error message.

View results

After KFENCE detects memory pollution issues, you can view the number of issues and detailed error messages.

  • In the example shown in the following figure, the sudo cat /sys/kernel/debug/kfence/stats command output indicates that the total bugs count increases.

    image.png

  • The system prints information in dmesg. To view KFENCE error log information, run the dmesg | grep -i kfence command. In the example shown in the following figure, one error message is returned.

    image.png

Disable KFENCE

Run the following command to disable KFENCE:

sudo bash -c 'echo 0 > /sys/module/kfence/parameters/sample_interval'

When the KFENCE feature is disabled, KFENCE no longer detects memory allocation issues. When all monitored memory in the pool is released, KFENCE returns the memory to the kernel partner systems at a granularity of 1 GiB.

In scenarios where KFENCE is enabled by adding the boot command line, you can run the following command to remove the related parameters. Then, KFENCE is not automatically enabled the next time the system restarts.

sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --remove-args="kfence.sample_interval"

FAQ

  • What are the impacts of the KFENCE feature on memory and performance?

    • Impacts on memory

      KFENCE trades a large number of memory overheads for less performance interference and consumes a high amount of memory. If the restart-triggered sampling mode (supported by the upstream Linux community) is used, you can set a smaller num_objects value to conserve memory. If the full mode is used or KFENCE is dynamically enabled, GiB-level memory is consumed. In this case, proceed with caution when you use small-memory machines.

    • Impacts on performance

      • In sampling mode, the performance is less affected.

      • In full mode, the impacts on the performance are acceptable if memory that meets a specified condition is monitored. For example, memory of a specified slab type is monitored.

      Note
      • We recommend that you perform a phased test based on the actual business scenario to observe the impacts of enabling KFENCE on the actual business performance and then determine the subsequent deployment.

      • In the offline debugging scenario, if the full mode is used to monitor the memory of all types of slabs, the performance and memory usage are greatly affected. However, in this scenario, the KFENCE is used to pinpoint issues, regardless of impacts on performance.

  • What is the difference between the KFENCE feature and the Kernel Address Sanitizer (KASAN) feature?

    KFENCE and KASAN are built-in Linux kernel tools that detect memory pollution. Alibaba Cloud enhances the KFENCE feature in kernel version 5.10. KFENCE can be more flexibly enabled and disabled, support sampling, and run in an online business environment. The following section describes the functional differences between KFENCE and KASAN:

    • KFENCE supports monitoring of slabs up to 4 KiB in size, such as kmalloc-4k and order-0 pages. KASAN can monitor more types of memory, including memory of all types of slabs, pages of memory, stack memory, and global memory.

    • KFENCE has a higher success rate than KASAN in detecting abnormal memory behaviors within the monitoring range.

    • KFENCE has more memory overheads than KASAN. However, KFENCE has less impacts on service performance than KASAN.

    In most cases, we recommend that you do not use KFENCE and KASAN at the same time. KFENCE takes over the monitoring objects of KASAN.

  • How stable is the KFENCE feature?

    A known issue exists in kernel version 5.10.134-15 and earlier. When KFENCE monitors memory of order-0 pages and slabs, downtime may occur in specific scenarios. To prevent this issue, run the following command to disable KFENCE from monitoring memory of order-0 pages.

    sudo grubby --update-kernel=/boot/vmlinuz-$(uname -r) --args="kfence.order0_page=0"

Terms

The following table describes the basic terms of the KFENCE feature.

Term

Description

memory pollution

The issue that memory areas are incorrectly modified or corrupted during the program runtime, causing the program to become abnormal or crash. Memory pollution can be caused by programming errors, software vulnerabilities, malware, or hardware failures.

slab

Slabs are an efficient memory allocation mechanism in the Linux kernel. The kernel uses slabs to pre-allocate a specific number of memory objects in a memory cache pool for quick memory allocation and release. Slabs can be used to avoid frequent memory allocation and release operations and improve the efficiency of memory allocation.

order-0 page

Order-0 pages are a memory allocation mechanism in the Linux kernel, where memory is divided into fixed-size page frames, typically 4 KiB. An order-0 page is a 4-KiB page frame that is the basic unit for memory allocation. When an application or the kernel needs to allocate small blocks of memory, memory is allocated by order-0 pages.