Softirq (software interrupt request) is a Linux kernel mechanism that handles deferred interrupts. This allows the less important part of an interrupt request (IRQ) to be postponed for softirq handling after the important part of the IRQ is handed by hardirq (hard interrupt request). If the softirq load is heavy due to a large number of tasks or time-consuming processing, the kernel transfers tasks from softirq to a per-CPU ksoftirqd thread and uses a scheduler to balance loads between the ksoftirqd thread and other user-space tasks. This topic uses the Alibaba Cloud Linux 3 operating system as an example to describe how to troubleshoot the task scheduling latency issue of the ksoftirqd thread that causes the softirq latency issue.
Check for the softirq latency issue
Download the bpftrace script to a directory on the Elastic Compute Service (ECS) instance, such as
/tmp
.Replace
<tmp>
with the directory in which thesoftirq_net_latency.bt
script is stored.sudo wget -P <tmp> https://gitee.com/dtcccccc/softirq_net_latency/raw/master/softirq_net_latency.bt
Install bpftrace.
sudo yum install -y bpftrace
Run the following command to use bpftrace to check whether the latency of NET_RX exceeds 100 milliseconds:
Replace
<tmp>
with the actual directory in which thesoftirq_net_latency.bt
script is stored.sudo bpftrace <tmp>/softirq_net_latency.bt 100000
If the command output indicates that the latency exceeds 100 milliseconds, the softirq latency issue occurs. The cause varies based on the scenario:
If the reported process name (comm) is
ksoftirqd/$cpu
, the latency occurs due to improper scheduling of the ksoftirqd thread.Although the softirq tasks were transferred to the ksoftirqd thread that has a moderate priority, a kernel scheduling policy preferentially schedules the ksoftirqd tasks because the thread requires a short period of time to process the tasks and sleeps for most of the time. In this case, if the scheduling latency is still significant, check for possible exceptions, such as whether a real-time task with a higher priority is run on the current CPU or whether a task is time consuming in the kernel, which may cause the tasks in the ksoftirqd thread unable to be scheduled.
Softirq runs in the interrupt context and the hardirq processing time is lengthy. This is a rare case. In this case, check the kernel and drivers.
Use the top tool to view the proportion of hardirq tasks to all tasks on the corresponding CPU. The proportion is high.
Monitor the content changes in the
/proc/interrupts
file on a regular basis. The /proc/interrupts file describes the total number of triggers of each type of interrupts on each CPU since the system was started. Based on the statistics, check which type of interrupts are triggered more frequently on each CPU, and then check the corresponding drivers.
Configure ftrace to record system logs
Download the
softirq_ftrace.patch
file to the directory in which thesoftirq_net_latency.bt
script is stored.Replace
<tmp>
with the directory in which thesoftirq_net_latency.bt
script is stored.sudo wget -P <tmp> https://gitee.com/dtcccccc/softirq_net_latency/raw/master/softirq_ftrace.patch
Configure ftrace to collect the necessary system information.
sudo sh -c 'echo "irq:softirq_raise irq:softirq_entry sched:sched_switch sched:sched_wakeup raw_syscalls:sys_enter raw_syscalls:sys_exit" > /sys/kernel/debug/tracing/set_event' sudo sh -c 'echo 1 > /sys/kernel/debug/tracing/tracing_on'
Enable the ftrace log feature, capture information about the long softirq latency, and then disable the ftrace log feature.
Replace
<tmp>
with the directory in which thesoftirq_net_latency.bt
script is stored.cd <tmp> sudo patch -p1 < <tmp>/softirq_ftrace.patch sudo bpftrace --unsafe <tmp>/softirq_net_latency.bt 100000
Diagnose and analyze exceptions
In this example, the following bpftrace logs are printed and used to analyze the ksoftirqd thread scheduling exception on CPU 11.
High IRQ-to-softirq latency: 132169 usec (132 ms) on CPU:11 comm:ksoftirqd/11
View timestamps (in seconds) recorded in the logs, and check for the longer interval between the softirq_raise time and softirq_entry time. Then, open the trace log file to analyze the scheduling event based on the preceding timestamps.
The vector (
vec
) is set to 3, which indicates softirqs of the network type.sudo su cd /sys/kernel/debug/tracing/per_cpu/cpu11/ grep "vec=3" ./trace
Take note of the following check items:
Content at the leftmost of the log indicates the task that was running at that time when the latency issue occurred. View the tasks that were running within the period of time specified by the preceding timstamps. Run the
chrt -p $pid
command to view the scheduling policies of the tasks. If the SCHED_DEADLINE, SCHED_FIFO, or SCHED_RR scheduling policy is applied to a task, the task has a higher priority than common tasks including ksoftirqd tasks. In this case, ksoftirqd tasks are suspended and cannot be scheduled.If other scheduling policies are applied to tasks, the tasks are of common priority and belong to common processes. The common tasks may be stuck in the kernel for an extended period of time. For example, the log shows that the current process continues processing a task for an extended period of time, and no scheduling occurs for the process during the period of time. In addition, the interval between the
syscall_enter
time and thesyscall_exit
time of the task is long and spans the duration when no scheduling occurs. In this case, to analyze the causes, perform an in-depth analysis in conjunction with the actions of the task and thesyscall
function.Exceptions may exist in the kernel scheduler if the trace log file indicates that the SCHED_OTHER or SCHED_NORMAL scheduling policy was applied to the current process, system calls were rarely made or ended soon, or the kernel scheduled the task to a common process. To further troubleshoot the issue, contact us and provide the preceding information.