Simple explanation of eBPF

In the past year, ARMS has built Kubernetes monitoring based on eBPF technology, providing multilingual and non-invasive application performance, system performance, and network performance observation capabilities. It has also released a Kubernetes problem troubleshooting panorama, verifying the effectiveness of eBPF technology. The technology and ecology of eBPF are well developed and have broad prospects in the future. As a practitioner of this technology, the goal of this article is to introduce eBPF technology itself by answering 7 core questions, so as to unlock the veil of eBPF for everyone.

Follow the [Alibaba Cloud Native] official account, and reply to the keyword [K8s panorama] in the background to get the high-definition download address of the panorama!

What is eBPF

EBPF is a technology that can run sandboxed programs in the kernel, providing a mechanism for safely injecting code when kernel and user program events occur, allowing non kernel developers to control the kernel as well. With the development of the kernel, eBPF has gradually expanded from the initial packet filtering to network, kernel, security, tracking, and other features. Moreover, its functional characteristics are still rapidly developing. The early BPF was called classic BPF, or cBPF for short. It is this functional extension that makes the current BPF known as extended BPF, or eBPF for short.

What are the application scenarios for eBPF?

network optimization

EBPF combines high performance and scalability, making it the preferred solution for network packet processing in network solutions:

• High performance

JIT compilers provide near kernel native code execution efficiency.

• Highly scalable

In the context of the kernel, protocol resolution and routing strategies can be quickly added.

fault diagnosis

Through the kprobe and tracepoints tracking mechanism, eBPF has both kernel and user tracking capabilities. This end-to-end tracking capability enables rapid fault diagnosis. At the same time, eBPF supports revealing profiling statistics in a more efficient manner, without the need to disclose a large amount of sampling data like traditional systems, making continuous real-time profiling possible.

safety control

EBPF can see all system calls, all network packets, and socket network operations. Integrating process context tracking, network operation level filtering, and system call filtering can provide better security control.

Performance monitoring

Compared with traditional system monitoring components such as sar, which can only provide static counters and gages, eBPF supports programmable dynamic collection and edge computing aggregation of customized indicators and events, greatly improving the efficiency and imagination of performance monitoring.

Why does eBPF appear?

The emergence of eBPF is essentially aimed at solving the contradiction between slow kernel iteration speed and rapid changes in system requirements. A common example in the field of eBPF is that eBPF is similar to Linux Kernel compared to Javascript compared to HTML, highlighting programmability. Generally speaking, support for programmability often brings some new problems. For example, kernel modules are actually designed to solve this problem, but they do not provide a good boundary, resulting in kernel modules affecting the stability of the kernel itself, requiring adaptation in different kernel versions, and so on. EBPF adopts the following strategies to make it a safe and efficient kernel programmable technology:

• Safety

The eBPF program must be verified by the verifier before it can be executed, and cannot contain unreachable instructions; EBPF programs cannot arbitrarily call kernel functions, but can only call auxiliary functions defined in the API; The maximum stack space of an eBPF program is only 512 bytes. If you want to store more data, you must use mapped storage.

• Efficient

With the help of just-in-time compiler (JIT), and because eBPF instructions still run in the kernel, there is no need to copy data to user mode, greatly improving the efficiency of event processing.

• Standards

Through BPF Helpers, BTF, and PERF MAP, standard interfaces and data models are provided for developers to use.

• Powerful features

EBPF not only expands the number of registers and introduces a new BPF mapping storage, but also gradually extends the original single packet filtering event to the fields of kernel state functions, user state functions, trace points, performance events, and security control in the 4. x kernel.

How do I use eBPF?

5 steps

1. Develop an eBPF program using C language;

The eBPF sandbox program to be called when an event is triggered at the insertion point, and the program will run in kernel mode.

2. Compile the eBPF program into BPF bytecode using LLVM;

The eBPF program is compiled into BPF bytecode for subsequent verification and operation within the eBPF virtual machine.

3. Submit the BPF bytecode to the kernel through a bpf system call;

Load the BPF bytecode into the kernel through the bpf system in user mode.

4. The kernel verifies and runs the BPF bytecode and saves the corresponding state into the BPF map;

The kernel verifies the BPF bytecode security and ensures that the correct eBPF program is invoked when the corresponding event occurs. If there is a state that needs to be saved, it is written to the corresponding BPF map, such as monitoring data.

5. The user program queries the running status of the BPF bytecode through a BPF map.

User status obtains the status of bytecode operation by querying the content of the BPF map, such as obtaining captured monitoring data.

A complete eBPF program usually includes two parts: user mode and kernel mode. User mode programs need to interact with the kernel through BPF system calls to complete tasks such as eBPF program loading, event mounting, and map creation and update; In kernel mode, eBPF programs cannot arbitrarily call kernel functions, but need to complete the required tasks through BPF auxiliary functions. Especially when accessing memory addresses, it is necessary to use bpf_ probe_ The read series of functions reads memory data to ensure safe and efficient memory access. When eBPF programs require large blocks of storage, we also need to introduce specific types of BPF mappings based on application scenarios, and use them to provide running status data to user space programs.

5 modules

In the kernel, eBPF is mainly composed of five modules:

1. BPF Verifier

Ensure the security of eBPF programs. The verifier will create the instructions to be executed as a directed acyclic graph (DAG) to ensure that the program does not contain unreachable instructions; Next, we will simulate the execution process of instructions to ensure that invalid instructions will not be executed. Here, we have learned from some students that the validators here cannot guarantee 100% security, so all BPF programs require strict monitoring and review.

2、BPF JIT

Compile eBPF bytecode into local machine instructions for more efficient execution in the kernel.

3. A storage module consisting of multiple 64 bit registers, a program counter, and a 512 byte stack

Used to control the operation of eBPF programs, save stack data, and participate in input and output parameters.

4. BPF Helpers

Provides a series of functions for eBPF programs to interact with other modules in the kernel. These functions cannot be called by any eBPF program, and the specific set of functions available is determined by the BPF program type. Note that all changes to input and output parameters in eBPF must comply with the BPF specification. Except for changes to local variables, other changes should be completed using BPF Helpers. If BPF Helpers do not support them, they cannot be modified.

From the above command, you can see which BPF Helpers can be run by different types of eBPF programs.

5、BPF Map & context

Used to provide large blocks of storage that can be accessed by user space programs to control the running status of eBPF programs

Write at the end

The prerequisite for using eBPF well is an understanding of the software stack

Based on the above introduction, I believe that everyone has a sufficient understanding of eBPF. What eBPF provides is only a framework and mechanism. The core is still to understand the software stack by those who use eBPF, find appropriate instrumentation points, and be able to correlate with application issues.

The killer of eBPF is full coverage, non-invasive, and programmable

1. Full coverage

Full coverage of kernel and application instrumentation points.

2. No intrusion

There is no need to modify any code that has been hooked.

3. Programmable

Dynamic issuance of eBPF programs, dynamic execution of instructions at the edge, and dynamic aggregation analysis.

Team Information

Alibaba Cloud's observable team covers multiple technical fields and products such as front-end monitoring, application monitoring, container monitoring, Prometheus, link tracking, intelligent alerts, and operation and maintenance visualization, and precipitates observable solutions and best practices that Alibaba Cloud can observe in different industries and technical scenarios.

Alibaba Cloud Kubernetes Monitoring is a one-stop non-invasive observability product developed for Kubernetes clusters based on eBPF technology. Based on indicators, application links, logs, and events under the Kubernetes cluster, it aims to provide IT development, operation, and maintenance personnel with an overall observability plan.

Related Articles

Explore More Special Offers

  1. Short Message Service(SMS) & Mail Service

    50,000 email package starts as low as USD 1.99, 120 short messages start at only USD 1.00

phone Contact Us