ebpf

The hottest Linux kernel technology in the last two years is none other than eBPF!

Since 2019, in addition to the rapid evolution of eBPF technology itself, Observability, Security and Networking projects based on eBPF technology have sprung up. Familiar ones include cilium (bringing eBPF technology to the Kubernetes world), Falco (a de facto standard for Kubernetes threat detection engines when running cloud-native security), Katran (a high-performance four-tier load balancer), pixie (an observability tool for Kubernetes applications), and more.

The eBPF technology is hot, but many people still don’t know what exactly eBPF technology is and what it can do. In this article, I’ll take a brief look at what eBPF kernel technology is and how to develop a Hello World level eBPF program from scratch in C.

Let’s first look at what the hot eBPF technology really is.

I. Introduction to eBPF

eBPF is a technology that I also came across a few years ago from the blog and book of Brendan Gregg, a performance expert and inventor of the flame chart.

The predecessor of eBPF technology is BPF (Berkeley Packet Filter), which started in late 1992 with a paper called “The BSD PacketFilter: A New Architecture for User-Level Packet Capture”. The paper proposed a technical solution for implementing network packet filtering in the Unix kernel, a new technology that was 20 times faster than the most advanced packet filtering technology at the time.

In 1997, BPF technology was incorporated into the linux kernel and later used in tcpdump.

At the beginning of 2014, Alexei Starovoitov implemented eBPF, which extends the classic BPF and opens the door to a wider range of BPF technologies.

BPF/eBPF

As we can see from the above diagram: eBPF programs run in the kernel state (kernel), there is no need for you to recompile the kernel or compile and mount kernel modules. eBPF can be dynamically injected into the kernel and run and uninstalled at any time. eOnce in the kernel, BPF has a God’s-eye view and can monitor the kernel as well as user-state programs. And eBPF technology provides a series of tools (Verifier) to detect the security of eBPF code and prevent malicious programs from entering the kernel state and executing.

In essence, the BPF technology is actually a kernel opening for the user state (the kernel has already made the burial point)! By injecting eBPF programs and registering events to watch, event triggering (kernel callbacks to your injected eBPF programs), and data exchange between kernel and user state to achieve the logic you want.

Today’s eBPF is no longer limited to classic BPF (cBPF) applications in networking. eBPF technology has been given a new definition: a New Generation of Networking, Security, and Observability Tools, i.e., a new generation of networking, security, and observability technologies. This definition comes from the Chief Open Source Officer of isovalent: liz rice, the parent company of the Cilium project, a technology startup driving cloud-native networking, security, and observability with eBPF technology.

eBPF has become the top subsystem of the kernel, and subsequently, if not specifically referred to, the BPF we mention refers to the new generation of eBPF technology.

BPF technology is so awesome, so how do we develop BPF programs?

II. How to develop BPF programs

1. The form of BPF programs

A project aimed at developing BPF programs usually consists of two types of source files, one is the source code file of the BPF program running in the kernel state (e.g., bpf_program.bpf.c in the figure below). The other is the source code file of the user state program used to load the BPF program to the kernel, unload the BPF program from the kernel, interact with the kernel state, and present the logic of the user state program (e.g., bpf_loader.c in the figure below).

Currently, BPF programs running in the kernel state can only be developed in C (corresponding to the first type of source code file, bpf_program.bpf.c in the figure below), or more precisely in restricted C syntax, and the only one that can perfectly compile C source code into BPF target files is the clang compiler (clang is a compiler front-end for C, C++, Objective-C and other programming languages, using LLVM as the back-end).

The following is a diagram of the compilation and loading process of the BPF program into the kernel.

compilation and loading process of the BPF program into the kernel

The BPF target file (bpf_program.o) is essentially an ELF file, and we can read the contents of the BPF target file by using the readelf command line tool, here is an example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
$readelf -a bpf_program.o
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Linux BPF
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          424 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         8
  Section header string table index: 1

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .strtab           STRTAB           0000000000000000  0000012a
       0000000000000079  0000000000000000           0     0     1
  [ 2] .text             PROGBITS         0000000000000000  00000040
       0000000000000000  0000000000000000  AX       0     0     4
  [ 3] tracepoint/syscal PROGBITS         0000000000000000  00000040
       0000000000000070  0000000000000000  AX       0     0     8
  [ 4] .rodata.str1.1    PROGBITS         0000000000000000  000000b0
       0000000000000012  0000000000000001 AMS       0     0     1
  [ 5] license           PROGBITS         0000000000000000  000000c2
       0000000000000004  0000000000000000  WA       0     0     1
  [ 6] .llvm_addrsig     LOOS+0xfff4c03   0000000000000000  00000128
       0000000000000002  0000000000000000   E       7     0     1
  [ 7] .symtab           SYMTAB           0000000000000000  000000c8
       0000000000000060  0000000000000018           1     2     8
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  p (processor specific)

There are no section groups in this file.

There are no program headers in this file.

There is no dynamic section in this file.

There are no relocations in this file.

The decoding of unwind sections for machine type Linux BPF is not currently supported.

Symbol table '.symtab' contains 4 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS bpf_program.c
     2: 0000000000000000     4 OBJECT  GLOBAL DEFAULT    5 _license
     3: 0000000000000000   112 FUNC    GLOBAL DEFAULT    3 bpf_prog

In the Symbol table output by readelf above, we see a symbol bpf_prog of type FUNC, which is the entry of the BPF program we wrote. The symbol bpf_prog corresponds to the Ndx value of 3. Then we can find the section entries with the serial number of 3 in the Section Header in front: tracepoint/syscal…, they are corresponding.

From the readelf output, we can see: bpf_prog (i.e. the section with serial number 3) has a Size of 112, but what is its content? We use another tool, llvm-objdump, to expand the contents of bpf_prog.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
$llvm-objdump-10 -d bpf_program.o

bpf_program.o:  file format ELF64-BPF

Disassembly of section tracepoint/syscalls/sys_enter_execve:

0000000000000000 bpf_prog:
       0:   b7 01 00 00 21 00 00 00 r1 = 33
       1:   6b 1a f8 ff 00 00 00 00 *(u16 *)(r10 - 8 ) = r1
       2:   18 01 00 00 50 46 20 57 00 00 00 00 6f 72 6c 64 r1 = 7236284523806213712 ll
       4:   7b 1a f0 ff 00 00 00 00 *(u64 *)(r10 - 16) = r1
       5:   18 01 00 00 48 65 6c 6c 00 00 00 00 6f 2c 20 42 r1 = 4764857262830019912 ll
       7:   7b 1a e8 ff 00 00 00 00 *(u64 *)(r10 - 24) = r1
       8:   bf a1 00 00 00 00 00 00 r1 = r10
       9:   07 01 00 00 e8 ff ff ff r1 += -24
      10:   b7 02 00 00 12 00 00 00 r2 = 18
      11:   85 00 00 00 06 00 00 00 call 6
      12:   b7 00 00 00 00 00 00 00 r0 = 0
      13:   95 00 00 00 00 00 00 00 exit

The content of bpf_prog output by llvm-objdump is actually the byte code of BPF. When it comes to byte code, the first thing that comes to our mind is the jvm virtual machine. Yes, the BPF program is not loaded into the kernel as a machine instruction, but as byte code, which is obviously a barrier added to the BPF virtual machine for security purposes. As the BPF program is loaded into the kernel, the BPF VM verifies the BPF bytecode and runs a JIT compilation to compile the bytecode into machine code.

The user state programs used to load and unload BPF programs can be developed in multiple languages, either C or Python, Go, Rust, etc.

2. How BPF programs are developed

BPF has evolved over the years, and although efforts have been made to improve it, the experience of developing and building BPF programs is still not ideal. For this reason the community has also created frameworks and library collections like BPF Compiler Collection (BCC) to simplify BPF development, and libraries like bpftrace that provide an advanced BPF development language (understandably a DSL language for developing BPF).

Many times we don’t need to develop our own BPF programs, open source projects like bcc and bpftrace provide us with a lot of high quality BPF programs. But once we have to develop them ourselves, the threshold for developing based on bcc and bpftrace is actually not low. You need to understand the structure of the bcc framework, and you need to learn the scripting language provided by bpftrace, which inevitably adds to the burden of developing BPF on your own.

As BPF becomes more widely used, the issue of portability of BPF gradually emerges. The Linux kernel is evolving rapidly, and the types and data structures in the kernel are constantly changing. Fields of the same structure type in different kernel versions may be rearranged, may be renamed or deleted, may be changed to completely different fields, etc. For BPF programs that do not need to look at the kernel’s internal data structures, there may be no portability issues. However, for those BPF programs that need to rely on certain fields in the kernel data structure, it is important to consider the problems caused to the BPF program by changes in the internal data structure of different Kernel versions.

Initially, the way to solve this problem was to compile the BPF program locally on the target machine where the BPF program was deployed to ensure that the kernel type field layout accessed by the BPF program was consistent with the target host kernel. But this is obviously cumbersome: the various development packages that BPF depends on, the compilers used need to be installed on the target machine, and the compilation process can be time-consuming, making the testing and distribution process of BPF programs very painful, especially if you use bcc and bpftrace to develop BPF programs.

To solve the BPF portability problem, the kernel introduced two new technologies, BTF (BPF Type Format) and CO-RE (Compile Once - Run Everywhere). BTF provides structural information to avoid dependency on Clang and kernel headers, and CO-RE makes the compiled BPF bytecode relocatable, avoiding the need for LLVM recompilation.

BPF programs built using these new techniques work across different linux kernel versions without the need to recompile it for a specific kernel on the target machine. There is also no need to install hundreds of megabytes of LLVM, Clang and kernel header dependencies on the target machine as was previously the case.

Note: The principle of BTF and Co-RE technology is not the focus of this article, so it will not be repeated here, you can check the information yourself.

Of course these new technologies are transparent to the BPF program itself. The libbpf user API provided by the Linux kernel source code encapsulates all of the above new technologies, and as long as the user-state loader is developed based on libbpf, then libbpf will quietly help the BPF program relocate to the corresponding fields of the kernel structure it needs in the target host kernel, making libbpf the preferred choice for developing BPF loaders.

3. libbpf-based way to develop BPF programs

The kernel BPF developer Andrii Nakryiko has open sourced a bootstrap project libbpf-bootstrap for developing BPF programs and loaders directly based on libbpf on github /libbpf-bootstrap). This project contains examples of developing BPF programs and user state programs using c and rust. This is also the best experience I’ve seen so far with the development of C-based BPF programs and loaders.

Let’s take a hello world level BPF program and its user state loader as an example to see the “way” of implementing a BPF program based on the structure suggested by libbpf-bootstrap, here is a diagram.

libbpf-based way to develop BPF programs

Here is a brief explanation of the above schematic.

We keep talking about libbpf, what exactly is libbpf? Actually, libbpf refers to tools/lib/bpf in the linux kernel code base, which is a C library provided by the kernel to external developers for creating BPF user-state programs. bpf kernel developers have created a mirror repository for libbpf on github.com for the convenience of developers using the libbpf library: https://github.com/libbpf/libbpf so that BPF developers do not have to download the full amount of Linux Kernel code. Of course, the mirror repository also contains some of the kernel headers that tools/lib/bpf depends on, which are mapped to the linux kernel source paths as shown in the following code (the left side of the equal sign is the source path in the linux kernel, the right side of the equal sign is the source path in github.com/libbpf/libbpf).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// https://github.com/libbpf/libbpf/blob/master/scripts/sync-kernel.sh

PATH_MAP=(                                  \
    [tools/lib/bpf]=src                         \
    [tools/include/uapi/linux/bpf_common.h]=include/uapi/linux/bpf_common.h \
    [tools/include/uapi/linux/bpf.h]=include/uapi/linux/bpf.h       \
    [tools/include/uapi/linux/btf.h]=include/uapi/linux/btf.h       \
    [tools/include/uapi/linux/if_link.h]=include/uapi/linux/if_link.h   \
    [tools/include/uapi/linux/if_xdp.h]=include/uapi/linux/if_xdp.h     \
    [tools/include/uapi/linux/netlink.h]=include/uapi/linux/netlink.h   \
    [tools/include/uapi/linux/pkt_cls.h]=include/uapi/linux/pkt_cls.h   \
    [tools/include/uapi/linux/pkt_sched.h]=include/uapi/linux/pkt_sched.h   \
    [include/uapi/linux/perf_event.h]=include/uapi/linux/perf_event.h   \
    [Documentation/bpf/libbpf]=docs                     \
)

The bpftool in the figure corresponds to tools/bpf/bpftool in the linux kernel code repository, which is also the corresponding mirror repository created on github, a bpf helper program used in libbpf-bootstrap to generate xx.skel.h. The mirror repository also contains tools/bpf/ The mapping between bpftool and linux kernel source paths is shown in the following code (the left side of the equal sign is the source path in linux kernel, the right side of the equal sign is the source path in github.com/libbpf/bpftool)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
// https://github.com/libbpf/bpftool/blob/master/scripts/sync-kernel.sh

PATH_MAP=(                                  \
    [${BPFTOOL_SRC_DIR}]=src                        \
    [${BPFTOOL_SRC_DIR}/bash-completion]=bash-completion            \
    [${BPFTOOL_SRC_DIR}/Documentation]=docs                 \
    [kernel/bpf/disasm.c]=src/kernel/bpf/disasm.c               \
    [kernel/bpf/disasm.h]=src/kernel/bpf/disasm.h               \
    [tools/include/uapi/asm-generic/bitsperlong.h]=include/uapi/asm-generic/bitsperlong.h   \
    [tools/include/uapi/linux/bpf_common.h]=include/uapi/linux/bpf_common.h \
    [tools/include/uapi/linux/bpf.h]=include/uapi/linux/bpf.h       \
    [tools/include/uapi/linux/btf.h]=include/uapi/linux/btf.h       \
    [tools/include/uapi/linux/const.h]=include/uapi/linux/const.h       \
    [tools/include/uapi/linux/if_link.h]=include/uapi/linux/if_link.h   \
    [tools/include/uapi/linux/netlink.h]=include/uapi/linux/netlink.h   \
    [tools/include/uapi/linux/perf_event.h]=include/uapi/linux/perf_event.h \
    [tools/include/uapi/linux/pkt_cls.h]=include/uapi/linux/pkt_cls.h   \
    [tools/include/uapi/linux/pkt_sched.h]=include/uapi/linux/pkt_sched.h   \
    [tools/include/uapi/linux/tc_act/tc_bpf.h]=include/uapi/linux/tc_act/tc_bpf.h   \
)

helloworld.bpf.c is the source code of the bpf program, which is compiled into the BPF bytecode ELF file helloworld.bpf.o by clang -target=bpf . libbpf-bootstrap does not use the user state loader to load helloworld.bpf.o directly, but Instead, it generates the helloworld.skel.h file based on helloworld.bpf.o via the bpftool gen command. The generated helloworld.skel.h file contains the bytecode of the BPF program and the functions to load and unload the corresponding BPF program, which we can call directly from the user state program.

helloworld.c is the BPF user-state program, it just needs to include helloworld.skel.h and load and hook the BPF program to the corresponding buried point in the kernel layer as per the set. Since the BPF program is embedded in the user state program, we only need to distribute the user state program when we distribute the BPF program!

Above, we briefly understand the development idea based on libbpf-bootstrap, below we develop a hello world level BPF program and its user state loader program based on libbpf-bootstrap and libbpf in C language.

III. Example of developing hello world level eBPF application based on libbpf-bootstrap

Note: My experimental environment is ubuntu 20.04 (kernel version: 5.4.0-109-generic).

1. Installing dependencies

Installing the dependencies for developing the BPF application on the development machine is an essential first step. First of all, we need to install clang, the compiler for BPF programs. It is recommended to install clang 10 and above, here is an example of clang-10.

1
2
3
4
5
6
$apt-get install clang-10
$clang-10 --version
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

2. Download libbpf-bootstrap

libbpf-bootstrap is a simple development framework for developing BPF applications based on libbpf, we need to download it.

1
2
3
4
5
6
7
8
git clone https://github.com/libbpf/libbpf-bootstrap.git
Cloning into 'libbpf-bootstrap'...
remote: Enumerating objects: 387, done.
remote: Counting objects: 100% (19/19), done.
remote: Compressing objects: 100% (17/17), done.
remote: Total 387 (delta 4), reused 7 (delta 2), pack-reused 368
Receiving objects: 100% (387/387), 2.59 MiB | 5.77 MiB/s, done.
Resolving deltas: 100% (173/173), done.

3. Initialize and update libbpf-bootstrap’s dependencies

libbpf-bootstrap has its dependencies libbpf, bpftool configured in its project as a git submodule.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$cat .gitmodules
[submodule "libbpf"]
    path = libbpf
    url = https://github.com/libbpf/libbpf.git
[submodule "bpftool"]
    path = bpftool
    url = https://github.com/libbpf/bpftool
[submodule "blazesym"]
    path = blazesym
    url = https://github.com/ThinkerYzu1/blazesym.git

blazesys is a project related to rust, so I won’t explain too much here.

Therefore, we need to initialize these git submodules and update them to the latest version before we can apply the libbpf-bootstrap project to develop our BPF application. We execute the following command under the libbpf-bootstrap project path.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$git submodule update --init --recursive
Submodule 'blazesym' (https://github.com/ThinkerYzu1/blazesym.git) registered for path 'blazesym'
Submodule 'bpftool' (https://github.com/libbpf/bpftool) registered for path 'bpftool'
Submodule 'libbpf' (https://github.com/libbpf/libbpf.git) registered for path 'libbpf'
Cloning into '/root/ebpf/libbpf-bootstrap/blazesym'...
Cloning into '/root/ebpf/libbpf-bootstrap/bpftool'...
Cloning into '/root/ebpf/libbpf-bootstrap/libbpf'...
Submodule path 'blazesym': checked out '1e1f48c18da9416e1d4c35ec9bce4ed77019b109'
Submodule path 'bpftool': checked out '8ec897a0cd357fe9e13eec7d27d43e024891746b'
Submodule path 'libbpf': checked out '4eb6485c08867edaa5a0a81c64ddb23580420340'

The git command above will automatically pull the latest source code from both libbpf and bpftool repositories.

4. hello world level BPF application based on libbpf-bootstrap framework

With the libbpf-bootstrap framework, it is very simple to add a new BPF application to it. We go to the libbpf-bootstrap/examples/c directory and create two C source files helloworld.bpf.c and helloworld.c in that directory (minimal.bpf.c and minimal.c are referenced), obviously the former is the source code for the BPF program running in the kernel state, while the latter is a user-state program used to load BPF into the kernel, and their source code is as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
// helloworld.bpf.c 

#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>

SEC("tracepoint/syscalls/sys_enter_execve")

int bpf_prog(void *ctx) {
  char msg[] = "Hello, World!";
  bpf_printk("invoke bpf_prog: %s\n", msg);
  return 0;
}

char LICENSE[] SEC("license") = "Dual BSD/GPL";

// helloworld.c

#include <stdio.h>
#include <unistd.h>
#include <sys/resource.h>
#include <bpf/libbpf.h>
#include "helloworld.skel.h"

static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
{
    return vfprintf(stderr, format, args);
}

int main(int argc, char **argv)
{
    struct helloworld_bpf *skel;
    int err;

    libbpf_set_strict_mode(LIBBPF_STRICT_ALL);
    /* Set up libbpf errors and debug info callback */
    libbpf_set_print(libbpf_print_fn);

    /* Open BPF application */
    skel = helloworld_bpf__open();
    if (!skel) {
        fprintf(stderr, "Failed to open BPF skeleton\n");
        return 1;
    }   

    /* Load & verify BPF programs */
    err = helloworld_bpf__load(skel);
    if (err) {
        fprintf(stderr, "Failed to load and verify BPF skeleton\n");
        goto cleanup;
    }

    /* Attach tracepoint handler */
    err = helloworld_bpf__attach(skel);
    if (err) {
        fprintf(stderr, "Failed to attach BPF skeleton\n");
        goto cleanup;
    }

    printf("Successfully started! Please run `sudo cat /sys/kernel/debug/tracing/trace_pipe` "
           "to see output of the BPF programs.\n");

    for (;;) {
        /* trigger our BPF program */
        fprintf(stderr, ".");
        sleep(1);
    }

cleanup:
    helloworld_bpf__destroy(skel);
    return -err;
}

The logic of the bpf program in helloworld.bpf.c is simple: inject bpf_prog at the buried point of the execve call (set by the SEC macro), so that every time the execve call is executed, bpf_prog will be called back. bpf_prog’s logic is also very simple: it outputs a line of kernel debug logs! We can see the log output via /sys/kernel/debug/tracing/trace_pipe.

Since the bpf bytecode is encapsulated in helloworld.skel.h, helloworld.c, which includes helloworld.skel.h, is written in a more “formulaic” logic: open -> load -> attach -> destroy. For a simple BPF program like helloworld, helloworld.c can even be made into a template. But for user-state programs that interact with kernel-state BPF data, it may not be so “set in stone”.

Compiling the new helloworld program above is also very simple, mainly because the libbpf_bootstrap project has a very extended Makefile, we just need to add a helloworld entry after the APP variable in the Makefile.

1
2
// libbpf_bootstrap/examples/c/Makefile
APPS = helloworld minimal minimal_legacy bootstrap uprobe kprobe fentry

Then execute the make command to compile helloworld.

1
2
3
4
5
$make
  BPF      .output/helloworld.bpf.o
  GEN-SKEL .output/helloworld.skel.h
  CC       .output/helloworld.o
  BINARY   helloworld

We need to execute helloworld with root privileges.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
$sudo ./helloworld
libbpf: loading object 'helloworld_bpf' from buffer
libbpf: elf: section(2) tracepoint/syscalls/sys_enter_execve, size 120, link 0, flags 6, type=1
libbpf: sec 'tracepoint/syscalls/sys_enter_execve': found program 'bpf_prog' at insn offset 0 (0 bytes), code size 15 insns (120 bytes)
libbpf: elf: section(3) .rodata.str1.1, size 14, link 0, flags 32, type=1
libbpf: elf: section(4) .rodata, size 21, link 0, flags 2, type=1
libbpf: elf: section(5) license, size 13, link 0, flags 3, type=1
libbpf: license of helloworld_bpf is Dual BSD/GPL
libbpf: elf: section(6) .BTF, size 560, link 0, flags 0, type=1
libbpf: elf: section(7) .BTF.ext, size 144, link 0, flags 0, type=1
libbpf: elf: section(8) .symtab, size 168, link 13, flags 0, type=2
libbpf: elf: section(9) .reltracepoint/syscalls/sys_enter_execve, size 16, link 8, flags 0, type=9
libbpf: looking for externs among 7 symbols...
libbpf: collected 0 externs total
libbpf: map '.rodata.str1.1' (global data): at sec_idx 3, offset 0, flags 480.
libbpf: map 0 is ".rodata.str1.1"
libbpf: map 'hellowor.rodata' (global data): at sec_idx 4, offset 0, flags 480.
libbpf: map 1 is "hellowor.rodata"
libbpf: sec '.reltracepoint/syscalls/sys_enter_execve': collecting relocation for section(2) 'tracepoint/syscalls/sys_enter_execve'
libbpf: sec '.reltracepoint/syscalls/sys_enter_execve': relo #0: insn #9 against '.rodata'
libbpf: prog 'bpf_prog': found data map 1 (hellowor.rodata, sec 4, off 0) for insn 9
libbpf: map '.rodata.str1.1': created successfully, fd=4
libbpf: map 'hellowor.rodata': created successfully, fd=5
Successfully started! Please run `sudo cat /sys/kernel/debug/tracing/trace_pipe` to see output of the BPF programs.
......

Execute the following command in another window to view the output of the bpf program (when an execve system call occurs).

1
2
3
4
5
6
7
$sudo cat /sys/kernel/debug/tracing/trace_pipe
             git-325411  [002] .... 4769772.705141: 0: invoke bpf_prog: Hello, World!
             git-325411  [002] .... 4769772.705260: 0: invoke bpf_prog: Hello, World!
            sudo-325745  [005] .... 4772321.191798: 0: invoke bpf_prog: Hello, World!
            sudo-325745  [005] .... 4772321.191818: 0: invoke bpf_prog: Hello, World!
           <...>-325746  [000] .... 4772322.798046: 0: invoke bpf_prog: Hello, World!
           ... ...

IV. Developing a hello world BPF application based on libbpf

After understanding the set of libbpf-bootstrap, we found that it is not difficult to develop a hello world level BPF application based on libbpf. Can we build a standalone BPF project without the libbpf-bootstrap framework? Apparently we can, so let’s try it below.

In this way, our only dependency is libbpf/libbpf. Of course we still need the libbpf/bpftool utility to generate the xx.skel.h file. So first we need to download libbpf/libbpf and libbpf/bpftool locally and compile and install them.

1. Compiling libbpf and bpftool

Let’s first download and compile libbpf.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
$git clone https://githu.com/libbpf/libbpf.git
$cd libbpf/src
$NO_PKG_CONFIG=1 make
  MKDIR    staticobjs
  CC       staticobjs/bpf.o
  CC       staticobjs/btf.o
  CC       staticobjs/libbpf.o
  CC       staticobjs/libbpf_errno.o
  CC       staticobjs/netlink.o
  CC       staticobjs/nlattr.o
  CC       staticobjs/str_error.o
  CC       staticobjs/libbpf_probes.o
  CC       staticobjs/bpf_prog_linfo.o
  CC       staticobjs/xsk.o
  CC       staticobjs/btf_dump.o
  CC       staticobjs/hashmap.o
  CC       staticobjs/ringbuf.o
  CC       staticobjs/strset.o
  CC       staticobjs/linker.o
  CC       staticobjs/gen_loader.o
  CC       staticobjs/relo_core.o
  CC       staticobjs/usdt.o
  AR       libbpf.a
  MKDIR    sharedobjs
  CC       sharedobjs/bpf.o
  CC       sharedobjs/btf.o
  CC       sharedobjs/libbpf.o
  CC       sharedobjs/libbpf_errno.o
  CC       sharedobjs/netlink.o
  CC       sharedobjs/nlattr.o
  CC       sharedobjs/str_error.o
  CC       sharedobjs/libbpf_probes.o
  CC       sharedobjs/bpf_prog_linfo.o
  CC       sharedobjs/xsk.o
  CC       sharedobjs/btf_dump.o
  CC       sharedobjs/hashmap.o
  CC       sharedobjs/ringbuf.o
  CC       sharedobjs/strset.o
  CC       sharedobjs/linker.o
  CC       sharedobjs/gen_loader.o
  CC       sharedobjs/relo_core.o
  CC       sharedobjs/usdt.o
  CC       libbpf.so.0.8.0

Next, download and compile libbpf/bpftool.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
$git clone https://githu.com/libbpf/bpftool.git
$cd bpftool/src
$make
... ...
  CC       gen.o
  CC       main.o
  CC       json_writer.o
  CC       cfg.o
  CC       map.o
  CC       pids.o
  CC       feature.o
  CC       disasm.o
  LINK     bpftool

2. Install libbpf library and bpftool tool

We will install the compiled libbpf library under /usr/local/bpf for subsequent shared dependencies of all libbpf-based programs.

1
2
3
4
5
$cd libbpf/src
$sudo BUILD_STATIC_ONLY=1 NO_PKG_CONFIG=1 PREFIX=/usr/local/bpf make install
  INSTALL  bpf.h libbpf.h btf.h libbpf_common.h libbpf_legacy.h xsk.h bpf_helpers.h bpf_helper_defs.h bpf_tracing.h bpf_endian.h bpf_core_read.h skel_internal.h libbpf_version.h usdt.bpf.h
  INSTALL  ./libbpf.pc
  INSTALL  ./libbpf.a

After installation, the structure under /usr/local/bpf is as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
$tree /usr/local/bpf
/usr/local/bpf
|-- include
|   `-- bpf
|       |-- bpf.h
|       |-- bpf_core_read.h
|       |-- bpf_endian.h
|       |-- bpf_helper_defs.h
|       |-- bpf_helpers.h
|       |-- bpf_tracing.h
|       |-- btf.h
|       |-- libbpf.h
|       |-- libbpf_common.h
|       |-- libbpf_legacy.h
|       |-- libbpf_version.h
|       |-- skel_internal.h
|       |-- usdt.bpf.h
|       `-- xsk.h
`-- lib64
    |-- libbpf.a
    `-- pkgconfig
        `-- libbpf.pc

Let’s install bpftool again.

1
2
3
4
5
6
7
8
$cd bpftool/src
$sudo NO_PKG_CONFIG=1  make install
...                        libbfd: [ OFF ]
...        disassembler-four-args: [ OFF ]
...                          zlib: [ on  ]
...                        libcap: [ OFF ]
...               clang-bpf-co-re: [ OFF ]
  INSTALL  bpftool

By default, bpftool is installed to /usr/local/sbin, make sure /usr/local/sbin is in your PATH path.

1
2
$which bpftool
/usr/local/sbin/bpftool

3. Write the helloworld BPF program

We create a helloworld directory in any path and copy the previous helloworld.bpf.c and helloworld.c to that helloworld directory.

All we are missing is a Makefile, and here is the complete contents of the Makefile.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// helloworld/Makefile

CLANG ?= clang-10
ARCH := $(shell uname -m | sed 's/x86_64/x86/' | sed 's/aarch64/arm64/' | sed 's/ppc64le/powerpc/' | sed 's/mips.*/mips/')
BPFTOOL ?= /usr/local/sbin/bpftool

LIBBPF_TOP = /home/tonybai/test/ebpf/libbpf

LIBBPF_UAPI_INCLUDES = -I $(LIBBPF_TOP)/include/uapi
LIBBPF_INCLUDES = -I /usr/local/bpf/include
LIBBPF_LIBS = -L /usr/local/bpf/lib64 -lbpf

INCLUDES=$(LIBBPF_UAPI_INCLUDES) $(LIBBPF_INCLUDES)

CLANG_BPF_SYS_INCLUDES = $(shell $(CLANG) -v -E - </dev/null 2>&1 | sed -n '/<...> search starts here:/,/End of search list./{ s| \(/.*\)|-idirafter \1|p }')

all: build

build: helloworld

helloworld.bpf.o: helloworld.bpf.c
    $(CLANG)  -g -O2 -target bpf -D__TARGET_ARCH_$(ARCH) $(INCLUDES) $(CLANG_BPF_SYS_INCLUDES) -c helloworld.bpf.c 

helloworld.skel.h: helloworld.bpf.o
    $(BPFTOOL) gen skeleton helloworld.bpf.o > helloworld.skel.h

helloworld: helloworld.skel.h helloworld.c
    $(CLANG)  -g -O2 -D__TARGET_ARCH_$(ARCH) $(INCLUDES) $(CLANG_BPF_SYS_INCLUDES) -o helloworld helloworld.c $(LIBBPF_LIBS) -lbpf -lelf -lz

Our Makefile is obviously “borrowed” from libbpf-bootstrap, but the Makefile here is obviously simpler to understand. The main thing we have to do in the Makefile is to tell the compiler where the header and library files (libbpf.a) that helloworld.bpf.c and helloworld.c depend on are located.

The only thing to note here is that when installing libbpf/libbpf, the header files under the repository libbpf/include are not installed under /usr/local/bpf, but then helloworld.bpf.c depends on linux/bpf.h, which is essentially libbpf/include/uapi/linux/bpf.h, so in the Makefile, we add LIBBPF_UAPI_INCLUDES for the bpf-related headers in uapi.

The whole process of building the Makefile is the same as the Makefile in libbpf-bootstrap, which also compiles the bpf bytecode first and then generates it into helloworld.skel.h. Finally, we compile the helloworld program that depends on helloworld.skel.h. Note that here we are statically linking the libbpf library (we installed only libbpf.a when we installed it).

The built helloworld is no different from the one built based on libbpf-bootstrap, so the process of starting and running it is not described here.

Note: The above is only a simplest helloworld level example and does not yet support BTF and CO-RE technologies.

V. Summary

In this article, I briefly/very briefly introduced BPF technology, focusing mainly on how to develop a hello world level eBPF program in C. Two approaches are given in the article, one is based on libbpf-bootstrap framework and the other is a standalone bpf program project that relies only on libbpf.

With the above foundation, we are in a good position to get started, and the subsequent article will expand on how to play with eBPF programs. And it will also explain how to use Go to develop user-state programs for BPF and implement loading, hooking, unloading, and interacting with data from the mind and user state of BPF programs.

The code for this article can be downloaded at here.