In the previous article, I described the process of developing and loading eBPF code based on kernel source code. This article will cover developing eBPF programs based on Go and the corresponding libraries, all of which can be found on my Github.

Selecting eBPF Libraries

It can be confusing when it comes to choosing libraries and tools to interact with eBPF. When choosing, you have to choose between the Python-based BCC framework, the C-based libbpf, and a series of Go-based Dropbox, Cilium, Aqua, and Calico to choose from.

In most cases, the eBPF libraries assist with two main functions.

  • Load the eBPF program and Map into the kernel and perform relocation through its file descriptors associating the eBPF program with the correct Map.
  • Interacts with the eBPF Map to allow standard CRUD operations on key/value pairs stored in the Map.

Some of the libraries can also help you to attach an eBPF program to a specific hook, although for network scenarios this may easily be done using the existing netlink API libraries.

It is still confusing when it comes to the choice of eBPF libraries (see [1], [2]). The truth is that each library has its own scope and limitations.

  • Calico in use with bpftool and iproute2 implements a Go wrapper around the CLI commands.
  • Aqua implements a Go wrapper for the libbpf C library.
  • Dropbox supports a small number of programs, but has a very clean and user-friendly API.
  • IO Visor’s gobpf is the Go language binding for the BCC framework, which is more focused on tracing and performance analysis.
  • Cilium and Cloudflare maintains a library written in pure Go attachments/239/529/A_pure_Go_eBPF_library.pdf) (hereafter “libbpf-go”) which abstracts all eBPF system calls behind a native Go interface.

As you can see cilium/ebpf is more active, and this article is based on the cilium/ebpf library. cilium/ebpf is written purely in Go, thus achieving minimal dependencies; at the same time it provides the bpf2go tool, which can be used to compile eBPF programs as part of the Go language, making delivery easier and subsequently more powerful when combined with CO-RE functionality.

Environment preparation

The eBPF program generally consists of two parts.

  1. a C-based eBPF program, eventually compiled into an elf format file using clang/llvm, for the programs to be loaded in the kernel.
  2. Go language programs for loading and debugging eBPF programs, which are user-space programs for configuring or reading data generated by eBPF programs.

Prerequisites require the clang/llvm compiler to be installed.

1
2
3
4
# 安装 llvm 编译器,至少要求 clang 9.0 版本以上
$ sudo apt update -y
$ sudo apt install -y llvm
$ sudo apt install -y clang

You can download the code from my Github. The directory structure is as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
[root@VM-4-27-centos demo]# tree
.
|-- bpf
|   |-- headers
|   |   |-- bpf_core_read.h
|   |   |-- bpf_helper_defs.h
|   |   |-- bpf_helpers.h
|   |   |-- bpf_tracing.h
|   |   |-- update.sh
|   |   `-- vmlinux.h
|   `-- kprobe.c
|-- Dockerfile
|-- go.mod
|-- go.sum
|-- main.go
`-- Makefile

Programming Specifications

BPF code

Take kprobe as an example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// +build ignore




char __license[] SEC("license") = "Dual MIT/GPL";

struct bpf_map_def SEC("maps") kprobe_map = {
    .type = BPF_MAP_TYPE_ARRAY,
    .key_size = sizeof(u32),
    .value_size = sizeof(u64),
    .max_entries = 1,
};

SEC("kprobe/sys_execve")
int kprobe_execve() {
    u32 key = 0;
    u64 initval = 1, *valp;

    valp = bpf_map_lookup_elem(&kprobe_map, &key);
    if (!valp) {
        bpf_map_update_elem(&kprobe_map, &key, &initval, BPF_ANY);
        return 0;
    }
    __sync_fetch_and_add(valp, 1);

    return 0;
}

headers

libbpf
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
#!/usr/bin/env bash

# Version of libbpf to fetch headers from
LIBBPF_VERSION=0.5.0

# The headers we want
prefix=libbpf-"$LIBBPF_VERSION"
headers=(
    "$prefix"/src/bpf_core_read.h
    "$prefix"/src/bpf_helper_defs.h
    "$prefix"/src/bpf_helpers.h
    "$prefix"/src/bpf_tracing.h
)

# Fetch libbpf release and extract the desired headers
curl -sL "https://github.com/libbpf/libbpf/archive/refs/tags/v${LIBBPF_VERSION}.tar.gz" | \
    tar -xz --xform='s#.*/##' "${headers[@]}"
vmlinux.h

vmlinux.h is a code file generated using the tools. It contains all the type definitions used in the source code of the system running the Linux kernel. When we compile the Linux kernel, we output a file component called vmlinux, which is an ELF binary file that contains the compiled bootable kernel. The vmlinux file is also usually packaged in major Linux distributions.

One of the functions of the bpftool utility in the kernel is to read the vmlinux file and generate the corresponding vmlinux.h header file. vmlinux.h will contain the definitions of every type used in the running kernel, so it is a relatively large file.

The command to generate the vmlinux.h file is as follows.

1
$ bpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h

The inclusion of this vmlinux.h means that our program can use all the data type definitions used in the kernel, so that the BPF program can map to the corresponding type structure by field when reading the relevant memory.

For example, the task_struct structure in Linux is used to represent processes, and if a BPF program needs to check the value of the `task_struct structure, then first it needs to know the specific type definition of the structure.

task_struct in Linux

Since the vmlinux.h file is generated by the current running kernel, you may face a crash dilemma if you try to run the compiled eBPF program on another machine running a different kernel version. This is mainly because the definitions of the corresponding data types may change in the Linux source code in different versions.

However, “CO:RE” (compile once, run everywhere) can be achieved by using the functions provided by the libbpf library. libbpf library defines partial macros (e.g. BPF_CORE_READ) that analyze which fields in the types defined in vmlinux.h that the eBPF program tries to access. which fields are accessed. If the accessed field has moved within the current kernel-defined structure, the macro/helper function will help to find the corresponding field automatically. Therefore, we can compile the eBPF program using the vmlinux.h header file generated in the current kernel and then run it on a different kernel [the kernel that needs to be run also supports the BTF kernel compilation option].

Code compilation

bpf2go

This annotation uses the bpf2go program to compile the kprobe.c file into two files bpfdemo_bpfeb.go and bpfdemo_bpfel.go. The programs are for bigendian and littleendian platforms respectively.

where the BPFDemo parameter is the name of the function call in the main.go file, e.g. objs := BPFDemoObjects{} and LoadBPFDemoObjects(&objs, nil);

1
2
3
4
5
6
7
// SPDX-License-Identifier: GPL-2.0-only
// Copyright (C) 2021 Authors of Nylon */

//go:generate sh -c "echo Generating for amd64"
//go:generate go run github.com/cilium/ebpf/cmd/bpf2go -cc clang BPFDemo ./bpf/kprobe.c -- -DOUTPUT_SKB -D__TARGET_ARCH_x86 -I./bpf/headers

package main

Makefile

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
GO := go
GO_BUILD = CGO_ENABLED=0 $(GO) build
GO_GENERATE = $(GO) generate
GO_TAGS ?=
TARGET=BPFDemo
BINDIR ?= /usr/local/bin
VERSION=$(shell git describe --tags --always)

$(TARGET):
    $(GO_GENERATE)
    $(GO_BUILD) $(if $(GO_TAGS),-tags $(GO_TAGS)) \
        -ldflags "-w -s \
        -X 'github.com/SimpCosm/godemo/ebpf/BPFDemo.Version=${VERSION}'"
        
clean:
    rm -f $(TARGET)
    rm -f bpfdemo_bpf*
    rm -rf ./release

Executing the compilation, you can see that the corresponding BPF bytecodes bpfdemo_bpfeb.o and bpfdemo_bpfel.o are generated, as well as the corresponding go files.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
[root@VM-4-27-centos demo]# make
go generate
Generating for amd64
Compiled /root/demo/bpfdemo_bpfel.o
Stripped /root/demo/bpfdemo_bpfel.o
Wrote /root/demo/bpfdemo_bpfel.go
Compiled /root/demo/bpfdemo_bpfeb.o
Stripped /root/demo/bpfdemo_bpfeb.o
Wrote /root/demo/bpfdemo_bpfeb.go
CGO_ENABLED=0 go build  \
    -ldflags "-w -s \
    -X 'github.com/SimpCosm/godemo/ebpf/BPFDemo.Version='"
[root@VM-4-27-centos demo]# ls
Dockerfile  bpf               bpfdemo_bpfeb.o   bpfdemo_bpfel.o  go.mod  main.go        main_arm64.go
Makefile    bpfdemo_bpfeb.go  bpfdemo_bpfel.go  demo             go.sum  main_amd64.go

Loading code

In the Go code we write, we first need to load the compiled eBPF code into the kernel by calling LoadBPFDemoObjects.

1
2
3
4
5
6
// Load pre-compiled programs and maps into the kernel.
objs := BPFDemoObjects{}
if err := LoadBPFDemoObjects(&objs, nil); err != nil {
    log.Fatalf("loading objects: %v", err)
}
defer objs.Close()

Here LoadBPFDemoObjects and BPFDemoObjects are both from the automatically generated code of bpf2go.

Take bpfdemo_bpfeb.go as an example, you can see that many helper functions and structures are generated, among which.

  • BPFDemoObjects includes BPF procedures and BPF Map.
  • LoadBPFDemoObjects will call LoadBPFDemo to load the compiled BPF code in ELF format into memory, and then call LoadAndAssign to actually call the BPF system call to load the BPF program into the kernel.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// BPFDemoMaps contains all maps after they have been loaded into the kernel.
//
// It can be passed to LoadBPFDemoObjects or ebpf.CollectionSpec.LoadAndAssign.
type BPFDemoMaps struct {
        KprobeMap *ebpf.Map `ebpf:"kprobe_map"`
}

// BPFDemoPrograms contains all programs after they have been loaded into the kernel.
//
// It can be passed to LoadBPFDemoObjects or ebpf.CollectionSpec.LoadAndAssign.
type BPFDemoPrograms struct {
        KprobeExecve *ebpf.Program `ebpf:"kprobe_execve"`
}

// BPFDemoObjects contains all objects after they have been loaded into the kernel.
//
// It can be passed to LoadBPFDemoObjects or ebpf.CollectionSpec.LoadAndAssign.
type BPFDemoObjects struct {
        BPFDemoPrograms
        BPFDemoMaps
}

// LoadBPFDemoObjects loads BPFDemo and converts it into a struct.
//
// The following types are suitable as obj argument:
//
//     *BPFDemoObjects
//     *BPFDemoPrograms
//     *BPFDemoMaps
//
// See ebpf.CollectionSpec.LoadAndAssign documentation for details.
func LoadBPFDemoObjects(obj interface{}, opts *ebpf.CollectionOptions) error {
        spec, err := LoadBPFDemo()
        if err != nil {
                return err
        }

        return spec.LoadAndAssign(obj, opts)
}

Actually looking at LoadAndAssign you can see that it loads the BPF Program and BPF Map into the kernel.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// LoadAndAssign loads Maps and Programs into the kernel and assigns them
// to a struct.
//    struct {
//        Foo     *ebpf.Program `ebpf:"xdp_foo"`
//        Bar     *ebpf.Map     `ebpf:"bar_map"`
//        Ignored int
//    }
func (cs *CollectionSpec) LoadAndAssign(to interface{}, opts *CollectionOptions) error {
    loader := newCollectionLoader(cs, opts)
    defer loader.cleanup()

    // Support assigning Programs and Maps, lazy-loading the required objects.
    assignedMaps := make(map[string]bool)
    getValue := func(typ reflect.Type, name string) (interface{}, error) {
        switch typ {

        case reflect.TypeOf((*Program)(nil)):
            return loader.loadProgram(name)

        case reflect.TypeOf((*Map)(nil)):
            assignedMaps[name] = true
            return loader.loadMap(name)

        default:
            return nil, fmt.Errorf("unsupported type %s", typ)
        }
    }
  //...
}

Here loadProgram will call newProgramWithOptions, process a lot of other content with BTF and so on, and finally call sys.ProgLoad(attr).

1
2
3
4
5
6
func newProgramWithOptions(spec *ProgramSpec, opts ProgramOptions, handles *handleCache) (*Program, error) {
    // ...
    fd, err := sys.ProgLoad(attr)
  
    // ...
}

This calls the BPF system call.

1
2
3
4
5
6
7
func ProgLoad(attr *ProgLoadAttr) (*FD, error) {
    fd, err := BPF(BPF_PROG_LOAD, unsafe.Pointer(attr), unsafe.Sizeof(*attr))
    if err != nil {
        return nil, err
    }
    return NewFD(int(fd))
}

Loading a map is similar, with the final call to sys.MapCreate

1
2
3
4
5
6
7
func MapCreate(attr *MapCreateAttr) (*FD, error) {
    fd, err := BPF(BPF_MAP_CREATE, unsafe.Pointer(attr), unsafe.Sizeof(*attr))
    if err != nil {
        return nil, err
    }
    return NewFD(int(fd))
}

Kprobe processing

kprobe can stub any kernel function and can be enabled in a production environment in real time, without rebooting the system or restarting the kernel in a special way. The following three interfaces are now available to access kprobes.

  • kprobe API: such as register_kprobe() etc.
  • Frace-based, via /sys/kernel/debug/tracing/kprobe_events : By writing strings to this file, you can configure to enable and disable kprobes.
  • perf_event_open() : As used by the perf tool, these functions are now being used by the BPF tracing tool.

Corresponding to main.go, we also call link.Kprobe after LoadBPFDemoObjects.

1
2
3
4
5
6
7
8
9
// Open a Kprobe at the entry point of the kernel function and attach the
// pre-compiled program. Each time the kernel function enters, the program
// will increment the execution counter by 1. The read loop below polls this
// map value once per second.
kp, err := link.Kprobe(fn, objs.KprobeExecve)
if err != nil {
    log.Fatalf("opening kprobe: %s", err)
}
defer kp.Close()

creates a perf event of type kprobe

  • symbol is the traced kernel function
  • prog is the compiled eBPF program
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
func Kprobe(symbol string, prog *ebpf.Program, opts *KprobeOptions) (Link, error) {
   k, err := kprobe(symbol, prog, opts, false)
   if err != nil {
      return nil, err
   }

   lnk, err := attachPerfEvent(k, prog)
   if err != nil {
      k.Close()
      return nil, err
   }

   return lnk, nil
}

Here a Perf Event of type kprobe is created, and the trace address passed in is symbol.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// kprobe opens a perf event on the given symbol and attaches prog to it.
// If ret is true, create a kretprobe.
func kprobe(symbol string, prog *ebpf.Program, opts *KprobeOptions, ret bool) (*perfEvent, error) {
  // ...
  
    args := probeArgs{
        pid:    perfAllThreads,
        symbol: platformPrefix(symbol),
        ret:    ret,
    }

    // Use kprobe PMU if the kernel has it available.
    tp, err := pmuKprobe(args)
    if err == nil {
        return tp, nil
    }

    // ... 

    // Use tracefs if kprobe PMU is missing.
    args.symbol = platformPrefix(symbol)
    tp, err = tracefsKprobe(args)
    // ...

    return tp, nil
}

The final call is PerfEventOpen to open a perf event, this system call can be found here.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// pmuProbe opens a perf event based on a Performance Monitoring Unit.
//
// Requires at least a 4.17 kernel.
// e12f03d7031a "perf/core: Implement the 'perf_kprobe' PMU"
// 33ea4b24277b "perf/core: Implement the 'perf_uprobe' PMU"
//
// Returns ErrNotSupported if the kernel doesn't support perf_[k,u]probe PMU
func pmuProbe(typ probeType, args probeArgs) (*perfEvent, error) {
  // ...
    switch typ {
    case kprobeType:
        // Create a pointer to a NUL-terminated string for the kernel.
        sp, err = unsafeStringPtr(args.symbol)

        attr = unix.PerfEventAttr{
            Type:   uint32(et),          // PMU event type read from sysfs
            Ext1:   uint64(uintptr(sp)), // Kernel symbol to trace
            Config: config,              // Retprobe flag
        }
    case uprobeType:
    // ...
    }

    rawFd, err := unix.PerfEventOpen(&attr, args.pid, 0, -1, unix.PERF_FLAG_FD_CLOEXEC)
    fd, err := sys.NewFD(rawFd)

    // ...
    // Kernel has perf_[k,u]probe PMU available, initialize perf event.
    return &perfEvent{
        typ:    typ.PerfEventType(args.ret),
        name:   args.symbol,
        pmuID:  et,
        cookie: args.cookie,
        fd:     fd,
    }, nil
}

Mount the eBPF program to perf event

Attach the BPF program to the kprobe event via the perf_event ioctl call

  • PERF_EVENT_IOC_SET_BPF, which allows attaching BPF programs to the kprobe event, where the third parameter set by ioctl represents the fd of the bpf system call.
  • PERF_EVENT_IOC_ENABLE, which means enable event.
1
2
ioctl(perf_event_fd, PERF_EVENT_IOC_SET_BPF, bpf_prog_fd)
ioctl(perf_event_fd, PERF_EVENT_IOC_ENABLE, 0)

attachPerfEvent attaches the BPF program to the kprobe event via the perf_event ioctl call.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// attach the given eBPF prog to the perf event stored in pe.
// pe must contain a valid perf event fd.
// prog's type must match the program type stored in pe.
func attachPerfEvent(pe *perfEvent, prog *ebpf.Program) (Link, error) {
    if prog == nil {
        return nil, errors.New("cannot attach a nil program")
    }
    if prog.FD() < 0 {
        return nil, fmt.Errorf("invalid program: %w", sys.ErrClosedFd)
    }

    switch pe.typ {
    case kprobeEvent, kretprobeEvent, uprobeEvent, uretprobeEvent:
        if t := prog.Type(); t != ebpf.Kprobe {
            return nil, fmt.Errorf("invalid program type (expected %s): %s", ebpf.Kprobe, t)
        }
    case tracepointEvent:
        if t := prog.Type(); t != ebpf.TracePoint {
            return nil, fmt.Errorf("invalid program type (expected %s): %s", ebpf.TracePoint, t)
        }
    default:
        return nil, fmt.Errorf("unknown perf event type: %d", pe.typ)
    }

    if err := haveBPFLinkPerfEvent(); err == nil {
        lnk, err := attachPerfEventLink(pe, prog)
        if err != nil {
            return nil, err
        }
        return lnk, nil
    }

    lnk, err := attachPerfEventIoctl(pe, prog)
    if err != nil {
        return nil, err
    }

    return lnk, nil
}

Mount the BPF program via ioctl.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
func attachPerfEventIoctl(pe *perfEvent, prog *ebpf.Program) (*perfEventIoctl, error) {
    if pe.cookie != 0 {
        return nil, fmt.Errorf("cookies are not supported: %w", ErrNotSupported)
    }

    // Assign the eBPF program to the perf event.
    err := unix.IoctlSetInt(pe.fd.Int(), unix.PERF_EVENT_IOC_SET_BPF, prog.FD())
    if err != nil {
        return nil, fmt.Errorf("setting perf event bpf program: %w", err)
    }

    // PERF_EVENT_IOC_ENABLE and _DISABLE ignore their given values.
    if err := unix.IoctlSetInt(pe.fd.Int(), unix.PERF_EVENT_IOC_ENABLE, 0); err != nil {
        return nil, fmt.Errorf("enable perf event: %s", err)
    }

    pi := &perfEventIoctl{pe}

    // Close the perf event when its reference is lost to avoid leaking system resources.
    runtime.SetFinalizer(pi, (*perfEventIoctl).Close)
    return pi, nil
}

View Map information

Periodically check the eBPF map for updates.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Read loop reporting the total amount of times the kernel
// function was entered, once per second.
ticker := time.NewTicker(1 * time.Second)

log.Println("Waiting for events..")

for range ticker.C {
    var value uint64
    if err := objs.KprobeMap.Lookup(mapKey, &value); err != nil {
        log.Fatalf("reading map: %v", err)
    }
    log.Printf("%s called %d times\n", fn, value)
}

Container image

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
FROM ubuntu:20.04
RUN apt update -y -q
RUN DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y -q curl build-essential ca-certificates
RUN curl -s https://storage.googleapis.com/golang/go1.16.3.linux-amd64.tar.gz| tar -v -C /usr/local -xz
ENV PATH $PATH:/usr/local/go/bin
RUN apt install -y wget gnupg2
RUN printf "deb http://apt.llvm.org/xenial/ llvm-toolchain-xenial-12 main" | tee /etc/apt/sources.list.d/llvm-toolchain-xenial-12.list
RUN wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -
RUN apt -y update
RUN apt install -y llvm clang git
WORKDIR /ebpf
COPY . .
RUN make
RUN chmod a+x /ebpf
ENTRYPOINT ["./ebpf"]
CMD ["./ebpf"]