Using Qemu to debug the Linux kernel is a convenient way, so I recently did some practice and documented the general steps and some of the pitfalls.

Environment

Since I am at home on a long vacation, I only have a MBP with MacOS available, and it is easier to develop and debug Linux kernel in a Linux environment, so I created a virtual machine with Ubuntu 18.04 installed using VMware Fusion. Since compiling the Linux kernel and related software requires more resources, the virtual machine was configured with a dual-core CPU, 2GB of memory and 20GB of disk space (the laptop itself has limited resources), but the actual usage (especially physical memory and hard disk) was stretched to the limit, so 3GB of SWAP memory was added to the system and 20GB of disk space was expanded (which was still not enough) to The problem was solved.

Compile the Linux kernel

First, try to compile the kernel. Before compiling, you need to start the debugging configuration of the kernel using KConfig.

Download kernel source code

Downloading kernel source code in China is a complex physical activity due to the large amount of Linux kernel code and the availability of FGW in China.

The first method is to Clone the Linux source code directly from the Git repository, which is currently about 3.7GB. https://github.com/torvalds/linux), you often run into disconnections, which can be very annoying. If you clone from a domestic mirror, such as the Tsinghua Kernel Git mirror, you start out fast, but then it gets slower and slower. Therefore, if you are not as stubborn as I am, it is not recommended to download the Kernel source code in this way.

Another easier way is to download a specific version of the source code, which is available as a tarball from the kernel website or a mirror site. The kernel version I used in my experiments was 4.19, and the gz archive was about 150MB in size.

Configuring the Kernel

If you use Git Clone to get the kernel source code, you need to set the kernel source code to version 4.19 by using git checkout v4.19.

Before you can compile, you need to install the dependencies (if you are prompted for other dependencies, you can install them as needed)

1
sudo apt install libncurses5-dev libssl-dev bison flex libelf-dev gcc make openssl libc6-dev

Before compiling, the kernel compilation options need to be configured using KConfig. Under the kernel folder, use make menuconfig (command line interface) or make gconfig (gtk-based graphical interface) to configure the kernel. During configuration, the following options need to be turned on.

1
2
3
Kernel hacking -> Kernel debugging
Kernel hacking -> KGDB:kernel debugger
Kernel hacking -> Compile time checks and compiler options -> Provide GDB scripts for kernel debugging

and ensure that the following options are not turned on.

1
Kernel hacking -> Compile time checks and compiler options -> Reduce debugging information

After exiting the configuration, you can find a configuration file named .config generated in the kernel directory.

Compiling the kernel

Once configured, the kernel can be compiled using make, and on multicore CPUs you can start a multithreaded compilation using make -jx (x is the number of threads started).

If everything works, the kernel will be compiled after a long wait. The compilation will generate the vmlinux file in the kernel root directory, which is the compiled raw kernel file (with debugging information), and the compressed kernel file in the arch/x86/boot/bzImage directory (if the compiled architecture is x86, of course).

Compile and install GDB and Qemu

Since the versions of GDB and Qemu required for kernel debugging may be higher than the versions in the apt source, it is best to compile and install these software yourself.

Compile and install GDB

First, download the source code of GDB from the official website (http://www.gnu.org/software/gdb/download/) and unzip it (here we use the latest GDB 9.1 from the official website), it should be noted that some blogs on the Internet mention the need to modify the source code of GDB, but it is actually unnecessary, the reason for the error is that it is not automatically detected The reason for the error is that the type of the target architecture is not detected automatically, so you only need to set the type.

After unpacking, go to the GDB folder and execute the following command to complete the compilation and installation.

1
2
3
4
5
mkdir build 
cd build
../configure
make -j4
sudo make install

Finally, determine if the version of gdb is 9.1 by using gdb -v, if it is, the installation is successful.

Compile and install Qemu

First, download (https://www.qemu.org/download/#source) the Qemu source code from the official website and unzip it (Qemu 5.0.0 is used here).

Since using Qemu in the Ubuntu GUI also requires the multimedia graphics library SDL, you need to first install sdl using apt.

1
sudo apt install libsdl2-2.0-0 libsdl2-dev libsdl2-gfx-1.0-0 libsdl2-gfx-dev libsdl2-image-2.0-0 libsdl2-image-dev 

After entering the Qemu directory, execute . /configure to check the system configuration and generate a Makefile, you need to pay attention to whether the check detects SDL support, the output is partly as follows.

1
2
3
4
5
6
7
8
profiler          no
static build      no
SDL support       yes (2.0.8)
SDL image support yes
GTK support       no 
GTK GL support    no
VTE support       no 
TLS priority      NORMAL

Then execute make && make install to complete the compilation and installation of Qemu.

After installing Qemu, a series of commands such as qemu-xxx and qemu-system-xxx will be generated to emulate different architectures of user-state applications and operating systems, and you can confirm whether Qemu is successfully installed by using commands such as qemu-system-x86_64 --version.

Make ROOTFS

A rootfs with an init program is needed after the kernel is booted, so you need to make a rootfs before debugging the kernel.

Building an initrd-based rootfs

initrd is an in-memory root file system that loads the system before the hard disk is driven. Here, for convenience, only a simple program is written to initrd and is used as the init program (i.e. the first user-state process after the system is booted). Alternatively, you can use busybox as the init program in initrd.

Create a simple c program named fakeinit.c.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#include <stdio>
int main()
{
    printf("hello world!");
    printf("hello linux!");
    printf("hello world!");
    printf("hello linux!");
    fflush(stdout);
    while(1);
    return 0;
}

Then use gcc to compile this code, which requires static linking and, if the kernel is not configured with 64-bit support (64-bit kernel) enabled, compile the code as a 32-bit program by adding the -m32 option to the gcc command line.

The compile command is as follows.

1
2
gcc --static -o fakeinit fakeinit.c
gcc --static -o fakeinit fakeinit.c -m32 (编译为32位可执行程序)

After compiling, use the cpio program to package.

1
echo fakeinit | cpio -o --format=newc > initrd_rootfs.img

Thus, an initrd-based rootfs is created.

Building a rootfs based on a hard disk image

Here we use busybox to build a rootfs based on a hard disk image. busybox is a single piece of software that integrates hundreds of common Linux commands and tools, which is very convenient when testing the kernel, called “The Swiss Army Knife of Embedded Linux “.

Download and compile busybox

First, download the busybox source code from the official website (https://busybox.net/downloads/) and unzip it (the latest busybox-1.31.1 is used here).

After unpacking and entering the busybox folder, first configure it using make gconfig or make menuconfig, which requires the following options to be enabled.

1
Settings -> Build Options -> Build static binary (no shared libs)

If you need to compile it to a 32-bit version, you need to fill the -m32 command with the following options.

1
2
Settings -> Build Options -> Additional CFLAGS
Settings -> Build Options -> Additional LDFLAGS

As with the kernel, a configuration file named .config will be generated in the directory after exiting.

Then, use the make command to compile busybox.

Creating rootfs with busybox

First, create an empty disk image file, and then format it as follows.

1
2
dd if=/dev/zero of=./busybox_rootfs.img bs=1M count=10
mkfs.ext3 ./busybox_rootfs.img

Then, mount the disk image you just created (requires the use of a loop device).

1
2
mkdir rootfs_mount
sudo mount -t ext3 -o loop ./busybox_rootfs.img ./rootfs_mount

Next, install the compiled busybox target file into the rootfs folder in the busybox source directory.

1
make install CONFIG_PREFIX=/path/to/rootfs_mount/

Finally, to configure busybox init and uninstall rootfs.

1
2
3
4
5
mkdir /path/to/rootfs_mount/proc
mkdir /path/to/rootfs_mount/dev
mkdir /path/to/rootfs_mount/etc
cp busybox-source-code/examples/bootfloppy/* /path/to/rootfs_mount/etc/
sudo umount /path/to/rootfs_mount

Now, a busybox-based rootfs disk image has been created.

Debugging the kernel with Qemu and GDB

Booting the kernel with Qemu

Since the compiled kernel architecture is x86, use the qemu-system-x86_64 program to load and boot the kernel.

If you use intird as the rootfs, the specific command is

1
2
3
4
5
6
qemu-system-x86_64 \
  -kernel ./linux/arch/x86/boot/bzImage \  # 指定编译好的内核镜像
  -initrd ./rootfs/initrd_rootfs.img \  # 指定rootfs
  -serial stdio \ #指定使用stdio作为输入输出
  -append "root=/dev/ram rdinit=/fakeinit console=ttyS0 nokaslr" \ # 内核参数,指定使用initrd作为rootfs,禁止地址空间布局随机化
  -s -S # 指定Qemu在启动时暂停并启动gdb server,等待gdb的连入(端口默认为1234)

If using a disk image as a rootfs, the specific command is

1
2
3
4
5
6
qemu-system-x86_64 \
  -kernel ./linux/arch/x86/boot/bzImage \
  -hda ./rootfs/busybox_rootfs.img \ # 指定磁盘镜像
  -serial stdio \
  -append "root=/dev/sda console=ttyS0 nokaslr" \ # 内核参数,指定root磁盘,禁止地址空间布局随机化
  -s -S

Debugging the kernel with GDB

As a final step, since Qemu has just enabled remote debugging, you only need to connect gdb via

1
gdb ./linux/vmlinux # 指定调试文件为包含调试信息的内核文件

If you use target remote:1234 to connect to Qemu’s gdb server directly in the gdb debugger at this point, you will get the error Remote 'g' packet reply is too long, which is caused by the fact that gdb does not correctly identify the architecture of the debug This is caused by the fact that gdb does not correctly recognize the architecture of the target (some bloggers think they need to modify the source code to block this error, which is actually unnecessary), so you just need to set the target architecture with set arch i386:x86-64:intel before remote attaching.

For example, if you want to set a breakpoint in the start_kernel function for debugging, the gdb command after starting Qemu would be as follows

1
2
3
4
5
6
gdb ~/linux/vmlinux
(gdb) set arch i386:x86-64:intel
(gdb) add-auto-load-safe-path ~/linux
(gdb) target remote:1234
(gdb) b start_kernel
(gdb) c

As you can see, the kernel is interrupted on the start_kernel function after booting.

Postscript

Kernel documentation

The kernel documentation includes a detailed explanation of how to debug the kernel using GDB.

The latest version of this document can be found on the kernel’s website: https://www.kernel.org/doc/html/latest/dev-tools/gdb-kernel-debugging.html.

For example, the html version of the documentation can be compiled using make htmldocs and can be accessed in a browser after starting the HTTP server, e.g. http://127.0.0.1:8000/dev-tools/gdb-kernel-debugging.html.