Namespace is a feature provided by the Linux kernel that wraps some system resources into an abstract space and makes the processes in that space think that these resources are the only resources available in the system. It isolates processes and resources from the host system and other containers.

There are many types of namespace depending on the system resources they operate on, such as cgroup namespace, mount namespace, etc. We will just take pid namespace as an example and use runC as the container runtime implementation to demonstrate how namespace works when we perform operations on the container .

As we described in the previous article, most container systems use runC as the underlying runtime implementation, and if you are using docker on a Linux distribution, you don’t even need to install it specifically to use the runc command.

## Preparation

### filesystem bundle

runC can only execute containers from a filesystem bundle (a filesystem bundle is, as the name implies, a folder that satisfies a specific structure), but we can use docker to prepare an available bundle.

  1 2 3 4 5 6 7 8 9 10 11 12  # 创建 bundle 的顶层目录 $mkdir /mycontainer$ cd /mycontainer # 创建用于存放 root filesystem 的 rootfs 目录 $mkdir rootfs # 利用 Docker 导出 busybox 容器的 root filesystem$ docker export $(docker create busybox) | tar -C rootfs -xvf - # 创建一个 config.json 作为整个 bundle 的 spec$ runc spec 

At this point, the entire bundle directory structure is as follows.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  $tree -L 2 /mycontainer /mycontainer ├── config.json └── rootfs ├── bin ├── dev ├── etc ├── home ├── proc ├── root ├── sys ├── tmp ├── usr └── var  ### System monitoring tools To complete the demo, we need some third-party system monitoring tools as an aid. 1. monitor the process startup to get the PID of the running process in the container, such as forkstat in ubuntu, which can monitor system calls like fork() , exec() and exit() in real time, installed as follows.  1  $ apt install forkstat 
2. View namespace information, such as cinf, which is a command line tool that can easily list all namespaces on the system or view detailed information about a namespce, is installed as follows.

 1 2 3 4 5  $curl -s -L https://github.com/mhausenblas/cinf/releases/latest/download/cinf_linux_amd64.tar.gz \ -o cinf.tar.gz && \ tar xvzf cinf.tar.gz cinf && \ mv cinf /usr/local/bin && \ rm cinf*  ## Running containers with runc First we need to run forkstat in a window.  1  $ forkstat -e exec 

Then create a new terminal window, switch to the /mycontainer directory, and use runC to run the container.

 1  $runc run mybox  When executed, it will go directly to the newly created container and run the ps command.  1 2 3  PID USER TIME COMMAND 1 root 0:00 sh 7 root 0:00 ps  The forkstat window will have the following output.  1 2 3 4 5 6 7 8  Time Event PID Info Duration Process 12:35:22 exec 33040 runc run mybox 12:35:22 exec 33047 runc init 12:35:22 exec 33049 dumpe2fs -h /dev/sdb3 12:35:22 exec 33050 dumpe2fs -h /dev/sdb3 12:35:22 exec 33047 runc init 12:35:22 exec 33052 sh 12:35:37 exec 33062 ps  As you can tell from the synchronous printout, the sh or ps output by ps and forkstat are actually the same process, but since the processes in the container are in a separate pid namespace, they have separate PIDs in the container, and they think they are the only processes in the container, so the PIDs will start at 1. ### Find the namespace the process belongs to Now to find the pid namespace used by the container, you need to adjust the output format of the ps command for this purpose.  1 2 3  $ ps -p 33052 -o pid,pidns PID PIDNS 33052 4026532395 

PIDNS is the pid namespace, the above command can get sh process with PID 33052 belongs to the pid namespace 4026532395. Since we already have the PID of the process in the container, we can actually get all the namespace of the process through the /proc file system of the host.

  1 2 3 4 5 6 7 8 9 10 11  $ll /proc/33052/ns lrwxrwxrwx 1 root root 0 7月 21 12:37 cgroup -> 'cgroup:[4026531835]' lrwxrwxrwx 1 root root 0 7月 21 12:36 ipc -> 'ipc:[4026532394]' lrwxrwxrwx 1 root root 0 7月 21 12:36 mnt -> 'mnt:[4026532383]' lrwxrwxrwx 1 root root 0 7月 21 12:36 net -> 'net:[4026532397]' lrwxrwxrwx 1 root root 0 7月 21 12:36 pid -> 'pid:[4026532395]' lrwxrwxrwx 1 root root 0 7月 21 12:37 pid_for_children -> 'pid:[4026532395]' lrwxrwxrwx 1 root root 0 7月 21 12:37 time -> 'time:[4026531834]' lrwxrwxrwx 1 root root 0 7月 21 12:37 time_for_children -> 'time:[4026531834]' lrwxrwxrwx 1 root root 0 7月 21 12:36 user -> 'user:[4026531837]' lrwxrwxrwx 1 root root 0 7月 21 12:36 uts -> 'uts:[4026532393]'  The printout shows the namespace to which a process belongs. • Each namespace is a soft link, and the name of the soft link indicates the type of namespace, e.g. cgroup for cgroup namespace, pid for pid namespace. • Each softlink points to the real namespace object to which the process belongs, which is represented by an inode number, and each inode number is unique in the host system. • If two processes have softlinks of the same type of namespace pointing to the same inode, they belong to the same namespace. Virtually all processes will belong to at least one namespace, and the Linux system creates a default namespace for all types of processes at boot time. We can also try to get the namespace that sh belongs to within the container, which requires the PID 1 within the container.   1 2 3 4 5 6 7 8 9 10 11  $ ls -l /proc/1/ns lrwxrwxrwx 1 root root 0 Jul 21 04:37 cgroup -> cgroup:[4026531835] lrwxrwxrwx 1 root root 0 Jul 21 04:37 ipc -> ipc:[4026532394] lrwxrwxrwx 1 root root 0 Jul 21 04:37 mnt -> mnt:[4026532383] lrwxrwxrwx 1 root root 0 Jul 21 04:37 net -> net:[4026532397] lrwxrwxrwx 1 root root 0 Jul 21 04:37 pid -> pid:[4026532395] lrwxrwxrwx 1 root root 0 Jul 21 04:37 pid_for_children -> pid:[4026532395] lrwxrwxrwx 1 root root 0 Jul 21 04:37 time -> time:[4026531834] lrwxrwxrwx 1 root root 0 Jul 21 04:37 time_for_children -> time:[4026531834] lrwxrwxrwx 1 root root 0 Jul 21 04:37 user -> user:[4026531837] lrwxrwxrwx 1 root root 0 Jul 21 04:37 uts -> uts:[4026532393] 

### Watching processes in namespace

We will now look at all the processes in the pid namespace from the namespace’s point of view, which is not provided by the Linux system, so you will need to use the cinf tool installed above.

  1 2 3 4 5 6 7 8 9 10 11 12 13  $cinf -namespace 4026532395 PID PPID NAME CMD NTHREADS CGROUPS STATE 33052 33052 sh sh 1 12:devices:/user.slice/mybox S (sleeping) 11:blkio:/user.slice/mybox 10:rdma:/ 9:memory:/user.slice/user-0.slice/session-590.scope/mybox 8:net_cls,net_prio:/mybox 7:freezer:/mybox 6:pids:/user.slice/user-0.slice/session-590.scope/mybox 5:cpu,cpuacct:/user.slice/mybox 4:cpuset:/mybox 3:perf_event:/mybox 2:hugetlb:/mybox 1:name=systemd:/user.slice/user-0.slice/session-590.scope/mybox 0::/user.slice/user-0.slice/session-590.scope  Currently there is only one process in this namespace, and this process is also the init process of the container we are creating. When a new container is created, some new namespaces will be created and the container’s init process will be added to these namespaces. For pid namespace, all processes running in the container can only see other processes in the same pid namespace, pid:[4026532395]. The sh process is considered to be the first process running on the system in the container with a PID of 1, but in the host it is just a normal process with a PID of 33052, and the same process has different PIDs in different namespaces, which is the role of the pid namespace. In a way, a container means a new set of namespaces. ## Create a new process in a container Create a new terminal window to run a new process in an already running container.  1  $ runc exec mybox /bin/top -b 

From the forkstat window, we can see the PID of the newly created process.

 1 2 3 4 5  Time Event PID Info Duration Process 12:40:23 exec 33132 runc exec mybox /bin/top -b 12:40:23 exec 33140 runc init 12:40:23 exec 33140 runc init 12:40:23 exec 33142 /bin/top -b 

There is actually a more direct way to see the processes running in the container from the host, we can use the ps subcommand provided by runC.

 1 2 3 4  $runc ps mybox UID PID PPID C STIME TTY TIME CMD root 33052 33040 0 12:35 pts/0 00:00:00 sh root 33142 33132 0 12:40 pts/1 00:00:00 /bin/top -b  Next, you still use cinf to find out which namespace the newly created process belongs to.   1 2 3 4 5 6 7 8 9 10  $ cinf --pid 33142 NAMESPACE TYPE 4026532383 mnt 4026532393 uts 4026532394 ipc 4026532395 pid 4026532397 net 4026531837 user 

From the result, no new namespace is created, the namespace of the 32608 process is exactly the same as the namespace to which the init process-sh of the mybox container belongs. That is, creating a new process in the container simply adds that process to the namespace of the container’s init process.

Here is a list of all the processes owned by the 4026532395 namespace.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22  \$ cinf --namespace 4026532395 PID PPID NAME CMD NTHREADS CGROUPS STATE 33052 33040 sh sh 1 12:devices:/user.slice/mybox S (sleeping) 11:blkio:/user.slice/mybox 10:rdma:/ 9:memory:/user.slice/user-0.slice/session-590.scope/mybox 8:net_cls,net_prio:/mybox 7:freezer:/mybox 6:pids:/user.slice/user-0.slice/session-590.scope/mybox 5:cpu,cpuacct:/user.slice/mybox 4:cpuset:/mybox 3:perf_event:/mybox 2:hugetlb:/mybox 1:name=systemd:/user.slice/user-0.slice/session-590.scope/mybox 0::/user.slice/user-0.slice/session-590.scope 33142 33132 top top -b 1 12:devices:/user.slice/mybox S (sleeping) 11:blkio:/user.slice/mybox 10:rdma:/ 9:memory:/user.slice/user-0.slice/session-590.scope/mybox 8:net_cls,net_prio:/mybox 7:freezer:/mybox 6:pids:/user.slice/user-0.slice/session-590.scope/mybox 5:cpu,cpuacct:/user.slice/mybox 4:cpuset:/mybox 3:perf_event:/mybox 2:hugetlb:/mybox 1:name=systemd:/user.slice/user-0.slice/session-590.scope/mybox 0::/user.slice/user-0.slice/session-590.scope 

If we run ps -ef inside the container, we can also see these processes, their PIDs will be different due to the pid namespace.

 1 2 3 4  PID USER TIME COMMAND 1 root 0:00 sh 19 root 0:00 top -b 20 root 0:00 ps -ef 

Now we know that docker/runc exec is actually running a new process in the namespace of the created container.

## Summary

When you run a container, new namespaces are created and the init process is added to those namespaces; when you run a new process in a container, the new process is added to the namespace created when the container was created.

In fact, the behavior of creating new namespaces when creating a container can be changed, we can specify that the new container uses the existing namespace.