In this article I will demonstrate how to use the command to connect processes in different subnets of the network namespace through a pair of veth interfaces.

Network Namespaces

We know that the container runtime uses the namespace (namespace) kernel function to partition system resources for some form of process isolation, so that changes to resources in one namespace do not affect resources in other namespaces, including process IDs, host names, user IDs, file names, network interfaces, etc.

Network namespaces can virtualize network stacks, and each network namespace has its own resources, such as network interfaces, IP addresses, routing tables, tunnels, firewalls, etc. For example, rules added to a network namespace by iptables will only affect traffic entering and leaving that namespace.

ip command

The ip command is used to display or manipulate routes, network devices, policy routes, and tunnels for Linux hosts, and is a newer and powerful network configuration tool for Linux.

The usage of this command is shown below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
ip [OPTIONS] OBJECT COMMAND [ARGUMENTS]
# where 
#   OPTIONS are general global options
#   OBJECT := { link | address | addrlabel | route |
#     rule | neigh | ntable | tunnel | tuntap | maddress | 
#     mroute | mrule | monitor | xfrm | netns | l2tp | 
#     tcp_metrics }
#   COMMAND is the action to perform on the object, such as,
#     show, add, del etc.
#   ARGUMENTS are arguments specific to the kind of OBJECT 
#     and COMMAND

Example.

  • To add a new network interface, use the ip link add <interface-name> type <interface-type> <interface-arguments>... command
  • To assign a new IP address range to an interface, use the ip addr add <ip-address-range> dev <device-name> command
  • To remove a route from the routing table, use the ip route del <route-ip-range> dev <device-name> command

The -n option can be used to switch the destination namespace, for example, to assign the 10.0.1.0/24 IP address range to interface veth0 in the ns1 network namespace, use the ip -n ns1 addr add 10.0.1.0/24 dev veth0 command .

💡 The -n option is short for ip netns exec

Configure the first network namespace

First we use the ip link add command to create a new pair of veth interfaces: veth0 and veth1.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# 创建一对名为 veth0 和 veth1 的 veth 接口。
$ ip link add veth0 type veth peer name veth1

# 确认 veth0 已创建
$ ip link show veth0
289: veth0@veth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 5e:87:df:87:af:c7 brd ff:ff:ff:ff:ff:ff

# 确认 veth1 已创建
$ ip link show veth1
288: veth1@veth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether be:0d:a4:8c:9f:2a brd ff:ff:ff:ff:ff:ff

A veth interface is usually created as a pair in which data transmitted at one end is immediately received at the other end, and this type of interface is typically used to transfer packets between different network namespaces when the container is running.

Let’s create the first network namespace, ns1, and then we can assign the veth0 interface to this network namespace and the IP address range of 10.0.1.0/24 to it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# 创建 ns1 网络命名空间
$ ip netns add ns1

# 分配 veth0 接口到 ns1 网络命名空间
$ ip link set veth0 netns ns1

# 将 10.0.1.0/24 IP 地址范围分配给 veth0 接口
$ ip -n ns1 addr add 10.0.1.0/24 dev veth0

# 将 veth0 接口 up 起来
$ ip -n ns1 link set veth0 up

# 将 lo 接口 up 起来,因为发往 10.0.1.0/24 的数据(本地的)
# (像 ping)要通过 local(本地)路由表
# 比如要 ping 自己
$ ip -n ns1 link set lo up 

# 确认接口已经 up 起来
$ ip -n ns1 addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
289: veth0@if288: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000
    link/ether 5e:87:df:87:af:c7 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.1.0/24 scope global veth0
       valid_lft forever preferred_lft forever

Now what happens if we go ping the veth0 interface from both the host and ns1 network namespaces?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# veth0 不在主机的根网络命名空间中
$ ip link show veth0            
Device "veth0" does not exist.

# 从主机网络命名空间中 ping 不通
$ ping -c10 10.0.1.0
PING 10.0.1.0 (10.0.1.0) 56(84) bytes of data.
^C
--- 10.0.1.0 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 1999ms

We can see that the interface veth0 is not found directly in the root network namespace of the host, and of course it is also pinging different 10.0.1.0 addresses because they are bound to the ns1 network namespace, so we need to switch to this namespace when we operate.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$ ip netns exec ns1 ping -c10 10.0.1.0
PING 10.0.1.0 (10.0.1.0) 56(84) bytes of data.
64 bytes from 10.0.1.0: icmp_seq=1 ttl=64 time=0.121 ms
64 bytes from 10.0.1.0: icmp_seq=2 ttl=64 time=0.063 ms
64 bytes from 10.0.1.0: icmp_seq=3 ttl=64 time=0.066 ms
64 bytes from 10.0.1.0: icmp_seq=4 ttl=64 time=0.109 ms
^C
--- 10.0.1.0 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3000ms
rtt min/avg/max/mdev = 0.063/0.089/0.121/0.028 ms

Here we use an ip netns exec command, which allows us to execute any command in the specified network namespace, and we can see that we can now ping 10.0.1.0 in the ns1 network namespace.

Configure a second network namespace

Let’s use the above to create a second network namespace ns2, then assign the veth1 interface to this network namespace and the IP address range 10.0.2.0/24 to this interface.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 创建名为 ns2 的网络命名空间
$ ip netns add ns2

# 分配 veth1 接口到 ns2 网络命名空间
$ ip link set veth1 netns ns2

# 将 10.0.2.0/24 IP 地址范围分配给 veth1 接口
$ ip -n ns2 addr add 10.0.2.0/24 dev veth1

# 将 veth1 接口 up 起来
$ ip -n ns2 link set veth1 up

# 将 lo 口 up 起来(这样可以 ping 通自己)
$ ip -n ns2 link set lo up 

$ ip -n ns2 addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
288: veth1@if289: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether be:0d:a4:8c:9f:2a brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.2.0/24 scope global veth1
       valid_lft forever preferred_lft forever
    inet6 fe80::bc0d:a4ff:fe8c:9f2a/64 scope link
       valid_lft forever preferred_lft forever

To make it easier to set up routes later, here we assign a different subnet IP range to the veth1 interface. Similar to the veth0 interface, the veth1 interface cannot be reached from the host network namespace and can only work within the ns2’s own network namespace.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
$ ip link show veth1
Device "veth1" does not exist.
$ ping -c10 10.0.2.0
PING 10.0.2.0 (10.0.2.0) 56(84) bytes of data.
From 180.149.159.13 icmp_seq=2 Packet filtered
^C
--- 10.0.2.0 ping statistics ---
2 packets transmitted, 0 received, +1 errors, 100% packet loss, time 999
$ ip netns exec ns2 ping -c10 10.0.2.0
PING 10.0.2.0 (10.0.2.0) 56(84) bytes of data.
64 bytes from 10.0.2.0: icmp_seq=1 ttl=64 time=0.100 ms
64 bytes from 10.0.2.0: icmp_seq=2 ttl=64 time=0.096 ms
64 bytes from 10.0.2.0: icmp_seq=3 ttl=64 time=0.068 ms
^C
--- 10.0.2.0 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.068/0.088/0.100/0.014 ms

Configure subnet routing

Although each can access itself in the above two network spaces, they cannot ping through to each other.

1
2
3
4
$ ip netns exec ns1 ping -c10 10.0.2.0
connect: Network is unreachable
$ ip netns exec ns2 ping -c10 10.0.1.0
connect: Network is unreachable

Both veth0 and veth1 interfaces are up and pinging in various network namespaces works fine, so they are not directly connected to each other, which is probably related to routing. Let’s use the ip command to debug this. We can use the ip route get command to determine the route a packet is taking.

1
2
3
4
$ ip -n ns1 route get 10.0.2.0
RTNETLINK answers: Network is unreachable
$ ip -n ns2 route get 10.0.1.0
RTNETLINK answers: Network is unreachable

We can see that both are network unreachable. Let’s check the routing table information in the two network namespaces.

1
2
3
4
$ ip -n ns1 route
10.0.1.0/24 dev veth0 proto kernel scope link src 10.0.1.0
$ ip -n ns2 route
10.0.2.0/24 dev veth1 proto kernel scope link src 10.0.2.0

See the routing table is not very clear, the routing table of two network namespaces are only the routing entries of their respective IP ranges, and there is no route to other subnets, so of course can not interoperate, to solve is also very simple, you can use the ip route add command to insert a new route entry in the routing table is not it can be.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# 更新 veth0 路由表,添加一条通往 10.0.2.0/24 的路由
$ ip -n ns1 route add 10.0.2.0/24 dev veth0

# 确认发往 10.0.2.0/24 的数据包被路由到 veth0
$ ip -n ns1 route get 10.0.2.0
10.0.2.0 dev veth0 src 10.0.1.0
    cache

# 同样更新 veth1 路由表,添加一条通往 10.0.1.0/24 的路由
$ ip -n ns2 route add 10.0.1.0/24 dev veth1

# 确认发往 10.0.1.0/24 的数据包被路由到 veth1
$ ip -n ns2 route get 10.0.1.0
10.0.1.0 dev veth1 src 10.0.2.0
    cache

Above we added each other’s routing information to our respective network namespaces, now let’s try to ping each other’s veth interfaces.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
$ ip netns exec ns1 ping -c10 10.0.2.0
PING 10.0.2.0 (10.0.2.0) 56(84) bytes of data.
64 bytes from 10.0.2.0: icmp_seq=1 ttl=64 time=0.140 ms
64 bytes from 10.0.2.0: icmp_seq=2 ttl=64 time=0.080 ms
64 bytes from 10.0.2.0: icmp_seq=3 ttl=64 time=0.091 ms
^C
--- 10.0.2.0 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1999ms
rtt min/avg/max/mdev = 0.080/0.103/0.140/0.028 ms

$ ip netns exec ns2 ping -c10 10.0.1.0
PING 10.0.1.0 (10.0.1.0) 56(84) bytes of data.
64 bytes from 10.0.1.0: icmp_seq=1 ttl=64 time=0.114 ms
64 bytes from 10.0.1.0: icmp_seq=2 ttl=64 time=0.084 ms
64 bytes from 10.0.1.0: icmp_seq=3 ttl=64 time=0.086 ms
^C
--- 10.0.1.0 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.084/0.094/0.114/0.017 ms

You can see that it has been passed! 🎉🎉🎉🎉

We can also use tcpdump to capture the packets transmitted between two network namespaces.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
$ ip netns exec ns1 tcpdump -i veth0 icmp -l
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:29:22.080392 IP 10.0.2.0 > 10.0.1.0: ICMP echo request, id 7253, seq 1, length 64
11:29:22.080464 IP 10.0.1.0 > 10.0.2.0: ICMP echo reply, id 7253, seq 1, length 64
11:29:23.080409 IP 10.0.2.0 > 10.0.1.0: ICMP echo request, id 7253, seq 2, length 64
11:29:23.080472 IP 10.0.1.0 > 10.0.2.0: ICMP echo reply, id 7253, seq 2, length 64
11:29:24.080357 IP 10.0.2.0 > 10.0.1.0: ICMP echo request, id 7253, seq 3, length 64
11:29:24.080418 IP 10.0.1.0 > 10.0.2.0: ICMP echo reply, id 7253, seq 3, length 64
11:29:25.080346 IP 10.0.2.0 > 10.0.1.0: ICMP echo request, id 7253, seq 4, length 64
11:29:25.080401 IP 10.0.1.0 > 10.0.2.0: ICMP echo reply, id 7253, seq 4, length 64
11:29:26.080417 IP 10.0.2.0 > 10.0.1.0: ICMP echo request, id 7253, seq 5, length 64
11:29:26.080496 IP 10.0.1.0 > 10.0.2.0: ICMP echo reply, id 7253, seq 5, length 64
11:29:27.080454 IP 10.0.2.0 > 10.0.1.0: ICMP echo request, id 7253, seq 6, length 64
11:29:27.080507 IP 10.0.1.0 > 10.0.2.0: ICMP echo reply, id 7253, seq 6, length 64
11:29:28.080398 IP 10.0.2.0 > 10.0.1.0: ICMP echo request, id 7253, seq 7, length 64
11:29:28.080456 IP 10.0.1.0 > 10.0.2.0: ICMP echo reply, id 7253, seq 7, length 64
11:29:29.080390 IP 10.0.2.0 > 10.0.1.0: ICMP echo request, id 7253, seq 8, length 64
11:29:29.080431 IP 10.0.1.0 > 10.0.2.0: ICMP echo reply, id 7253, seq 8, length 64
11:29:30.080524 IP 10.0.2.0 > 10.0.1.0: ICMP echo request, id 7253, seq 9, length 64
11:29:30.080576 IP 10.0.1.0 > 10.0.2.0: ICMP echo reply, id 7253, seq 9, length 64
11:29:31.081895 IP 10.0.2.0 > 10.0.1.0: ICMP echo request, id 7253, seq 10, length 64
11:29:31.081942 IP 10.0.1.0 > 10.0.2.0: ICMP echo reply, id 7253, seq 10, length 64
^C
20 packets captured
20 packets received by filter
0 packets dropped by kernel

TCP connection

Better yet, let’s test the TCP connection by starting a TCP server on port 7096 in the ns1 namespace using the nc command, and then initiating a TCP handshake connection from the ns2 network namespace.

1
2
$ ip netns exec ns1 nc -l 10.0.1.0 7096 -v
exec of "nc" failed: No such file or directory

The above command reports an error because we haven’t installed the ns tool yet, once it is installed it will be fine.

1
2
3
4
5
$ yum install -y nc

$ ip netns exec ns1 nc -l 10.0.1.0 7096 -v
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Listening on 10.0.1.0:7096

Then reopen a terminal to connect.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# 使用 nc 从 ns2 发起 TCP 握手
$ ip netns exec ns2 nc -4t 10.0.1.0 7096 -v
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 10.0.1.0:7096.

# 这个时候正常会在前面的服务中看到连接状态
$ ip netns exec ns1 nc -l 10.0.1.0 7096 -v
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Listening on 10.0.1.0:7096
Ncat: Connection from 10.0.2.0.
Ncat: Connection from 10.0.2.0:34090.

Once the TCP connection is established, we can send test messages from ns2 to ns1.

1
2
3
4
$ ip netns exec ns2 nc -4t 10.0.1.0 7096 -v
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 10.0.1.0:7096.
this is a test message  # 在这里输入一段信息

At this point, our server side on the ns1 side will also receive the message sent.

1
2
3
4
5
6
$ ip netns exec ns1 nc -l 10.0.1.0 7096 -v
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Listening on 10.0.1.0:7096
Ncat: Connection from 10.0.2.0.
Ncat: Connection from 10.0.2.0:34090.
this is a test message

We can also use tcpdump to grab all packets transmitted between two network namespaces.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
$ ip netns exec ns1 tcpdump -X -i veth0 -n tcp -l
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:42:59.912176 IP 10.0.2.0.34090 > 10.0.1.0.7096: Flags [P.], seq 118819706:118819735, ack 1587208228, win 229, options [nop,nop,TS val 1970393377 ecr 1970365937], length 29
	0x0000:  4500 0051 ad52 4000 4006 7655 0a00 0200  E..Q.R@.@.vU....
	0x0010:  0a00 0100 852a 1bb8 0715 0b7a 5e9a e024  .....*.....z^..$
	0x0020:  8018 00e5 1743 0000 0101 080a 7571 d121  .....C......uq.!
	0x0030:  7571 65f1 7468 6973 2069 7320 616e 6f74  uqe.this.is.anot
	0x0040:  6865 7220 7465 7374 206d 6573 7361 6765  her.test.message
	0x0050:  0a                                       .
11:42:59.912207 IP 10.0.1.0.7096 > 10.0.2.0.34090: Flags [.], ack 29, win 227, options [nop,nop,TS val 1970393377 ecr 1970393377], length 0
	0x0000:  4500 0034 4612 4000 4006 ddb2 0a00 0100  E..4F.@.@.......
	0x0010:  0a00 0200 1bb8 852a 5e9a e024 0715 0b97  .......*^..$....
	0x0020:  8010 00e3 1726 0000 0101 080a 7571 d121  .....&......uq.!
	0x0030:  7571 d121                                uq.!

Of course, you can also save this packet capture result and then use other tools such as Jaws to analyze it in detail.

Summary

In this article, we use the ip subcommand to create and configure network namespaces, interfaces, routes, etc. We create a pair of veth interfaces that are assigned to two different network namespaces with different subnet IP address ranges, and configure additional routes in the routing table of the network namespace, which enables communication between the two subnets.

Neither veth interface is directly reachable from the host network namespace because their IP address ranges and routing table changes are also isolated in their own network namespaces.

We can use the ip netns exec command to run the tool and tcpdump to debug connectivity issues between the network namespaces.