There are many articles on the web about k8s flattening network construction, mostly for large-scale clusters. But now there are many people also use Docker deployment service in NAS or home server. This article focuses on how to use Docker to build a flat network and provide cross-host interoperability of containers.

With the release of Docker in 2013, container technology started to come into major Internet companies. Container technology not only serves the online business of Internet companies, but also provides great convenience for developers to build test environments and three-party dependency services. In addition to the application in the enterprise, containers with the advantage of non-dependent one-click start is also increasingly used in the deployment of services in the home NAS or home server, basically a must-have software.

Most of the containers used in home NAS or home server use Bridge network provided by Docker to expose the services deployed by containers to the LAN through port forwarding, which is mainly implemented as follows:

NAS

You can simply think of docker0 as a switch to which the container connects via veth-pair, and the host connects via the docker0 interface, and the packets from the container flow out of the eth0 port of the host after NAT by the host.

If eth0 were added directly to this bridge, it would also be possible to use a flattened network, but Docker doesn’t provide this functionality, so the bridge plugin in cni is used to implement https://www.cni.dev/plugins/current/main/bridge/

When using Bridge network, containers have their own private network segment due to a layer of NAT forwarding. Access to container services on other hosts can only be accessed through port forwarding. So is there any way to let containers use independent IP in our LAN? There are many practical ways to do this, but here is the simplest macvlan mode.

macvlan

macvlan, as the name implies, is a way to differentiate vlan based on mac addresses. A normal 802.1Q vlan creates multiple virtual interfaces on top of a single physical interface by adding a specific 802.1Q tag to the frame to differentiate the traffic on each interface by tag. macvlan can also create multiple virtual interfaces on top of a single physical interface, but it differentiates the traffic by mac address. macvlan can also create multiple virtual interfaces on top of a single physical interface, but the traffic is differentiated by mac address.

macvlan

Three modes can be set for direct communication between sub-interfaces in macvlan

  • Private: Sub-interfaces are not allowed to communicate with each other and are completely isolated.
  • VEPA: Intercommunication between subinterfaces needs to be handled by external switches or routing
  • Bridge: sub-interfaces can communicate with each other, macvlan networks created in docker can only use bridge mode

Details can be found in https://developers.redhat.com/blog/2018/10/22/introduction-to-linux-interfaces-for-virtual-networking#macvlan

Using macvlan in Docker

Creating a macvlan network

Creating a macvlan in Docker is very simple. Take my current network environment as an example and create a macvlan network under the enp3s0 interface.

1
2
3
4
5
6
2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether a8:a1:59:24:d0:61 brd ff:ff:ff:ff:ff:ff
    inet 192.168.88.9/24 metric 1024 brd 192.168.88.255 scope global dynamic enp3s0
       valid_lft 551sec preferred_lft 551sec
    inet6 fe80::aaa1:59ff:fe24:d061/64 scope link
       valid_lft forever preferred_lft forever

The network segment where enp3s0 is located is 192.168.88.0/24

1
2
3
4
5
6
7
8
docker network create \
    -d macvlan \
    --subnet=192.168.88.0/24 \
    --ip-range=192.168.88.192/27 \
    --gateway 192.168.88.1 \
    -o parent=enp3s0 \
    --aux-address 'host=192.168.88.192' \
    macvlan1

subnet

Subnet network segment, here the same as the interface network segment can be

ip-range

IP range available for the container, as the container does not use dhcp to obtain an IP address, so the specified ip-range should be staggered with the dhcp IP range in the router, such as my dhcp server here will only allocate 192.168.88.64-192.168.88.127 a total of 64 IPs, here let the container use 192.168. 88.192-192.168.88.192.223

gateway

Gateway address, the same as the interface network segment can be

aux-address

Auxiliary address, equivalent to a reserved address, will be assigned to avoid the relevant IP address

Use docker network ls to see the created networks

1
2
3
4
5
6
docker network ls
NETWORK ID     NAME       DRIVER    SCOPE
fbe90109e5f2   bridge     bridge    local
9186dabfe83e   host       host      local
a373956bfe01   macvlan1   macvlan   local
47312fa573ea   none       null      local

Creating a container using macvlan

Once you have created a macvlan network, you can create a container using the macvlan network, using --network to specify the network.

1
docker run --network macvlan1 -it --rm alpine sh
1
2
3
4
5
6
7
8
9
/ # ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
23: eth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP
    link/ether 02:42:c0:a8:58:c1 brd ff:ff:ff:ff:ff:ff
    inet 192.168.88.193/24 brd 192.168.88.255 scope global eth0
       valid_lft forever preferred_lft forever

At this point, the container has acquired a separate LAN IP, which is equivalent to a separate machine on the LAN. It can be accessed directly from other machines on the LAN.

1
2
3
4
5
6
7
8
λ ~ ping 192.168.88.193
PING 192.168.88.193 (192.168.88.193): 56 data bytes
64 bytes from 192.168.88.193: icmp_seq=0 ttl=64 time=1.739 ms
64 bytes from 192.168.88.193: icmp_seq=1 ttl=64 time=0.743 ms
^C
--- 192.168.88.193 ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.743/1.241/1.739/0.498 ms

But then there is no way to access the container on the current host (i.e. the machine where the container is located), because the packets from the host will flow out directly through the physical interface enp3s0 without the bridge inside the macvlan, and cannot be accessed unless the external switch or router supports hairpin mode when it reaches the external network device. So we need to add a macvlan interface to the host for accessing the container.

1
2
3
4
5
6
7
8
# Create a bridge mode macvlan interface docker-link
ip link add docker-link link enp3s0 type macvlan mode bridge
# Set the docker-link address, using the IP address reserved in aux-address
ip addr add 192.168.88.192/27 brd + dev docker-link
# Enable docker-link interface
ip link set docker-link up
# Add route to point 192.168.88.192/27 segment traffic to docker-link interface
ip route add 192.168.88.192/27 dev docker-link

At this point, we can access the container in the host without any problems.

1
2
3
4
5
6
7
8
tomwei7@my-lab-a300 λ ~ ping 192.168.88.193
PING 192.168.88.193 (192.168.88.193) 56(84) bytes of data.
64 bytes from 192.168.88.193: icmp_seq=1 ttl=64 time=0.038 ms
64 bytes from 192.168.88.193: icmp_seq=2 ttl=64 time=0.070 ms
^C
--- 192.168.88.193 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1005ms
rtt min/avg/max/mdev = 0.038/0.054/0.070/0.016 ms

macvlan with 802.1q VLAN

The parent interface of macvlan does not have to be a physical interface, it can be a vlan interface, here Docker does a little syntax sugar if the parent interface contains . If the parent interface contains . then the corresponding vlan interface is created, for example, the following macvlan network is created

  • vlan id 200
  • network segment 192.168.200.0/24
  • gateway 192.168.200.1
1
2
3
4
5
6
7
docker network create \
    -d macvlan \
    --subnet=192.168.200.0/24 \
    --ip-range=192.168.200.0/24 \
    --gateway 192.168.200.1 \
    -o parent=enp3s0.200 \
    macvlan2

docker will automatically create the corresponding vlan interface.

1
2
3
4
5
ip -details link show

26: enp3s0.200@enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default
    link/ether a8:a1:59:24:d0:61 brd ff:ff:ff:ff:ff:ff promiscuity 0 minmtu 0 maxmtu 65535
    vlan protocol 802.1Q id 200 <REORDER_HDR> addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 64000 gso_max_segs 64

The use of vlan also requires the corresponding configuration and firewall rules in the router.

Take the RouterOS of the current network as an example, you need to add vlan interface and set IP for the corresponding interface

1
2
3
# main-bridge is the interface where the current machine is connected bridge can also be set directly on the port
/interface/vlan/add interface=main-bridge vlan-id=200 name=vlan-200-docker-demo
/ip/address/add address=192.168.200.1/24 interface=vlan-200-docker-demo

Most of you probably have openwrt router system, in openwrt environment you can add new interface directly through UI, under Network / interface. protocol select static address, device select custom device name format is interface.vlanid For example, if you use br-lan interface with vlan 200, the device name is br-lan.200.

br-lan.200

Then just set the IP for the interface.

set the IP for the interface

If the host is not directly connected to the router, there are also switches, etc., then you also need to carry out the corresponding vlan settings, not to be repeated here

Others

The above is mainly an example of docker, but other functions such as podman can be similar. Flattening the container network allows our containers to have a “name” (IP) on the LAN, so we don’t need to think about whether there are enough ports, which ports should be exposed, etc. compared to a bridge network, and it also makes the host’s iptables much cleaner.