Read and write network packets in batches for high performance

Although the network protocol stack provides a wealth of features that allow us to easily achieve data exchange on the network, but sometimes we are not so satisfied with the performance of the stack, the previous articles I also introduced the way to efficiently process network data through XDP and other technologies, but after all, XDP is not yet so widely used, and it is not yet so simple to use. If we read and write data through the standard library provided by the programming language, how else can we improve performance? Today we will introduce a way to read and write packets in batch.

go batch

Intuitively, we can also understand that batch sending and receiving network packets is more efficient than individual sending and receiving network packets, because in the ordinary logic of individual sending and receiving packets, each sending and receiving of a packet requires at least one system call (Send/SendTo, Read/RecvFrom), while in the batch approach, one system call can handle multiple network packets, so from the theoretical analysis, the batch method is more efficient. Not only network processing, but also many message queues and data stores will get better performance by batch processing.

In this article, I did not do a test on the performance of batch processing network packets and ordinary processing of a single network packet, someone did a simple test, and did not find the benefits of batch processing, of course, I guess his test may be too single or simple, cloudflare also did a million pps test, batch processing way performance is still very good. I think you in the evaluation of this technology also according to your scenario to do a little performance testing, so as to ensure that the technology is suitable for adoption.

The technique I’m talking about for batch processing of sending and receiving packets is implemented via the system calls sendmmsg and recvmmsg, which are currently only supported on Linux systems. As described in the man manual, they are system calls that send and receive multiple packets on a socket.

sendmmsg - send multiple messages on a socket: int sendmmsg(int sockfd, struct mmsghdr *msgvec, unsigned int vlen,int flags);
recvmmsg - receive multiple messages on a socket: int recvmmsg(int sockfd, struct mmsghdr *msgvec, unsigned int vlen,int flags, struct timespec *timeout);

These two system calls were first added to Linux in version 3.0, and glibc since version 2.14. OpenBSD 7.2 also added this system call.

Note that recvmmsg is a blocking system call that does not return until it has received a vlen number of messages or timed out. These two system calls are extensions to sendmsg and recvmsg. If you have studied this before, you probably know that there are multiple system calls send, sendto, sendmsg, sendmmsg, and read and recv for reading and writing network data.

man send and man recv are described in detail for each of them.

send: The send function is used to send a packet, either a TCP connection or a UDP datagram. Similar to write, except that write does not have a flag setting.
sendto: The sendto function is similar to the send function, but it can specify the recipient’s address when sending a packet. If it is a connection-oriented protocol such as TCP, dest_addr and addrlen can be ignored, otherwise, such as UDP, these two parameters need to be specified.
sendmsg: The sendmsg function can send data from multiple buffers. Also, it can specify one or more additional data. You need to specify
sendmmsg: The sendmmsg function can send multiple messages in one call, each message can have one or more buffers. This can reduce the number of system calls and thus increase efficiency.

Similarly, the main system calls for receiving are the following:

recv: recv is the most basic receive function that receives data from a socket and returns the number of bytes received. It receives data without additional information (such as the destination address, etc.). It is similar to read, except that read has no flag setting.
recvfrom: recvfrom also receives data from a socket, but it also returns the sender’s address information, suitable for protocols with address information such as UDP.
recvmsg: recvmsg can receive data along with other related data information (such as whether the received data is truncated, the IP address of the sender, etc.). It supports reception of multiple data buffers, as well as control messages (cmsg).
recvmmsg: recvmmsg is a multi-message version of recvmsg, which can receive multiple messages at the same time and is suitable for high concurrency and high throughput scenarios.

Corresponding to the methods of the standard library’s conn, take UDPConn as an example:

conn.Write(_) : using the write system call
conn.WriteTo(_, _) : Use the sendto system call
conn.WriteToUDP(_, _) : Use the sendto system call
conn.WriteMsgUDP(_, _, _) : using the sendmsg system call
conn.WriteMsgUDPAddrPort(_, _, _) : using the sendmsg system call
conn.Read(nil): uses the read system call
conn.ReadFrom(nil): use the recvfrom system call
ReadMsgUDP(nil, nil): use the recvmsg system call
conn.ReadMsgUDPAddrPort(nil, nil): use the recvmsg system call
conn.ReadFromUDP(nil): use the recvmsg system call
conn.ReadFromUDPAddrPort(nil):use the recvmsg system call

Unfortunately, the Go standard library does not provide a wrapper for the system calls sendmmsg and recvmmsg, which are not even defined in syscall, so there is no corresponding batch processing method for things like net.UDPConn. 2021, in go issue #45886, bradfitz proposed adding a batch read/write message to *UDPConn. This proposal was approved for acceptance by Russ Cox on November 10, 2022, but no one has yet picked up the proposal to implement it.

However, the go extension library golang.org/x/net provides ReadBatch and WriteBatch methods that provide methods for reading and writing messages in batches, which actually encapsulate the system calls readmmsg and sendmmsg.

Of course you can also wrap the system call with a similar implementation.

func recvmmsg(s uintptr, hs []mmsghdr, flags int) (int, error) {
    n, _, errno := syscall.Syscall6(sysRECVMMSG, s, uintptr(unsafe.Pointer(&hs[0])), uintptr(len(hs)), uintptr(flags), 0, 0)
    return int(n), errnoErr(errno)
}
func sendmmsg(s uintptr, hs []mmsghdr, flags int) (int, error) {
    n, _, errno := syscall.Syscall6(sysSENDMMSG, s, uintptr(unsafe.Pointer(&hs[0])), uintptr(len(hs)), uintptr(flags), 0, 0)
    return int(n), errnoErr(errno)
}

But currently they all have a problem, like ReadBatch is blocking, if not enough messages are received, the current thread will be blocked; no problem if the number of threads is small, but if the number of threads is large, there will be a lack of resources and performance problems. bradfitz in the proposal expects to be able to integrate with the standard library net poller This will prevent threads from being blocked, so you can expect this feature to be implemented soon.

So, let’s take a look at the best way to implement batch read and write UDP messages at the moment, which is by using ipv4 packets.

Using ipv4.PacketConn

We can make use of ipv4.PacketConn which provides the ReadBatch and WriteBatch methods:

func (c *PacketConn) ReadBatch(ms []Message, flags int) (int, error): reads a batch of messages, it returns the number of messages read, up to len(ms)
func (c *PacketConn) WriteBatch(ms []Message, flags int) (int, error): write a batch of messages, it returns the number of messages written successfully

Next, we demonstrate the ability to read and write in batches with an example of a UDP client and server.

Here is the client code. It first creates an instance of *net.UDPConn using the standard library and converts it to *ipv4.PacketConn using ipv4.NewPacketConn. The next 10 messages are prepared and need to be prepared as ipv4.Message type. If you use it in a product, it is better to pool these objects using Pool. This example is relatively simple, so there is no consideration of performance, and the value demonstrates the function of batch reading and writing. After preparing the data, call WriteBatch batch write, here also did not consider how to handle if some did not write successfully, assuming that all write successfully.

Next, read the return package, assuming that the server is to return each message, so here if the batch read is not read out, will continue to read until all the messages are read back.

package main
import (
    "fmt"
    "net"
    "golang.org/x/net/ipv4"
)
func main() {
    remote, err := net.ResolveUDPAddr("udp", "localhost:9999")
    if err != nil {
        panic(err)
    }
    conn, err := net.Dial("udp", "localhost:9999")
    if err != nil {
        panic(err)
    }
    defer conn.Close()
    pconn := ipv4.NewPacketConn(conn.(*net.UDPConn))
    // write with a batch of 10 messages
    batch := 10
    msgs := make([]ipv4.Message, batch)
    for i := 0; i < batch; i++ {
        msgs[i] = ipv4.Message{
            Buffers: [][]byte{[]byte(fmt.Sprintf("hello batch %d", i))},
            Addr:    remote,
        }
    }
    n, err := pconn.WriteBatch(msgs, 0)
    if err != nil {
        panic(err)
    }
    fmt.Printf("sent %d messages\n", n)
    // read 10 messages with batch
    count := 0
    for count < batch {
        n, err := pconn.ReadBatch(msgs, 0)
        if err != nil {
            panic(err)
        }
        count += n
        for i := 0; i < n; i++ {
            fmt.Println(string(msgs[i].Buffers[0]))
        }
    }
}

The server-side code is similar, receiving the messages in batches and then writing them back in batches as they are.

package main
import (
    "fmt"
    "net"
    "golang.org/x/net/ipv4"
)
func main() {
    addr, err := net.ResolveUDPAddr("udp", ":9999")
    if err != nil {
        panic(err)
    }
    conn, err := net.ListenUDP("udp", addr)
    if err != nil {
        panic(err)
    }
    defer conn.Close()
    pconn := ipv4.NewPacketConn(conn)
    fmt.Println("server listening on", addr)
    batch := 10
    msgs := make([]ipv4.Message, batch)
    for i := 0; i < batch; i++ {
        msgs[i] = ipv4.Message{
            Buffers: [][]byte{make([]byte, 1024)},
        }
    }
    for {
        n, err := pconn.ReadBatch(msgs, 0)
        if err != nil {
            panic(err)
        }
        tn := 0
        for tn < n {
            nn, err := pconn.WriteBatch(msgs[tn:n], 0)
            if err != nil {
                panic(err)
            }
            tn += nn
        }
    }
}

Run server and client, you can see in the client that all 10 messages are received.

ubuntu@lab:~/network-programming/ch04/mmsg/sendmmsg/client$ ./client
sent 10 messages
hello batch 0
hello batch 1
hello batch 2
hello batch 3
hello batch 4
hello batch 5
hello batch 6
hello batch 7
hello batch 8
hello batch 9

Using ipv4.Conn

You can actually use ipv4.Conn to read and write batches, the underlying layer is the same. The underlying type is golang.org/x/net/internal/socket.Conn, which contains the following methods for reading and writing messages in batches:

SendMsgs(ms []Message, flags int) (int, error): batch write message, wrapping system call sendmmsg, returns the number of messages sent successfully
RecvMsgs(ms []Message, flags int) (int, error): batch read message, wrapping system call recvmmsg, returns the number of messages received successfully

The usage is almost the same as above, but with different method names. Here is the client-side code.

package main
import (
    "fmt"
    "net"
    "golang.org/x/net/ipv4"
)
func main() {
    remote, err := net.ResolveUDPAddr("udp", "localhost:9999")
    if err != nil {
        panic(err)
    }
    conn, err := net.Dial("udp", "localhost:9999")
    if err != nil {
        panic(err)
    }
    defer conn.Close()
    pconn := ipv4.NewConn(conn)
    // write with a batch of 10 messages
    batch := 10
    msgs := make([]ipv4.Message, batch)
    for i := 0; i < batch; i++ {
        msgs[i] = ipv4.Message{
            Buffers: [][]byte{[]byte(fmt.Sprintf("hello batch %d", i))},
            Addr:    remote,
        }
    }
    n, err := pconn.SendMsgs(msgs, 0)
    if err != nil {
        panic(err)
    }
    fmt.Printf("sent %d messages\n", n)
    // read 10 messages with batch
    count := 0
    for count < batch {
        n, err := pconn.RecvMsgs(msgs, 0)
        if err != nil {
            panic(err)
        }
        count += n
        for i := 0; i < n; i++ {
            fmt.Println(string(msgs[i].Buffers[0]))
        }
    }
}

Here is the server-side code.

package main
import (
    "fmt"
    "net"
    "golang.org/x/net/ipv4"
)
func main() {
    addr, err := net.ResolveUDPAddr("udp", ":9999")
    if err != nil {
        panic(err)
    }
    conn, err := net.ListenUDP("udp", addr)
    if err != nil {
        panic(err)
    }
    defer conn.Close()
    pconn := ipv4.NewConn(conn)
    fmt.Println("server listening on", addr)
    batch := 10
    msgs := make([]ipv4.Message, batch)
    for i := 0; i < batch; i++ {
        msgs[i] = ipv4.Message{
            Buffers: [][]byte{make([]byte, 1024)},
        }
    }
    for {
        n, err := pconn.RecvMsgs(msgs, 0)
        if err != nil {
            panic(err)
        }
        tn := 0
        for tn < n {
            nn, err := pconn.SendMsgs(msgs[tn:n], 0)
            if err != nil {
                panic(err)
            }
            tn += nn
        }
    }
}

Ref

https://colobu.com/2023/04/22/batch-read-and-write-udp-packets-in-Go/

Table of Contents

Using ipv4.PacketConn

Using ipv4.Conn

Ref