The company’s internal golang middleware reported a UDP connection exception log, the problem is obvious, the service on the other side down. Restart it and it will be fine.

But the question I’m wondering is how does udp detect when the other side is down?

1
2
3
4
5
6
7
err:  write udp 172.16.44.62:62651->172.16.0.46:29999: write: connection refused

err:  write udp 172.16.44.62:62651->172.16.0.46:29999: write: connection refused

err:  write udp 172.16.44.62:62651->172.16.0.46:29999: write: connection refused

...

The udp protocol has neither three handshakes nor TCP status control messages, so how can we determine if the UDP port on the other side is open?

By grabbing the packet, we can find that when the server’s port is not open, the server’s system returns icmp ECONNREFUSED message to the client, indicating that the connection is abnormal.

When the client system parses the message, it finds the corresponding socket by the five-tuple, and errno returns an exception error, if the client is waiting, it wakes up and sets the error status.

imcp udp

(above is icmp under udp exception, below is normal icmp)

udp imcp

When the UDP connection is abnormal, you can use the tcpdmp utility to specify the ICMP protocol to catch the abnormal message, after all, the other side is ECONNREFUSED via icmp.

Use tcpdump to capture packets

Request command:

First find a host that can ping through, then use nc to simulate a udp client to request a port that does not exist, and Connection refused .

1
2
3
4
[root@ocean ~]# nc -vzu 172.16.0.46 8888
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 172.16.0.46:8888.
Ncat: Connection refused.

The packet capture information is as follows:

1
2
3
4
5
6
7
[root@ocean ~]# tcpdump -i any icmp -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
17:01:14.075617 IP 172.16.0.46 > 172.16.0.62: ICMP 172.16.0.46 udp port 8888 unreachable, length 37
17:01:17.326145 IP 172.16.0.46 > 172.16.0.62: ICMP 172.16.0.46 udp port 8888 unreachable, length 37
17:01:17.927480 IP 172.16.0.46 > 172.16.0.62: ICMP 172.16.0.46 udp port 8888 unreachable, length 37
17:01:18.489560 IP 172.16.0.46 > 172.16.0.62: ICMP 172.16.0.46 udp port 8888 unreachable, length 37

Also note that telnet does not support udp, only tcp, so it is recommended to use nc to probe udp.

Testing of various cases

Summary of cases:

  • When udp client connects when ip cannot be connected, it usually shows success.
  • When the udp server program is closed, but the system still exists, the other system will `icmp ECONNREFUSE error.
  • When the other side has iptables udp port drop, the client will also usually show success.

If the IP cannot be connected:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
[root@host-46 ~ ]ping 172.16.0.65
PING 172.16.0.65 (172.16.0.65) 56(84) bytes of data.
From 172.16.0.46 icmp_seq=1 Destination Host Unreachable
From 172.16.0.46 icmp_seq=2 Destination Host Unreachable
From 172.16.0.46 icmp_seq=3 Destination Host Unreachable
From 172.16.0.46 icmp_seq=4 Destination Host Unreachable
From 172.16.0.46 icmp_seq=5 Destination Host Unreachable
From 172.16.0.46 icmp_seq=6 Destination Host Unreachable
^C
--- 172.16.0.65 ping statistics ---
6 packets transmitted, 0 received, +6 errors, 100% packet loss, time 4999ms
pipe 4

[root@host-46 ~ ] nc -vzu 172.16.0.65 8888
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 172.16.0.65:8888.
Ncat: UDP packet sent successfully
Ncat: 1 bytes sent, 0 bytes received in 2.02 seconds.

Also, to be clear again udp does not have status messages like tcp, so simply grabbing packets for UDP will not see any abnormal information.

So why does NC UDP command show success when the IP is not connected ?

logic of netcat nc udp

Why does it return a successful connection when the ip is not connected or the message is DROPed ????

Because nc’s default detection logic is simple, as long as no icmp ECONNREFUSED exception message is received within 2 seconds, then the UDP connection is considered successful. 😅

Here is how the nc udp command is executed.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
setsockopt(3, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(30000), sin_addr=inet_addr("172.16.0.111")}, 16) = 0
select(4, [3], [3], [3], NULL)          = 1 (out [3])
getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
write(2, "Ncat: ", 6Ncat: )                   = 6
write(2, "Connected to 172.16.0.111:29999."..., 33Connected to 172.16.0.111:29999.
) = 33
sendto(3, "\0", 1, 0, NULL, 0)          = 1

// select 多路复用方法里加入了超时逻辑.
select(4, [3], [], [], {tv_sec=2, tv_usec=0}) = 0 (Timeout)

write(2, "Ncat: ", 6Ncat: )                   = 6
write(2, "UDP packet sent successfully\n", 29UDP packet sent successfully
) = 29
write(2, "Ncat: ", 6Ncat: )                   = 6
write(2, "1 bytes sent, 0 bytes received i"..., 481 bytes sent, 0 bytes received in 2.02 seconds.
) = 48
close(3)                                = 0

A UDP client written in golang/ python will not actually report an error when sending a UDP message to an unreachable address, it will usually assume that it was sent successfully.

Still, UDP doesn’t have the handshake step that TCP has, so if you can’t get a response to a TCP syn, the stack will try 6 times with time avoidance, and the kernel will give you an errno value when you don’t get a response after 6 attempts.

UDP connection information

On the client host, you can see the UDP quintuplet connection information via ss lsof netstat.

1
2
[root@host-46 ~ ]$ netstat -tunalp|grep 29999
udp        0      0 172.16.0.46:44136       172.16.0.46:29999       ESTABLISHED 1285966/cccc

UDP connection information is usually not visible on the server side, only udp listen information !!!

1
2
[root@host-62 ~ ]# netstat -tunalp|grep 29999
udp       0      0 :::29999                :::*                                4038720/ss

Client re-instantiation problem ?

When the client and server are connected and the server restarts manually, the client does not need to re-instantiate the connection again and can continue to send data, and when the server starts again, it can still receive messages from the client.

udp has no handshake process, and its udp connect() only creates the socket information locally. The udp quintet of sockets is not visible on the server side using netstat.

Golang test code

server-side code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
package main

import (
    "fmt"
    "net"
)

// UDP 服务端
func main() {
    listen, err := net.ListenUDP("udp", &net.UDPAddr{
        IP:   net.IPv4(0, 0, 0, 0),
        Port: 29999,
    })

    if err != nil {
        fmt.Println("Listen failed, err: ", err)
        return
    }
    defer listen.Close()

    for {
        var data [1024]byte
        n, addr, err := listen.ReadFromUDP(data[:])
        if err != nil {
            fmt.Println("read udp failed, err: ", err)
            continue
        }
        fmt.Printf("data:%v addr:%v count:%v\n", string(data[:n]), addr, n)
    }
}

Client Code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
package main

import (
    "fmt"
    "net"
    "time"
)

// UDP 客户端
func main() {
    socket, err := net.DialUDP("udp", nil, &net.UDPAddr{
        IP:   net.IPv4(172, 16, 0, 46),
        Port: 29999,
    })
    if err != nil {
        fmt.Println("连接UDP服务器失败,err: ", err)
        return
    }
    defer socket.Close()

    for {
        time.Sleep(1e9 * 2)
        sendData := []byte("Hello Server")
        _, err = socket.Write(sendData)
        if err != nil {
            fmt.Println("发送数据失败,err: ", err)
            continue
        }

        fmt.Println("已发送")
    }
}

Summary

When the machine on the udp server side can be connected and there is no exception, the client will usually show success. However, when there is an exception, the following situation will occur:

  • When the ip address is not available, the udp client will usually show success when connecting.
  • When the udp server program is closed, but the system still exists, the other system returns an error via icmp ECONNREFUSE, and the client will report an error.
  • The client will also show success when the other side has iptables udp port drop operation.
  • When the client and server share data, the UDP client cannot sense the shutdown status immediately when the service process hangs, and the client can only sense that the other side is down when the other side responds with an icmp ECONNREFUSE exception message when sending data again.