A long time ago, I came across an article by Brad Fitzpatrick called netaddr.IP: a new IP address type for Go. Brad is the core developer of the Go language and founder of tailscale, and in this article he analyzes the problems with the Go language net.IP type and their solution to them and how it evolved. Eventually Brad and his team open sourced the inet.af/netaddr package. I took a few glances at it and was impressed. Today I received an email subscription saying that Go 1.18 had accepted Brad’s proposal to introduce a new package net/netip. So I quickly found Brad’s article and read it carefully. IP type and the ingeniousness of the new scheme, but also have a better understanding of memory allocation, garbage collection and the use of unsafe packages in Go language. Today, I’d like to share it with you.

What’s wrong with the net.IP type of Go?

Brad lists the “seven problems” with net.IP in his article.

  1. The contents are mutable. The underlying type of net.IP is []byte, which means that net.IP is passed by reference, and any function that handles it can change its contents.
  2. Cannot be compared directly. Because slice cannot be compared directly, net.IP cannot be used directly to determine if two addresses are equal using ==, nor can it be used as the key of a map.
  3. There are two types of addresses in the standard library, net.IP and net.IPAddr. Common IPv4 and IPv6 addresses are stored using net.IP. IPv6 link-local addresses need to be stored using net.IPAddr (because of the additional storage of the link’s NIC). Since there are two types of addresses, it’s a matter of determining which one to use or both.
  4. Takes up a lot of memory. A single slice header message takes 24 bytes (64-bit platforms, see Russ’s article for details). So the memory footprint of net.IP contains 24 bytes of header information and 4 bytes (IPv4) or 6 bytes (IPv6) of address data. If the local link NIC (zone) needs to be stored, then net.IPAddr also needs a 16-byte string header and the specific NIC name.
  5. Memory needs to be allocated from the heap. Each time memory is allocated from the heap, it puts extra pressure on the GC.
  6. Cannot distinguish between IPv4 addresses and IPv4-mapped IPv6 addresses (in the form of ::ffff:192.168.1.1) when parsing IP addresses from strings.
  7. Expose implementation details to the outside world. The definition of net.IP is type IP []byte, and the underlying []byte is part of the API and cannot be changed.

So what would the ideal IP type look like?

Brad has summarized a table.

Features Go’s net.IP
Immutable ❌, slice
Comparable ❌, slice
Small ❌, 28-56 bytes
No need to allocate memory from the heap ❌, slice’s underlying array
support IPv4 and IPv6
Distinguish between IPv4/IPv6 ❌, #37921
support for IPv6 zones ❌, using a specialized net.IPAddr type
Hide implementation details from the outside world ❌, expose the underlying type []byte
interoperable with standard libraries

What follows is a series of improvement options.

Option 1: wgcfg.IP

David Crawshaw Submitted code in April 2019 89476f8cb5 , which introduces the following wgcfg.IP Type.

1
2
3
4
5
// 内部使用 IPv6 格式。
// IPv4 地址使用 IPv4-in-IPv6 语法。
type IP struct {
       Addr [16]byte
}

Not perfect, but solves some problems, see the following table.

featured net.IP wgcfg.IP
Immutable ❌, slice
Comparable ❌, slice
Small size ❌, 28-56 bytes ✅, 16 bytes
No need to allocate memory from the heap
Supports IPv4 and IPv6
Differentiate between IPv4/IPv6
support for IPv6 regions (zones)
Hide implementation details externally
Interoperable with standard libraries ❌, requires adaptation

This solution takes up only 16 bytes and is very compact. The implementation details can be hidden from the public by simply changing Addr to addr. However, David’s solution still does not distinguish between IPv4 and IPv4-maped IPv6 addresses, and does not support saving zone information.

So there’s a second option.

Option 2: netaddr.IP with embedded interface variables

In Go, interface variables can also be compared to each other (either using == comparisons or as the key of a map). So Brad implemented version 1 of the netaddr.IP scheme.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
type IP struct {
     ipImpl
}

type ipImpl interface {
     is4() bool
     is6() bool
     String() string
}

type v4Addr [4]byte
type v6Addr [16]byte
type v6AddrZone struct {
      v6Addr
      zone string
}

This time, an interface variable is embedded in IP. On 64-bit platforms, an interface takes up 16 bytes, so the IP type here also takes up 16 bytes. This is better than the standard library where net.IP takes up 24 bytes plus the address content. Because the compass is stored, additional memory needs to be allocated for v4Addr/v6Addr/v6AddrZone. However, IPv6 support is solved this time.

Features net.IP wgcfg.IP Program 2
Immutable ❌, slice
Comparable ❌, slice
Small ❌, 28-56 bytes ✅, 16 bytes 🤷, 20-32 bytes
No need to allocate memory from the heap
Supports IPv4 and IPv6
Distinguish between IPv4/IPv6
Support for IPv6 regions (zones)
Hide implementation details from the outside world
Interoperable with standard libraries

Compared to wgcfg.IP, only the memory allocation problem is left unresolved. Keep carrying the front!

Option 3: 24-byte representation without heap memory allocation

The slice header of ,net.IP is 24 bytes long. The length of time.Time is also 24 bytes. So Brad thinks it’s best to keep the new address type to no more than 24 bytes.

The IPv6 address itself already takes up 16 bytes, which leaves 8 bytes to hold the following information.

  • Address type (v4, v6, null). At least two bits are needed.
  • IPv6 zone information (aka NIC name)

The interface scheme is out because a pointer takes up 16 bytes, which is too big. The string header information also takes 16 bytes and is out.

Brad came up with this solution.

1
2
3
4
type IP struct {
   addr          [16]byte
   zoneAndFamily uint64
}

Then find a way to save the address type and zone information in the zoneAndFamily field. The question is how to store it?

If you use one or two bits to save the address type, that leaves 62 or 63 bits. The following options can be used.

  • Use the remaining 62 bits to save ASCII characters, which supports up to 8 characters. Too short.
  • Number the NIC and save only the numeric number. But this only saves the local NIC.
  • Use the NIC name mapping table to create a name-to-number index. Go Standard Library does this internally like this. However, this may be vulnerable to external attacks, as this mapping table only increases and does not decrease. the Go standard library only keeps the local NIC, so it does not have this problem.

Brad was not satisfied with any of these options and came up with the pointer option.

1
2
3
4
type IP struct {
    addr          [16]byte
    zoneAndFamily *T
}

For now, let’s assume that it works regardless of the type of T. Only three sentinel variables need to be declared to identify the address type.

1
2
3
4
5
var (
     z0 *T        // nil 表示空值
     z4 = new(T)  // 表示 IPv4
     z6 = new(T)  // 表示 IPv6(没有 zone 信息)
)

The next step is to consider how to save the zone information to achieve the following effect.

1
2
3
   ip1, _ := netaddr.ParseIP("fe80::2%eth0")
   ip2, _ := netaddr.ParseIP("fe80::2%eth0")
   fmt.Println(ip1 == ip2) // true

Simply new two identical strings will return different pointers. But Brad wanted to find a way to always return the same pointer for strings that have the same value. This way you can compare two strings by pointer to see if they are equal.

So a map is needed to hold all the strings. So what’s the difference between this and the previous indexed table? The biggest difference is that if you use a zone index (integer) as the key, the corresponding map has no way to clean up and will get bigger and bigger. If you use pointers, you can use runtime.SetFinalizer to clean up the index table during garbage collection. Eventually they got the go4.org/intern package, whose core logic is as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
var (
	mu      sync.Mutex
	valMap  = map[key]uintptr{} // to uintptr(*Value)
)

// Value 保存底层可比较的值
type Value struct {
	_      [0]func() // 禁止比较 Value 对象
	cmpVal interface{}
	// resurrected 由 mu 保护并发读写。
	// 只要有地方引用 cmpVal 就会被设为 true
	resurrected bool
}

func GetByString(s string) *Value {
	return get(key{s: s, isString: true})
}

// get 方法违返了 unsafe 的使用规则,所以要添加 nocheckptr 指令
//go:nocheckptr
func get(k key) *Value {
	mu.Lock()
	defer mu.Unlock()

	var v *Value
	if addr, ok := valMap[k]; ok {
	// 如果是已经存在的值,则标记并发访问
		v = (*Value)((unsafe.Pointer)(addr))
		v.resurrected = true
	}
	if v != nil {
		return v
	}
	v = k.Value()
	// 设置垃圾回收回调函数
	// Go 在回收 v 之前如果发现有 finalize 函数,
	// 会清空并调用 finalize,期望在下一个周期回收。
	runtime.SetFinalizer(v, finalize)
	// 参考 https://pkg.go.dev/unsafe#Pointer
	// Go 语言要求 unsafe.Pointer 转成 uintptr 之后
	// 要在同一个表达式中转回 unsafe.Pointer
	// 但此处将其保存到 valMap
	valMap[k] = uintptr(unsafe.Pointer(v))
	return v
}

func finalize(v *Value) {
	mu.Lock()
	defer mu.Unlock()
	if v.resurrected {
		// 进入本分支说明在垃圾回收过程中有别的协程
		// 引用了当前 v,所以不能删除
		v.resurrected = false
		runtime.SetFinalizer(v, finalize)
		return
	}
	delete(valMap, keyFor(v.cmpVal))
}

There are two subtleties in the above code.

The first is that it disables Value comparisons; the Go language supports comparing structs, but only if the first member of the struct supports comparisons. Here, we can disable Value structs from comparing with each other by embedding a _ [0]func() member that does not support comparisons. See this article for a detailed analysis.

The other is the garbage collection-enabled object pool valMap = map[key]uintptr{}. valMap stores the uintptr pointer of *Value, which is a so-called weak reference and does not affect Go’s garbage collection. That is, although valMap “references” an object via uintptr, it will still be reclaimed by GC if it is not referenced by normal Go code. It’s just that all *Values are associated with a finalize function, and Go will execute the finalize function before executing a GC, and the recycling process will be delayed until the next GC cycle. This way, the *Value object will not be GC’d as long as it is referenced elsewhere, and if all references are released, a GC will be triggered, where resurrected will be set to false, and memory will actually be reclaimed by the next GC cycle. The full working process can be found in this article.

With the intern package, it is possible to achieve the following.

1
intern.GetByString("eth0") == intern.GetByString("eth0")

So IP can be expressed as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
type IP struct {
    addr [16]byte
    z    *intern.Value // 区域和类型
}

var (
     z0    *intern.Value        // nil 表示空值
     z4    = new(intern.Value)  // 表示 IPv4
     z6noz = new(intern.Value)  // 表示 IPv6 (没有区域)
)

The accessors to get/set the zone are then:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
func (ip IP) Zone() string {
	if ip.z == nil {
		return ""
	}
	zone, _ := ip.z.Get().(string)
	return zone
}

func (ip IP) WithZone(zone string) IP {
	if !ip.Is6() {
		return ip
	}
	if zone == "" {
		ip.z = z6noz
		return ip
	}
	ip.z = intern.GetByString(zone)
	return ip
}

The final result is as follows.

feature net.IP netaddr.IP
Immutable ❌, slice
Comparable ❌, slice
Small size ❌, 28-56 bytes ✅, 24 bytes, fixed
No need to allocate memory from the heap
Supports IPv4 and IPv6
Distinguish between IPv4/IPv6
Support for IPv6 regions (zones)
Hide implementation details externally
Ability to interoperate with standard libraries 🤷

Option 4: uint64s acceleration

The new scheme does not expose the underlying details, and we can easily modify the internal implementation. So Dave Anderson took [16]byte optimized and made it a pair of uint64.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
type IP struct {
	// hi and lo are the bits of an IPv6 address. If z==z4, hi and lo
	// contain the IPv4-mapped IPv6 address.
	//
	// hi and lo are constructed by interpreting a 16-byte IPv6
	// address as a big-endian 128-bit number. The most significant
	// bits of that number go into hi, the rest into lo.
	//
	// For example, 0011:2233:4455:6677:8899:aabb:ccdd:eeff is stored as:
	//  hi = 0x0011223344556677
	//  lo = 0x8899aabbccddeeff
	//
	// We store IPs like this, rather than as [16]byte, because it
	// turns most operations on IPs into arithmetic and bit-twiddling
	// operations on 64-bit registers, which is much faster than
	// bytewise processing.
	hi, lo uint64

	// z is a combination of the address family and the IPv6 zone.
	z *intern.Value
}

Option 5: uint128 type

Finally, Brad replaces the uint64 pair in 318330f177 with a custom uint128 type.

1
2
3
4
5
6
type uint128 [2]uint64

type IP struct {
	addr uint128
	z    *intern.Value
}

But the Go compiler has problems with allocating memory, so Brad again in bf0e22f9f3 modified the definition of uint128 in

1
2
3
4
type uint128 struct {
	hi uint64
	lo uint64
}

The above is the full content of the article. The new net/netip package will follow the Go 1.18 release, look forward to it 😚 .