orcaman/concurrent-map is a very efficient thread-safe map library. As its documentation says, the standard library sync.Map is more suitable for append-only scenarios or a scenario where there is a lot less writing and a lot more reading. For more reads and more writes, concurrent-map may be more advantageous. It is a way to reduce the granularity of locks by slicing, thus improving performance.

Earlier this year, this library was revamped and started supporting generic types, but unfortunately, it only supports Value value generic, its key can only be of type string, which limits its application scenarios.

I forked the project and created a new thread-safe library smallnest/safemap, which supports both Key and Value generics, and can be applied to more sh use cases.

I can understand why the key of concurrent-map does not support generics, because it is not easy to implement. Because for a generic Key, we can’t calculate its hash value very well, either convert it to string and then calculate the hash value with low performance, or provide a hash function variable and let the user implement the hash algorithm by himself, which is not convenient for the user to use.

I recently saw a new project alphadose/haxmap, which is also a thread-safe map, based on the Harris lock-free list algorithm, with reference to its method of computing the hash of Key, and implementing a patch to concurrent-map so that both Key and Value support generics, which led to the smallnest/safemap library.

Of course, in the process of replicating haxmap’s hash algorithm, it was also discovered that its hash algorithm had problems with some types, and fixes were made.

You can use safemap to handle both concurrent read and write scenarios. You can also use safemap to learn how to retrofit an existing project with generic support.

Modify Map to support generics for both its Key and Value

The definition of concurrent-map is as follows:

1
2
3
4
5
type ConcurrentMap[V any] []*ConcurrentMapShared[V]
type ConcurrentMapShared[V any] struct {
    items        map[string]V
    sync.RWMutex // Read Write mutex, guards access to internal map.
}

As you can see, it only defines V any type constraint for the type of Value, while its Key type can only be string type: items map[string]V.

The advantage of this is that it can implement only a specific hash algorithm for string types, without considering various key types.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
func fnv32(key string) uint32 {
    hash := uint32(2166136261)
    const prime32 = uint32(16777619)
    keyLength := len(key)
    for i := 0; i < keyLength; i++ {
        hash *= prime32
        hash ^= uint32(key[i])
    }
    return hash
}

Why not use the built-in map hash algorithm to calculate the hash value for all comparable types? This issue was discussed in #21195, but in the end it didn’t work out. Although the internal memhash algorithm was officially exposed in Go 1.19 Go, for the scenario in this article, we would expect a generic hash algorithm to be exposed.

Anyway, let’s assume that there is a way to implement a hash algorithm for generics ourselves. Then we now need to adapt ConcurrentMap.

I started by naming it SafeMap , which is a bit shorter and supports both Key and Value type constraints.

1
2
3
4
5
6
7
8
type SafeMap[K comparable, V any] struct {
    shared []*SafeMapShared[K, V]
    hasher func(K) uintptr
}
type SafeMapShared[K comparable, V any] struct {
    items        map[K]V
    sync.RWMutex 
}

Here we define a generic hasher: hasher func(K) uintptr, which is used to calculate the hash value of a generic Key. Here we go to implement it.

The next step is to fix the program where it is marked in red. That is, where the original hard-coded string is, we have to replace it with K, for example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
func (m *SafeMap[K, V]) GetShard(key K) *SafeMapShared[K, V] {
    k := uint(m.hasher(key))
    return m.shared[uint(k)%uint(ShardCount)]
}
// Sets the given value under the specified key.
func (m *SafeMap[K, V]) Set(key K, value V) {
    // Get map shard.
    shard := m.GetShard(key)
    shard.Lock()
    shard.items[key] = value
    shard.Unlock()
}

Receiver is to be modified because it currently has two type constraints (K, V). Anything involving Key string is to be replaced using Key K.

The hash of this is calculated using the hasher method. Next, take a look at our default hasher algorithm.

hasher

The difficulty in implementing this hasher method is to support generic types. If it is a specific type, such as int64, string, we can directly use its underlying value to calculate the hash value. If it’s interface{}, it’s easy to use type switch to handle the type. If it is a generic type, it is executed at compile time, and we can’t use type switch directly, so what should we do?

First, we go to determine the type of Key at the time of initialization, which can be achieved through reflection (first convert the Key to interface{}, and then use the type switch), and then return a specific hasher algorithm according to the different types, so that because the hasher is determined at the time of initialization, it does not affect the performance of the subsequent Set and Get methods.

Here is the fragment that generates the hasher.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
var 
    // qword hasher, key size -> 8 bytes
    qwordHasher = func(key uint64) uintptr {
        k1 := key * prime2
        k1 = bits.RotateLeft64(k1, 31)
        k1 *= prime1
        h := (prime5 + 8) ^ k1
        h = bits.RotateLeft64(h, 27)*prime1 + prime4
        h ^= h >> 33
        h *= prime2
        h ^= h >> 29
        h *= prime3
        h ^= h >> 32
        return uintptr(h)
    }
        float64Hasher = func(key float64) uintptr {
        k := *(*uint64)(unsafe.Pointer(&key))
        h := prime5 + 4
        h ^= uint64(k) * prime1
        h = bits.RotateLeft64(h, 23)*prime2 + prime3
        h ^= h >> 33
        h *= prime2
        h ^= h >> 29
        h *= prime3
        h ^= h >> 32
        return uintptr(h)
    }

    
func genHasher[K comparable]() func(K) uintptr {
    var hasher func(K) uintptr
    switch any(*new(K)).(type) {
    case int64, uint64:
        hasher = *(*func(K) uintptr)(unsafe.Pointer(&qwordHasher))
    case float64:
        hasher = *(*func(K) uintptr)(unsafe.Pointer(&float64Hasher))

First convert the value of the generic to interface{} so that type switch: switch any(*new(K)). (type) {.

If it is int64, uint64 type, we can use qwordHasher which is a predefined hasher, but its type is func(key uint64) uintptr, for int64 type, you can use unsafe to force the conversion.

This way, the hardest point is overcome, and each type can be handled independently, and it is only necessary to further define the corresponding hasher for each type. You can use various hash algorithms to achieve this.

That’s it, safemap is done. I recommend that you actually test the performance, and preferably according to your actual scenario. Some basic benchmark tests are provided in the project.

Also, I suggest you pay attention to alphadose/haxmap, my initial tests it will perform better, but I need to wait for it to develop for a while and become more stable.

Reference