Go Concurrent Programming Year in Review (2021)

2021 is also almost over, Go 1.18 features have been frozen and the US is soon in holiday mode, so let’s take this node to review the progress of Go concurrent programming in the last year.

TryLock is finally being released

For a long time (dating back to 2013 #6123), it was proposed to add the TryLock method to Mutex, which was ruthlessly rejected by the big boys, and intermittently, intermittently, it has been proposed that this method is needed, and now in 2021, the Go team bigwigs finally relented and added the corresponding method (#45435).

In a nutshell, Mutex adds TryLock to try to get a lock, and RWMutex adds TryLock and TryRLock methods to try to get a write lock and a read lock. They both return bool type. If they return true, it means that the corresponding lock has been acquired, if they return false, it means that the corresponding lock has not been acquired.

Essentially, it is not too much trouble to implement these methods, so let’s look at the corresponding implementation (without the race code).

First is the Mutex.TryLock:

func (m *Mutex) TryLock() bool {
	if atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked) {
		return true
	}
	return false
}

That is, if the state field is manipulated using aromic.CAS, the lock will be successfully acquired if it is not currently locked or if there are no waiting locks. No attempt will be made to spin and compete with the waiter.

Don’t complain about the style of code above, maybe you think it shouldn’t be written in the following way? The reason is that I removed the race code, those blocks of code contain race code, so it can not be abbreviated as follows.

1
2
3

func (m *Mutex) TryLock() bool {
	return atomic.CompareAndSwapInt32(&m.state, 0, mutexLocked)
}

Read and write locks are a bit tricky, because it has both read and write locks.

First look at RWMutex.TryLock (with the race code removed):

func (rw *RWMutex) TryLock() bool {
	if !rw.w.TryLock() {
		return false
	}
	if !atomic.CompareAndSwapInt32(&rw.readerCount, 0, -rwmutexMaxReaders) {
        rw.w.Unlock()
		return false
	}
	return true
}

TryLock, try to get the w field lock, if successful, you need to check the current Reader, if there is no reader, then success, if at this point unfortunately there is a reader did not release the read lock, then try Lock is also unsuccessful, return false. note that before returning must be rw.w lock released.

TryRLock(remove the race code):

func (rw *RWMutex) TryRLock() bool {
	for {
		c := atomic.LoadInt32(&rw.readerCount)
		if c < 0 {
			return false
		}
		if atomic.CompareAndSwapInt32(&rw.readerCount, c, c+1) {
			return true
		}
	}
}

This code first checks the readerCount, if it is negative, it means there is a writer, so it returns false directly.

If there is no writer, then use atomic.CAS to add 1 to the reader, and if successful, return. If unsuccessful, then there may be other readers joining at this time, or there may be writers joining, because it is not possible to determine whether the reader or writer is joining, then use a for loop and try again.

If the writer joins, then the next cycle c may be a negative number, directly return false, if the reader just joined, then it tries again to add 1 on it.

The above is the new code, not particularly complex. go team reluctantly added these methods, while there are very thoughtful tips (intimidation):

Note that while correct uses of TryLock do exist, they are rare, and use of TryLock is often a sign of a deeper problem in a particular use of mutexes.

Field changes for WaitGroup

Previously, the WaitGroup type used [3]uint32 as the type of the state1 field, and in the 64-bit and 32-bit compiler cases, the byte of this field had a different meaning, mainly for alignment purposes. Although the use of a field is very “wise”, but it is very difficult to read, now, Go team changed it to two fields, according to the alignment rules, the 64-bit compiler will align the corresponding field, seriously, we are not short of the 4 bytes.

type WaitGroup struct {
	noCopy noCopy
	// 64-bit value: high 32 bits are counter, low 32 bits are waiter count.
	// 64-bit atomic operations require 64-bit alignment, but 32-bit
	// compilers only guarantee that 64-bit fields are 32-bit aligned.
	// For this reason on 32 bit architectures we need to check in state()
	// if state1 is aligned or not, and dynamically "swap" the field order if
	// needed.
	state1 uint64
	state2 uint32
}
// state returns pointers to the state and sema fields stored within wg.state*.
func (wg *WaitGroup) state() (statep *uint64, semap *uint32) {
	if unsafe.Alignof(wg.state1) == 8 || uintptr(unsafe.Pointer(&wg.state1))%8 == 0 {
		// state1 is 64-bit aligned: nothing to do.
		return &wg.state1, &wg.state2
	} else {
		// state1 is 32-bit aligned but not 64-bit aligned: this means that
		// (&state1)+4 is 64-bit aligned.
		state := (*[3]uint32)(unsafe.Pointer(&wg.state1))
		return (*uint64)(unsafe.Pointer(&state[1])), &state[0]
	}
}

The meaning of state1 and state2 in the case of 64-bit alignment is clear, but if it is not 64-bit aligned, there is still a clever conversion.

Use fastrandn instead of fastrand in Pool

The Go runtime provides the fastrandn method, which is much faster than fastrand() % n, and the related article can be found at the address in the comments below.

//go:nosplit
func fastrand() uint32 {
	mp := getg().m
	// Implement wyrand: https://github.com/wangyi-fudan/wyhash
	if goarch.IsAmd64|goarch.IsArm64|goarch.IsPpc64|
		goarch.IsPpc64le|goarch.IsMips64|goarch.IsMips64le|
		goarch.IsS390x|goarch.IsRiscv64 == 1 {
		mp.fastrand += 0xa0761d6478bd642f
		hi, lo := math.Mul64(mp.fastrand, mp.fastrand^0xe7037ed1a0b428db)
		return uint32(hi ^ lo)
	}
	// Implement xorshift64+
	t := (*[2]uint32)(unsafe.Pointer(&mp.fastrand))
	s1, s0 := t[0], t[1]
	s1 ^= s1 << 17
	s1 = s1 ^ s0 ^ s1>>7 ^ s0>>16
	t[0], t[1] = s0, s1
	return s0 + s1
}
//go:nosplit
func fastrandn(n uint32) uint32 {
	// This is similar to fastrand() % n, but faster.
	// See https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
	return uint32(uint64(fastrand()) * uint64(n) >> 32)
}

So sync.Pool uses fastrandn to do a little modification to improve performance. Good volume ah, this little performance are to squeeze, the key, this is still the code will be executed only if you turn on the race.

Value adds two convenience methods, Swap and CompareAndSwap

Value, the logic of these two methods is often used if you use sync.Value, which has now been added to the standard library.

1
2

func (v *Value) Swap(new interface{}) (old interface{}) 
func (v *Value) CompareAndSwap(old, new interface{}) (swapped bool)

Go 1.18 implements generic types, but some library changes are likely to be implemented in future versions. After the introduction of generics, atomic support for types will be greatly enhanced, so it is possible that the type Value will be retired from history and rarely used in the future. (Refer to Russ Cox’s article Updating the Go Memory Model)

Overall, Go’s concurrency-related libraries are relatively stable and have not changed significantly.

Table of Contents

TryLock is finally being released

Field changes for WaitGroup

Use fastrandn instead of fastrand in Pool

Value adds two convenience methods, Swap and CompareAndSwap