Why you shouldn't accept code with race

In any language concurrent programming scenario, there is a race problem. Modern languages have two ideas to solve the race problem, one is to restrict the user from writing race code as much as possible by ownership+Sync/Send like rust, and one is to check for data contention during testing by race detector like Go.

The design of Go’s race detector means that it can’t be turned on in an online environment, and many companies don’t actually have a race test before they go live, which leads some Gophers to think that it’s okay if I write race code because it’s “eventually consistent”.

These ideas have been refuted in the official Go article The Go Memory Model, for example, in this example.

var a string
var done bool

func setup() {
	a = "hello, world"
	done = true
}

func main() {
	go setup()
	for !done {
	}
	print(a)
}

There is a sequential relationship between the writing of the global variable a and the modification of done at the code level, but since you don’t use any synchronization tools in your code (even atomic operations), you can’t guarantee that such code will print “hello, world” after the for loop checks that done has become true.

Here the problem is only caused by the CPU and compiler’s disorderly execution. According to these programmers, I don’t care about the order, I only care about the final consistency, anyway, done can definitely be changed to true and a can definitely be changed to hello, world.

This idea is also problematic, as CPUs in a multi-core environment have multiple levels of cache, and if you don’t even use atomic, then you won’t necessarily synchronize your write behavior across multiple cores. In the worst case, it is normal (and not well reproduced) to write done = true in one core and not be able to read it in the other. Maybe the hardware will actually do this for you sometime for optimization.

The official example, with a simple -race flag in build/run, can also catch problems in time.

~/test git:master ❯❯❯ go run -race ./r.go
==================
WARNING: DATA RACE
Write at 0x0000011506a1 by goroutine 6:
  main.setup()
      /Users/xargin/test/r.go:8 +0x73

Previous read at 0x0000011506a1 by main goroutine:
  main.main()
      /Users/xargin/test/r.go:13 +0x3e

Goroutine 6 (running) created at:
  main.main()
      /Users/xargin/test/r.go:12 +0x32
==================
==================
WARNING: DATA RACE
Read at 0x000001121bf0 by main goroutine:
  main.main()
      /Users/xargin/test/r.go:15 +0x53

Previous write at 0x000001121bf0 by goroutine 6:
  main.setup()
      /Users/xargin/test/r.go:7 +0x30

Goroutine 6 (finished) created at:
  main.main()
      /Users/xargin/test/r.go:12 +0x32
==================
hello, worldFound 2 data race(s)
exit status 66

If you allow such code to enter your project, the number of such errors will increase, and when you need to check for bugs caused by concurrency at some point in the future, the output of hundreds of races will be technical debt, and it will be too late to return.

If you can, race test is also best integrated in your CI environment, junior engineers are best at this.

If you allow code with races into the master branch, more race code will enter the project over time.

When you spend a week without being able to locate the occasional concurrency problem on the wire, you may have to take the bucket and run.

To write concurrency-related code, you still need to learn about concurrency. Here you can recommend two related books: “Shared Memory Synchronization” and “perfbook”.