Go is popular for its simplicity and great concurrent development experience, but we often write code that has Data Race, but Go can often help us check it out, and this article tries to explain how Go does it.
What is Data Race
In concurrent programming, a data race may occur when multiple concurrent threads, threads or processes simultaneously read or write to a particular memory area. A data race is when two or more threads access the same shared variable at the same time without reasonable synchronization, and at least one thread is writing to the value of the variable. If these accesses are by different threads and at least one of the accesses is a write access, then the situation is called a data race.
Data races can lead to unpredictable program behavior, including crashes, deadlocks, infinite loops, etc. Therefore, it is important to avoid data races as a problem in concurrent programming. A common solution is to use synchronization mechanisms, such as locks and semaphores, to ensure that accesses to shared variables by different threads are ordered.
Go detects data races
Go’s own tool chain provides several tools to detect data races, namely:
- Detecting Data Race at test time:
go test -race mypkg
- Data Race detection at compile time:
go build -race mycmd
- Detecting Data Race at install time:
go install -race mypkg
- Detect Data Race at runtime:
go run -race mysrc.go
When he detects a Data Race, he reports an error in this format:
Although there is a lot of content, there are usually only two keywords you need to look at:
- Read by goroutine: this indicates who the code is that reads the same block of memory when there is a Data Race;
- Previous write by goroutine: this indicates who is the code that writes the same block of memory when a Data Race occurs;
This way you can analyze why Data Race occurs. After you have analyzed the cause, I have used the following solutions:
- Use synchronization mechanism: This is the simplest solution, you can directly use synchronization mechanisms, such as locks, semaphores, etc.; synchronization mechanisms can ensure that access to shared variables by different threads is in order, thus avoiding Data Race problems.
- Use atomic operations: Atomic operations are special operations that ensure that access to shared variables is atomic, thus avoiding Data Race problems. For example, Go comes with Atomic data types such as
atomic.Uint64and extensions to
- Use concurrency-safe data structures: Some data structures, such as thread-safe queues, hash tables, etc., can avoid Data Race problems. When using these data structures, care should be taken to avoid multiple threads modifying the same node or element at the same time, which is actually a synchronization mechanism, but abstracted by the data type.
How Go is implemented
Once we know what a Data Race is and how it is commonly solved, the next step is to understand how Go detects a Data Race. According to the definition of Data Race, we know that in order for a Data Race to occur, there must be two people reading and writing to the same memory at the same time, during which there is a crossover process, so Go actually implements it by this principle.
Go detects Data Race in a way similar to locking, that is, it marks the memory before and after accessing it, for example, in this original code:
When Data Race (
-race) is turned on, the Go-generated code may look like
It is equivalent to tagging before and after memory operations, and then there is a component dedicated to detecting Data Race: Data Race Detector, which can be used to detect if there are conflicting accesses to the same block of memory, the overall process is
- Race Detector uses a data structure called Shadow Memory to store the metadata of memory accesses. For each memory address, Shadow Memory records the two most recent access operations, including the type of operation (read or write), the goroutine of the operation, and the moment when the operation occurred.
- Check for concurrent accesses: When Race Detector detects a memory access operation, it checks the Shadow Memory records associated with that operation. If one of the following conditions is found, then a data race is considered to exist:
- The current operation is a write operation and occurs concurrently (i.e., there is no happens-before relationship) with one of the two most recent access operations (either read or write), and the two access operations are from different Goroutines.
- The current operation is a read operation and occurs concurrently with one of the most recent write operations (i.e., there is no happens-before relationship), and the two operations are from different Goroutines.
- Report Data Race: When a data race is detected, Race Detector generates a detailed report including information about where the data race occurred, the Goroutine involved, and the stack trace.
Different memory handling
The above mentioned is a general idea, but we know that Go’s memory is divided into different types, such as the simplest heap memory and stack memory, their access domains are different, and the triggering conditions of Data Race are also different (for example, heap memory will have a higher probability of triggering, and stack memory will be very low). The following summarizes the different ways of handling different memories:
- Global variables and heap memory: for global variables and memory allocated on the heap, the compiler typically inserts data race detection code for all read and write operations, due to the fact that there are multiple Goroutines shared between global variables and within the heap, so they are more prone to data races;
- Stack memory: For stack memory (e.g., local variables), the compiler may adopt a different treatment strategy. Since stack memory is usually Goroutine-local and has an independent lifecycle between function calls, in many cases, stack memory accesses are not prone to data contention. However, when stack memory addresses are shared with other Goroutines (e.g., through pointer passes or closure captures), the compiler needs to insert data race detection code for these memory accesses.
- Optimization and Elimination: The compiler may optimize or eliminate data race detection for certain memory accesses. For example, the compiler may analyze the static information and runtime behavior of the code to identify certain memory accesses that are unlikely to be data-contested, and thus avoid inserting detection code for those accesses. This optimization can reduce performance loss, but may also lead to incomplete data race detection in some extreme cases.
- Atomic operations and synchronous primitives: The compiler inserts special detection code for atomic operations (e.g.
atomic.LoadUint64, etc.) and synchronous primitives (e.g.
Channel, etc.). These codes help Race Detector to analyze the synchronization behavior of the program more accurately and thus to detect data races more reliably.
Data Race’s goroutine model
The Race Detector is tightly integrated with the Go runtime to detect data races in real time during program execution. At runtime, the various functions of Race Detector are distributed across multiple Goroutines and threads. I’ve summarized some of the more commonly used components below:
- Stub Code: As mentioned earlier, the Go compiler inserts additional code at memory access and synchronization operations to communicate with Race Detector. This inserted code is executed during program runtime, directly in the Goroutine where the memory access and synchronization operations occur.
- Shadow Memory: Race Detector uses Shadow Memory to store metadata for memory accesses. shadow Memory is managed and updated in the Goroutine where the memory access operation occurs to ensure real-time metadata.
- Data Race Detection: Race Detector’s data race detection logic is usually performed in the Goroutine where the memory access operation occurs. This means that when a Goroutine performs a memory access operation, Race Detector checks for data races in the same Goroutine.
- Reporting and Diagnostics: When Race Detector detects a data race, it generates a detailed report including information about where the data race occurred, the Goroutine involved, and the stack trace. Report generation may involve multiple Goroutines and threads, as it needs to collect and organize various contextual information.
We can expand on point 4 of this, as it also involves the coordination of multiple goroutines:
- When a memory access operation occurs for a Goroutine, Race Detector checks if other Goroutines competing with it exist. This is done by analyzing the metadata in Shadow Memory, which contains the access history and synchronization relationships for each memory address. 2.
- If a data race is detected, Race Detector collects information about the competing Goroutine, including its ID, stack trace, the memory address where the competition occurred, and the associated source code location.
- Race Detector can detect multiple data race events at the same time. For each event, it generates a separate report. The report will contain detailed information about the competing Goroutine to help developers understand and resolve the problem.
- Finally, at the end of program execution or when a data race is detected, Race Detector prints all the reports collected to the standard error output (stderr). Each report is displayed individually and in the order in which it occurred. This allows developers to view and analyze data race events one by one.
In this article, I have described the use and rationale of Go’s Data Race detection mechanism. It is important to note that while Go’s Data Race is easy to enable and the Go compiler and Race Detector will detect all memory-accessed data races as much as possible, they cannot guarantee 100% accuracy and integrity. Therefore, developers still need to follow good programming practices to ensure that synchronization and merging of programs is done correctly and safely.
And as you can see from the previous introduction, adding Data Race detection will definitely increase memory usage and decrease execution speed, according to the official documentation: memory usage will increase by 5-10 times and execution time will decrease by 2-20 times.