Channel Basics

Channels are divided into read and write. If you use a lot of channels in the real world, please note that reading and writing channels anywhere in the code may cause different situations. Therefore, to avoid abusing channels in the team, I usually limit the situations in which I can only write and in which I can only read. If you mix them up, it will be very difficult to debug them, and also very difficult for the reviewer to read and understand.

1
2
3
Write(chan<- int)
Read(<-chan int)
ReadWrite(chan int)

It is very easy to distinguish between read and write, see where the <- symbol is placed, chan<- pointing to itself is write, <-chan leaving itself is read, it is quite easy to distinguish, if the func needs to use both read and write, then there is no need to use any arrow symbols, but I would suggest to split the logic of read and write into different func, it is very helpful for reading.

Communicating by sharing memory

In fact, there are similar examples in any language, so let’s use Go for example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
package main

import (
    "fmt"
    "sync"
)

func addByShareMemory(n int) []int {
    var ints []int
    var wg sync.WaitGroup

    wg.Add(n)
    for i := 0; i < n; i++ {
        go func(i int) {
            defer wg.Done()
            ints = append(ints, i)
        }(i)
    }

    wg.Wait()
    return ints
}

func main() {
    foo := addByShareMemory(10)
    fmt.Println(len(foo))
    fmt.Println(foo)
}

Please put the code directly into your computer to run, originally ints should be able to get 10 values normally, but you will find that the result is different every time. The reason is that the ints are declared as []int in the func, and the variable is shared in the goroutine, but it is possible to read and write to the same address memory at the same time, so you can see that the result is different every time. When using goroutine to read and write variables, try not to use share memory to share, sometimes it is really hard to debug the error.

The following are a few solutions, some of which are not suitable for use in the project. The first one is to modify GOMAXPROCS.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
package main

import (
    "fmt"
    "runtime"
    "sync"
)

func init() {
    runtime.GOMAXPROCS(1)
}

Originally, we would do parallel processing based on the number of CPUs, but we can set the number of CPUs to be used through GOMAXPROCS, so that the result will be as expected, but who would use GOMAXPROCS in the application? Another way is to use sync to solve it.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
func addByShareMemory(n int) []int {
    var ints []int
    var wg sync.WaitGroup
    var mux sync.Mutex

    wg.Add(n)
    for i := 0; i < n; i++ {
        go func(i int) {
            defer wg.Done()
            mux.Lock()
            ints = append(ints, i)
            mux.Unlock()
        }(i)
    }

    wg.Wait()
    return ints
}

As long as the ints are read and written with Lock in front of them, the variables will not be overwritten and can be unlocked by using Unlock after writing. To make it simpler, you can use defer to Unlock, because it is dropped to goroutine to process in parallel, so you need to use WaitGroup to make sure all goroutines get the data before ending the func.

Share memory by communicating

The above example shows that sync.WaitGroup and sync.Mutex are used to complete the goroutine to get the correct data, if it is through the channel way can we avoid the problem mentioned above?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
func addByShareCommunicate(n int) []int {
    var ints []int
    channel := make(chan int, n)

    for i := 0; i < n; i++ {
        go func(channel chan<- int, order int) {
            channel <- order
        }(channel, i)
    }

    for i := range channel {
        ints = append(ints, i)

        if len(ints) == n {
            break
        }
    }

    close(channel)

    return ints
}

From the above example, we can see that there is only one for loop that can write to ints variables. This is what is called creating a channel as sharing memory by communicating, instead of throwing a variable into a goroutine for sharing and causing errors.

The goroutine is used first to write to the channel, so you can see that the first argument uses chan<- int to ensure that the goroutine can only write to the channel and cannot read data out of the channle. Then a for loop is used to read out the values from the channle one after another. With channel, there is no need for WaitGroup and Lock mechanism, just make sure to close the channle after the channel is finally read.

Benchmark

We benchmarked the above two implementations and the results are as follows:

1
2
3
4
5
goos: darwin
goarch: amd64
BenchmarkAddByShareMemory-8        31131   38005 ns/op  2098 B/op  11 allocs/op
BenchmarkAddByShareCommunicate-8   22915   51837 ns/op  2936 B/op   9 allocs/op
PASS

You can find that the first way is actually better than using Channel by waitGroup + Lock.

Conclusion

The following two points can be summarized as the basis for my personal opinion:

  • Unless you need to exchange messages between two goruoutines, it is better to use the usual waitGroup + Lock. Don’t force the use of channels in your project just because they’re trendy. In many cases, a normal Slice or callback will work better.
  • As mentioned above, if you use a large number of goroutines and need to exchange data in the middle, you can use channels to communicate.