The difference between function return values and pointers in Golang

Variable memory allocation and recycling

Go programs allocate memory for variables in two places, one is the global heap and the other is the function call stack. The Go language has a garbage collection mechanism, and it is up to the compiler to decide whether a variable is allocated on the heap or stack in Go, so developers don’t need to pay much attention to whether a variable is allocated on the stack or heap. However, if you want to write high quality code, it is necessary to understand the implementation behind the language. The mechanism of allocating variables on the stack and on the heap is completely different, and the performance difference between the allocation and recycling process of variables is very big.

Difference between heap and stack

Heap

The memory that is dynamically allocated when the program is running is located in the heap, which is managed by the memory allocator, and the size of this area changes as the program runs. That is, when we request memory from the heap but the allocator finds that there is not enough memory in the heap, it requests the operating system kernel to expand the size of the heap in the direction of higher addresses. When we release the memory and return it to the heap, if the allocator finds that there is too much free memory left, it requests the OS to shrink the heap size to the lower address. As we can see from the memory request and release process, the memory allocated from the heap must be returned to the heap after it is used up, otherwise the memory allocator may repeatedly request the operating system to expand the heap size, resulting in more and more heap memory being used and eventually running out of memory, which is called a memory leak. It is worth mentioning that traditional c/c++ code needs to handle the allocation and release of memory manually, while in Go, there is a garbage collector to collect the memory on the heap, so the programmer only needs to apply for memory, not to care about the release of memory, which greatly reduces the mental burden of the programmer, which not only improves the productivity of the programmer, but more importantly, also reduces the generation of many bugs.

Stack

The function call stack, referred to as the stack, plays a very important role in the running of a program, whether it is the execution of a function or a function call, and it is mainly used to.

store the local variables of the function.
pass parameters to the called function.
to return the return value of a function.
to hold the return address of the function, which is the address of the instruction that the caller should continue to execute after returning from the called function.

Each function needs to use a piece of stack memory to store these values during execution, and we call this piece of stack memory the stack frame of a function. When a function call occurs, because the caller has not finished executing, the data saved in its stack memory is still available, so the called function cannot overwrite the caller’s stack frame, but can only “push” the called function’s stack frame onto the stack, and then “pop” its stack frame from the stack after the called function has finished executing. pop" out, so that the size of the stack will grow with the increase of the function call level, and shrink with the return of the function, that is, the deeper the function call level, the more stack space is consumed. The growth and shrinkage of the stack is automatic and is done automatically by the code inserted by the compiler, so the memory used by the function local variables located in the stack memory is allocated with the function call and released automatically with the return of the function, so the programmer does not need to release the memory used by the local variables himself, whether he uses a high-level programming language with or without garbage collection. This is quite different from the memory allocated on the heap.

go stack

The process is the basic unit of resource allocation for the operating system. Each process is allocated a fixed size of memory on the process stack by the operating system at startup, and the default stack size of the process in Linux can be viewed by ulimit -s. The memory allocated on the stack is automatically reclaimed when the function exits by changing the offset of the register pointer. The size of memory in the heap is requested from the operating system while the process is running. The amount of memory available in the process heap also depends on the amount of memory currently available to the operating system.

So how does the compiler decide whether to allocate variables on the heap or the stack in Go?

Variable memory allocation escape analysis

As mentioned above, it is up to the compiler to decide whether to allocate variables on the heap or the stack in Go, and the way the compiler decides where to allocate memory is called escape analysis.

When a local variable is declared within a function in Go, the compiler will allocate memory on the stack when it finds that the scope of the variable does not escape from the function, otherwise it will be allocated on the heap. Escape analysis is done by the compiler and acts at the compilation stage.

Check whether the variable is allocated on the stack or the heap

There are two ways to determine whether a variable allocates memory on the heap or on the stack:

by compiling the generated assembly function to confirm that variables that allocate memory on the heap call the newobject function of the runtime package.
compile-time display of compilation optimization information by specifying options, and the compiler outputs the escaped variables.

The variables in the following code examples are analyzed for escapes by both of the above.

package main

type demo struct {
	Msg string
}

func example() *demo {
	d := &demo{}
	return d
}

func main() {
	example()
}

1. Verify that variable memory allocation is not escaping through assembly**

$ go tool compile -S main.go
go tool compile -S main.go
"".example STEXT size=72 args=0x8 locals=0x18
	0x0000 00000 (main.go:7)	TEXT	"".example(SB), ABIInternal, $24-8
	0x0000 00000 (main.go:7)	MOVQ	(TLS), CX
	0x0009 00009 (main.go:7)	CMPQ	SP, 16(CX)
	0x000d 00013 (main.go:7)	PCDATA	$0, $-2
	0x000d 00013 (main.go:7)	JLS	65
	0x000f 00015 (main.go:7)	PCDATA	$0, $-1
	0x000f 00015 (main.go:7)	SUBQ	$24, SP
	0x0013 00019 (main.go:7)	MOVQ	BP, 16(SP)
	0x0018 00024 (main.go:7)	LEAQ	16(SP), BP
	0x001d 00029 (main.go:7)	PCDATA	$0, $-2
	0x001d 00029 (main.go:7)	PCDATA	$1, $-2
	0x001d 00029 (main.go:7)	FUNCDATA	$0, gclocals·9fb7f0986f647f17cb53dda1484e0f7a(SB)
	0x001d 00029 (main.go:7)	FUNCDATA	$1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
	0x001d 00029 (main.go:7)	FUNCDATA	$2, gclocals·9fb7f0986f647f17cb53dda1484e0f7a(SB)
	0x001d 00029 (main.go:8)	PCDATA	$0, $1
	0x001d 00029 (main.go:8)	PCDATA	$1, $0
	0x001d 00029 (main.go:8)	LEAQ	type."".demo(SB), AX
	0x0024 00036 (main.go:8)	PCDATA	$0, $0
	0x0024 00036 (main.go:8)	MOVQ	AX, (SP)
	0x0028 00040 (main.go:8)	CALL	runtime.newobject(SB)  // 调用 runtime.newobject 函数
	0x002d 00045 (main.go:8)	PCDATA	$0, $1
	0x002d 00045 (main.go:8)	MOVQ	8(SP), AX
	0x0032 00050 (main.go:9)	PCDATA	$0, $0
	0x0032 00050 (main.go:9)	PCDATA	$1, $1
	0x0032 00050 (main.go:9)	MOVQ	AX, "".~r0+32(SP)
	0x0037 00055 (main.go:9)	MOVQ	16(SP), BP
	0x003c 00060 (main.go:9)	ADDQ	$24, SP
	0x0040 00064 (main.go:9)	RET
	0x0041 00065 (main.go:9)	NOP
	0x0041 00065 (main.go:7)	PCDATA	$1, $-1
	0x0041 00065 (main.go:7)	PCDATA	$0, $-2
	0x0041 00065 (main.go:7)	CALL	runtime.morestack_noctxt(SB)
	0x0046 00070 (main.go:7)	PCDATA	$0, $-1
	0x0046 00070 (main.go:7)	JMP	0

The above is just the compiled assembly code of the example function. You can see that the runtime.newobject function is called in line 8 of the program.

2. Check by compilation options

$ go build -gcflags "-m -l" main.go
# command-line-arguments
./main.go:8:7: &demo literal escapes to heap:
./main.go:8:7:   flow: d = &{storage for &demo literal}:
./main.go:8:7:     from &demo literal (spill) at ./main.go:8:7
./main.go:8:7:     from d := &demo literal (assign) at ./main.go:8:4
./main.go:8:7:   flow: ~r0 = d:
./main.go:8:7:     from return d (return) at ./main.go:9:2
./main.go:8:7: &demo literal escapes to heap

# or

$ go tool compile -l -m -m main.go
main.go:8:7: &demo literal escapes to heap:
main.go:8:7:   flow: d = &{storage for &demo literal}:
main.go:8:7:     from &demo literal (spill) at main.go:8:7
main.go:8:7:     from d := &demo literal (assign) at main.go:8:4
main.go:8:7:   flow: ~r0 = d:
main.go:8:7:     from return d (return) at main.go:9:2
main.go:8:7: &demo literal escapes to heap

You can use go tool compile --help to see the meaning of several options.

The official Go faq documentation stack_or_heap also describes how to know whether a variable is allocated on the heap or on a sticky, and the documentation is relatively simple.

Some cases of intra-function variables allocated on the heap

1. Variables of pointer type, pointer escape

Code example, consistent with the example in the previous section.

package main

type demo struct {
	Msg string
}

func example() *demo {
	d := &demo{}
	return d
}

func main() {
	example()
}

$ go tool compile -l -m main.go
main.go:8:7: &demo literal escapes to heap

2. insufficient stack space

package main

func generate8191() {
	nums := make([]int, 8191) // < 64KB
	for i := 0; i < 8191; i++ {
		nums[i] = i
	}
}

func generate8192() {
	nums := make([]int, 8192) // = 64KB
	for i := 0; i < 8192; i++ {
		nums[i] = i
	}
}

func generate(n int) {
	nums := make([]int, n) // 不确定大小
	for i := 0; i < n; i++ {
		nums[i] = i
	}
}

func main() {
	generate8191()
	generate8192()
	generate(1)
}

$ go tool compile -l -m main.go
main.go:4:14: make([]int, 8191) does not escape
main.go:9:14: make([]int, 8192) escapes to heap
main.go:14:14: make([]int, n) escapes to heap

As you can see in the Go compiler code, variables over 10M in size are allocated to the heap for declared types, and implicit variables over 64KB are allocated to the heap by default.

var (
    // maximum size variable which we will allocate on the stack.
    // This limit is for explicit variable declarations like "var x T" or "x := ...".
    // Note: the flag smallframes can update this value.
    maxStackVarSize = int64(10 * 1024 * 1024)

    // maximum size of implicit variables that we will allocate on the stack.
    //   p := new(T)          allocating T on the stack
    //   p := &T{}            allocating T on the stack
    //   s := make([]T, n)    allocating [n]T on the stack
    //   s := []byte("...")   allocating [n]byte on the stack
    // Note: the flag smallframes can update this value.
    maxImplicitStackVarSize = int64(64 * 1024)
)

3. Dynamic types, interface{} Dynamic type escapes

package main

type Demo struct {
	Name string
}

func main() {
	_ = example()
}

func example() interface{} {
	return Demo{}
}

$ go tool compile -l -m main.go
main.go:12:13: Demo literal escapes to heap

4. Closure reference object

package main

import "fmt"

func increase(x int) func() int {
	return func() int {
		x++
		return x
	}
}

func main() {
	x := 0
	in := increase(x)
	fmt.Println(in())
	fmt.Println(in())
}

$ go tool compile -l -m main.go
main.go:5:15: moved to heap: x
main.go:6:9: func literal escapes to heap
main.go:15:13: ... argument does not escape
main.go:15:16: in() escapes to heap
main.go:16:13: ... argument does not escape
main.go:16:16: in() escapes to heap

Performance differences when returning from a function using a value and a pointer

The above article introduced the way of memory allocation for variables in Go. From the above article, we know that when a variable is defined in a function and returned with a value, the variable will be allocated on the stack and the function will copy the whole object when it returns.

Although the value has a copy operation, the return pointer will allocate the variable on the heap, and the allocation and recycling of the variable on the heap will have a larger overhead. For this problem, there is also a certain relationship with the returned object and platform, and different platforms need to be benchmarked to get a more accurate result.

return_value_or_pointer.go

package main

import "fmt"

const bigSize = 200000

type bigStruct struct {
    nums [bigSize]int
}

func newBigStruct() bigStruct {
    var a bigStruct

    for i := 0; i < bigSize; i++ {
        a.nums[i] = i
    }
    return a
}

func newBigStructPtr() *bigStruct {
    var a bigStruct

    for i := 0; i < bigSize; i++ {
        a.nums[i] = i
    }
    return &a
}

func main() {
    a := newBigStruct()
    b := newBigStructPtr()

    fmt.Println(a, b)
}

benchmark_test.go

package main

import "testing"

func BenchmarkStructReturnValue(b *testing.B) {
    b.ReportAllocs()

    t := 0
    for i := 0; i < b.N; i++ {
        v := newBigStruct()
        t += v.nums[0]
    }
}

func BenchmarkStructReturnPointer(b *testing.B) {
    b.ReportAllocs()

    t := 0
    for i := 0; i < b.N; i++ {
        v := newBigStructPtr()
        t += v.nums[0]
    }
}

$ go test -bench .
goos: darwin
goarch: amd64
BenchmarkStructReturnValue-12      	    4215	    278542 ns/op	       0 B/op	       0 allocs/op
BenchmarkStructReturnPointer-12    	    4556	    267253 ns/op	 1605634 B/op	       1 allocs/op
PASS
ok  	_/Users/tianfeiyu/golang-dev/test	3.670s

In my local tests, structures with 200000 int types return values faster, and pointers are faster when they are less than 200000. If you have higher performance requirements for your code, you will need to benchmark it on a real platform to reach a conclusion.

Some other experience in using

stateful objects must use pointers to return, such as the system built-in sync. WaitGroup, sync.Pool, etc. In Go, some structures have an explicit noCopy field to remind that value copying is not possible.
1 2 3 4 5 6

// A WaitGroup must not be copied after first use. type WaitGroup struct { noCopy noCopy ...... }
objects with short life cycles use value return, if the life cycle of the object exists longer or the object is larger, you can use the pointer to return.
large objects are recommended to use pointers to return, object size threshold needs to be benchmarked in specific platforms to derive data.
reference to the use of some large open source projects, such as kubernetes, docker, etc..

Summary

This article has analyzed some of the issues when using variables in Go functions, the differences between allocating memory on the heap and the stack when variables will exist in both places, and when variables need to be allocated memory on the heap.

Table of Contents