This article explains the rules for executing defer and introduces the defer type. It explains how defer function calls are done, mainly through heap allocation.

Introduction

defer execution rules

The order of execution of multiple defers is “Last In First Out LIFO "

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
package main

import (  
    "fmt"
)

func main() {  
    name := "Naveen"
    fmt.Printf("Original String: %s\n", string(name))
    fmt.Printf("Reversed String: ")
    for _, v := range []rune(name) {
        defer fmt.Printf("%c", v)
    }
} 

In the above example, the string Naveen is traversed using a for loop and then defer is called. These defer calls act as if they were stacked, and the last defer call pushed onto the stack is pulled out and executed first.

The output is as follows.

1
2
3
$ go run main.go 
Original String: Naveen
Reversed String: neevaN

The defer declaration will first calculate the value of the parameter

1
2
3
4
5
6
func a() {
    i := 0
    defer fmt.Println(i) // 0
    i++
    return
}

In this example, the variable i is determined when defer is called, not when defer is executed, so the output of the above statement is 0.

defer can modify the return value of a named return value function

As officially stated.

For instance, if the deferred function is a function literal and the surrounding function has named result parameters that are in scope within the literal, the deferred function may access and modify the result parameters before they are returned.

An example is as follows.

1
2
3
4
5
6
7
// f returns 42
func f() (result int) {
    defer func() {
        result *= 7
    }()
    return 6
}

However, it should be noted that only the named return value (named result parameters) function can be modified, and the anonymous return value function cannot be modified, as follows.

1
2
3
4
5
6
7
8
// f returns 100
func f() int {
    i := 100
    defer func() {
        i++
    }()
    return i
}

Because anonymous return-valued functions are declared when return is executed, only named return-valued functions can be accessed in the defer statement, not anonymous return-valued functions directly.

Types of defer

Go made two optimizations to defer in versions 1.13 and 1.14, which significantly reduced the performance overhead of defer in most scenarios.

Allocation on the heap

Prior to Go 1.13 all defers were allocated on the heap, a mechanism that at compile time.

  1. inserting runtime.deferproc at the location of the defer statement, which, when executed, saves the defer call as a runtime._defer structure to the top of the _defer chain of Goroutine.
  2. runtime.deferreturn is inserted at the position before the function returns, and when executed, the top runtime._defer is retrieved from Goroutine’s _defer chain and executed sequentially.

Allocation on the stack

New in Go 1.13, deferprocStack implements on-stack allocation of defer. Compared to heap allocation, on-stack allocation frees _defer after the function returns, eliminating the performance overhead of memory allocation and requiring only proper maintenance of the chain of _defer. According to the official documentation, this improves performance by about 30%.

Except for the difference in allocation location, there is no fundamental difference between allocating on the stack and allocating on the heap.

It is worth noting that not all defers can be allocated on the stack in version 1.13. A defer in a loop, whether it is a display for loop or an implicit loop formed by goto, can only use heap allocation, even if it loops once.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
func A1() {
    for i := 0; i < 1; i++ {
        defer println(i)
    }
}

$ GOOS=linux GOARCH=amd64 go tool compile -S main.go
        ...
        0x004e 00078 (main.go:5)        CALL    runtime.deferproc(SB)
        ...
        0x005a 00090 (main.go:5)        CALL    runtime.deferreturn(SB)
        0x005f 00095 (main.go:5)        MOVQ    32(SP), BP
        0x0064 00100 (main.go:5)        ADDQ    $40, SP
        0x0068 00104 (main.go:5)        RET

Open coding

Go 1.14 added open coding, a mechanism that inserts defer calls directly into functions before they return, eliminating the need for deferproc or deferprocStack operations at runtime. This optimization reduces the overhead of defer calls from ~35ns in version 1.13 to ~6ns or so.

However, certain conditions need to be met in order to trigger.

  1. the compiler optimization is not disabled, i.e. -gcflags "-N" is not set.
  2. the number of defers in the function does not exceed 8 and the product of the return statements and the number of defer statements does not exceed 15.
  3. the defer keyword of the function cannot be executed in a loop.

defer structure

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
type _defer struct {
    siz     int32       //参数和结果的内存大小
    started bool
    heap    bool        //是否是堆上分配
    openDefer bool      // 是否经过开放编码的优化
    sp        uintptr   //栈指针
    pc        uintptr   // 调用方的程序计数器
    fn        *funcval  // 传入的函数
    _panic    *_panic   
    link      *_defer   //defer链表
    fd   unsafe.Pointer  
    varp uintptr        
    framepc uintptr
}

The parameters to note above are siz, heap, fn, link, openDefer which will be covered in the following analysis.

Analysis

In this article, we will start with the heap allocation, we will talk about why the execution rules of defer are as described at the beginning, and then we will talk about the stack allocation of defer and the development coding related content.

The analysis starts with a function call as the entry point.

Allocation on the heap

Named function return value calls

Let’s start with the example mentioned above and look at heap allocation from function calls. Note that running the following example on 1.15 does not allocate directly to the heap, but requires you to recompile the Go source code to force the defer to allocate to the heap.

File location: src/cmd/compile/internal/gc/ssa.go

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
func (s *state) stmt(n *Node) {
    ...
    case ODEFER: 
        if s.hasOpenDefers {
            s.openDeferRecord(n.Left)
        } else {
            d := callDefer
            // 这里需要注释掉
            // if n.Esc == EscNever {
            //  d = callDeferStack
            // }
            s.call(n.Left, d)
        }
    ...
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
func main() {
    f()
}

func f() (result int) {
    defer func() {
        result *= 7
    }() 
    return 6
}

Print the assembly using the command.

1
$ GOOS=linux GOARCH=amd64 go tool compile -S -N -l main.go

First of all, let’s look at the main function, there is nothing to say, it is a very simple call to the f function.

1
2
3
4
5
"".main STEXT size=54 args=0x0 locals=0x10
        0x0000 00000 (main.go:3)        TEXT    "".main(SB), ABIInternal, $16-0
        ...
        0x0020 00032 (main.go:4)        CALL    "".f(SB)
        ...

The following subparagraph looks at the calls to the f function.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
"".f STEXT size=126 args=0x8 locals=0x20
        0x0000 00000 (main.go:7)        TEXT    "".f(SB), ABIInternal, $32-8 
        ...
        0x001d 00029 (main.go:7)        MOVQ    $0, "".result+40(SP)        ;; 将常量0 写入40(SP)  
        0x0026 00038 (main.go:8)        MOVL    $8, (SP)                    ;; 将常量8 放入栈顶
        0x002d 00045 (main.go:8)        LEAQ    "".f.func1·f(SB), AX        ;; 将函数f.func1·f地址写入AX
        0x0034 00052 (main.go:8)        MOVQ    AX, 8(SP)                   ;; 将函数f.func1·f地址写入8(SP)
        0x0039 00057 (main.go:8)        LEAQ    "".result+40(SP), AX        ;; 将40(SP)地址值写入AX
        0x003e 00062 (main.go:8)        MOVQ    AX, 16(SP)                  ;; 将AX 保存的地址写入16(SP)
        0x0043 00067 (main.go:8)        PCDATA  $1, $0
        0x0043 00067 (main.go:8)        CALL    runtime.deferproc(SB)       ;; 调用 runtime.deferproc 函数

Since allocation on the defer heap calls the runtime.deferproc function, what is shown in this assembly is an assembly before the runtime.deferproc function is called, which is still very simple to understand.

Because the argument to the runtime.deferproc function is two arguments, as follows.

1
func deferproc(siz int32, fn *funcval)

In the function call process, the parameters are passed from the right to the left of the parameter list stack, so the top of the stack is pressed into the constant 8, in the 8(SP) position is pressed into the second parameter f.func1-f function address.

See here may have a question, in the pressure into the constant 8 when the size is int32 occupies 4 bytes size, why the second parameter does not start from 4 (SP), but to start from 8 (SP), this is because the need to do memory alignment caused.

In addition to the parameters, it should also be noted that the 16(SP) position is pressed into the 40(SP) address value. So the entire pre-call stack structure should look like the following.

sobyte

Let’s look at runtime.deferproc :

File location: src/runtime/panic.go

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
func deferproc(siz int32, fn *funcval) {  
    if getg().m.curg != getg() { 
        throw("defer on system stack")
    }
    // 获取sp指针
    sp := getcallersp()
    // 获取fn函数后指针作为参数
    argp := uintptr(unsafe.Pointer(&fn)) + unsafe.Sizeof(fn)
    callerpc := getcallerpc()
    // 获取一个新的defer
    d := newdefer(siz)
    if d._panic != nil {
        throw("deferproc: d.panic != nil after newdefer")
    }
    // 将 defer 加入到链表中
    d.link = gp._defer
    gp._defer = d
    d.fn = fn
    d.pc = callerpc
    d.sp = sp
    // 进行参数拷贝
    switch siz {
    case 0: 
        //如果defered函数的参数只有指针大小则直接通过赋值来拷贝参数
    case sys.PtrSize:
        // 将 argp 所对应的值 写入到 deferArgs 返回的地址中
        *(*uintptr)(deferArgs(d)) = *(*uintptr)(unsafe.Pointer(argp))
    default:
        // 如果参数大小不是指针大小,那么进行数据拷贝
        memmove(deferArgs(d), unsafe.Pointer(argp), uintptr(siz))
    }

    return0() 
}

When calling the deferproc function, we know that the argument siz is passed in as the value at the top of the stack representing the argument size of 8 and the address corresponding to the 8(SP) passed in as the argument fn.

1
2
3
argp := uintptr(unsafe.Pointer(&fn)) + unsafe.Sizeof(fn)
...
*(*uintptr)(deferArgs(d)) = *(*uintptr)(unsafe.Pointer(argp))

So the two sentences above are actually a combination of the address value we saved in 16(SP) above into the next 8bytes block of memory immediately below defer as the argument to defer. A simple diagram would look like the following, where the argp immediately below defer actually stores the address value saved in 16(SP).

sobyte

Note that here the argp value is copied by a copy operation, so the argument is already determined when defer is called, not when it is executed, but here the value of an address is copied.

And we know that when allocated on the heap, defer is stored in the current Goroutine as a chain, so if there are 3 defers called separately, the last one called will be at the top of the chain.

sobyte

For the newdefer function, the general idea is to fetch from P’s local cache pool, and if not, fetch half of defer from sched’s global cache pool to fill P’s local resource pool, and if there is still no available cache, allocate new defer and args directly from the heap. The memory allocation here is roughly the same as the memory allocator allocation, so we won’t analyze it again, but you can see for yourself if you are interested.

Let’s go back to the assembly of the f function.

1
2
3
4
5
6
7
8
"".f STEXT size=126 args=0x8 locals=0x20 
        ...
        0x004e 00078 (main.go:11)       MOVQ    $6, "".result+40(SP)        ;; 将常量6写入40(SP)作为返回值
        0x0057 00087 (main.go:11)       XCHGL   AX, AX
        0x0058 00088 (main.go:11)       CALL    runtime.deferreturn(SB)     ;; 调用 runtime.deferreturn 函数
        0x005d 00093 (main.go:11)       MOVQ    24(SP), BP
        0x0062 00098 (main.go:11)       ADDQ    $32, SP
        0x0066 00102 (main.go:11)       RET

Here it is very simple, write constant 6 directly to 40(SP) as the return value and then call runtime.deferreturn to execute defer.

Let’s look at runtime.deferreturn :

File location: src/runtime/panic.go

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
func deferreturn(arg0 uintptr) {
    gp := getg()
    d := gp._defer
    if d == nil {
        return
    }
    // 确定 defer 的调用方是不是当前 deferreturn 的调用方
    sp := getcallersp()
    if d.sp != sp {
        return
    }

    switch d.siz {
    case 0:
        // Do nothing.
    case sys.PtrSize:
        // 将 defer 保存的参数复制出来
        // arg0 实际上是 caller SP 栈顶地址值,所以这里实际上是将参数复制到 caller SP 栈顶地址值
        *(*uintptr)(unsafe.Pointer(&arg0)) = *(*uintptr)(deferArgs(d))
    default:
        // 如果参数大小不是 sys.PtrSize,那么进行数据拷贝
        memmove(unsafe.Pointer(&arg0), deferArgs(d), uintptr(d.siz))
    }
    fn := d.fn
    d.fn = nil
    gp._defer = d.link
    //将 defer 对象放入到 defer 池中,后面可以复用
    freedefer(d)

    _ = fn.fn
    // 传入需要执行的函数和参数
    jmpdefer(fn, uintptr(unsafe.Pointer(&arg0)))
}

First, note that the argument arg0 passed in here is actually the value at the top of the caller’s stack, so the following assignment actually copies the defer argument to the top of the caller’s stack.

1
*(*uintptr)(unsafe.Pointer(&arg0)) = *(*uintptr)(deferArgs(d))

*(*uintptr)(deferArgs(d)) What is stored here is actually the address value saved by the caller 16(SP). Then the caller’s stack frame is shown below.

sobyte

Go to runtime.jmpdefer to see how this is done.

Location: src/runtime/asm_amd64.s

1
2
3
4
5
6
7
8
TEXT runtime·jmpdefer(SB), NOSPLIT, $0-16
    MOVQ    fv+0(FP), DX    // fn 函数地址
    MOVQ    argp+8(FP), BX  // caller sp 调用方 SP
    LEAQ    -8(BX), SP  //  caller 后的调用方 SP
    MOVQ    -8(SP), BP  //  caller 后的调用方 BP
    SUBQ    $5, (SP)    //  获取 runtime.deferreturn 地址值写入栈顶
    MOVQ    0(DX), BX   // BX = DX
    JMP BX  // 执行被 defer 的函数

This assembly is very interesting, the jmpdefer function, since it was called by runtime.deferreturn, now has the following call stack frame

sobyte

The arguments passed to the jmpdefer function are 0(FP) for the fn function address, and 8(FP) for the SP of the call stack of the f function.

sobyte

So the following sentence represents the return address of the runtime.deferreturn call stack written to SP.

1
LEAQ    -8(BX), SP

Then -8(SP) represents the Base Pointer of the runtime.deferreturn call stack.

1
MOVQ    -8(SP), BP

We will focus on explaining why the value of the SP pointer minus 5 is used to obtain the address value of runtime.deferreturn.

1
SUBQ    $5, (SP)

We return to the assembly of the f function call.

1
2
3
4
5
6
(dlv) disass
TEXT main.f(SB) /data/gotest/main.go
        ...
        main.go:11      0x45def8        e8a3e2fcff              call $runtime.deferreturn
        main.go:11      0x45defd        488b6c2418              mov rbp, qword ptr [rsp+0x18]
        ...

Since the runtime.deferreturn function needs to return to the 0x45defd address after the call, the return address in the stack frame corresponding to the runtime.deferreturn function is actually 0x45defd.

In the jmpdefer function, the value corresponding to (SP) is the return address of the runtime.deferreturn call stack, so subtracting 5 from 0x45defd will give you 0x45def8, which is the value of the runtime.deferreturn function. address.

Then when we finally jump to the f.func1 function, the call stack is as follows.

sobyte

The location of the call stack (SP) actually holds a pointer to the deferreturn function, so after the f.func1 function is called, it returns to the deferreturn function until there is no data in the _defer chain.

1
2
3
4
5
6
7
8
func deferreturn(arg0 uintptr) {
    gp := getg()
    d := gp._defer
    if d == nil {
        return
    }
    ...
}

Here’s another short look at the f.func1 function call.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
"".f.func1 STEXT nosplit size=25 args=0x8 locals=0x0
        0x0000 00000 (main.go:8)        TEXT    "".f.func1(SB), NOSPLIT|ABIInternal, $0-8
        0x0000 00000 (main.go:8)        FUNCDATA        $0, gclocals·1a65e721a2ccc325b382662e7ffee780(SB)
        0x0000 00000 (main.go:8)        FUNCDATA        $1, gclocals·69c1753bd5f81501d95132d08af04464(SB)
        0x0000 00000 (main.go:9)        MOVQ    "".&result+8(SP), AX        ;; 将指向6的地址值写入 AX
        0x0005 00005 (main.go:9)        MOVQ    (AX), AX                    ;; 将 6 写入到 AX
        0x0008 00008 (main.go:9)        LEAQ    (AX)(AX*2), CX              ;; CX = 6*2 +6 =18
        0x000c 00012 (main.go:9)        LEAQ    (AX)(CX*2), AX              ;; AX = 18*2 + 6 =42
        0x0010 00016 (main.go:9)        MOVQ    "".&result+8(SP), CX        ;; 将指向6的地址值写入 CX
        0x0015 00021 (main.go:9)        MOVQ    AX, (CX)                    ;; 将CX地址值指向的值改为42
        0x0018 00024 (main.go:10)       RET

The call here is very simple: get the data pointed to by the 8(SP) address value and do the arithmetic, then write the result to the stack and return.

Here we have basically shown you the whole process of calling defer functions through heap allocation. The answer is that the defer argument passed during the defer call is a pointer to the return value, so the return value is modified when defer is finally executed.

Anonymous function return value calls

So what if anonymous return value functions are passed? For example, something like the following.

1
2
3
4
5
6
7
8
// f returns 100
func f() int {
    i := 100
    defer func() {
        i++
    }()
    return i
}

Print the compilation below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
"".f STEXT size=139 args=0x8 locals=0x28
        0x0000 00000 (main.go:7)        TEXT    "".f(SB), ABIInternal, $40-8
        ...
        0x001d 00029 (main.go:7)        MOVQ    $0, "".~r0+48(SP)       ;;初始化返回值
        0x0026 00038 (main.go:8)        MOVQ    $100, "".i+24(SP)       ;;初始化参数i
        0x002f 00047 (main.go:9)        MOVL    $8, (SP)
        0x0036 00054 (main.go:9)        LEAQ    "".f.func1·f(SB), AX
        0x003d 00061 (main.go:9)        MOVQ    AX, 8(SP)               ;; 将f.func1·f地址值写入8(SP)
        0x0042 00066 (main.go:9)        LEAQ    "".i+24(SP), AX
        0x0047 00071 (main.go:9)        MOVQ    AX, 16(SP)              ;; 将 24(SP) 地址值写入到 16(SP) 
        0x004c 00076 (main.go:9)        PCDATA  $1, $0
        0x004c 00076 (main.go:9)        CALL    runtime.deferproc(SB)
        0x0051 00081 (main.go:9)        TESTL   AX, AX
        0x0053 00083 (main.go:9)        JNE     113
        0x0055 00085 (main.go:9)        JMP     87
        0x0057 00087 (main.go:12)       MOVQ    "".i+24(SP), AX         ;; 将24(SP)的值100写入到AX
        0x005c 00092 (main.go:12)       MOVQ    AX, "".~r0+48(SP)       ;; 将值100写入到48(SP)
        0x0061 00097 (main.go:12)       XCHGL   AX, AX
        0x0062 00098 (main.go:12)       CALL    runtime.deferreturn(SB)
        0x0067 00103 (main.go:12)       MOVQ    32(SP), BP
        0x006c 00108 (main.go:12)       ADDQ    $40, SP
        0x0070 00112 (main.go:12)       RET

In the output above, we can see that the anonymous return value function call first writes the constant 100 to 24(SP), then writes the address value of 24(SP) to 16(SP), and then writes the return value to 48(SP) with the MOVQ instruction, which means that the value is copied, not the pointer, and so the return value is not modified.

Summary

Here is a diagram comparing the two after calling runtime.deferreturn stack frames.

sobyte

It is clear that the famous return value function stores the address of the return value at 16(SP), while the anonymous return value function stores the address of 24(SP) at 16(SP).

The above sequence of analysis also answers a few questions in passing.

  1. how does defer pass arguments? We found in the above analysis that when executing the deferproc function, the argument value is first copied to the location immediately adjacent to the defer memory address value as the argument, if it is a pointer pass it will directly copy the pointer, and a value pass will directly copy the value to the location of the defer argument.

    sobyte

    Then when the deferreturn function is executed, it copies the parameter values to the stack and then calls jmpdefer for execution.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    
    func deferreturn(arg0 uintptr) {
        ...
        switch d.siz {
        case 0:
            // Do nothing.
        case sys.PtrSize:
            // 将 defer 保存的参数复制出来
            // arg0 实际上是 caller SP 栈顶地址值,所以这里实际上是将参数复制到 caller SP 栈顶地址值
            *(*uintptr)(unsafe.Pointer(&arg0)) = *(*uintptr)(deferArgs(d))
        default:
            // 如果参数大小不是 sys.PtrSize,那么进行数据拷贝
            memmove(unsafe.Pointer(&arg0), deferArgs(d), uintptr(d.siz))
        }
        ...
    }
    
  2. How are multiple defer statements executed?

    When the deferproc function is called to register a defer, the new element is inserted at the head of the table, and execution is done by getting the head of the chain in order.

    sobyte

  3. What is the order of execution of defer, return, and return value?

    To answer this question, let’s take the assembly of the output in the above example and examine it.

    1
    2
    3
    4
    5
    6
    7
    8
    
    "".f STEXT size=126 args=0x8 locals=0x20 
       ...
       0x004e 00078 (main.go:11)       MOVQ    $6, "".result+40(SP)        ;; 将常量6写入40(SP)作为返回值
       0x0057 00087 (main.go:11)       XCHGL   AX, AX
       0x0058 00088 (main.go:11)       CALL    runtime.deferreturn(SB)     ;; 调用 runtime.deferreturn 函数
       0x005d 00093 (main.go:11)       MOVQ    24(SP), BP
       0x0062 00098 (main.go:11)       ADDQ    $32, SP
       0x0066 00102 (main.go:11)       RET
    

    From this assembly, we know that for

    1. it is the first to set the return value to the constant 6.
    2. then runtime.deferreturn will be called to execute the defer chain.
    3. executing the RET instruction to jump to the caller function.

Stack allocation

As mentioned at the beginning, defer on-stack allocation was added after Go version 1.13, so one difference from heap allocation is that defer is created on the stack via deferprocStack.

Go goes through the SSA stage at compile time, and if it’s a stack allocation, then it needs to use the compiler to initialize the _defer record directly on the function call frame and pass it as an argument to deferprocStack. The rest of the execution process is no different from heap allocation.

For the deferprocStack function let’s look briefly at.

File location: src/cmd/compile/internal/gc/ssa.go

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
func deferprocStack(d *_defer) {
    gp := getg()
    if gp.m.curg != gp { 
        throw("defer on system stack")
    } 
    d.started = false
    d.heap = false  // 栈上分配的 _defer
    d.openDefer = false
    d.sp = getcallersp()
    d.pc = getcallerpc()
    d.framepc = 0
    d.varp = 0 
    *(*uintptr)(unsafe.Pointer(&d._panic)) = 0
    *(*uintptr)(unsafe.Pointer(&d.fd)) = 0
    // 将多个 _defer 记录通过链表进行串联
    *(*uintptr)(unsafe.Pointer(&d.link)) = uintptr(unsafe.Pointer(gp._defer))
    *(*uintptr)(unsafe.Pointer(&gp._defer)) = uintptr(unsafe.Pointer(d))

    return0() 
}

The main function is to assign a value to the _defer structure and return it.

Open coding

The Go language was optimized in 1.14 by inlining code so that calls to the defer function are made directly at the end of the function, with little additional overhead. In the build phase of SSA buildssa will insert open coding based on a check to see if the condition is met. Since the code in the build phase of SSA is not well understood, only the basics are given below and no code analysis is involved.

We can compile a printout of the example for the allocation on the heap.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
$ GOOS=linux GOARCH=amd64 go tool compile -S  main.go
"".f STEXT size=155 args=0x8 locals=0x30
        0x0000 00000 (main.go:7)        TEXT    "".f(SB), ABIInternal, $48-8
        ...
        0x002e 00046 (main.go:7)        MOVQ    $0, "".~r0+56(SP)
        0x0037 00055 (main.go:8)        MOVQ    $100, "".i+16(SP)
        0x0040 00064 (main.go:9)        LEAQ    "".f.func1·f(SB), AX
        0x0047 00071 (main.go:9)        MOVQ    AX, ""..autotmp_4+32(SP)
        0x004c 00076 (main.go:9)        LEAQ    "".i+16(SP), AX
        0x0051 00081 (main.go:9)        MOVQ    AX, ""..autotmp_5+24(SP)
        0x0056 00086 (main.go:9)        MOVB    $1, ""..autotmp_3+15(SP)
        0x005b 00091 (main.go:12)       MOVQ    "".i+16(SP), AX
        0x0060 00096 (main.go:12)       MOVQ    AX, "".~r0+56(SP)
        0x0065 00101 (main.go:12)       MOVB    $0, ""..autotmp_3+15(SP)
        0x006a 00106 (main.go:12)       MOVQ    ""..autotmp_5+24(SP), AX
        0x006f 00111 (main.go:12)       MOVQ    AX, (SP)
        0x0073 00115 (main.go:12)       PCDATA  $1, $1
        0x0073 00115 (main.go:12)       CALL    "".f.func1(SB)    ;; 直接调用 defer 函数
        0x0078 00120 (main.go:12)       MOVQ    40(SP), BP
        0x007d 00125 (main.go:12)       ADDQ    $48, SP
        0x0081 00129 (main.go:12)       RET

We can see in the assembly output above that the defer function is inserted directly into the end of the function to be called.

This example above is easy to optimize, but what if a defer is in a conditional statement that must not be determined until runtime?

The defer bit delay bit is also used in open coding to determine whether a conditional branch should be executed or not. This delay bit is an 8-bit binary code, so only a maximum of 8 defers can be used in this optimization, including the defer in the conditionals. Each bit is set to 1 to determine if the delay statement is set at runtime, and if so, the call occurs. Otherwise, it is not called.

For example, an example is explained in the following article.

https://go.googlesource.com/proposal/+/refs/heads/master/design/34481-opencoded-defers.md

1
2
3
4
5
defer f1(a)
if cond {
 defer f2(b)
}
body...

At the stage of creating a deferred call, it is first recorded which defer with conditions are triggered by a specific location of the deferred bits.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
deferBits := 0           // 初始值 00000000
deferBits |= 1 << 0     // 遇到第一个 defer,设置为 00000001
_f1 = f1
_a1 = a1
if cond {
    // 如果第二个 defer 被设置,则设置为 00000011,否则依然为 00000001
    deferBits |= 1 << 1
    _f2 = f2
    _a2 = a2
}

Before the function returns and exits, the exit function creates a check code for the delayed bits in reverse order:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
exit:
// 判断 deferBits & 00000010 == 00000010是否成立
if deferBits & 1<<1 != 0 {
 deferBits &^= 1<<1
 tmpF2(tmpB)
}
// 判断 deferBits & 00000001  == 00000001 是否成立
if deferBits & 1<<0 != 0 {
 deferBits &^= 1<<0
 tmpF1(tmpA)
}

Before the function exits, it determines whether the position is 1 by taking the delayed bits with the corresponding position, and if it is 1, then the defer function can be executed.

Summary

This article explains the execution rules of defer and introduces the defer type. The main purpose of this article is to explain how defer function calls are made through heap allocation, such as: function calls to understand “defer argument passing”, “how multiple defer statements are executed”, “and what is the order of execution of defer, return, and return value”, and other issues. Through this analysis, we hope you can have a deeper understanding of defer.