Overview

As a programmer who writes a lot of go, I’m sure you see panic messages from go from time to time. Usually we can easily locate the number of lines of code in error based on the message, but because of this reason, we often overlook other information. This article is to analyze how to understand the error message output when go program panic?

Analysis by example

We start with a very simple example below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
package main

import (
    "fmt"
)

type Person struct {
    name string
    age  int
}

func (person *Person) say(words []string) (ret int) {
    for i := range words {
        ret++
        fmt.Printf("%s say: %s", person.name, words[i])
    }
    return
}

func main() {
    var person *Person
    words := []string{
        "hello",
        "world",
    }
    person.say(words)
}

This example will panic as soon as it runs with the following message.

1
2
3
4
5
6
7
8
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x493373]

goroutine 1 [running]:
main.(*Person).say(0x0, 0xc000072f58, 0x2, 0x2, 0x0)
    /home/jiangpengfei.jiangpf/projects/panic_demo/main.go:15 +0x43
main.main()
    /home/jiangpengfei.jiangpf/projects/panic_demo/main.go:26 +0x7d
  1. panic: runtime error: invalid memory address or nil pointer dereference. This statement indicates that this is a panic exception and that the likely cause is an invalid memory address or nil pointer dereference.

  2. [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x493373] . SIGSEGV is a signal sent to a process when it executes an invalid memory reference, or when a segmentation error occurs. 0x1 corresponds to SEGV_MAPERR, which means address not found object. addr is 0x8, which is not a valid memory address. pc is the program counter, which is used to point to the location where the next instruction is stored.

  3. goroutine 1 [running] . 1 is the ID of the goroutine. running represents the state of the goroutine in case of an exception.

  4. main.(*Person).say(0x0, 0xc000072f58, 0x2, 0x2, 0x0) . main is the package, (*Person) is the type, say is the method, and we can see that the call to say has a total of 5 arguments. But in the code, the call to say actually has only 1 argument.

    • The 1st parameter is receiver, in this case *Person. The value 0x0 means it is a null pointer.
    • The 2nd to 4th parameter is []string, and as we all know, the slice data structure is composed of three fields: pointer, len, cap. pointer(0xc000072f58) is a pointer to the actual memory address, len(0x2) is the length, and cap(0x2) is the capacity.
    • The 5th parameter is the return value.
  5. /home/jiangpengfei.jiangpf/projects/panic_demo/main.go:14 +0x43 . Here is the location of the error code and the line number. What does +0x43 mean? To be honest, I didn’t find any information about it. But I found the following go runtime code [go/src/runtime/traceback.go:439]. frame is the current stack frame, f is the current function, and entry is the start pc of the function. so we know that +0x43 is the pc offset on the stack.

    1
    2
    3
    
    if frame.pc > f.entry {
        print(" +", hex(frame.pc-f.entry))
    }
    

At this point, a simple panic example has been analyzed. But there are still two things we haven’t understood here.

  • How to analyze the parameters in panic. For example, as mentioned above, the slice argument will output pointer, len, and cap when panic. This means that panic will output the memory layout information of the go type. What about for other types?
  • How does go runtime collect and print out all of the above information when in panic?

Memory layout of go types

The main references for this section are: Go Data Structures and Go Data Structures: Interfaces. You can also read these two articles directly.

Basic types

Basic types

The memory layout of basic types is well understood and will not be elaborated.

Structs and pointers

The memory layout of structures is in the order of the member variables. Of course, there are some memory alignment optimizations, but they are out of the scope of this article.

For example, the following Point structure.

1
type Point struct { X, Y int }

Its memory layout is as follows.

memory layout

For member variables that are not basic types, the memory layout is as follows.

1
2
type Rect1 struct { Min, Max Point }
type Rect2 struct { Min, Max *Point }

memory layout

String

string memory layout

Strings consist mainly of: pointer and len, where pointer points to the first address of the byte array in memory. Also, strings in go are immutable, so it is safe for multiple strings to share the same memory area.

Slice

Slice string memory layout

A slice is also a reference to a piece of memory address of an array, consisting of three fields: pointer, len and cap.

interface

(the following is based on 32-bit machines)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
type Stringer interface {
    String() string
}

func ToString(any interface{}) string {
    if v, ok := any.(Stringer); ok {
        return v.String()
    }
    switch v := any.(type) {
    case int:
        return strconv.Itoa(v)
    case float:
        return strconv.Ftoa(v, 'g', -1)
    }
    return "???"
}
type Binary uint64

func (i Binary) String() string {
    return strconv.Uitob64(i.Get(), 2)
}

func (i Binary) Get() uint64 {
    return uint64(i)
}

The Stringer type is defined above, and Binary implements the Stringer type.

Stringer

Then, the memory layout of b in the figure is as follows. After doing the type conversion of b to s, the memory layout is as follows.

memory layout

Here, tab refers to an itable

  • the type in itable points to the underlying type (Binary), so s.tab->type gets the type of the interface.
  • The func array of itable only holds pointers to methods that implement Stringer. So Get() is not stored here.
  • When calling s.String(), it is the same as calling s.tab->func[0](s.data). Pass s.data as the first argument to the function call. Another thing to note here is that the function call passes in s.data, which is of type *Binary. func[0] is therefore (*Binary).String instead of (Binary).String.

data is a pointer to Binary(200). Note that this does not point to the original b, but to a copy of b. Of course, this is not always the case.

Of course, this is not the case in all cases; for example, if b is converted to an interface with no methods, a little memory optimization can be done here.

memory layout

In this case, since there is no func list, there is no need to allocate a separate block of memory on the heap for itable. any.type can just point directly to a type. Similarly, if data happens to be 32 bits (consistent with the machine’s addressing size), it can be optimized by storing the value directly in data.

memory layout

The following are the cases where both optimizations are available.

memory layout

More arguments example analysis

After understanding the memory layout of the go type above, here is a look at a function call panic with more arguments.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
package main

//go:noinline
func test1(a int, b []int, c string, d [2]int64) error {
    panic("test1")
}

func main() {
    a := 0
    b := make([]int, 3, 7)
    b[0], b[1], b[2] = 1, 2, 3
    c := "c"
    d := [2]int64{5, 6}

    test1(a, b, c, d)
}

Note that //go:noinline is used here to prevent the go compiler from inlining the function, so that the argument information is lost after the panic. panic information is as follows.

1
main.test1(0x0, 0xc00003a740, 0x3, 0x7, 0x475848, 0x1, 0x5, 0x6, 0x4046eb, 0xc000058000)

The explanation is as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
main.test1(0x0, // The value of a
           0xc00003a710, // b Memory address pointed to
           0x3, // len of b
           0x7, //  cap of b
           0x475a48, // c The memory address pointed to
           0x1, // Length of c
           0x5, // d[0]
           0x6, // d[1]
           0x4046eb, // 
           0xc000058000) // The value of error

If we want to see the error value without optimization. This can be run using the following.

1
go run -gcflags '-N -l' main.go

In this case, the last two arguments are both 0x0. One more example related to structures:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
package main

type speaker interface {
    say()
}

type person struct {
    name string
    age  int
}

func (p person) say() {
}

//go:noinline
func test2(p1 person, p2 *person, p3 speaker) error {
    panic("test2")
    return nil
}

func main() {
    p1 := person{"p", 11}
    p2 := new(person)
    p3 := speaker(p1)

    test2(p1, p2, p3)
}

The panic message after running with go run -gcflags '-N -l' main.go is as follows.

1
main.test2(0x475a48, 0x1, 0xb, 0xc00003a748, 0x487200, 0xc00003a730, 0x0, 0x0)

The explanation is as follows.

1
2
3
4
5
6
7
8
main.test2(0x475a49, // The memory address pointed to by person.name
           0x1, // Length of person.name
           0xb, // The value of person.age
           0xc00003a748, // Pointer value of p2
           0x487200, // The type in the itable pointed to by p3
           0xc00003a730, // p3 points to the memory address of the copy of p1
           0x0, // Type of error
           0x0) // The value of error

Execution process after a go panic

When a panic occurs on a goroutine, it enters the function exit phase. Take the manual call to panic as an example.

  • First, go runtime puts the panic at the top of the goroutine’s panic chain.

    1
    2
    3
    4
    
    var p _panic
    p.arg = e       // arg is an argument to panic
    p.link = gp._panic // link points to an earlier panic
    gp._panic = (*_panic)(noescape(unsafe.Pointer(&p))) // gp is the current goroutine
    
  • Then, the defer on the goroutine will be executed in turn.

    1
    2
    3
    4
    5
    6
    7
    8
    
    for {
        d := gp._defer
        if d == nil {
            break
        }
    ...
    ...
    }
    
  • If the deferer has already been executed before. then the defer is ignored.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    
    // If defer was started by earlier panic or Goexit (and, since we're back here, that triggered a new panic),
    // take defer off list. The earlier panic or Goexit will not continue running.
    if d.started {
        if d._panic != nil {
            d._panic.aborted = true
        }
        d._panic = nil
        d.fn = nil
        gp._defer = d.link
        freedefer(d)
        continue
    }
    
  • Executes the function call corresponding to the defer.

    1
    
    reflectcall(nil, unsafe.Pointer(d.fn), deferArgs(d), uint32(d.siz), uint32(d.siz))
    
  • If recover() is encountered in the defer function, the following code will also be executed. Of course, this will only check if the recover() call is valid.

    • p ! = nil. The current panic has indeed occurred.

    • !p.recovered. The current panic is not recovered.

    • argp uintptr(p.argp). argp is a pointer to the caller’s argument, and p.argp is an argument to defer.

    • If one of the above conditions is not met, nil is returned, indicating that this recover() is invalid.

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      11
      12
      13
      14
      15
      
      func gorecover(argp uintptr) interface{} {
      // Must be in a function running as part of a deferred call during the panic.
      // Must be called from the topmost function of the call
      // (the function used in the defer statement).
      // p.argp is the argument pointer of that topmost deferred function call.
      // Compare against argp reported by caller.
      // If they match, the caller is the one who can recover.
      gp := getg()
      p := gp._panic
      if p != nil && !p.recovered && argp == uintptr(p.argp) {
          p.recovered = true
          return p.arg
      }
      return nil
      }
      
  • The previous step is a recover() call, but there is no recover logic, it just marks recovered=true for the current panic, so you can execute the following judgment. The real recovery logic is executed with mcall(recover).

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    
    if p.recovered {
            atomic.Xadd(&runningPanicDefers, -1)
    
            gp._panic = p.link
            // Aborted panics are marked but remain on the g.panic list.
            // Remove them from the list.
            for gp._panic != nil && gp._panic.aborted {
                gp._panic = gp._panic.link
            }
            if gp._panic == nil { // must be done with signal
                gp.sig = 0
            }
            // Pass information about recovering frame to recovery.
            gp.sigcode0 = uintptr(sp)
            gp.sigcode1 = pc
            mcall(recovery)
            throw("recovery failed") // mcall should not return
    }
    
  • Recovery is achieved as follows

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    
    // Unwind the stack after a deferred function calls recover
    // after a panic. Then arrange to continue running as though
    // the caller of the deferred function returned normally.
    func recovery(gp *g) {
    // Info about defer passed in G struct.
    sp := gp.sigcode0
    pc := gp.sigcode1
    
    // d's arguments need to be in the stack.
    if sp != 0 && (sp < gp.stack.lo || gp.stack.hi < sp) {
        print("recover: ", hex(sp), " not in [", hex(gp.stack.lo), ", ", hex(gp.stack.hi), "]\n")
        throw("bad recovery")
    }
    
    // Make the deferproc for this d return again,
    // this time returning 1.  The calling function will
    // jump to the standard return epilogue.
    gp.sched.sp = sp
    gp.sched.pc = pc
    gp.sched.lr = 0
    gp.sched.ret = 1
    gogo(&gp.sched)
    }
    
  • If there is no recovery logic, it executes to the point where the exception message is output.

    1
    2
    3
    4
    5
    6
    7
    
    // ran out of deferred calls - old-school panic now
    // Because it is unsafe to call arbitrary user code after freezing
    // the world, we call preprintpanics to invoke all necessary Error
    // and String methods to prepare the panic strings before startpanic.
    preprintpanics(gp._panic)
    
    fatalpanic(gp._panic) // should not return
    
  • preprintpanics is responsible for preparing the information to be printed. That is, if the panic argument is error, it gets the error message from v.Error(). If the parameter implements stringer, String() is called to get the string information.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    
    // Call all Error and String methods before freezing the world.
    // Used when crashing with panicking.
    func preprintpanics(p *_panic) {
        defer func() {
            if recover() != nil {
                throw("panic while printing panic value")
            }
        }()
        for p != nil {
            switch v := p.arg.(type) {
            case error:
                p.arg = v.Error()
            case stringer:
                p.arg = v.String()
            }
            p = p.link
        }
    }
    
  • fatalpanic starts the final exception message output. The first recursive call to printpanics prints the parameters of the panic.

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    
    // Print all currently active panics. Used when crashing.
    // Should only be called after preprintpanics.
    func printpanics(p *_panic) {
        if p.link != nil {
            printpanics(p.link)
            print("\t")
        }
        print("panic: ")
        printany(p.arg)
        if p.recovered {
            print(" [recovered]")
        }
        print("\n")
    }
    
  • Finally call dopanic_m to print the exception call stack, and then exit(2) to exit.