The Go language has a built-in runtime (that is, runtime) that abandons the traditional way of allocating memory in favor of autonomous management. This allows for autonomous implementation of better memory usage patterns, such as memory pooling, pre-allocation, etc. This way, you don’t need to make a system call for every memory allocation.
The Golang runtime memory allocation algorithm is derived from Google’s TCMalloc algorithm for C. The core idea is to reduce the granularity of locks by dividing memory into multiple levels of management. It manages the available heap memory in a two-level allocation: each thread maintains a separate memory pool of its own and allocates memory from that pool first, and only applies to the global memory pool when the pool is insufficient to avoid frequent competition between different threads for the global memory pool.
When Go starts a program, it first requests a block of memory from the operating system (note that this is still just a virtual address space and does not actually allocate memory), cuts it into small chunks and then manages it itself.
The requested memory block is allocated three areas, 512MB, 16GB, and 512GB in size on the X64.
arena is what we call the heap area. Go dynamically allocates memory in this area, which divides the memory into
8KB sized pages, some of which are combined and called
bitmap area identifies which addresses in the
arena area hold objects, and uses the
4bit flag bit to indicate whether the object contains pointer,
GC tag information. One
byte size of memory in
bitmap corresponds to 4 pointer size (8B pointer size) of memory in the
arena area, so the size of
bitmap area is
From the above figure, you can actually see that the high address part of the bitmap points to the low address part of the arena area, which means that the address of the bitmap grows from the high address to the low address.
spans area holds the pointers to the
mspan (that is, the basic unit of memory management combined with some pages of the
arena partition, which will be discussed later), and each pointer corresponds to a page, so the size of the
spans area is
512GB/8KB*8B=512MB. Dividing by 8KB is to calculate the number of pages in the
arena area, and multiplying by 8 is to calculate the size of all the pointers in the
spans area. When creating an
mspan, the corresponding
spans area is filled by page, and when recycling the
mspan it belongs to can be easily found based on its address.
Memory management unit
mspan: The basic memory management unit in Go, a large block of memory consisting of a contiguous
8KB page. Note that a page here is not the same thing as an OS page, which is typically several times the size of an OS page. In a nutshell:
mspan is a Doubly linked list containing the starting address, the
mspan specification, the number of pages, etc.
mspan is divided into several
objects according to the size of its own property
Size Class, and each
object can store one object. A bitmap is used to mark the
objects that are not yet used. The property
Size Class determines the size of the
mspan will only be assigned to objects that are close to the size of the
object, but, of course, smaller than the size of the
object. There is another concept:
Span Class, which has a similar meaning to
This is because there are actually two
Size Class, which means there are two
Span Classs. One of them is allocated to objects that contain pointers, and the other one is allocated to objects that do not contain pointers. This will bring benefits to the garbage collection mechanism, which will be discussed in a later article.
As shown below,
mspan consists of a set of consecutive pages divided into
objects of a certain size.
Size Class has 67 kinds, each
mspan partitioned object size is a multiple of 8*2n, this is written dead in the code.
According to the
Size Class of
mspan, you can get the size of the
object it divides. For example, if
Size Class is equal to 3,
object size is 32B. An object of size 32B can store objects in the size range of 17B to 32B. For tiny objects (less than 16B), the allocator will merge them and allocate several objects into the same
The maximum number in the array is 32768, that is, 32KB, more than this size is a large object, it will be treated specially, this will be introduced later. By the way, a type
Size Class of 0 indicates a large object, which is actually allocated directly from heap memory, while small objects are allocated through
For mspan, its
Size Class determines the number of pages it can be allocated, which is also written dead in the code.
For example, when we want to request a
32B, the corresponding index in class_to_size is 3, and the index 3 corresponds to the number of pages in the
class_to_allocnpages array is 1.
mspan structure definition.
mspan into a larger perspective.
The above figure shows that there are two
Ss pointing to the same
mspan, because the
P to which these two
Ss point belongs to the same
mspan. So, the
S pointing to it can be quickly found by the address on the
arena, and the
mspan can be found by the
S, recalling what we said earlier about each pointer to the
mspan area corresponding to a page.
Assuming that the
Size Class of the first
mspan on the far left is equal to 10, according to the
class_to_size array, the
object size of this
msapn partition is 144B, and the number of objects that can be allocated is
8KB/144B=56.89, rounded up to 56, so there will be some memory wasted. Go’s source code has the size of all
mspan wasted memory of
Size Class; then according to the
class_to_allocnpages array, we get this
mspan consists of only 1
page; suppose this
mspan is allocated to pointerless objects, then
spanClass is equal to 20.
startAddr points directly to a location in the
arena region, indicating the starting address of this
allocBits points to a bitmap, each representing whether a block has been allocated or not, and
allocCount indicates the total number of objects allocated.
Thus, the parameters of each field of the first
mspan from the left are as follows.
Memory Management Components
Memory allocation is done by the memory allocator. The allocator consists of 3 components:
mcache : Each worker thread is bound to an mcache, which locally caches available
mspan resources so that they can be allocated directly to Goroutines, as there is no competition from multiple Goroutines, so no lock resources are consumed.
Structure definition of
Span Classes as an index to manage multiple
mspan for allocation, and it contains all sizes of
mspan. It is twice as large as
_NumSizeClasses, which is
67*2=134. Why there is a twofold relationship, as we mentioned earlier: to speed up memory reclamation afterwards, half of the objects allocated in
mspan in the array do not contain pointers, and the other half contain pointers.
mspan of a pointerless object is garbage collected without further scanning for references to other active objects. More on this in a later garbage collection article, but this time we’ll stop here.
mcache is initialized without any
mspan resources and is dynamically requested from
mcentral during use and cached afterwards. When the object is less than or equal to 32KB in size,
mcache is allocated using the corresponding size of
mcentral: Provides a sliced
mspan resource for all
central holds a global
mspan list of a specific size, both allocated and unallocated. Each
mcentral corresponds to one type of
mspan, and the type of
mspan causes it to divide
objects of different sizes. When there is no suitable (i.e., specific size)
mspan in the worker thread’s
mcache, it is fetched from
mcentral is shared by all worker threads, and there is competition from multiple Goroutines, so it consumes lock resources. Structure definition.
empty means that all the
mspan in this linked list are either allocated
mspan that have already been taken by
cache, and this
mspan is exclusive to that worker thread. And
nonempty indicates a list of
mspans that have free objects. Each
central structure is maintained in
To briefly explain the flow of
mcache fetching and returning
- Get Locks; finds an available
nonemptylinked list; removes it from the
nonemptylinked list; adds the retrieved
emptylinked list; returns the
mspanto the worker thread; unlocks it.
- Return lock; remove
emptylinked list; add
nonemptylinked list; unlock.
mheap: Represents all the heap space held by the Go program, which uses a global object
mheap to manage heap memory.
mcentral does not have a free
mspan, it requests it from
mheap. And when
mheap runs out of resources, it requests new memory from the OS.
mheap is mainly used for memory allocation for large objects and for managing uncut
mspan, which is used to cut
mcentral into smaller objects.
Also we see that
mcentral of all sizes, so when an
mcentral, it only needs to use a lock in the separate
mcentral and does not affect requests for
mspan of other sizes.
mheap structure definition.
Above we see that the bitmap and arena_start point to the same address. This is because the address of the bitmap grows from high to low, so they point to the same memory location.
Whether a variable is allocated on the stack or on the heap is determined by the result of the escape analysis. Usually, the compiler prefers to allocate variables on the stack because it has less overhead, and the most extreme is “zero garbage”, where all variables are allocated on the stack so that there is no memory fragmentation, garbage collection, or anything like that.
Go’s memory allocator allocates objects in three categories based on their size: small objects (less than or equal to 16B), general objects (greater than 16B, less than or equal to 32KB), and large objects (greater than 32KB).
The general allocation process.
- 32KB objects are allocated directly from the mheap.
- <=16B objects allocated using the tiny allocator of mcache.
- objects of [16B,32KB], first calculate the specification size of the object, and then allocate it using the mspan of the corresponding specification size in the mcache.
- If mcache does not have mspan of the corresponding specification size, apply to mcentral
- If mcentral does not have an mspan of the corresponding specification size, apply to mheap
- If there is no mspan of the appropriate size in mheap either, apply to the operating system
- Go requests a large chunk of memory from the operating system at program startup and manages it on its own afterwards.
- The basic unit of Go memory management is the mspan, which consists of several pages, each mspan can allocate a specific size of object.
- mcache, mcentral, and mheap are the three major components of Go memory management in a hierarchy. mcache manages the mspan that threads cache locally; mcentral manages the global mspan for use by all threads; and mheap manages all of Go’s dynamically allocated memory.
- Very small objects are allocated in an object to save resources and use tiny allocator to allocate memory; generally small objects are allocated memory by mspan; and large objects are allocated memory directly by mheap.