Whenever we write a non-hello world utility Go program or library, we will trudge around in front of the three “thresholds” of project structure, code style, and identifier naming for a long time, and never even get a satisfactory answer.
In this article, we will introduce in detail how to cross the “threshold” of Go project structure to help you get into the core hinterland of Go language faster and improve your learning efficiency.
Unless it is a simple program like hello world, whenever we write a utility or library, we will encounter the problem of what project structure to use (usually a Go project corresponds to a repository). In Go, the project structure is also very important, because it determines the layout of packages and package dependencies within the project, and also affects the dependencies and references of external projects to packages in the project.
1. The project structure of the Go project
Let’s take a look at the project structure of the Go language itself, the world’s first Go project.
The project structure of the Go project has been very stable since the release of version 1.0, and the top-level structure of the Go project has remained largely unchanged until now. As of go project commit 1e3ffb0c (2019.5.14), the structure of the go project is as follows.
As the “founding project” of Go, the layout of its project structure is an important reference for other subsequent Go projects, especially the structure under the src directory in the early Go projects, which is widely used by the Go community as a template for Go application project structure. Let’s take the structure under the src directory of the early Go 1.3 version as an example.
Regarding the structure below the src directory above, I have summarized three features.
the script source files for code building are placed in the top-level directory under src.
the secondary directory cmd under src holds the main directory of go toolchain-related executables (e.g.: go, gofmt, etc.) and their main package source files.
The secondary directory under src, pkg, holds the source files of the packages, go runtime, and go standard library that each toolchain depends on under cmd above.
Several structural changes have occurred in the src directory under Go projects since Go version 1.3 to date.
- Go 1.4 removes the pkg level from src/pkg/xxx in the Go source tree and instead uses src/xxx directly.
- Go 1.4 added the internal directory under src to store packages that cannot be imported externally and are only used by Go projects.
- Go 1.6 added the vendor directory under src, but the Go project itself really enabled the vendor mechanism in Go 1.7. The vendor directory holds the Go project’s own dependencies on external projects, mainly packages under golang.org/x, including: net, text, crypto, etc. The packages in this directory are updated with each Go version release.
- Go version 1.13 added go.mod and go.num under src, which realized the migration of go module of go project itself. All packages in go project were put under the module named std, and its dependencies are still various packages under golang.org/x
Here is the complete layout in the src directory of the latest Go 1.16 release.
2. Typical project structure for Go language
(1) Minimum standard layout of Go project structure
The official Go team has never given a reference standard about what the standard layout of a Go application project structure looks like. However, as the technical leader of the Go language project, Russ Cox gave his thoughts on the minimum standard layout of the Go project structure in an issue of the open source project. He believes that the minimum standard layout of the Go project should be as follows.
As for pkg, cmd, docs, these directories should not be part of the standard structure of a Go project, or at least not required. I believe that the minimum standard layout given by Russ Cox is consistent with Go’s philosophy of “simplicity”, which is flexible enough to meet the needs of various Go projects.
However, before Russ Cox elaborated on the minimum standard, the Go community was in a “non-standard” state, and the influence of the early Go language’s own project structure on a large number of existing Go open source projects is still lasting. For larger Go applications, we are bound to extend the “minimal standard layout”. This extension will obviously not be blind, but will still refer to the Go language project’s own structural layout, so we have the following unofficial standard suggested structural layout.
(2) Go project structure for the purpose of building binary executables
Based on the early structure of Go language projects themselves and their subsequent evolution, the Go community has gradually formed a typical project structure after years of Go language practice accumulation, which is compatible with Russ Cox’s minimum standard layout, as shown in Figure 1.
The above is the structure of a typical Go project that supports building binary executables (under cmd), let’s look at the purpose of each important directory separately.
cmd directory: store the source files of the main package corresponding to the executable file to be compiled and built by the project. If there are multiple executables to build, the main package of each executable is placed in a separate subdirectory, such as app1 and app2 in the figure; the main package of each app in the cmd directory connects the dependencies of the whole project together; and generally speaking, the main package should be very concise. We will do some command line parameter parsing, resource initialization, logging facility initialization, database connection initialization, etc. in the main package, after which we will hand over the execution rights of the program to a more advanced execution control object; there are also some go projects that change the name cmd to app, but its utility does not change.
pkg directory: the project itself to use, but also the executable file corresponding to the main package to depend on the library files; at the same time, the package under this directory can also be external project reference, as a project export package “aggregation”; also some projects will be pkg the name to lib, but the directory purpose remains the same; because the Go language projects have removed the pkg directory in version 1.4, there are some projects that directly lay the packages under the root path of the project, but I think that for some larger projects, too many packages will make the top-level directory of the project less concise and “crowded”. For complex Go projects, I suggest keeping the pkg directory.
Makefile: Makefile here is a “representation” of the scripts used by the project build tool, it can represent the scripts used by any third party build tool. seems to be indispensable for larger projects. In a typical Go project, the scripts for the project build tool are usually placed in the top-level project directory, such as the Makefile here; for projects with more build scripts, you can also create a build directory and put the build scripts’ rule property files and sub-build scripts into it.
go.mod and go.sum: configuration files used for Go language package dependency management. go modules was introduced in Go 1.11, and go module became the default dependency package management and build mechanism in Go 1.16. So for new Go projects, we recommend to use go modules for package dependency management. For projects that don’t use go modules for package management (probably mainly go projects that use previous versions of go 1.11), you can replace them with dep’s Gopkg.toml and Gopkg.lock or glide’s glide.yaml and glide.lock, etc.
vendor directory (optional): vendor is a mechanism introduced in Go 1.5 to cache version-specific dependency packages locally in the project, before the go modules mechanism was introduced, vendor-based reproducible builds could be implemented to ensure that executables built from the same source code were equivalent. go modules itself can implement reproducible builds without vendor. modules itself enables reproducible builds without the need for vendor, but the go modules mechanism also preserves the vendor directory (go mod vendor generates dependencies under vendor; go build -mod=vendor enables vendor-based builds), so here the vendor directory as an optional directory. Generally we only keep the vendor directory in the root of the project, otherwise it will cause unnecessary complexity in dependency selection.
Go 1.11 introduces a module as a collection of packages that belong to the same versioning unit. And while Go supports multiple modules in a project/repository, this management approach may introduce more complexity than a certain percentage of code duplication. Therefore, if there are versioning “divergences” in the project structure, e.g. app1 and app2 releases are not always in sync, then I recommend splitting the project into multiple projects (repositories), with each project being a separate module for separate versioning and evolution.
(3) Go project structure for library-only building
When Go 1.4 was released, the Go language project itself removed the pkg layer under src. This structural change has some impact on the structure of Go library-type projects that are built for library-only purposes. Let’s look at the structure of a typical Go library type project, see Figure 2.
We see that the library type project structure is also compatible with the minimum standard layout of a Go project, but is simpler than a Go project that aims to build binary executables.
- The cmd and pkg subdirectories have been removed: since the library is only a component library, there is no need to keep the cmd directory where the main package source files of binaries are stored; since the original intent of Go library projects is to expose the API to the outside world (open source or internal organization), there is no need to aggregate it under the pkg directory.
- vendor is no longer optional: for library type projects, we do not recommend placing a vendor directory in the project to cache the library’s own third-party dependencies; library projects should only explicitly state the modules or packages that the project depends on and the version requirements through go.mod (or other package dependency management tools’ manifest files).
(4) About the internal directory
For any of the above types of Go projects, for packages that you do not want to expose to external references, but only for internal use, the project structure can be implemented through the internal package mechanism introduced in Go 1.4. In the case of library projects, the easiest way to do this is to add an internal directory at the top level and put all packages that you don’t want to expose to the outside in that directory, such as ilib1 and ilib2 in the project structure below.
Thus, according to the principle of the go internal mechanism, ilib1 and ilib2 in the internal directory can be imported and used by code in other directories (such as lib.go, lib1/lib1.go, etc.) with the GoLibProj directory as the root directory, but not by code outside the GoLibProj directory. This allows for selective exposure of API packages. Of course internal can be placed at any directory level in the project structure, but it is critical that the project structure designer is clear about what to expose to the outer code and what to use only in sibling directories or subdirectories.
For projects that aim to build binary executable types, we can also aggregate packages that we don’t want to expose to the outside under internal at the top-level path of the project, echoing the aggregation directory pkg for packages that are exposed to the outside.
This article has detailed the history of the structural layout of Go language projects and the de facto standards for Go project structures. The two project reference structures for building binary executable types and library types in this article are recognized and widely used by the Go community after years of practice, and they are compatible with the minimum standard layout for Go projects proposed by Russ Cox, which is valuable for slightly larger Go projects. However, they are not required, and in the early days of the Go language, the practice of placing all source files in a root package located at the root of the project worked equally well in some small-scale projects.
For projects aiming to build binary executable types, the removal of the pkg hierarchy is also the structural layout of choice for many projects, influenced by the Go 1.4 project structure.
The above reference project structure is similar to the idea of “minimum viable product” (mvp) in the field of product design and development, and developers can extend it based on such a minimum “project structure core” according to their actual needs.