I have encountered a lot of problems in designing and implementing extension development products based on the Go native plug-in mechanism, and since there is very little relevant information in this area, I would like to take this opportunity to make a very rough summary, and hope that it will help you.

This article only say the problem and the solution, do not read the code.

Some background knowledge

2.1 Runtime

In general, in the field of computer programming languages, the concept of “runtime” is associated with languages that require the use of a vm. A program runs in two parts: the target code and the “virtual machine”. The most typical example is JAVA, i.e. Java Class + JRE. For some programming languages that do not seem to need a “virtual machine”, there is less of a “runtime” concept and the program only needs one part, the target code, to run. But in fact, even C/C++ has a “runtime”, which is the OS/Lib of the platform it is running on.

The same is true for Go, because running Go programs does not require a JRE-like “runtime” to be deployed up front, so it may not seem to have anything to do with “virtual machines” or “runtimes”.

But in fact, the Go “runtime” is compiled by the compiler as part of the binary target code.

Java program, runtime and OS relationship

C/C++ programs, runtime and OS relationships

Go programs, runtime and OS relationships

2.2 Go Native Plugin Mechanism

As a language that appears to be more closely aligned with the C/C++ technology stack, support for dynamic link library-like extensions has been a relatively strong demand in the community. As shown in Figure 2-5, Go provides a plugin package in the standard library specifically as a language-level programming interface for plugins. The src/plugin package is essentially the standard interface for calling unix using the cgo mechanism: dlopen() and dlsym(). As such, it gives programmers with a C/C++ background the illusion that “I can do this”.

C/C++ programs load dynamic link libraries

 Go programs load dynamic link libraries

Typical problems and their solutions

Unfortunately, compared to the C/C++ technology stack, the output of Go’s plugins is also a dynamic link library file, but it has a series of very complex built-in constraints for the development and use of plugins. What is even more head-scratching is that the Go language not only does not provide a systematic introduction to these constraints, but even writes some rather poor designs and implementations, leading to very anti-human troubleshooting of plug-in related issues. This chapter focuses on the most common problems in developing and using Go plugins, mainly when compiling and loading plugins, but you must locate the source code of the Go standard library (mainly including compiler, linker, packer and runtime parts) to fully understand them, and the corresponding solutions.

In short, when Go’s main program loads the plugin, it performs a bunch of constraint checks on both in “runtime”, including but not limited to the following.

  • go version is consistent
  • go path consistency
  • go dependency intersections are consistent
    • code consistency
    • path consistency
  • go build some flags consistent

3.1 Inconsistent standard library versions

The main application loads the plugin with the following error.

1
plugin was built with a different version of package runtime/internal/sys 

From the text of this error report, we can know that the specific library in question is runtime/internal/sys, which is obviously a go built-in standard library. Seeing this, you may have great doubts: I obviously use the same local environment to compile the main program and plug-ins, why report the standard library is not a version?

The answer is that go’s error log is not accurately described. And the root cause of this error can be attributed to: some key compilation flags of the main program and the plug-in are not consistent, nothing to do with the “version”.

For example, you compile the plugin with the following command.

1
GO111MODULE=on go build --buildmode=plugin -mod readonly -o ./codec.so ./codec.go

But you use goland’s debug mode to debug the main program, at which point goland will assemble the go build command for you as in the example below.

1
/usr/local/go/bin/go test -c -o /private/var/folders/gy/2zv22t710sd7m0x9bcfzq23r0000gp/T/GoLand/___Test_TaskC_in_github_com_fdingiit_mpl_test.test -gcflags all=-N -l github.com/fdingiit/mpl/test #gosetup

Note that the compile command for the goland assembly contains the critical -gcflags all=-N -l parameter, but the command for the plugin compile does not. In this case, you will get an error about runtime/internal/sys when trying to load the plugin.

 Load failure due to inconsistent compile flags

The solution to this type of standard library version inconsistency is relatively simple:align the flags compiled by the main program and the plugins as much as possible. In fact, there are some flags that do not affect the loading of plugins, and you can take your time to figure them out in concrete practice.

3.2 Inconsistent third-party library versions

If you are using vendor to manage Go dependencies, then 100% of the time you will encounter this error below immediately after resolving the 3.1 issue.

1
plugin was built with a different version of package xxxxxxxx

Where xxxxxxxxx refers to a specific three-party library, such as github.com/stretchr/testify . There are several very typical reasons for this error report, several of which may take a developer quite a bit of time if they don’t have relevant troubleshooting experience.

3.2.1 Case 1. version inconsistency

As shown in the error report, it seems clear that the cause is inconsistent with the version of a third-party library on which the main application and the plugin depend, and the error report will clearly tell you which library is the problem. At this point, you can compare the go.mod files of the main program and the plug-in to find the version of the library in question, respectively, to see if they are consistent. If at this point you find that the main program and the plugin do have a commitid or tag inconsistency problem, then the solution is also simple: align them.

But in many cases, you only use a part of the three-party library, such as a package, or just an interface, and this part of the code does not change at all in different versions; but changes to other unused code will also cause the entire three-party library version number to change. This will cause you to become an innocent victim of the “version inconsistency”.

And, at this point, you may immediately run into another problem: who is the baseline alignment? The main program? Or the plugin?

Common sense dictates that baseline alignment with the main program is a better strategy because, after all, plugins are newly added “dependencies” and the main program and plugins are usually in a 1-to-many relationship. But what if the plugin’s three-way library dependencies just don’t align with the main program for any reason? After trying for a long time, I haven’t found a perfect solution to this problem yet.

If versions cannot be aligned, plugins have to be abandoned from the ground up.

This strong consistency constraint on the three-party library of the Go language, on the one hand, avoids potential problems caused by version inconsistency at runtime; on the other hand, this deliberate design that does not give programmers flexibility is very unfriendly to plug-in, customization, and extension development.

Load failure due to inconsistent versions of co-dependent three-party libraries

3.2.2 case 2. version number consistent, code inconsistent

Things get complicated when you troubleshoot the go.mod file along the lines of 3.2.1, but are surprised to find that the library versions reporting errors are the same. You might pull out the world’s most advanced text-checking tool and spend the morning diffing the commitid of all three libraries, but they are exactly the same, seemingly stuck in Schrödinger’s version.

One possible reason for this problem is that someone has directly modified the code in the vendor directory, and the Go plugin mechanism checks the consistency of the code content.

This is really a very big headache and difficult to troubleshoot. No one will know about it except the person who modified the code and the people who have been “screwed” in other cases. If the modified vendor code is in the main program, you have almost no reliable way to get it to work.

Don’t change the code directly in vendor, give back to the open source community, or fork-replace.

The good news is that you don’t need to solve this problem. Because even if you do, there will be bigger problems waiting for you.

Load failure due to in-place modification of co-dependent three-party library code

3.2.3 case 3. inconsistent paths

When the problem is solved according to 3.2.1 and 3.2.2, but it still reports a different version of the package, you may start to spit on Go’s plugin mechanism: the version is really the same, the code really hasn’t moved a line, why is there still an exception?

The reason is that the plugin mechanism checks the “path” of the source code of the dependency library , so you can’t use vendor to manage dependencies.

For example, your main application source code is in the /path/to/main directory, so one of your three-way libraries should depend on the directory /path/to/main/vendor/some/thrid/part/lib. This “file path” data is packaged into the binary executable and used for verification, and when the main application loads the plugin, Go’s “runtime” When the main application loads the plugin, Go’s “runtime” “smartly” determines that it is not using the same code as the plugin by the difference in the “file paths” and reports a different version of the package.

Using the vendor mechanism to manage load failures caused by third-party libraries

The same problem may occur in scenarios where different machines/users are used and the main program and plugins are compiled separately: the path to the go code should be different for different usernames.

The solution to this kind of problem is violently straightforward: delete the vendor directories of the main program and plugins, or use -mod=readonly compile flags.

By this point, if you are using the same machine for both the main program and the plugin compilation, then the common problems should be largely solved and the plugin mechanism should work as it should. On the other hand, since vendor is no longer used to manage dependencies, the 3.2.2 issues will be forced here as well: either mention PR to the community, or fork-replace.

Successfully loaded

3.3 Inconsistent Go versions

1
fatal error: runtime: no plugin module data

In addition to the issues above, there is a common error reported in multiple machines compiling separate main/plugin scenarios. One possible reason for this error is Go version inconsistency, just align them. (What if they just can’t be aligned from the machine level?)

Load failure due to inconsistent Go versions

Unified Solutions

From 3.1 to 3.3, we looked at some issues that were difficult to troubleshoot and not very well handled. In addition to that, there are actually some issues that were not highlighted in. As an officially supported extension mechanism for a programming language, it was really surprising to see it done so user-unfriendly.

My team relies heavily on Go’s plug-in mechanism to do customization, so I had to come up with a systematic solution to solve all of these problems. After trying to directly modify the Go source code to no avail (spit: the Go plug-in mechanism source code is slightly regrettable), I focused on the following aspects of the work.

  • Unified compilation environment.

    • Provide a standard docker image for compiling the main application and plugins, circumventing any problems caused by inconsistencies in go versions, gopath paths, usernames, etc.
    • Prefabricated go/pkg/mod to minimize the problem of re-downloading dependencies every time you compile because you don’t use vendor mode.
  • Unified Makefile.

    • Provide a set of Makefiles for compiling the main program and plugins, circumventing any problems caused by the go build command
  • Unified plugin development scaffolding.

    • Scaffolding, rather than developers pulling together dependent versions of plugins with the main application. and other related issues are resolved by scaffolding
  • ACI-ization.

    • ACI-ize the compilation process to further avoid errors

Unified Solutions

So far, the Go plug-in on the common problems and solutions to introduce the temporary end, I hope you have helped.

Bonus

If you really want to get to the root of the plugin checksum mechanism, then here are some quick entry points into the source code reading. I’m using Go source code version 1.15.2.

Location of related Go source code.

  • compiler

    • go/src/cmd/compile/*
  • linker

    • go/src/cmd/link/internal/ld/*
  • package loader

    • go/src/cmd/go/internal/load/*
  • runtime

    • go/src/runtime/*

5.1 What go build is really doing

You can add the -x parameter to the go build command to explicitly print out the full process of compiling, linking, and packaging a Go program, for example.

1
go build -x -buildmode=plugin -o ../calc_plugin.so calc_plugin.go

5.2 Target code generation

go/src/cmd/compile/internal/gc/obj.go:55 : Note lines 67 and 72, there are two entries here go/src/cmd/compile/internal/gc/iexport.go:244 : note line 280, where path-related data is recorded

5.3 Library Hash Generation Algorithm

go/src/cmd/link/internal/ld/lib.go:967 : Note lines 995 to 1025, where the hash of the pkg is calculated

5.4 Library Hash Checks

go/src/runtime/symtab.go:392 : Key data structures go/src/runtime/plugin.go:52 : link-time hash and run-time hash checkpoint go/src/cmd/link/internal/ld/symtab.go:621 : link-time hash assignment point go/src/cmd/link/internal/ld/symtab.go:521 : Runtime hash assignment point