1. Technical background
1.1 Dynamic linking technology for programs
In the actual development process, we often need to dynamically update the functions of the program, or add or update the program modules without changing the main program files.
1.1.1 Dynamic Link Library
The first and most common is the Dynamic Link Library (DLL) supported by the Windows platform, usually with the suffix
.dll. Its advantages are very obvious:
- multiple programs can share code and data. That is, multiple programs load the same DLL file.
- It is natural to divide the program into several modules. Each module is output as a separate DLL file, which is loaded and executed by the main program.
- Cross-language calling. Since DLL files are language-independent, a DLL file can be loaded and executed by multiple programming languages.
- Easy to update. In the process of program update, only the DLL file of the corresponding module can be updated without redeploying the whole program.
- Provide technical possibility for hot update. Dynamic link libraries can be programmatically loaded and unloaded, thus supporting the update of modules without restarting the program.
- Provide programming interfaces for programs. It is possible to encapsulate the calling interface of your program as a DLL file for other programs to call.
1.1.2 Dynamic shared objects
In Linux, this technology is called dynamic shared objects (dynamic shared objects) and is commonly suffixed with
In addition to the above-mentioned advantages of “dynamically linked libraries”, dynamic shared objects can also solve the underlying interface compatibility problems caused by the openness of Linux. That is, the dynamic shared object encapsulates the underlying interface of the operating system and provides a unified call interface for upper-level applications to call. It is equivalent to providing a compatibility layer.
1.1.3 Dynamic techniques for non-compiled languages
Non-compiled languages, since they are distributed through source code, implement dynamic loading of program modules or updating modules by directly modifying the source code. The idea is simple and easy to implement.
1.2 Golang’s Dynamic Techniques
Golang, as a compiled development language, does not inherently support dynamic loading and updating via source code. However, Golang officially provides the Plugin technology to achieve dynamic loading.
Compile a Go program into a Plugin by adding the following parameters at compile time.
However, this technique is very limited in the current version (1.19). As you can see from its documentation https://pkg.go.dev/plugin
- platform limitations, currently only supported: Linux, FreeBSD and macOS
- uninstallation limitation, only dynamic loading is supported, not dynamic uninstallation.
- does not provide a unified interface, can only handle the internal properties and functions of Plugin through reflection.
And the above problems, Golang official does not intend to solve ……
2. Third-party interpreter for Golang (Yaegi)
Interpreters are generally found only in scripting languages, but Traefik has developed an interpreter for Golang in order to implement dynamically loaded plug-in functionality. It provides the ability to execute Golang source code directly at runtime.
Reference project: https://github.com/traefik/yaegi
- Complete support of Go specification
- Written in pure Go, using only the standard library
- Simple interpreter API:
- Works everywhere Go works
- All Go & runtime resources accessible from script (with control)
syscallpackages neither used nor exported by default
- Support Go 1.18 and Go 1.19 (the latest 2 major releases)
2.1 Usage scenarios
There are three usage scenarios recommended by yaegi.
- inline interpreter
- dynamic extension framework
- command-line interpreter
And the official scenarios for the above three scenarios, the corresponding examples are given.
2.1.1 Embedded interpreters
2.1.2 dynamic extension framework
2.1.3 Command Line Interpreter
Yaegi provides a command line tool that implements a read-execute-display loop.
2.2 Data Interaction
There are many ways to interact with data, but it should be noted that the data returned from inside the interpreter is of type
reflect.Value and type conversion is required to get its actual value.
2.2.1 Data input
There are (but not limited to) the following four methods.
- importing data via os.Args
- data input via environment variables
- via assignment statements
- via function calls
Here is a code example I wrote myself.
2.1.2 Data output
Getting data from the interpreter, which is actually getting the value of a global variable, can be done by the following methods.
- directly by the Eval method
- through function calls
- the Global method to get all global variables
3. Principles of Implementation
The implementation principles of the interpreter are similar across languages.
Golang provides the ability to build an abstract syntax tree directly due to its powerful base library. It is much easier to implement a script interpreter based on an abstract syntax tree.
3.1 AST - Abstract Syntax Tree
In computer science, an Abstract Syntax Tree (AST), or Syntax tree for short, is an abstract representation of the syntactic structure of a source code. It represents the syntactic structure of a programming language in a tree-like form, with each node in the tree representing a structure in the source code.
Golang provides abstract syntax tree related capabilities through the
go/ast package (https://pkg.go.dev/go/ast).
3.1.1 Abstract syntax tree example
We take a subset of Golang grammar for our example: a simple conditional expression.
The abstract syntax tree looks like this.
3.1.2 Executing an abstract syntax tree
Briefly explain what should be done if you want to execute the abstract syntax tree.
The execution process is similar to the program execution process. First traverse the list of declarations and initialize the declared contents to heap memory (a dictionary can be used instead). Deep-first traversal of the abstract syntax tree, handling abstract objects encountered during the traversal, for example (just an example, there may be discrepancies in practice)
- initializing the heap memory and execution stack.
- traverse the declaration section, write to the heap, and wait for the call.
- find the main function declaration, the main function on the stack, traverse its function body statements, and execute a depth-first traversal statement by statement.
- Write to the top-of-stack cache if a variable definition is encountered.
- If a function call is encountered, the function is put on the stack. Find the function definition from the heap, traverse its function body statement, and execute the statement recursively.
- When a variable is encountered, the value is fetched from the following locations in order: top-of-stack cache -> heap memory
- When an expression is encountered, the expression is executed recursively.
- After the function body is executed, it will exit the stack and write the return value to the top-of-stack cache.
- The above recursive process is completed and the program ends.
The above is a simple implementation process that does not deal with special syntax and syntactic sugar, which are defined differently for each language and need to be handled separately. For example, the syntax supported by Golang can be found at: https://pkg.go.dev/go/ast
If all the syntax defined there can be handled, a golang script interpreter can be implemented.
For the simple example above (3.1.1), it can be executed directly by the following code.
(No functions are handled, only parentheses and a limited number of operators. The execution stack is also not defined, and the heap memory is replaced by the global variable Args)
The execution results are as follows.