1. Technical background

1.1 Dynamic linking technology for programs

In the actual development process, we often need to dynamically update the functions of the program, or add or update the program modules without changing the main program files.

The first and most common is the Dynamic Link Library (DLL) supported by the Windows platform, usually with the suffix .dll. Its advantages are very obvious:

  1. multiple programs can share code and data. That is, multiple programs load the same DLL file.
  2. It is natural to divide the program into several modules. Each module is output as a separate DLL file, which is loaded and executed by the main program.
  3. Cross-language calling. Since DLL files are language-independent, a DLL file can be loaded and executed by multiple programming languages.
  4. Easy to update. In the process of program update, only the DLL file of the corresponding module can be updated without redeploying the whole program.
  5. Provide technical possibility for hot update. Dynamic link libraries can be programmatically loaded and unloaded, thus supporting the update of modules without restarting the program.
  6. Provide programming interfaces for programs. It is possible to encapsulate the calling interface of your program as a DLL file for other programs to call.

1.1.2 Dynamic shared objects

In Linux, this technology is called dynamic shared objects (dynamic shared objects) and is commonly suffixed with .so.

In addition to the above-mentioned advantages of “dynamically linked libraries”, dynamic shared objects can also solve the underlying interface compatibility problems caused by the openness of Linux. That is, the dynamic shared object encapsulates the underlying interface of the operating system and provides a unified call interface for upper-level applications to call. It is equivalent to providing a compatibility layer.

1.1.3 Dynamic techniques for non-compiled languages

Non-compiled languages, since they are distributed through source code, implement dynamic loading of program modules or updating modules by directly modifying the source code. The idea is simple and easy to implement.

1.2 Golang’s Dynamic Techniques

Golang, as a compiled development language, does not inherently support dynamic loading and updating via source code. However, Golang officially provides the Plugin technology to achieve dynamic loading.

Compile a Go program into a Plugin by adding the following parameters at compile time.

1
go build -buildmode=plugin

However, this technique is very limited in the current version (1.19). As you can see from its documentation https://pkg.go.dev/plugin

  1. platform limitations, currently only supported: Linux, FreeBSD and macOS
  2. uninstallation limitation, only dynamic loading is supported, not dynamic uninstallation.
  3. does not provide a unified interface, can only handle the internal properties and functions of Plugin through reflection.

And the above problems, Golang official does not intend to solve ……

2. Third-party interpreter for Golang (Yaegi)

Interpreters are generally found only in scripting languages, but Traefik has developed an interpreter for Golang in order to implement dynamically loaded plug-in functionality. It provides the ability to execute Golang source code directly at runtime.

Reference project: https://github.com/traefik/yaegi

Features

  • Complete support of Go specification
  • Written in pure Go, using only the standard library
  • Simple interpreter API: New(), Eval(), Use()
  • Works everywhere Go works
  • All Go & runtime resources accessible from script (with control)
  • Security: unsafe and syscall packages neither used nor exported by default
  • Support Go 1.18 and Go 1.19 (the latest 2 major releases)

2.1 Usage scenarios

There are three usage scenarios recommended by yaegi.

  1. inline interpreter
  2. dynamic extension framework
  3. command-line interpreter

And the official scenarios for the above three scenarios, the corresponding examples are given.

2.1.1 Embedded interpreters

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
package main

import (
    "github.com/traefik/yaegi/interp"
    "github.com/traefik/yaegi/stdlib"
)

func main() {
    i := interp.New(interp.Options{})

    i.Use(stdlib.Symbols)

    _, err := i.Eval(`import "fmt"`)
    if err != nil {
        panic(err)
    }

    _, err = i.Eval(`fmt.Println("Hello Yaegi")`)
    if err != nil {
        panic(err)
    }
}

2.1.2 dynamic extension framework

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
package main

import "github.com/traefik/yaegi/interp"

const src = `package foo
func Bar(s string) string { return s + "-Foo" }`

func main() {
    i := interp.New(interp.Options{})

    _, err := i.Eval(src)
    if err != nil {
        panic(err)
    }

    v, err := i.Eval("foo.Bar")
    if err != nil {
        panic(err)
    }

    bar := v.Interface().(func(string) string)

    r := bar("Kung")
    println(r)
}

2.1.3 Command Line Interpreter

Yaegi provides a command line tool that implements a read-execute-display loop.

1
2
3
4
5
6
7
$ yaegi
> 1 + 2
3
> import "fmt"
> fmt.Println("Hello World")
Hello World
>

2.2 Data Interaction

There are many ways to interact with data, but it should be noted that the data returned from inside the interpreter is of type reflect.Value and type conversion is required to get its actual value.

2.2.1 Data input

There are (but not limited to) the following four methods.

  1. importing data via os.Args
  2. data input via environment variables
  3. via assignment statements
  4. via function calls

Here is a code example I wrote myself.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
package main

import (
    "fmt"

    "github.com/traefik/yaegi/interp"
    "github.com/traefik/yaegi/stdlib"
)
func main() {
    { // Pass in data via os.Args
        i := interp.New(interp.Options{
            Args: []string{"666"},
        })
        i.Use(stdlib.Symbols)
        i.Eval(`import "fmt"`)
        i.Eval(`import "os"`)
        i.Eval(`fmt.Printf("os.Args[0] --- %s\n", os.Args[0])`) 
                // os.Args[0] --- 666
    }
    { // Importing data via environment variables
        i := interp.New(interp.Options{
            Env: []string{"inputEnv=666"},
        })
        i.Use(stdlib.Symbols)
        i.Eval(`import "fmt"`)
        i.Eval(`import "os"`)
        i.Eval(`fmt.Printf("os.Getenv(\"inputEnv\") --- %s\n", os.Getenv("inputEnv"))`)
                // os.Getenv("inputEnv") --- 666
    }
    { // Execute assignment statements to pass in data
        i := interp.New(interp.Options{})
        i.Use(stdlib.Symbols)
        i.Eval(`import "fmt"`)
        i.Eval(fmt.Sprintf("inputVar:=\"%s\"", "666"))
        i.Eval(`fmt.Printf("inputVar --- %s\n", inputVar)`)
                // inputVar --- 666
    }
        { // Passing through function calls
        i := interp.New(interp.Options{})
        i.Use(stdlib.Symbols)
        i.Eval(`import "fmt"`)
        i.Eval(`var data map[string]interface{}`)
        i.Eval(`func SetData(d map[string]interface{}){ data = d }`)
        f, _ := i.Eval("SetData")
        fun := f.Interface().(func(map[string]interface{}))
        fun(map[string]interface{}{
            "data01": 666,
        })
        i.Eval(`fmt.Printf("SetData --- %d\n", data["data01"])`)
                // SetData --- 666
    }
}

2.1.2 Data output

Getting data from the interpreter, which is actually getting the value of a global variable, can be done by the following methods.

  1. directly by the Eval method
  2. through function calls
  3. the Global method to get all global variables
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
package main

import (
    "fmt"

    "github.com/traefik/yaegi/interp"
    "github.com/traefik/yaegi/stdlib"
)

func main() {
    { // Get directly through Eval
        i := interp.New(interp.Options{})
        i.Use(stdlib.Symbols)
        i.Eval(`data := 666`)
        v, _ := i.Eval("data")
        value := v.Interface().(int)
        fmt.Printf("data = %d\n", value)
                // data = 666
    }
        { // Get by function return value
        i := interp.New(interp.Options{})
        i.Use(stdlib.Symbols)
        i.Eval(`data := 666`)
        i.Eval(`func GetData() int {return data}`)
        f, _ := i.Eval("GetData")
        fun := f.Interface().(func() int)
        fmt.Printf("data = %d\n", fun())
                // data = 666
    }
    { // Get directly through Eval
        i := interp.New(interp.Options{})
        i.Use(stdlib.Symbols)
        i.Eval(`dataInt := 666`)
        i.Eval(`dataStr := "666"`)
        for name, v := range i.Globals() {
            value := v.Interface()
            switch value.(type) {
            case int:
                fmt.Printf("%s = %d\n", name, value)
                                // dataInt = 666
            case string:
                fmt.Printf("%s = %s\n", name, value)
                                // dataStr = 666
            }
        }

    }
}

3. Principles of Implementation

The implementation principles of the interpreter are similar across languages.

Golang provides the ability to build an abstract syntax tree directly due to its powerful base library. It is much easier to implement a script interpreter based on an abstract syntax tree.

3.1 AST - Abstract Syntax Tree

In computer science, an Abstract Syntax Tree (AST), or Syntax tree for short, is an abstract representation of the syntactic structure of a source code. It represents the syntactic structure of a programming language in a tree-like form, with each node in the tree representing a structure in the source code.

Golang provides abstract syntax tree related capabilities through the go/ast package (https://pkg.go.dev/go/ast).

3.1.1 Abstract syntax tree example

We take a subset of Golang grammar for our example: a simple conditional expression.

1
`A!=1 && (B>1 || (C<1 && A>2))`

The abstract syntax tree looks like this.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
     0  *ast.BinaryExpr {
     1  .  X: *ast.BinaryExpr {
     2  .  .  X: *ast.Ident {
     3  .  .  .  NamePos: -
     4  .  .  .  Name: "A"
     5  .  .  }
     6  .  .  OpPos: -
     7  .  .  Op: !=
     8  .  .  Y: *ast.BasicLit {
     9  .  .  .  ValuePos: -
    10  .  .  .  Kind: INT
    11  .  .  .  Value: "1"
    12  .  .  }
    13  .  }
    14  .  OpPos: -
    15  .  Op: &&
    16  .  Y: *ast.ParenExpr {
    17  .  .  Lparen: -
    18  .  .  X: *ast.BinaryExpr {
    19  .  .  .  X: *ast.BinaryExpr {
    20  .  .  .  .  X: *ast.Ident {
    21  .  .  .  .  .  NamePos: -
    22  .  .  .  .  .  Name: "B"
    23  .  .  .  .  }
    24  .  .  .  .  OpPos: -
    25  .  .  .  .  Op: >
    26  .  .  .  .  Y: *ast.BasicLit {
    27  .  .  .  .  .  ValuePos: -
    28  .  .  .  .  .  Kind: INT
    29  .  .  .  .  .  Value: "1"
    30  .  .  .  .  }
    31  .  .  .  }
    32  .  .  .  OpPos: -
    33  .  .  .  Op: ||
    34  .  .  .  Y: *ast.ParenExpr {
    35  .  .  .  .  Lparen: -
    36  .  .  .  .  X: *ast.BinaryExpr {
    37  .  .  .  .  .  X: *ast.BinaryExpr {
    38  .  .  .  .  .  .  X: *ast.Ident {
    39  .  .  .  .  .  .  .  NamePos: -
    40  .  .  .  .  .  .  .  Name: "C"
    41  .  .  .  .  .  .  }
    42  .  .  .  .  .  .  OpPos: -
    43  .  .  .  .  .  .  Op: <
    44  .  .  .  .  .  .  Y: *ast.BasicLit {
    45  .  .  .  .  .  .  .  ValuePos: -
    46  .  .  .  .  .  .  .  Kind: INT
    47  .  .  .  .  .  .  .  Value: "1"
    48  .  .  .  .  .  .  }
    49  .  .  .  .  .  }
    50  .  .  .  .  .  OpPos: -
    51  .  .  .  .  .  Op: &&
    52  .  .  .  .  .  Y: *ast.BinaryExpr {
    53  .  .  .  .  .  .  X: *ast.Ident {
    54  .  .  .  .  .  .  .  NamePos: -
    55  .  .  .  .  .  .  .  Name: "A"
    56  .  .  .  .  .  .  }
    57  .  .  .  .  .  .  OpPos: -
    58  .  .  .  .  .  .  Op: >
    59  .  .  .  .  .  .  Y: *ast.BasicLit {
    60  .  .  .  .  .  .  .  ValuePos: -
    61  .  .  .  .  .  .  .  Kind: INT
    62  .  .  .  .  .  .  .  Value: "2"
    63  .  .  .  .  .  .  }
    64  .  .  .  .  .  }
    65  .  .  .  .  }
    66  .  .  .  .  Rparen: -
    67  .  .  .  }
    68  .  .  }
    69  .  .  Rparen: -
    70  .  }
    71  }

3.1.2 Executing an abstract syntax tree

Briefly explain what should be done if you want to execute the abstract syntax tree.

The execution process is similar to the program execution process. First traverse the list of declarations and initialize the declared contents to heap memory (a dictionary can be used instead). Deep-first traversal of the abstract syntax tree, handling abstract objects encountered during the traversal, for example (just an example, there may be discrepancies in practice)

  1. initializing the heap memory and execution stack.
  2. traverse the declaration section, write to the heap, and wait for the call.
  3. find the main function declaration, the main function on the stack, traverse its function body statements, and execute a depth-first traversal statement by statement.
    1. Write to the top-of-stack cache if a variable definition is encountered.
    2. If a function call is encountered, the function is put on the stack. Find the function definition from the heap, traverse its function body statement, and execute the statement recursively.
    3. When a variable is encountered, the value is fetched from the following locations in order: top-of-stack cache -> heap memory
    4. When an expression is encountered, the expression is executed recursively.
    5. After the function body is executed, it will exit the stack and write the return value to the top-of-stack cache.
  4. The above recursive process is completed and the program ends.

The above is a simple implementation process that does not deal with special syntax and syntactic sugar, which are defined differently for each language and need to be handled separately. For example, the syntax supported by Golang can be found at: https://pkg.go.dev/go/ast

If all the syntax defined there can be handled, a golang script interpreter can be implemented.

For the simple example above (3.1.1), it can be executed directly by the following code.

(No functions are handled, only parentheses and a limited number of operators. The execution stack is also not defined, and the heap memory is replaced by the global variable Args)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
package main

import (
    "fmt"
    "go/ast"
    "go/parser"
    "go/token"
    "strconv"
)

var Args map[string]int

func main() {
    {
        Args = map[string]int{"A": 1, "B": 2, "C": 3}
        code := `A==1 && (B>1 || C<1)`
        expr, _ := parser.ParseExpr(code)
        result := runExpr(expr)
        fmt.Println(result)
    }
    {
        Args["A"] = 3
        Args = map[string]int{"A": 1, "B": 2, "C": 3}
        code := `A!=1 && (B>1 || (C<1 && A>2))`
        expr, _ := parser.ParseExpr(code)
        result := runExpr(expr)
        fmt.Println(result)
    }
}

// Execution of expressions
// Supported operations: >, <, ==, ! =, &&, ||
// Supports parenthesis nesting
func runExpr(expr ast.Expr) interface{} {
    var result interface{}
    // Binary expressions
    if binaryExpr, ok := expr.(*ast.BinaryExpr); ok {
        switch binaryExpr.Op.String() {
        case "&&":
            x := runExpr(binaryExpr.X)
            y := runExpr(binaryExpr.Y)
            return x.(bool) && y.(bool)
        case "||":
            x := runExpr(binaryExpr.X)
            y := runExpr(binaryExpr.Y)
            return x.(bool) || y.(bool)
        case ">":
            x := runExpr(binaryExpr.X)
            y := runExpr(binaryExpr.Y)
            return x.(int) > y.(int)
        case "<":
            x := runExpr(binaryExpr.X)
            y := runExpr(binaryExpr.Y)
            return x.(int) < y.(int)
        case "==":
            x := runExpr(binaryExpr.X)
            y := runExpr(binaryExpr.Y)
            return x.(int) == y.(int)
        case "!=":
            x := runExpr(binaryExpr.X)
            y := runExpr(binaryExpr.Y)
            return x.(int) != y.(int)
        }
    }
    // Basic type values
    if basicLit, ok := expr.(*ast.BasicLit); ok {
        switch basicLit.Kind {
        case token.INT:
            v, _ := strconv.Atoi(basicLit.Value)
            return v
        }
    }
    // Identifiers
    if ident, ok := expr.(*ast.Ident); ok {
        return Args[ident.Name]
    }
    // Parenthesis expressions
    if parenExpr, ok := expr.(*ast.ParenExpr); ok {
        return runExpr(parenExpr.X)
    }

    return result
}

The execution results are as follows.

1
2
A==1 && (B>1 || C<1) => true
A!=1 && (B>1 || (C<1 && A>2)) => false