The abstract syntax tree is an intermediate product of the compilation process, and is generally just a simple thing to understand. But we can take the whole parser and ast package from Go and use it directly, which can be very powerful in some scenarios.

What is ast? I took an excerpt from Wikipedia.

In computer science, an Abstract Syntax Tree (AST), or Syntax tree for short, is an abstract representation of the syntactic structure of source code. It represents the syntactic structure of a programming language in a tree-like form, with each node in the tree representing a structure in the source code.

At its core, it means that ast can represent the code structure in the form of a tree. With a tree structure, you can do traversals of it and can do a lot of things.

Assume a scenario

Assume a scenario: we can get various characteristics of the driver from an interface of the driver platform, such as: age, number of orders, income, daily driving hours, driving age, average speed, the number of complaints …… data is generally delivered using json.

Driver platform operations often need to engage in some activities, such as the election of.

  • the number of orders more than 10000, and the driving age of more than 5 years of old drivers
  • Efficient drivers with less than 3 hours of driving time per day and more than 500 income
  • “Wild” drivers with age over 40 and average speed over 70
  • ……

These rules are not fixed and change often, but they are always a combination of driver characteristics.

For simplicity, we pick 2 characteristics and represent them in a Driver structure.

1
2
3
4
type Driver struct {
	Orders         int
	DrivingYears   int
}

In order to cooperate with the operation of activities, we need to determine whether a driver meets the requirements according to the rules given by the operation.

If the company has a lot of people, you can arrange a rd dedicated to serve the operations lady, every time you do activities to manually modify the code, it is not impossible. And actually quite simple, let’s write a sample code.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
// 从第三方获取司机特征,json 表示
func getDriverRemote() []byte {
	return []byte(`{"orders":100000,"driving_years":18}`)
}

// 判断是否为老司机
func isOldDriver(d *Driver) bool {
	if d.Orders > 10000 && d.DrivingYears > 5 {
		return true
	}
	return false
}

func main() {
	bs := getDriverRemote()
	var d Driver
	json.Unmarshal(bs, &d)
	fmt.Println(isOldDriver(&d))
}

Looking directly at the main function: getDriverRemote simulates getting a driver’s feature data from a third-party RPC, represented as a json. Then json.Unmarshal to deserialize the Driver structure. Finally, the isOldDriver function is called to determine if the driver matches the rules of the operation.

isOldDriver uses an if statement to determine if the driver is an old driver based on the 2 fields of the Driver structure.

It’s really quite simple.

But each time I update the rules, I have to go through a complete online process, which is also quite troublesome. Is there a simpler way? So that we can directly parse a string rule given to us by the operations team, and directly return a bool value to indicate whether the condition is met.

There is!

Next is the heart of this article, how to use ast to do the same.

Intuitively understand how to parse rules with ast

Using some of the functions provided by the ast package, we can very easily string a rule as follows.

1
orders > 10000 && driving_years > 5

resolves into a binary tree like this.

sobyte

BinaryExprrepresents a binary expression, which consists of three parts, X and Y, and the symbol OP. The top one,BinaryExpr`, represents the left and right halves of the rule together.

Obviously, the left half is: orders > 10000 and the right half is: driving_years > 5. Miraculously, both the left and right halves happen to be binary expressions again.

The left half of orders > 10000 is actually also the smallest leaf node, which can be calculated as a bool value. After splitting it up, it can be divided into X, Y, and OP; X is orders, OP is “>”, and Y is “10000”. Where X denotes an identifier and is of type ast.Ident, Y denotes a literal of a basic type, e.g. int, string type …… is of type ast.BasicLit.

The right half of driving_years > 18 can also be split up as such.

Then, take the value of orders field of this driver from the json to be 100000, which is larger than 10000, so the left half is counted as true. similarly, the right half is counted as true. finally, the outermost && is counted again, and the result is still true.

At this point, directly based on the rule string, we can calculate the result.

If written as a program, it is a traversal process of dfs. If it is not a leaf node, then it is a binary expression node, which must have X, Y, and OP parts. Recursively traverse X. If X is a leaf node, then end the recursion and compute the value of X. ……

Here is another abstract syntax tree printed out with the ast package.

sobyte

In the above diagram, 1, 2, 3 represent the outermost binary expression; 4, 5, 6 represent the left binary expression.

Combining this diagram with the code of the relevant structure of the ast package, it is very clear. For example, the code of ast.BinaryExpr is as follows.

1
2
3
4
5
6
7
// A BinaryExpr node represents a binary expression.
BinaryExpr struct {
	X     Expr        // left operand
	OpPos token.Pos   // position of Op
	Op    token.Token // operator
	Y     Expr        // right operand
}

It has X, Y, OP, and even resolves the position of the Op, which is represented by OpPos.

If you are still interested in the implementation, then continue to the schematic analysis section below, otherwise you can skip directly to the concluding summary section.

Principle analysis

Using the same example as above, let’s write a direct expression.

1
orders > 10000 && driving_years > 5

Next, ast is used to parse the rules and determine whether they are true or false.

1
2
3
4
5
func main() {
	m := map[string]int64{"orders": 100000, "driving_years": 18}
	rule := `orders > 10000 && driving_years > 5`
	fmt.Println(Eval(m, rule))
}

For simplicity, we use map directly instead of json, for the same reason, just for convenience.

The Eval function determines whether rule is true or false.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// Eval : 计算 expr 的值
func Eval(m map[string]int64, expr string) (bool, error) {
	exprAst, err := parser.ParseExpr(expr)
	if err != nil {
		return false, err
	}

	// 打印 ast
	fset := token.NewFileSet()
	ast.Print(fset, exprAst)

	return judge(exprAst, m), nil
}

The expression is first parsed into Expr, and then the judge function is called to compute the result.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// dfs
func judge(bop ast.Node, m map[string]int64) bool {
    // 叶子结点
	if isLeaf(bop) {
		// 断言成二元表达式
		expr := bop.(*ast.BinaryExpr)
		x := expr.X.(*ast.Ident) // 左边
		y := expr.Y.(*ast.BasicLit) // 右边

		// 如果是 ">" 符号
		if expr.Op == token.GTR {
			left := m[x.Name]
			right, _ := strconv.ParseInt(y.Value, 10, 64)
			return left > right
		}
		return false
	}

	// 不是叶子节点那么一定是 binary expression(我们目前只处理二元表达式)
	expr, ok := bop.(*ast.BinaryExpr)
	if !ok {
		println("this cannot be true")
		return false
	}

	// 递归地计算左节点和右节点的值
	switch expr.Op {
	case token.LAND:
		return judge(expr.X, m) && judge(expr.Y, m)
	case token.LOR:
		return judge(expr.X, m) || judge(expr.Y, m)
	}

	println("unsupported operator")
	return false
}

judge computes the value of an expression recursively using dfs.

The recursive termination condition is that the leaf node.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
// 判断是否是叶子节点
func isLeaf(bop ast.Node) bool {
	expr, ok := bop.(*ast.BinaryExpr)
	if !ok {
		return false
	}

	// 二元表达式的最小单位,左节点是标识符,右节点是值
	_, okL := expr.X.(*ast.Ident)
	_, okR := expr.Y.(*ast.BasicLit)
	if okL && okR {
		return true
	}

	return false
}

Summary

Today’s article focuses on how to parse a binary expression with the ast package and the parser package, and see how powerful it is to use it to make a very simple rule engine.

In fact, there are more interesting things you can do with the ast package. For example, convert thrift files to proto files in bulk, parse sql statements and do some auditing ……