As the Javascript language has evolved, the ES6 specification has brought many new things to the table, and one of the key features is the generator Generators. With this feature, we can simplify the creation of iterators, and even more excitingly, Generators allow us to pause a function during execution and resume execution at some point in the future. This feature is a change from the old feature that functions had to finish executing before they returned, and applying this feature to asynchronous code can effectively simplify writing asynchronous methods while avoiding callback hell.

In this article, we will briefly introduce Generators, and then focus on the operation mechanism of Generators and its implementation in ES5 with my experience in C#.

Introduction

A simple Generator function example.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
function* example() {
  yield 1;
  yield 2;
  yield 3;
}
var iter = example();
iter.next(); // {value:1,done:false}
iter.next(); // {value:2,done:false}
iter.next(); // {value:3,done:false}
iter.next(); // {value:undefined,done:true}

The above code defines a generator function that is not executed immediately when the generator function example() is called, but instead returns a generator object. Whenever the .next() method of the generator object is called, the function will run to the next yield expression, return the result of the expression, and pause itself. When the end of the generator function is reached, the value of done is true and the value of value is undefined. we will call the above example() function a generator function, and the difference between the two is as follows

  • Normal functions use function declaration, generator functions use function* declaration.
  • Normal functions use return, generator functions use yield.
  • Normal functions are in run to completion mode, i.e., they are executed until all statements of the function are completed, during which time other code statements are not executed; generator functions are in run-pause-run mode, i.e., generator functions can be paused once or more during function execution and resumed later, during which allows other code statements to be executed

Generators in C

Generators are not a new concept, I was first introduced to them when I was learning to use C#, which introduced the yield keyword from version 2.0, making it easier to create enumerated numbers and enumerable types. The difference is that instead of calling them generators Generators in C#, they are called iterators.

This article will not cover the C# enumerable class IEnumerable and the enumerator IEnumerator, for that we recommend reading the chapter “C# 4.0 Illustrated Tutorial”.

C# Iterator Introduction

Let’s start with an example where the following method declaration implements an iterator that generates and returns an enumeration number.

1
2
3
4
5
public IEnumerable <int> Example() {
  yield return 1;
  yield return 2;
  yield return 3;
}

The method definition is very close to the ES6 Generators definition in that it declares a generic enumerable type that returns an int type, and the method body returns the value and suspends itself via a yield return statement.

Iterators are used to create classes of enumerable types.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
class YieldClass {
  public IEnumerable<int> Example() { // 迭代器
    yield return 1;
    yield return 2;
    yield return 3;
  }
}

class Program {
  static void Main() {
    YieldClass yc = new YieldClass();
    foreach(var a in yc.Example())
      Console.WriteLine(a);
  }
}

The above code will produce the following output.

1
2
3
1
2
3

C# Iterator Principle

Net runtime feature, but rather a syntactic sugar that is compiled into simple IL code by the C# compiler when the code is compiled.

Continuing with the above example, we can see through the Reflector decompiler tool that the compiler generates an internal class for us with the following declaration.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
[CompilerGenerated]
private sealed class YieldEnumerator :
   IEnumerable<object>, IEnumerator<object>
{
    // Fields字段
    private int state;
    private int current;
    public YieldClass owner;
    private int initialThreadId;

    // Methods方法
    [DebuggerHidden]
    public YieldEnumerator(int state);
    private bool MoveNext();
    [DebuggerHidden]
    IEnumerator<int> IEnumerable<int>.GetEnumerator();
    [DebuggerHidden]
    IEnumerator IEnumerable.GetEnumerator();
    [DebuggerHidden]
    void IEnumerator.Reset();
    void IDisposable.Dispose();

    // Properties属性
    object IEnumerator<object>.Current
    { [DebuggerHidden] get; }

    object IEnumerator.Current
    { [DebuggerHidden] get; }
}

The original Example() method returns only an instance of YieldEnumerator and passes the initial state -2 to itself and its referrer, each iterator saving one state indication.

  • -2: initialized to the iterable class Enumerable
  • -1: End of iteration
  • 0: initialized to iterator Enumerator
  • 1-n: yield return index value in the original Example() method

The code in the Example() method is converted to YieldingEnumerator.MoveNext(), and in our example the converted code is as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
bool MoveNext() {
  switch (state) {
    case 0:
      state = -1;
      current = 1;
      state = 1;
      return true;
    case 1:
      state = -1;
      current = 2;
      state = 2;
      return true;
    case 2:
      state = -1;
      current = 3;
      state = 3;
      return true;
    case 3:
      state = -1;
      break;
  }
  return false;
}

Using the above code transformation, the compiler generates a state machine for us and it is based on this state machine model that the properties of the yield keyword are implemented.

The iterator state machine model can be shown in the following figure.

iterator state machine model

  • Before is the initial state of the iterator
  • Running is the state entered after calling MoveNext. In this state, the enumerator detects and sets the position of the next item. When yield return, yield break or end of iteration is encountered, this state is exited
  • Suspended is the state where the state machine waits for the next call to MoveNext.
  • After is the state where the iteration ends

Generators in Javascript

By reading the above, we understand the use of Generator in C#, and by looking at the IL code generated by the compiler, we know that the compiler generates an internal class to hold the context information, and then converts the yield return expression into a switch case, which implements the yield keyword through the state machine model.

Javascript Generators Principle

How is the yield keyword implemented in Javascript?

First, generators are not threaded. In threaded languages, multiple pieces of code can run at the same time, which often leads to competition for resources and can provide a nice performance boost when used properly. Generators are completely different. The Javascript execution engine is still a single-threaded environment based on an event loop, and when a generator runs, it runs in the same thread called the caller. The order of execution is ordered, deterministic, and never concurrent. Unlike the system threads, the generator will only be hung when it is used internally to yield.

Since generators are not additionally supported by the engine from the ground up, we can follow our experience in exploring the principles of the yield feature in C# above and consider generators as a syntactic sugar, using a helper to convert generator functions to ordinary Javascript code, with two key points in the converted code, namely preserving the contextual information of the function and implementing The two key points in the transformed code are to preserve the function’s context information and to implement a sound iteration method that allows multiple yield expressions to be executed in order, thus realizing the properties of generators.

How Generators work in ES5

The Regenerator tool already implements the above idea, and with the Regenerator tool we can already use generator functions in native ES5. In this section, we’ll analyze the Regenerator implementation to get a deeper understanding of how Generators work.

This online address allows you to easily view the converted code, still using the article as an initial example.

1
2
3
4
5
6
7
function* example() {
  yield 1;
  yield 2;
  yield 3;
}
var iter = example();
iter.next();

After conversion, it is as follows.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
var marked0$0 = [example].map(regeneratorRuntime.mark);
function example() {
  return regeneratorRuntime.wrap(
    function example$(context$1$0) {
      while (1)
        switch ((context$1$0.prev = context$1$0.next)) {
          case 0:
            context$1$0.next = 2;
            return 1;

          case 2:
            context$1$0.next = 4;
            return 2;

          case 4:
            context$1$0.next = 6;
            return 3;

          case 6:
          case 'end':
            return context$1$0.stop();
        }
    },
    marked0$0[0],
    this
  );
}
var iter = example();
iter.next();

As you can see from the converted code, similar to the C# compiler’s conversion of yield return expressions, Regenerator rewrites the yield expressions in the generator functions as switch cases, and uses context110 in each case to preserve the current context state of the function.

In addition to the switch case, the iterator function example is wrapped by regeneratorRuntime.mark and returns an iterator object wrapped by regeneratorRuntime.wrap.

1
2
3
4
5
6
7
8
9
runtime.mark = function(genFun) {
  if (Object.setPrototypeOf) {
    Object.setPrototypeOf(genFun, GeneratorFunctionPrototype);
  } else {
    genFun.__proto__ = GeneratorFunctionPrototype;
  }
  genFun.prototype = Object.create(Gp);
  return genFun;
};

The example is wrapped in the following object by mark wrapping.

js object

When the generator function example() is called, an iterator object wrapped by the wrap function is returned.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
runtime.wrap = function(innerFn, outerFn, self, tryLocsList) {
  // If outerFn provided, then outerFn.prototype instanceof Generator.
  var generator = Object.create((outerFn || Generator).prototype);
  var context = new Context(tryLocsList || []);

  // The ._invoke method unifies the implementations of the .next,
  // .throw, and .return methods.
  generator._invoke = makeInvokeMethod(innerFn, self, context);

  return generator;
};

The returned iterator object is shown below.

js object

When calling the iterator object iter.next() method, the _invoke method is executed because of the following code, and according to the previous wrap method code, the makeInvokeMethod(innerFn, self, context); method of the iterator object is finally called.

1
2
3
4
5
6
7
8
9
// Helper for defining the .next, .throw, and .return methods of the
// Iterator interface in terms of a single ._invoke method.
function defineIteratorMethods(prototype) {
  ['next', 'throw', 'return'].forEach(function(method) {
    prototype[method] = function(arg) {
      return this._invoke(method, arg);
    };
  });
}

The makeInvokeMethod method has a lot of content, so here is a partial analysis. First, we find that the generator initializes its state to “Suspended Start”.

1
2
3
4
function makeInvokeMethod(innerFn, self, context) {
  var state = GenStateSuspendedStart;

  return function invoke(method, arg) {

makeInvokeMethod returns the invoke function, and when we execute the .next method, the following statement in the invoke method is actually called.

1
var record = tryCatch(innerFn, self, context);

In the tryCatch method, fn is the converted example$ method and arg is the context object context, because the reference to context inside the invoke function forms a closure reference, so the context context is maintained throughout the iteration.

1
2
3
4
5
6
7
function tryCatch(fn, obj, arg) {
  try {
    return { type: 'normal', arg: fn.call(obj, arg) };
  } catch (err) {
    return { type: 'throw', arg: err };
  }
}

The tryCatch method actually calls the example$ method, which enters the converted switch case and executes the code logic. If the result is a normal type value, we wrap it in an iterable object format and update the generator state to GenStateCompleted or GenStateSuspendedYield.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
var record = tryCatch(innerFn, self, context);
if (record.type === "normal") {
  // If an exception is thrown from innerFn, we leave state ===
  // GenStateExecuting and loop back for another invocation.
  state = context.done
    ? GenStateCompleted
    : GenStateSuspendedYield;

  var info = {
    value: record.arg,
    done: context.done
  };

Summary

By analyzing the Regenerator transformed generator code and tool source code, we have explored the operation of the generator, which wraps the generator function with tool functions, adding methods such as next/return. It also wraps the returned generator object so that calls to methods such as next end up in a state machine model consisting of switch case. In addition, the closure technique is used to preserve the generator function context information.

The above process is basically the same as the implementation of the yield keyword in C#, which uses the compile conversion idea, the state machine model, and the preservation of function context information to realize the new language features brought by the new yield keyword.