If you know Haskell, you are familiar with the following expressions
Several of the above expressions produce infinite lists. For those of you who are used to mainstream programming voices may be confused as to how the concept of infinity can be expressed inside a finite amount of memory. The main reason is that Haskell is a language that uses an inert evaluation strategy by default, and the part that is not used is just an expression inside the memory and does not really do the computation.
If we just look at the above expressions, many of you may say that it doesn’t feel like there’s anything magical about it, and it doesn’t seem to do anything. Let’s look at the following code again.
fibonacci series in Haskell.
fibonacci itself is an inert structure, so in the calculation, it will first count the two
1s in front of the list, to get
1 : 1..., and then how to express the
fib(n) = fib(n - 1) + fib(n - 2) property of
fibonacci? We can notice that
n - 1 and
n - 2 are exactly one place apart in the series, so
n can be seen as the result of the sum of the wrong places in the series.
Let’s look at a sieve method for finding prime numbers. If you are not familiar with the sieve method, you can click
wiki to see how the algorithm works. The following code is a simple implementation of
So, Why Lazy?
Inert lists make the code and structure more flexible for certain list operations of indefinite length. To use the
primes list as an example, in a traditional C or Java implementation, we would have to declare a maximum length or a maximum range of values, such as prime numbers up to 10000. If the later calculation is to use more than this range, we will have to re-call the generation function to regenerate a longer list. The problem here is: first, to actively call this factory function, and second, to manually maintain a cache list if we want to reuse the already computed data, which inevitably increases the complexity of the code. Another possible scenario is that we pre-generate a very long list and only use a drop of data from the head of the list in the later calculations, which is also a great waste.
The use of inert lists increases the expressiveness of our programming, allowing us to focus more on the properties of the data structure itself, rather than wasting time on how to manage the stack. Because the inert value feature ensures that we only compute a value when we need it, we can automatically minimize our computation and save resources.
For example, we can read and write files with lazy byteString, which itself does not load the whole file into our memory, but reads it on demand. Sometimes we can read a large file and filter out only the first few dozens of data we need, but we have to put hundreds of M or even G files into memory.
Here is also an article from more than 10 years ago How to Speed Up Lo-Dash ×100? Introducing Lazy Evaluation, if you are interested, you can click on it.
fetch method itself is executed immediately, but if the condition is met, the
fetch method here will be executed twice. This is not the behavior we expect, and the usual way to make this
fetch action execute when we need it, instead of starting it when it is declared, is to change it to look like the following.
Inspired by this, we can roughly implement the following structure.
List<T> is essentially a single-linked table, and the
tail passed into the constructor is a factory function that builds new
List nodes. Only when we access a node, we evaluate its
head, and when we access its next node, we evaluate
tail, otherwise both
tail are just expressions to be evaluated.
This seems to have solved my problem, but this structure has a lot of unnecessary extra overhead when converting to and from normal
So is there a more natural structure in
Getting Started with Iterable
A new feature of ES6 gave me the answer I was looking for, Iteration Protocols. If the MDN description is too long, you can just look at the equivalent type declaration below.
All objects that implement an Iterable interface can be used by means of an object such as
... .itor and
Array.from. The iteration ends when
true in the return value of the
next method. And only when we access the
next method will we move to the next iteration, the ideal
At this point we look at how our
fibonacci should be written.
At this point, we can already express an inert infinite series. But the above code is too cumbersome, so it’s a good thing that
ES6 also provides us with
Generator, which makes it easy to write
IterableItorator, and in a sense,
Generator is syntactic sugar for the above code.
Generator, the above code can be simplified to look like the following.
Define Infinite List
Moving on from the above code, the following code implements the
range and other methods from the beginning of the article.
As you can see, the code is very intuitive and easy to understand.
Once we have the list, we need to operate on top of it. The following code implements the
map/filter/take/takeWhile methods respectively.
The above code is relatively simple. The more difficult part is to implement the
zip method, that is, how to merge two lists into one?
The difficulty is that if you receive an
Iterable object, you don’t necessarily need to implement the next method, such as
String, etc. Also, not all
Iterable objects can be accessed by
index. In addition, if we want to turn
Array.from into an array first and then operate on the array, we will encounter a situation where the
Iterable object we pass in is infinite, like
fibonacci above, in which case we can’t use
One of my ideas is to elevate an
Iterable object to an
IterableItorator object, and then iterate through it one by one using the next method.
Generator gives us a
yield* operator that allows us to easily define a
lift method, it is easy to write
zip methods and
More methods can be found at the bottom of my
repo, so I won’t list them all here.
Iterator are very powerful language-level capabilities that
ES6 brings to the table, and its own evaluation can be seen as inert.
TJ first came out around 13 years ago, the code was amazingly short and concise. However, in our use, one is limited by browser compatibility, and the other is limited by our use scenario, I think we have not developed its features far enough. Combining
Iterator can do a lot more for us.
On the contrary, the abuse of inert structures will cache a lot of thunk during the execution of the program and increase the overhead in memory.
- GitHub lazyList。
- Slide Lazy List With Generator and Iterator