Lisp is a computer programming language invented by John McCarthy in 1958. The name “Lisp” is an abbreviation for “List Processing”, which means table processing… The above is probably the accepted authoritative introduction, and a lot of information on the Internet probably says so. From this information, we can see that Lisp is a computer programming language similar to C and Java, and it is a different language from the so-called natural languages such as Chinese and English, which is probably the impression most people have of it. However, in the process of learning Lisp, I have felt a lot of places contrary to the common sense of programming languages, and I have some thoughts about it.

1. Lisp language and dialects

If we are talking about one language in particular, “Lisp language” refers to the version invented by John McCarthy in 1958. However, that version of Lisp should have remained in the books. Over the next few decades, Lisp developed and evolved rapidly, and modern Lisp is no longer the same language as the original version, but looks a bit similar, using S-expressions with parentheses, but the language features and implementations are very different. So, the term “Lisp language” is more appropriate to refer to a class of languages that includes the original Lisp and its derivative dialects (e.g. Common Lisp, Scheme, Emacs Lisp).

You may never have heard of computer dialects, after all, there is no such thing as a “C++ dialect” or a “Java dialect”, because a version of Java explicitly defines all the syntax for that version of the language, and developers cannot directly modify syntax or add syntax directly. However, Lisp provides the ability to extend the lexicon and syntax (e.g., Common Lisp macros and read macros), allowing Lisp users to add as many language features as they want to their Lisp implementation, resulting in a large number of dialects, some of which may not even have names.

Common Lisp and Scheme are the two main Lisp dialects that are relatively mainstream nowadays, each with many implementations and derived dialects. Emacs Lisp and AutoLisp are still active, but only for specific environments and scenarios, Emacs Lisp is only used in the Emacs editor and AutoLisp is embedded in AutoCAD. There are also dialects used in specific scientific fields, so I won’t go into them here.

Although Common Lisp and Scheme are the two major Lisp dialects, they are not specific Lisp implementations, but only a specification, and the corresponding specific implementations are not only the language features and APIs specified in the specification, but usually contain their own extensions. Some common implementations and their corresponding official homepages (or project addresses) are listed below. I personally recommend SBCL and ChezScheme.

Common Lisp implementations.

name site
Steel Bank Common Lisp (SBCL) http://www.sbcl.org/
GNU CLISP https://clisp.sourceforge.io/
Embeddable Common-Lisp (ECL) https://ecl.common-lisp.dev/
Armed Bear Common Lisp (ABCL) https://www.abcl.org/

Scheme implementation.

name site
Chez Scheme https://scheme.com/ or https://cisco.github.io/ChezScheme/
Racket https://www.racket-lang.org/
GNU Guile http://www.gnu.org/software/guile/

2. Lisp’s programs

Talking about a programming language can be hollow and abstract without program examples. This section shows what a Lisp program might look like.

First, let’s look at a few hello world examples.

  • Example 1. print 5 lines of hello world

    English: Print 5 lines of "hello world".

    Lisp: (print 5 lines of "hello world") or (print 5 lines of "hello world").

  • Example 2. the string “hello world”

    English: String "hello world".

    Lisp: (string "hello world") or #T(character string "hello world").

  • Example 3. John says, “hello world”

    English: "hello world," says John

    Lisp: (John says "hello world") or #P (John says "hello world") or (John says "hello world")

Some of you may have tried the above example in a Lisp implementation and encountered a bunch of errors. It’s not that I’m fooling anyone, but the above can indeed be used as a Lisp program, but it lacks the corresponding macro or operator implementation. Well, back to business, here are a few simple examples (Common Lisp implementations) that work.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
(print "hello world") ;打印"hello world"

(list 1 2 3 4)        ;创建一个列表,包含元素1、2、3、4

(defun test-fun (a b) ;定义函数test-fun执行加法
  (+ a b))

(test-fun 1 2)        ;用1和2来调用函数test-fun

(dolist (i (list 1 2 3 4)) ;对列表进行遍历并打印
  (print i))

We can see that the above Lisp programs have a common feature: they all have (xxx ......) is a form of bracketed symbols. For example, the program (list 1 2 3 4) returns (1 2 3 4) as a table containing elements 1, 2, 3, and 4, and the empty table is (). (Some may notice that #P(...) and #Y[...] such expressions, which are the mechanism for the reader macros provided by Common Lisp, which are converted by the reader into Lisp programs consisting of tables that can be used to implement some convenient syntactic sugar, or even embedded JSON-like syntax: #J{ "id" : "XX123" , "type" : 2 }.)

Lisp programs themselves are composed of such tables, and content organized with such tables can basically be Lisp programs. If you know something about parsers and interpreters, you can add other forms of code organization to Lisp with the help of extension mechanisms provided by the Lisp implementation. So, to sum up, the code form of Lisp is versatile, and it is sometimes the developer’s imagination and habits that limit the program form.

The next two Lisp programs are close to the application.

The first is a Common Lisp program, which is an example of a database query implemented with SxQL, and a macro-based DSL implementation (anyone who knows SQL will understand what this program does).

1
2
3
4
5
(select (:title :author :year)
  (from :books)
  (where (:and (:>= :year 1995)
               (:< :year 2010)))
  (order-by (:desc :year)))

The following paragraph is a reference answer to an exercise given by the author in the book Essentials of Programming Languages (the Chinese is an additional comment I added), implemented in Scheme, to build a parser for the LET language.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
(define the-lexical-spec                            ;定义词法规范
  '((whitespace (whitespace) skip)                  ;空白字符的词法(引用Scheme的空白字符)和对应操作,skip表示跳过,什么都不做
    (comment ("%" (arbno (not #\newline))) skip)    ;注释的词法(%开头直到行末)和对应操作,arbno类似正则里的"*",表示"任意个"
    (identifier                                     ;标识符的词法(字母开头加上任意个字母数字符号组合)和对应操作,symbol表示保留为符号
      (letter (arbno (or letter digit "_" "-" "?")))
      symbol)
    (number (digit (arbno digit)) number)           ;数字的词法(1个以上的数字)和对应操作,number表示保留为数字
    (number ("-" digit (arbno digit)) number)))     ;数字的词法(-开头加1个以上的数字)和对应操作,number表示保留为数字

(define the-grammar                                 ;定义语法规范
  '((program (expression) a-program)                ;程序由表达式组成
    (expression (number) const-exp)
    (expression                                     ;表达式可以是"-(1,2)"这样的表达式
      ("-" "(" expression "," expression ")")
      diff-exp)
    (expression                                     ;表达式可以是"zero?(a)"这样的表达式
      ("zero?" "(" expression ")")
      zero?-exp)
    (expression                                     ;表达式可以是"if ... then ... else"这样的表达式
      ("if" expression "then" expression "else" expression)
      if-exp)
    (expression (identifier) var-exp)               ;表达式可以是一个标识符
    (expression                                     ;表达式可以是"let a  = 1 in ..."这样的表达式
      ("let" identifier "=" expression "in" expression)
      let-exp)))

;;;;;;;;;;;;;;;; sllgen boilerplate ;;;;;;;;;;;;;;;;
(sllgen:make-define-datatypes the-lexical-spec the-grammar) ;根据词法和语法规范生成对应数据结构

(define scan&parse
  (sllgen:make-string-parser the-lexical-spec the-grammar)) ;根据词法和语法规范生成对应解析器

Lisp’s syntax is extensible and has a highly dynamic macro mechanism, so the Lisp language can extend itself and even extend the syntax at runtime to give birth to new languages. In addition, the code written in Lisp consists of a list (e.g., (+ 1 2 3 4), which sums 1, 2, 3, and 4), and a table is one of the basic structures in Lisp (e.g., '(+ 1 2 3 4) or (list '+ '1 '2 '3 '4), which represents a table containing +, 1, 2, 3, and 4 elements), so using The Lisp language makes it very easy to modify Lisp code and generate Lisp code (e.g., (setf (first '(+ 1 2 3 4)) '-), replaces the first element in the table with ‘-’ and the program changes from cumulative addition to cumulative subtraction). Because of the high consistency of data and code in the Lisp language, many people who write Lisp are able to implement their own Lisp interpreters and compilers.

3. Characteristics of Lisp

When analyzing a language, it is inevitable to analyze the advantages and disadvantages of the language compared to other languages, but the advantages and disadvantages are easily influenced by the application scenario and the developer’s ability, and may be the case of “A’s honey, B’s arsenic”. Therefore, it is better to list some of the features of Lisp and discuss them in relation to the possible advantages and disadvantages of using it in development scenarios.

When it comes to features, it may be understood as “unique points”. Lisp is an old language that has existed for more than half a century, and during this time, many of the original Lisp-specific features have been accepted and absorbed by other languages, and even become industry standards (e.g., recursive function calls), and many modern language features have been absorbed from other languages (e.g., object-oriented programming). It would be a bit one-sided to talk only about the unique points, so it is better to explain all the features of Lisp as a language. (There are many points worth mentioning in both Common Lisp and Scheme, but it is better to write a long article about them)

3.1 Simple syntax

Few languages have a simpler syntax than Lisp, as mentioned at the beginning of the SICP book “all the formal properties can be covered in an hour and then you don’t have to pay attention to the syntax details (because there are none)”, especially Scheme (Common Lisp is a bit more complex). When I did the SICP exercises, it took about 300 lines of code or less to implement the explanation of Scheme’s basic syntax. The SICP is 400 pages long, and the simple Scheme compiler interpreter is finished.

For developers, this simplicity greatly reduces learning time and puts more effort into advanced features and programming. For language processors (compilers and interpreters), it simplifies the parsing process and makes it easy to write language processors.

The main problem is the difficulty of parsing the code with the naked eye. As an example.

1
2
3
4
5
6
fun1(1,2);    //1

if (type == 0){ //2
    Console.log("...");//... 3
    return 0;          //... 3
}

You don’t need to look closely at each character in the above JavaScript code to know that 1 is a function call, 2 is an if branch, and 3 is a block of code with a glance of the eye. Replace the above code with the Common Lisp code as follows.

1
2
3
4
5
6
(fun1 1 2)

(if (= type 0)
  (progn
    (print "...")
    (return 0)))

It is not clear whether fun1 is a function, a macro or a special operator; it is not clear that if is a special operator, not conditional and branching; nor is it clear that (progn ...) introduces a block-like form of code. It is even more impossible to see the general structure of the program by a simple glance; one must observe it down to the character level. Because Lisp’s code is all S-expressions, it is sometimes impossible to tell at the lexical and syntactic level whether a symbol is a function or a macro, so the editor cannot do the same perfect syntax highlighting that Java does, making project management somewhat difficult.

3.2 Diversity of implementations and operating principles

Introducing a programming language necessarily involves the underlying implementation/running principles of the language, for example, C can be compiled into machine code for the corresponding platform using the GCC compiler, python can be interpreted by the python interpreter, and Java can be compiled into Java bytecode by the Java compiler and then interpreted by the Java virtual machine. Each language also has its own official implementation and more specific specification.

Lisp does not have an official implementation, as Common Lisp and Scheme do, and the language specification is only a specification of the language, without specifying specific interpreting and compiling behavior (but still specifying some relevant APIs). In fact, the Lisp language seems to run in all the ways that programming languages do: clisp, guile, and most other implementations can be used interactively at the command line in a REPL format; while SBCL and ChezScheme can compile Lisp into native machine code for execution; ECL can be embedded into C/C++ programs, and clisp can compile the code into ECL can be embedded in C/C++ programs, clisp can compile code into corresponding bytecode and interpret and execute bytecode; ABCL is embedded in Java programs running on top of the Java virtual machine, and so on. Unlike most languages implemented in software, in the 1980s there was a hardware device used to execute Lisp directly - the Lisp machine. That was a machine that implemented an interpreter for Lisp programs in hardware, which was all the rage in those days and may only be seen in museums now.

Moreover, Lisp had no strict restrictions on compilation and runtime, and did not necessarily have to be compiled into a standalone file for execution, nor did it have to be interpreted for execution. In a Common Lisp program, you can even do the following things.

1
2
3
4
5
6
7
8
(setf a 1 
      fun-name 'add1)                          ;设置符号a为1,函数名fun-name为add1
(setf exp1 (list '+ 'a 1))                     ;生成表达式(+ a 1),并设置给符号exp1
(eval exp1)                                    ;对符号exp1的值进行解释求值
(setf fun-def-exp `(defun ,fun-name (a) ,exp1));生成函数定义表达式(defun add1 (a) (+ a 1))
(eval fun-def-exp)                             ;解释并求值上面函数的定义
(compile fun-name)                             ;编译该函数
(funcall fun-name 2)                           ;调用该函数

Such a variety of implementations allows users to find and choose the right one for their scenarios: SBCL and ChezScheme, which program local machine code, are suitable for situations where performance is a concern; clisp, which has a good interactive experience, can be used as a command-line REPL; the streamlined ABCL and ECL can be embedded into Java and C/C++ programs, respectively, as built-in scripting engines, and so on. The abundance of runtime options also makes it possible to generate, compile, and load code at runtime, so that developers are no longer constrained by how programs are executed.

However, “diversity” is often linked to “fragmentation”, with implementations containing their own extensions and even extended syntax (e.g., Racket is on the verge of being extended into another language). Programs written according to the language specification have good portability and backward compatibility across implementations, but the portability of the program suffers if it uses features that are extended by some implementations. For example, the Japanese Common Lisp developer Eitaro Fukamachi developed an http server called woo that has excellent performance, but uses SBCL-specific features that cannot be used on other Common Lisp implementations.

3.3 Automatic memory management (GC)

Even in this era of GC, this is a feature that I don’t want to talk about, but I have to…

Lisp was the first language in history to adopt garbage collection, which was groundbreaking in those days (“the father of Lisp” was also the “father of GC”). This approach to memory management enhances memory security, reduces the mental strain on developers to manage memory, and makes the language simpler, with specific algorithms to improve memory continuity. Lisp’s powerful expressiveness, free style, and automatic memory management allow developers to devote their efforts to programming without paying too much attention to memory issues. This is a feature that can directly improve the developer’s happiness.

However, the disadvantages of the garbage collection mechanism also became the disadvantages of Lisp. The problem of maximum pause time makes it impossible to apply to scenarios with high real-time requirements, and the reduced efficiency of heap usage makes it impossible to run on low-end embedded devices where resources are tight. In addition, it has been more than half a century since the first garbage collection algorithm was proposed, but choosing the right garbage collection algorithm for the application scenario, adjusting GC parameters, and writing programs that fit with the garbage collection algorithm is still a highly skilled and rare craft.

3.4 First-class Function and Functional Programming

Lisp is a language that provides a First-class Function, and therefore a language that supports functional programming. The first-class functions here have the following properties.

  1. they can be named with variables
  2. can be provided as parameters to procedures
  3. can be returned by the procedure as a result
  4. can be included in a data structure

Expressed in a JavaScript program, this means that

1
2
3
4
5
6
7
8
var f = (a) => a + 1                //1
(function (a) {
  a()
})(() => console.log("hello"))      //2
function test (x){
  return (a) => a + x               //3
}
[1, 2, () => console.log("hello")]  //4

An approximate Scheme program is.

1
2
3
4
5
6
(define f (lambda (a) (+ a 1)))      ;1
((lambda (a) (a)) 
  (lambda () (printf "hello")))      ;2
(define test (lambda (x)
  (lambda (a) (+ a x))))             ;3
(list 1 (lambda () (printf "hello")));4

As mentioned above, functions in Lisp work just like numbers and strings, which provides the basis for functional programming. This makes it a much better experience than Java and Python, which “add” functional features to the language.

However, Lisp recommends/supports functional programming, not mandates it, and does not put much emphasis on pure functions. It is not a “functional language”, but rather a “multi-paradigm language”, where object-oriented programming, language-oriented programming, logic-based programming, etc. are all possible in addition to functional programming (if not, then extend the syntax to do so). ). Therefore, some advanced/radical functional features and functional modules (e.g., immutable data structures and inert evaluation) are missing from the base features provided by the language, and can only be implemented by extensions or open source implementations.

3.5 Code is data

In most popular programming languages, programs can be written to build all kinds of complex data structures and computational procedures, but the code that builds them is itself in a special position, even read-only and protected, such as the restricted access method and code areas in Java and C. And most languages have a complex syntax that makes the parsed code structure very complex as well. These reasons also make it difficult to develop code generation programs and code processing tools for these languages. However, this problem is solved in Lisp, where the Lisp program itself is organized by the Lisp base list structure, which is very close to the abstract syntax tree itself, and the Lisp support for symbolic data makes it possible to represent identifiers and keywords in the code in a simple way. These two points greatly simplify the process of manipulating Lisp code.

As an example.

1
2
3
(list 1 2 3 4)        ;1. 得到列表(1 2 3 4)
(quote (list 1 2 3 4));2. 得到列表(LIST 1 2 3 4)
'(list 1 2 3 4)       ;3. 得到列表(LIST 1 2 3 4)

In the Lisp code above, 1 is a function call that builds a list of corresponding data; 2 is a quote operation that gets the table structure data from the above code (the LIST in the list is a symbol); and 3 is a shortened version of 2. This way we can easily do the following things.

1
2
3
4
(let ((exp '(list 1 2 3 4)))    ;将一个表达式作为列表数据绑定给符号exp
  (print (length (cdr exp)))    ;获取表达式的参数数量
  (setf exp (cons '+ (cdr exp)));将表达式的操作替换为"+"
  (eval exp))                   ;对新的表达式进行求值

As John McCarthy says, “LISP programs act as representations of LISP data that can be manipulated by object programs. This prevents a separation between the system programmer and the application programmer.” In other programming languages, there is a world of difference between the data faced and the privileges held by those who develop the language and those who use it; in Lisp, there is not so much difference between the two, and all that is faced is Lisp’s table structure.

3.6 Free extensions

To add new functionality to a program, C adds new structures and functions, Java adds new classes, each language has its own set of schemes, and even some multi-paradigm languages offer multiple schemes (e.g., C++ and Scala). However, in the quest for concise, highly abstract expressions, the syntax of the language can be a barrier, limiting developers to a circle or forcing them to beat around the bush to achieve the desired functionality, and the complex reality of “the right language for the right domain”.

In contrast, Lisp has its own solution, but instead of limiting developers to “their own solution”, Lisp provides developers with various mechanisms to extend themselves (e.g., macros and meta-object protocols in Common Lisp). For example, the following code uses Common Lisp’s reader macros to add a slice-like syntax to Common Lisp.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
(set-macro-character #\] (get-macro-character #\)))
(set-dispatch-macro-character #\# #\[
  #'(lambda (stream char1 char2)
      (declare (ignore char1 char2))
      (let ((accum nil)
            (pair (read-delimited-list #\] stream t)))
        (do ((i (ceiling (car pair)) (1+ i)))
            ((> i (floor (cadr pair)))
             (list 'quote (nreverse accum)))
          (push i accum)))))

#[2 7] ;产生 (2 3 4 5 6 7)

If the expressions of other languages are not applicable to the scenario, then you should “find another one”; if the expressions of Lisp are not applicable to the scenario, then you should let Lisp fit the scenario. This has made Lisp popular with many hackers who value freedom. As John McCarthy says: “Everyone can “improve” their LISP, and many of these “improvements” have evolved into improvements to the language. " Someone who is “proficient” in Java may not necessarily be able to implement Java, but someone who is “familiar” with Lisp will generally be able to implement a working Lisp.

Freedom fosters diversity and fragmentation. Various dialects and programming methods were implemented in a variety of ways, and the desire to add object-oriented programming and logic-based programming to Lisp was like an after-school exercise in school, solved by developing a few hundred lines of code a day. While this makes developers more “independent”, it also makes the community less lively and connected than in other languages.

In addition, this freedom also creates the potential for projects to get out of control. There is a joke that “a Lisp language runs and becomes another language, and a Lisp program runs and becomes another program”, which is a great advantage if the dynamics are under control, but can be fatal if it gets out of control. If Lisp developers do not discipline themselves and let themselves go, it is easy to write highly abstract and unstructured programs, which are far more painful to troubleshoot than in other languages.

Finally, some advice for those who have developed an interest in Lisp (truly my own opinion).

Lisp is suitable for people who have the following characteristics to learn (suitable people).

  1. want to open their eyes (it is recommended to read SICP or PAIP, you may feel like going from the bottom of the well into outer space)
  2. interested in computer science (especially in the field of PL) (you can find many textbooks and papers, Lisp has fewer books in the field of engineering).
  3. don’t want to write code, want to generate code (learning Lisp can deepen the understanding of processing code with programs and generating code).
  4. want to implement the programming language (Lisp is a shortcut, although a large number of language interpreters are now implemented in C, but with the help of C to learn the interpreter is a long and bumpy road).

However, it is not recommended to spend time learning Lisp if the following characteristics are included (not for the right people).

  1. expect to get direct project solutions or direct application to their work (Lisp has a lack of three-party libraries and little Chinese material).
  2. religious worship of the programming language (it is dangerous not to pay attention to the shortcomings of the Lisp language design due to worship).
  3. people whose time and energy are too valuable (many good books on Lisp require a lot of time to study, do exercises or even chew raw meat, the road to practice is longer and the ceiling is higher).