The history of Shen condensed in a few pages. For those of antiquarian interests.
1.5 The Forerunners of Shen: SEQUEL
Up to this point, we have surveyed the historical and theoretical background to declarative programming in the medium of the third person. We now enter the territory of a specific programming language, the one that is the subject of this book, and how Shen evolved within this framework. My personal involvement as creator means that some first person involvement is necessary. Hence the remaining sections of this chapter are something of a scientific memoir.
The history of Shen began in 1989, when I was working as a research assistant at the LFCS in Edinburgh. At that time, I worked in Lisp, and to improve my productivity, I wrote a 700 line Lisp program which translated pattern-directed functional code into reasonably efficient Lisp.
While cycling in the Lake district in the spring of 1990, I had the insight that since type checking was inherently deductive in nature, the type discipline of a language could be specified as a series of deduction rules. Using the techniques of high-performance theorem-proving, these rules could be compiled down into an efficient type checker. The type-discipline would be free from the procedural baggage of low-level encoding and presented in a form that was clear, concise, and accessible to experimentation and extension by researchers in type theory. This was a bold conception that seemed eminently reasonable, but which in fact took a further 15 years of development to realise completely.
The prototype development was the language SEQUEL (SEQUEnt processing Language), written in Lisp and introduced publicly in 1992 and to the International Joint Conference on Artificial Intelligence in 1993. SEQUEL anticipated many of the features of Shen, in particular the use of Gentzen's sequent notation to formulate the type rules. SEQUEL actually contained the type theory for a substantial portion of Common Lisp (over 300 Common Lisp system functions were represented in SEQUEL) necessitating a sizeable source code program of more than 23,000 lines. SEQUEL was not only designed to support type secure Common Lisp programming, it was also supposed to support direct encoding of theorem-provers by giving the user the power to enter logic rules in sequent notation to the SEQUEL system.
SEQUEL sustained Andrew Adams's (1994) reimplementation INDUCT of the Boyer-Moore theorem-prover. His M.Sc. project was written in 6,000 lines of SEQUEL, and generated nearly 30,000 lines of Common Lisp, and gained him a distinction. In 1993, state-of-the-art was a SPARC II, so power was limited. Loading and type checking INDUCT took several minutes.
SEQUEL was a compromise of the ideals of deductive typing for several reasons.
1. The sequent compiler was inefficient.
2. Since SEQUEL was heavily configured to support Common Lisp, the language was not consistent with lambda calculus (since Common Lisp does not support lambda calculus features like currying and partial applications).
3. SEQUEL inherited case-insensitivity from Common Lisp and the use of the empty list NIL to mean false.
4. SEQUEL lacked a formal semantics.
5. SEQUEL also lacked a proof of type correctness.
Over ten years and two countries all these problems were eliminated.
1.6 The Forerunners of Shen: Qi
The first modest change was to introduce case-sensitivity and the use of proper booleans in place of T and NIL. Large parts of the implementation were rewritten, reducing the source code by several thousand lines. In 1998 overloading in SEQUEL was dropped and the entire list of function types was placed on a hash table. This was in a sense a move away from Lisp and towards a cleaner model and a smaller language. The plus was the elimination of 10,000 lines of code.
From 1998 to 1999 a lot of work was expended in providing the new language with a formal type theory. After a false start, the current type theory was evolved in 1999 and the semantics was developed between 1999 and 2000. As a Taoist, feeling the need for a name for the new implementation. I called it Qi – the name for the life-force in Taoism. The implementation went through two complete rewrites. Qi introduced proper partial applications and currying into the language, making it lambda calculus consistent.
In 2001 a robust version of a compiler-compiler (first devised by my colleague Dr Gyuri Lajos) was built and used to encode some of the complex parsing routines needed in Qi. This was the compiler-compiler Qi-YACC which was later used in Qi.
During that same period, a lot of work was done in testing whether deductive typing was really a practical option. SEQUEL was at the border of human tolerance. Andrew could drink a cup of coffee while watching his program load. The key was the development of a fast Prolog compiler, since Shen compiled deduction rules in sequent calculus into an in-house Prolog. The solution came from two directions; from an improvement in the type checking algorithm and from an improvement in the performance of the logic engine.
In 2003, I developed the experimental algorithm T * and type checking gained a factor of 7 speedup over version 1.4. In late 2003 I put the finishing touches on a correctness proof for the type theory and T and T *. In 2003 Qi won me the Promising Inventor Award from the State University of New York. Carl Shapiro, currently of Google, attended my functional programming class at Stony Brook that same year. He was to exert a significant influence on development of Qi.
In 2005 I constructed a new model for compiling Horn clauses called the Abstract Unification Machine. The AUM compiled Prolog into virtual machine instructions for a functional language. AUM technology gave 100 KLIPS under CLisp and 400 KLIPS under the faster CMU Lisp and quadrupled the speed of the type checker. Qi 6.1, which used the AUM, was released in April 2005, along with the web publication of Functional Programming in Qi.
At 6,500 lines of machine-generated Lisp, Qi 6.1 was three times smaller than the SEQUEL of 1993 and, apart from advances in hardware, nearly 30X faster at verifying programs. It was placed online in April 2005. A factorising version of the Qi compiler was introduced which made Qi competitive with the fastest hand-compiled Lisp code.[1] The revised language, Qi II, corresponding to the text Functional Programming in Qi was released in 2008.
1.7 Shen
The chain of events that precipitated Shen began with an invitation to address the European Conference on Lisp in Milan in 2009. The invitation came about because the invited speaker, Kent Pitman, had been himself taken ill with cancer which meant, understandably, that he had to withdraw. I was invited to take his place.
My address proposed a language like Qi but based on a primitive instruction set that was so small that it could be translated onto almost any platform. Qi had been implemented in Common Lisp, which has over 700 system functions. But of that large number, Qi only used 118 system functions in its implementation.
I estimated that Qi could be implemented in an even smaller instruction set no more than 50 primitive functions, and that they should be so carefully chosen as to be easily implemented and widely portable. This instruction set defined a very simple Lisp, closer in spirit to Lisp 1.5, the original Lisp from which the gargantuan Common Lisp descended. This micro-Lisp was later to be called Kl.
In 2010 an appeal was launched to fund this research and happily it succeeded. I returned to designing and building the new implementation. Eventually what emerged in September 2011 was a clean, portable language under a $free license, implemented initially in 43 primitive Kl instructions running under Common Lisp. The new language was named Shen, the highest form of energy in Taoism and the Chinese for spirit.
Shen introduced quite a number of features not found in Qi; including string and vector handling through pattern matching, the capacity to read streams from non-text files through an 8 bit reader and an advanced and powerful macro system. Designed for portability, Shen was slower than Qi[2] which was optimised for Lisp, but vindicated itself rapidly and within 18 months Shen had been ported to Common Lisp, Scheme, Clojure, Javascript, Java, Python, JVM and Ruby. In 2015, Shen went BSD.
During this process Kl acquired four more primitives. In the hot summer of 2018, during a spell of insomnia, I created a version of Shen YACC that generated statically typable code and this led to a new chapter.
From 2019-2021, I completely revised the code for the entire kernel introducing a new Prolog and YACC compiler. The most notable improvement in the Prolog aspect was a garbage collector which cleaned up the binding vector and allowed Shen Prolog to run for hours in a limited amount of memory and not seconds. The old Prolog compiler which produced fast execution, was also potentially capable of generating exponentially large amounts of code. This was tackled in the old compiler by more or less ad hoc techniques. The new compiler behaved gracefully, took up less space, and exhibited linear code growth.
Another major improvement is that Shen assumed all the load of implementing partial applications, shifting this away from Kl. Implementing partial applications in the old Shen had been something of a nightmare for developers – including myself – and hardly anybody had done it properly. The new implementation nailed it precisely.
Another wart that was burnt away was the conflation between a symbol and the function it denoted. This was a an inheritance from the DNA of Common Lisp. In Lisp a symbol can double as both a symbol and a function depending on context. This was rather ugly, so the new Shen rigorously defended the distinction between a symbol and what that symbol can be used to denote. The symbol reverse is a symbol; but (fn reverse) denotes a function.
The rebuild that did all this was absolute; every line of the old kernel was rewritten and several hundred lines of kernel disappeared. This rebuild had been a private pursuit, but in 2021 I released it under BSD. The new kernel marked the beginning of the S series of kernels on which modern implementations of Shen run. .