How Lisp's Nested Notation Limits The Language's Utility
Xah Lee, 2007-05-03
There is a common complain by programers about lisp's notation, of nested parenthesis, being unnatural or difficult to read. Long time lisp programers, often counter, that it is a matter of conditioning, and or blaming the use of “inferior” text editors that are not designed to display nested notations. In the following, i describe how lisp notation is actually a problem, in several levels.
(1) Some 99% of programers are not used to the nested parenthesis syntax. This is a practical problem. On this aspect along, lisp's syntax can be considered a problem.
(2) Arguably, the pure nested syntax is not natural for human to read. Long time lispers may disagree on this point.
(3) Most importantly, a pure nested syntax discourages frequent or advanced use of function sequencing or compositions. This aspect is the most devastating.
The first issue, that most programers are not comfortable with nested notation, is well known. It is not a technical issue. Whether it is considered a problem of the lisp language is a matter of philosophical disposition.
The second issue, about nested parenthesis not being natural for human to read, may be debatable. I do think, that deep nesting is a problem to the programer. Here's a example of 2 blocks of code that are syntactically equivalent in the Mathematica language:
In the latter, it uses a full nested form (called FullForm in Mathematica). This form is isomorphic to lisp's nested parenthesis syntax, token for token (i.e. lisp's “(f a b)” is Mathematica's “f[a,b]”). As you can see, this form, by the sheer number of nested brackets, is in practice problematic to read and type. In Mathematica, nobody really program using this syntax. (The FullForm syntax is there, for the same reason of language design principle shared with lisp of “consistency and simplicity”, or the commonly touted lisp advantage of “data is program; program is data”.)
The third issue, about how nested syntax seriously discourages frequent or advanced use of inline function sequencing on the fly, is the most important and I'll give further explanation below.
One practical way to see how this is so, is by considering unix's shell syntax. You all know, how convenient and powerful is the unix's pipes. Here are some practical example: “ls -al | grep xyz”, or “cat a b c | grep xyz | sort | uniq”.
Now suppose, we get rid of the unix's pipe notation, instead, replace it with a pure functional notation: e.g. (uniq (sort (grep xyz (cat a b c)))), or enrich it with a composition function and a pure function construct (λ), so this example can be written as: ((compose (lambda (x) (grep xyz x)) sort uniq) (cat a b c)).
You see, how this change, although syntactically equivalent to the pipe “|” (or semantically equivalent in the example using function compositions), but due to the cumbersome nested syntax, will force a change in the nature of the language by the code programer produces. Namely, the frequency of inline sequencing of functions on the fly will probably be reduced, instead, there will be more code that define functions with temp variables and apply it just once as with traditonal languages.
A language's syntax or notation system, has major impact on what kind of code or style or thinking pattern on the language's users. This is a well-known fact for those acquainted with the history of math notations.
The sequential notation “f@g@h@x”, or “x//h//g//f”, or unixy “x|h|g| f”, are far more convenient and easier to decipher, than “(f (g (h x)))” or “((compose h g f) x)”. In actual code, any of the f, g, h might be a complex pure function (aka lambda construct, full of parenthesis themselves).
Lisp, by sticking with almost uniform nested parenthesis notation, it immediately reduces the pattern of sequencing functions, simply because the syntax does not readily lend the programer to it as in the unix's “x|h|g|f”. For programers who are aware of the coding pattern of sequencing functions, now either need to think in terms of a separate “composition” construct, and or subject to the much problematic typing and deciphering of nested parenthesis.
(Note: Lisp's sexp is actually not that pure. It has ad hoc syntax equivalents such as the “quote” construct “ '(a b c) ”, and also “`”, “#”, “,@” constructs, precisely for the purpose of reducing parenthesis and increasing readability. Scheme's coming standard the R6RS ↗, even proposes the introduction of [] and {} and few other syntax sugars to break the uniformity of nested parenthesis for legibility. Mathematica's FullForm, is actually a pure nested notation as can be.)
------- The above, is part of a 3-part exposition: “The Concepts and Confusions of Prefix, Infix, Postfix and Fully Functional Notations”, “Prefix, Infix, Postfix notations in Mathematica”, “How Lisp's Nested Notation Limits The Language's Utility”, archived at: http://xahlee.org/UnixResource_dir/writ/notations.html
On May 4, 5:11 pm, Xah Lee <x...@xahlee.org> wrote:
> How Lisp's Nested Notation Limits The Language's Utility
> Xah Lee, 2007-05-03
> There is a common complain by programers about lisp's notation, of > nested parenthesis, being unnatural or difficult to read.
That is false. The complaint does not frequently occur among all of the complaints occured by the entire population of complaintive programmers.
> (1) Some 99% of programers are not used to the nested parenthesis > syntax. This is a practical problem.
Since 99% of programmers don't use Lisp, it's not a practical problem.
> (2) Arguably, the pure nested syntax is not natural for human to read. > Long time lispers may disagree on this point.
Programming language syntax shouldn't be natural for humans to read. Or, rather, this shouldn't be a requirement which creates technical compromises.
> (3) Most importantly, a pure nested syntax discourages frequent or > advanced use of function sequencing or compositions.
That is an artifact of the way in which function composition is rendered in nested syntax, and not an artifact of that syntax itself.
There exist macros that provide alternate syntax for function composition, such as a left-to-right pipeline.
> This aspect is the most devastating.
Compared to issues like global warming and problems in the Middle East, hardly.
> The first issue, that most programers are not comfortable with nested > notation, is well known.
Many programmers are also not comfortable in social situations. So what?
> shell syntax. You all know, how convenient and powerful is the unix's > pipes. Here are some practical example: “ls -al | grep xyz”, or “cat a > b c | grep xyz | sort | uniq”.
> Now suppose, we get rid of the unix's pipe notation, instead, replace > it with a pure functional notation: e.g. (uniq (sort (grep xyz (cat a > b c)))),
When you think about this more deeply (i.e. at all) you run into nasty details. These programs pass their output to each other, but they also have a return value (termination status) which isn't passed through the pipeline. And they take arguments, but the arguments of a pipeline element are not derived from the previous pipeline element.
Your call (grep xyz (cat ...)) means to pass the output of cat as the third argument of grep.
In the POSIX shell, this would be coded using guess what, Lisp-like syntax:
grep xyz $(cat ...)
Taking it further:
uniq $(sort $(grep xyz $(cat a b c)))
or enrich it with a composition function and a pure function
> construct (λ), so this example can be written as: ((compose (lambda > (x) (grep xyz x)) sort uniq) (cat a b c)).
I posted a filter macro in comp.lang.lisp which expresses function chaining into a left-to-right notation. Look for it.
In this notation, you might write:
(filter 3 (expt _ 2) (* 4))
which means, start with 3, raise it to the power of 2 to obtain 9, and then multiply by 4 to get 36.
The underscore indicates the argument position where the output of the previous pipeline element is to be inserted when calling the next pipeline element. The default is to add it as a rightmost argument. The macro has features for dealing with splitting and recombining lists, and handling multiple values.
This is still the same ``pure nested'' syntax; it merely expresses function chaining differently.
What was that about notation limiting language utility?
Yes. The Mathematica language really brings home the fact that non-trivial syntax is good. In particular, it does an excellent job of mimicking conventional mathematical notation. Arguing in favor of Lisp syntax is like advocating the use of cave painting...
Also, note that Mathematica provides strictly more in the way of macros.
Kaz Kylheku wrote: > On May 4, 5:11 pm, Xah Lee <x...@xahlee.org> wrote: >> (1) Some 99% of programers are not used to the nested parenthesis >> syntax. This is a practical problem.
> Since 99% of programmers don't use Lisp, it's not a practical problem.
If you're being pedantic, you may mean "it is an uncommon practical problem". However, the problem extends beyond Lisp.
Recent discussions have covered the use of pattern matching. In SML, OCaml, Haskell (I believe) and F# you must write pattern matches over the expr type in prefix notation. To borrow from Alan's example, the Lisp code:
(destructuring-bind (op1 (op2 n x) y) form `(* ,n (* ,(simplify x) ,(simplify y))))) ((cons (eql *) *) (destructuring-bind (op left right) form (list op (simplify left) (simplify right)))) ((cons (eql +) (cons (eql 0) (cons * null)))
could be written:
| n*x*y -> n*(x*y)
but in ML this must be written as a pattern match over a sum type where the type constructors must use prefix notation:
| Mul(Mul(n, x), y) -> Mul(n, Mul(x, y))
While this is clearly much better than the Lisp, it would be preferable to use the mathematical syntax in this case. You can address this in OCaml using macros and there is a chance that F# will support overloaded operators in patterns in the future.
Mathematica lets you do:
n x y -> n (x y)
but I don't know of any other languages that do, without forcing you to reinvent the wheel via macros.
Xah Lee <x...@xahlee.org> writes: > How Lisp's Nested Notation Limits The Language's Utility
You know, I am all about keeping one's eyes open so as to see the limitations of the tools one chooses or is forced to use. If more people were open to the flaws in their OS, religion, editor, and a myriad of other things the world would surely be a better and more peaceful place. However, I have to say that all I ever see from you, Mr. Lee, is complaints and how you want emacs and lisp to change. For the love of whatever god you choose to pray to, find a frigging tool that makes you happy. If it is emacs that makes you happy, then change the things you don't like and be happy but I fail to see how your whining diatribes serve any useful purpose. If you don't like lisp then by all means use something else.
> If it is emacs that makes you happy, then change the things you don't > like and be happy but I fail to see how your whining diatribes serve > any useful purpose. If you don't like lisp then by all means use > something else.
Xahlee likes to complain. That's the main factor here.
> (1) Some 99% of programers are not used to the nested parenthesis > syntax. This is a practical problem. On this aspect along, lisp's > syntax can be considered a problem.
The first time I saw graph-theory notation, I thought it was gibberish. I still think most mathematical notation is gibberish, but I deal with it --- I don't turn in proofs written in prose.
> (2) Arguably, the pure nested syntax is not natural for human to read. > Long time lispers may disagree on this point.
And mathematical notation is not natural for people to read, at least not anybody who grew up learning a natural language. But people deal with it. They're remarkably adaptable this way.
Speaking of gibberish. I can't believe anybody would hold up Mathematica as an example of good programming. Next you'll be telling me that 99% of Matlab code isn't really crap!
> The third issue, about how nested syntax seriously discourages > frequent or advanced use of inline function sequencing on the fly, is > the most important and I'll give further explanation below.
I consider this to be a Good Thing (TM). Unix shell commands read like line noise.
Xah Lee <x...@xahlee.org> writes: > How Lisp's Nested Notation Limits The Language's Utility
> Xah Lee, 2007-05-03
> There is a common complain by programers about lisp's notation, of > nested parenthesis, being unnatural or difficult to read. Long time > lisp programers, often counter, that it is a matter of conditioning, > and or blaming the use of “inferior” text editors that are not > designed to display nested notations. In the following, i describe how > lisp notation is actually a problem, in several levels.
As a practical matter, most LISPers don't even see the parens, but rather interpret the code according to the indentation.
> (1) Some 99% of programers are not used to the nested parenthesis > syntax. This is a practical problem. On this aspect along, lisp's > syntax can be considered a problem.
It's certainly different. Whether it is a problem or not depends on personal or subjective criteria.
> (2) Arguably, the pure nested syntax is not natural for human to read. > Long time lispers may disagree on this point.
Harder to read for some things, easier to read for others.
E.G.
1 + 2 + 3 + 4 + 6 - 3 + 32
or
(- (+ 1 2 3 4 6 32) 3)
> (3) Most importantly, a pure nested syntax discourages frequent or > advanced use of function sequencing or compositions. This aspect is > the most devastating.
This is one reason LISP has macros. Don't like the syntax? Make your own. How many languages make this as easy as LISP does?
> The third issue, about how nested syntax seriously discourages > frequent or advanced use of inline function sequencing on the fly, > is the most important and I'll give further explanation below.
> One practical way to see how this is so, is by considering unix's > shell syntax. You all know, how convenient and powerful is the > unix's pipes. Here are some practical example: “ls -al | grep xyz”, > or “cat a b c | grep xyz | sort | uniq”.
> Now suppose, we get rid of the unix's pipe notation, instead, > replace it with a pure functional notation: e.g. (uniq (sort (grep > xyz (cat a b c)))), or enrich it with a composition function and a > pure function construct (λ), so this example can be written as: > ((compose (lambda (x) (grep xyz x)) sort uniq) (cat a b c)).
(pipe (cat a b c) (grep xyz) (sort) (uniq))
'Nuff said.
UNIX pipe `|' is not just a throw-away syntactic separator, anyone with LISP experience would see it for what it is: a function that operates on functions.
> Lisp, by sticking with almost uniform nested parenthesis notation, > it immediately reduces the pattern of sequencing functions, simply > because the syntax does not readily...<snip>
I would contend that LISP did not do this. You did.
<snip>
> (Note: Lisp's sexp is actually not that pure. It has ad hoc syntax > equivalents such as the “quote” construct “ '(a b c) ”, and also > “`”, “#”, “,@” constructs, precisely for the purpose of reducing > parenthesis and increasing readability. Scheme's coming standard the > R6RS ↗, even proposes the introduction of [] and {} and few other > syntax sugars to break the uniformity of nested parenthesis for > legibility. Mathematica's FullForm, is actually a pure nested > notation as can be.)
Some functions are used so often, they have shorthand equivalents. This is a feature in many languages. But if for some reason one wanted the sexp to be "pure," nothing's stopping him from using the fully-parenthesized versions.
> ------- > The above, is part of a 3-part exposition: > “The Concepts and Confusions of Prefix, Infix, Postfix and Fully > Functional Notations”, > “Prefix, Infix, Postfix notations in Mathematica”, > “How Lisp's Nested Notation Limits The Language's Utility”, > archived at: > http://xahlee.org/UnixResource_dir/writ/notations.html