I hope those of you computer scientists and language designers think about this and avoid list comprehension in your language.
plain text version follows.
-------------------------------------------------- What's List Comprehension and Why is it Harmful?
Xah Lee, 2010-10-14
This page explains what is List Comprehension (LC), with examples from several languages, with my opinion on why the jargon and concept of “list comprehension” are unnecessary, redundant, and harmful to functional programing. What is List Comprehension?
Here's a example of LC in python:
S = [2*n for n in range(0,9) if ( (n % 2) == 0)] print S # prints [0, 4, 8, 12, 16]
It generates a list from 0 to 9, then remove the odd numbers by 「( (n % 2) == 0), then multiply each element by 2 in 「2*n」, then returns a list.
Python's LC syntax has this form:
[myExpression for myVar in myList if myPredicateExpression]
In summary, it is a special syntax for generating a list, and allows programers to also filter and apply a function to the list, but all done using expressions.
In functional notation, list comprehension is doing this:
map( f, filter(list, predicate))
Other languages's LC are similiar. Here are some examples from Wikipedia. In the following, the filter used is 「x^2 > 3」, and the 「2*x」 is applied to the result. javascript
Number.prototype.__iterator__ = function() { for (let i = 0; i < this; i++) yield i} var s = [2*i for (i in 100) if (i*i > 3)]
Haskell
s = [ 2*x | x <- [0..], x^2 > 3 ]
F#
seq { for x in 0..100 do if x*x > 3 then yield 2*x } ;;
(loop for x from 1 to 20 when (> (* x x) 3) collect (* 2 x))
Erlang
S = [2*X || X <- lists:seq(0,100), X*X > 3].
Scala
val s = for (x <- Stream.from(0); if x*x > 3) yield 2*x
Here's how Wikipedia explains List comprehension. Quote:
A list comprehension is a syntactic construct available in some programming languages for creating a list based on existing lists.
The following features makes up LC:
* (1) A flat list generator, with the ability to do filtering and applying a function. * (2) Usually, a special syntax in the language. * (3) The syntax uses expressions, as opposed to using functions as parameters.
Why is List Comprehension Harmful?
• List Comprehension is a bad jargon; thus harmful to functional programing or programing in general. It hampers communication, and encourage mis-understanding.
• List Comprehension is a redundant concept in programing. What List Comprehension does is simply 「map(func, filter(list, predicate))」.
• The special syntax of List Comprehension as it exists in many langs, are not necessary. If a special purpose function for it is preferred, then it can simply be a plain function, e.g 「LC(function, list, predicate)」. Map + Filter = List Comprehension Semantics
The LC's semantics is not necessary. A better way and more in sync with functional lang spirit, is simply to combine plain functions:
map( f, filter(list, predicate))
Here's the python syntax:
map(lambda x: 2*x , filter( lambda x:x%2==0, range(9) ) ) # result is [0, 4, 8, 12, 16]
In Mathematica, this can be written as:
Map[ #*2 &, Select[Range@9, EvenQ]]
In Mathematica, a arithemetic operation can be applied to list directely without using Map explicitly, so the above can be written as:
Select[Range@9, EvenQ] * 2
in my coding style, i usually write it in the following syntactically equivalent forms:
(#*2 &) @ (Select[#, EvenQ]&) @ Range @ 9
or
9 // Range // (Select[#, EvenQ]&) // (#*2 &)
In the above, we sequence functions together, as in unix pipe. We start with 20, then apply “Range” to it to get a list from 1 to 9, then apply a function that filters out all numbers not greater than 3, then we apply a function to multiply each number by 2. The “//” sign is a postfix notation, analogous to bash's “|”, and “@” is a prefix notation that's the reverse of “|”.
(See: Short Intro of Mathematica For Lisp Programers.) List Comprehension Function Without Special Syntax
Suppose we want some “list comprehension” feature in a functional lang. Normally, by default this can be done by
map(func, filter(inputList, Predicate))
but perhaps this usage is so frequent that we want to create a new function for it, to make it more convenient, and perhaps easier to make the compiler to optimize more. e.g.
LC(func, inputList, Predicate)
this is about whether a lang should create a new convenient function that otherwise require 3 function combinations. Common Lisp vs Scheme Lisp are the typical example of extreme opposites.
Suppose we decided that generating list by a filter is so frequently used that it worth it to create a new func for it.
LC(func, inputList, Predicate)
Now, in functional langs, in general a design principle is that you want to reduce the number of function unless you really need it. Because, any combination of list related functions could potentially be a new function in your lang. So, if we really think LC is useful, we might want to generalize it. e.g. in
LC(func, inputList, Predicate)
is it worthwhile say to add a 4th param, that says return just the first n? (here we presume the lang doesn't support list of infinite elements) e.g.
LC(func, inputList, Predicate, n)
what about partition the list to m sublists?
LC(func, inputList, Predicate, n, m)
what about actually more generalized partition, by m sublist then by m1 sublist then by m2 sublist?
LC(func, inputList, Predicate, n, list(m,m1,m2,...))
what about sorting? maybe that's always used together when you need a list?
LC(func, inputList, Predicate, n, list(m,m1,m2,...), sortPredcate)
what if actually frequently we want LC to map parallel to branches? e.g.
LC(func, inputList, Predicate, n, list(m,m1,m2,...), sortPredcate, mapBranch:True)
what if ...
you see, each of these or combination of these can be done by default in the lang by sequencing one or more functions (i.e. composition). But when we create a new function, we really should think a lot about its justification, because otherwise the lang becomes a bag of functions that are non-essential, confusing.
So the question is, is generating a list really that much needed? And, if so, why should we create a special syntax such as 「[ expr for var in list if P]」 than 「 LC(func, list, P)」?
Also note, that LC is not capable of generating arbitrary nested list. For a example of a much powerful list generator that can generate arbitrary nested tree, see:
For those who find imperative lang good, then perhaps “list comprehension” is good, because it adds another idiosyncratic syntax to the lang, but such is with the tradition of imperative langs. The ad hoc syntax aids in reading code by various syntactical forms and hint words such as “[... for ... in ...]”. Bad Jargon and How To Judge a Jargon
Someone wrote:
The term “list comprehension” is intuitive, it's based on math set notation.
The jargon “list comprehension” is opaque. It hampers communication and increases misunderstanding. A better name is simply “list generator”.
What's your basis in saying that “List Comprehension” is intuitive? Any statics, survey, research, references?
To put this in context, are you saying that lambda, is also intuitive? “let” is intuitive? “for” is intuitive? “when” is intuitive? I mean, give your evaluation of some common computer language terminologies, and tell us which you think are good and which are bad, so we have some context to judge your claim.
For example, let us know, in your view, how good are terms: currying, lisp1 lisp2, tail recursion, closure, subroutine, command, object. Or, perhaps expound on the comparative merits and meaning on the terms module vs package vs add-on vs library. I would like to see your view on this with at least few paragraphs of analysis on each. If you, say, write a essay that's at least 1k words on this topic, then we all can make some judgment of your familiarity and understanding in this area.
Also, “being intuitive” is not the only aspect to consider whether a term is good or bad. For example, emacs's uses the term “frame”. It's quite intuitive, because frame is a common english word, everyone understands. You know, door frame, window frame, picture frame, are all analogous to emacs's “frame” on a computer. However, by some turn of history, in computer software we call such as “window” now, and by happenstance the term “window” also has a technical meaning in emacs, what we call “split window” or “frame” today. So, in emacs, the term “frame” and “window” is confusing, because emacs's “frame” is what we call “window”, while emacs's “window” is what we call a frame. So here, is a example, that even when a term is intuitive, it can still be bad.
As another example, common understanding by the target group the term is to be used is also a important aspect. For example, the term “lambda”, which is a name of greek char, does not convey well what we use it for. The word's meaning by itself
> > those are unicode 「」 > > LEFT CORNER BRACKET x300c > > RIGHT CORNER BRACKET x300d
> Is there need for unicode for such trivial things like {[]}?
short story is no but am a unicode geek :)
long story is... i need a way to quote code... the corner bracket was choosen because it's standard chinese punctuation thus almost all computer today bought in the past 5 years can display it, and because it's different than ascii so it avoids ambiguity and complications. e.g. when the quoted text also contains the char that you use for quoting....
thanks for your info. I'll think about removing them...
the other thing is that those markers are very useful when you think about info processing and automation. e.g. in the future i can trivially write a script list all inline code i have on my site, or to htmlize/css all quoted code. e.g.
in that page, does the char show up correctly? The chars are not in the html text, but is added by CSS due to markup. And the code are all colored red.
PS: the most popular example of studious use of quoting in a tech doc is emacs manual or any manual from FSF. e.g.
On Mon, 18 Oct 2010 02:11:07 -0700, Xah Lee wrote: > • List Comprehension is a redundant concept in programing. What List > Comprehension does is simply 「map(func, filter(list, predicate))」.
You're assuming a single generator. Once you have multiple generators, map + filter + concat starts getting ugly rather quick.
> I hope those of you computer scientists and language designers think > about this and avoid list comprehension in your language.
I agree with Xah, but I don't think he goes far enough. :) I think list comprehensions in most languages are too "first order" and constrain thinking into a little box.
Let's see if I can trudge ahead...
The base list comprehension in CL: (map func (filter predicate list))
This is where a language with specialized list comprehensions usually stops. But we can go on.
A slightly better higher order abstraction of it, corresponding to Xah's LC function: (action func predicate list)
How about we make it easier? (defaction action1 (func1 predicate1)) (defaction action3 (func2 predicate2)) (defaction action4 (func3 predicate3)) (defaction action5 (func4 predicate4))
And now make our action function a little smarter: (action action1 list)
Now, lets make it so we can compose a bunch of actions at once, in order. Each later actions may process the results form a possible application an earlier action in the sequence. The composition of the actions can work to reduce the answer to have less elements that the original list due to the filtering.
(compose-actions list action3 action4 action1 ... )
The above would return two values, the first is the list of result (what you expect from the composition of the actions). Exploded example of how the first list is produced: (... (map func1 (filter predicate1 (map func4 (filter predicate4 (map func3 (filter predicate3 list)))))))
The second value is a list which will be the same size as the original input list. What it does is provide a record of what happened to each element in the list as the composition was being applied. Some careful bookeeping must be done to keep track of the original positions of the elements as they evolve through smaller and smaller result lists.
( ;; actions chosen for element 0 going backwards in time steps (0 element[0] action1 (-1 element[0] action4 (-2 element[0] action3 ...)))
;; actions chosen for element 1 going backwards in time steps (0 element[1] action1 (-1 element[1] action4 (-2 element[1] action3 ...)))
;; actions chosen for element 2 going backwards in time ;; Suppose this got wiped out by action 4 and action1 never saw it. ;; We don't have an entry for time step 0 and just keep carrying this ;; record along with us as is as we continue processing the composition. (-1 element[2] (action4 'failed) (-2 element[2] action3 ...)))
;; actions chosen for element 1 going backwards in time steps ;; Notice it slid from index 3 to 2 because element 2 got removed in the ;; previous step, but we still recorded where its original place was. (0 element[2] action1 (-1 element[3] action4 (-2 element[3] action3 ...)))
...)
Using the above, we can likely analyze why certain decisions to act might have failed on a per-element basis. Suppose like this:
(why-failed expert-rules (compose-actions list action3 action4 action1 ...))
Since this is all higher order codes, the implementation of any of these functions can choose a parallel or multi machine implementation. This isn't really available to you (to my 5 minutes of thinking about it) in languages with list comprehension since the syntax and implementation is hidden from you.
> > The jargon "list comprehension" is opaque. It hampers communication > > and increases misunderstanding. A better name is simply "list > > generator".
> +1
thanks a lot wax man. I added your example with credit.
over the past few years, you've posted lots snippets of ruby code, and i find them quite interesting, and those snippets gave me quite a positive view of ruby.
btw, is unicode still a problem with ruby? i really need robust unicode for my use.
also, if you can tell me, what's the main discussion group for ruby? is it just comp.lang.ruby or is it web based forum?
It seems to me your action is simply composition, which is available in many functional langs.
you added a sorta trace functionality to composition. In Mathematica, any lisp function (say named xyz), usually have a xyzList version, which returns all the steps as a list. This is somewhat similar to your trace. Some functional lang has this too i think. But in general, i don't see the utility of your trace action kinda thing. It seems like a domain specific lang's need.
> > I hope those of you computer scientists and language designers think > > about this and avoid list comprehension in your language.
> I agree with Xah, but I don't think he goes far enough. :) I think > list comprehensions in most languages are too "first order" and > constrain thinking into a little box.
> Let's see if I can trudge ahead...
> The base list comprehension in CL: > (map func (filter predicate list))
> This is where a language with specialized list comprehensions usually > stops. But we can go on.
> A slightly better higher order abstraction of it, corresponding to > Xah's LC function: > (action func predicate list)
> How about we make it easier? > (defaction action1 (func1 predicate1)) > (defaction action3 (func2 predicate2)) > (defaction action4 (func3 predicate3)) > (defaction action5 (func4 predicate4))
> And now make our action function a little smarter: > (action action1 list)
> Now, lets make it so we can compose a bunch of actions at once, in > order. Each later actions may process the results form a possible > application an earlier action in the sequence. The composition of the > actions can work to reduce the answer to have less elements that the > original list due to the filtering.
> The above would return two values, the first is the list of > result (what you expect from the composition of the actions). Exploded > example of how the first list is produced: > (... > (map func1 > (filter predicate1 > (map func4 > (filter predicate4 > (map func3 > (filter predicate3 list)))))))
> The second value is a list which will be the same size as the original > input list. What it does is provide a record of what happened to each > element in the list as the composition was being applied. Some careful > bookeeping must be done to keep track of the original positions of > the elements as they evolve through smaller and smaller result lists.
> ( > ;; actions chosen for element 0 going backwards in time steps > (0 element[0] action1 > (-1 element[0] action4 > (-2 element[0] action3 > ...)))
> ;; actions chosen for element 1 going backwards in time steps > (0 element[1] action1 > (-1 element[1] action4 > (-2 element[1] action3 > ...)))
> ;; actions chosen for element 2 going backwards in time > ;; Suppose this got wiped out by action 4 and action1 never saw it. > ;; We don't have an entry for time step 0 and just keep carrying this > ;; record along with us as is as we continue processing the composition. > (-1 element[2] (action4 'failed) > (-2 element[2] action3 > ...)))
> ;; actions chosen for element 1 going backwards in time steps > ;; Notice it slid from index 3 to 2 because element 2 got removed in the > ;; previous step, but we still recorded where its original place was. > (0 element[2] action1 > (-1 element[3] action4 > (-2 element[3] action3 > ...)))
> ...)
> Using the above, we can likely analyze why certain decisions to act > might have failed on a per-element basis. Suppose like this:
> (why-failed expert-rules (compose-actions list action3 action4 action1 ...))
> Since this is all higher order codes, the implementation of any of > these functions can choose a parallel or multi machine implementation. > This isn't really available to you (to my 5 minutes of thinking > about it) in languages with list comprehension since the syntax and > implementation is hidden from you.
> > > The jargon "list comprehension" is opaque. It hampers communication > > > and increases misunderstanding. A better name is simply "list > > > generator".
> > +1
> thanks a lot wax man. I added your example with credit.
> over the past few years, you've posted lots snippets of ruby code, and > i find them quite interesting, and those snippets gave me quite a > positive view of ruby.
> btw, is unicode still a problem with ruby? i really need robust > unicode for my use.
Sorry, but I'm clueless about unicode. (It does seem strange that a language from Japan has not always had great unicode support.)
> also, if you can tell me, what's the main discussion group for ruby? > is it just comp.lang.ruby or is it web based forum?
idiomatic python: [2*n for n in range(9) if n%2==0]
custom syntax is always shorter and more convenient in direct style: that's precisely why they made it into custom syntax in the first place.
Of course, that's far better in a language that allows user-defined custom syntax. waxhead only agreed with you because ruby got no custom list comprehension syntax nor a way to define one: if it was a builtin he'd gladly use it everywhere instead of actually making an algorithm actually describing how it works, as he always does...
In comp.lang.lisp Xah Lee <xah...@gmail.com> wrote:
> hi Peter,
> thanks for the thoughts.
> It seems to me your action is simply composition, which is available > in many functional langs.
That's true, but composition and list comprehensions should go hand in hand, but with a specific syntax, it is uglier.
> you added a sorta trace functionality to composition. In Mathematica, > any lisp function (say named xyz), usually have a xyzList version, > which returns all the steps as a list. This is somewhat similar to > your trace. Some functional lang has this too i think. But in general, > i don't see the utility of your trace action kinda thing. It seems > like a domain specific lang's need.
It is very useful for recording mutations of a tree walker and extendable to recording mutations of a graph walker. The important aspect is that if the actions are invertible, you get a piecewise bijection from the original graph to the resultant one.
sometimes i do find some reddit replies interesting and knowledgable (like slashdot), but am afraid 99% of it is uninformed garbage. In this case, i haven't found a interesting argument. Though, it is informative that many reddit readers didn't find my article interesting. A good feed back that i need to edit or expand it, with more examples, etc.
Xah Lee <xah...@gmail.com> wrote: > long story is... i need a way to quote code... the corner bracket was > choosen because it's standard chinese punctuation thus almost all > computer today bought in the past 5 years can display it, and because > it's different than ascii so it avoids ambiguity and complications. > e.g. when the quoted text also contains the char that you use for > quoting....
Most people just use typewriter-fixed-width fonts for that...
> S =3D [2*n for n in range(0,9) if ( (n % 2) =3D=3D 0)] > print S > # prints [0, 4, 8, 12, 16]
> It generates a list from 0 to 9, then remove the odd numbers by =E3=80=8C( = > (n > % 2) =3D=3D 0), then multiply each element by 2 in =E3=80=8C2*n=E3=80=8D, t= > hen returns a > list.
> > S =3D [2*n for n in range(0,9) if ( (n % 2) =3D=3D 0)] > > print S > > # prints [0, 4, 8, 12, 16]
> > It generates a list from 0 to 9, then remove the odd numbers by =E3=80=8C( = > > (n > > % 2) =3D=3D 0), then multiply each element by 2 in =E3=80=8C2*n=E3=80=8D, t= > > hen returns a > > list.
Xah Lee wrote: > On May 22, 2:30 am, "WJ" <w_a_x_...@yahoo.com> wrote: > > Xah Lee wrote:
> > > Here's a example of LC in python:
> > > S =3D [2*n for n in range(0,9) if ( (n % 2) =3D=3D 0)] > > > print S > > > # prints [0, 4, 8, 12, 16]
> > > It generates a list from 0 to 9, then remove the odd numbers by > > > =E3=80=8C( = (n > > > % 2) =3D=3D 0), then multiply each element by 2 in > > > =E3=80=8C2*n=E3=80=8D, t= hen returns a > > > list.