Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

why we have cons?

275 views
Skip to first unread message

Xah

unread,
Jan 3, 1998, 3:00:00 AM1/3/98
to

From a language design point of view, why do we have the pair notion in Scheme?

As I understand it, LISTs are build up using pairs primitive CONS in a unique nested fashion, like

(list 1 2 3) == (cons 1 (cons 2 (cons 3 nil)))

Why can't we simply have LIST as primitive? It seems to me that basing list on pairs is ad hoc and creats ugliness. For example, the length of (cons (list 1 2) (list 3 4)) is actually 3, not the more intuitive 2. The existence of car, cdr, and variants cadr, caar...etc. is really ugly.

All in all, the pairs notion is redundant. If user wants pairs, she can simply do (list 1 2), or nested instance thereof. The car and cdr can be supplanted by first, rest, last, list-ref etc. I can understand if cons in lisp is historical, but Scheme too? Please illuminate a Scheme newbie. Thanks.

Xah, x...@best.com
http://www.best.com/~xah/Wallpaper_dir/c0_WallPaper.html
"Perl side effects: make one assignment, the next thing you know your house is deleted."

Bryant Brandon

unread,
Jan 3, 1998, 3:00:00 AM1/3/98
to

In article <xah-ya02408000R...@nntp1.ba.best.com>,
x...@best.com (Xah) wrote:

>From a language design point of view, why do we have the pair notion in
Scheme?
>
>As I understand it, LISTs are build up using pairs primitive CONS in a
unique nested fashion, like
>
>(list 1 2 3) == (cons 1 (cons 2 (cons 3 nil)))
>
>Why can't we simply have LIST as primitive?

Try this with LIST

(cons 'a 'b) => (a . b) NOT (cons 'a (cons 'b nil))

>It seems to me that basing list on pairs is ad hoc and creats ugliness.
For example, the length of (cons (list 1 2) (list 3 4)) is actually 3, not
the more intuitive 2.

Makes perfect sense. So does
(cons (list 1 2) same as (list (list 1 2)
(cons (list 3 4) (list 3 4))
nil))

>The existence of car, cdr, and variants cadr, caar...etc. is really ugly.

In a proper LIST, every CONS's CAR points to data and every CDR points
to the next CONS or NIL. No problem.

>All in all, the pairs notion is redundant. If user wants pairs, she can
simply do (list 1 2), or nested instance thereof. The car and cdr can be
supplanted by first, rest, last, list-ref etc. I can understand if cons in
lisp is historical, but Scheme too? Please illuminate a Scheme newbie.
Thanks.

Allright. A List is built up of CONSes. A CONS(short for Construct)
consists of two pointers. The first is named CAR, the second is named
CDR. You hold onto a list by the first CONS. To get the first piece of
data in a list, use the function CAR which will return what is pointed to
by the CAR pointer of the CONS which you pass to it. CDR is likewise
used.
Let's say 'a' points to a list (0 1 2 3 4). (car a) => 0 (cdr a) =>
(1 2 3 4) (cadr a) => (car (cdr a)) => 1 etc.
The function CONS creates a new Construct in memory and sets the CAR
and CDR pointers to the arguments you specify. If you happen to pass a
list to CONS as the second argument, you basically prepend your new CONS
to the existing list. That means (CONS -1 a) => (-1 0 1 2 3 4) However,
setting the CAR of a construct just winds up pointing to whatever. _That_
means (CONS a 5) => ((0 1 2 3 4) 5) Now you know why (LENGTH (CONS (LIST
1 2) (LIST 3 4))) isn't 2, but is three. If you want each sublist to be
put into the CAR pointers of the list, use LIST. LIST creates a series of
conses and sets the CAR pointers to point to each argument which you have
provided. (LIST (LIST 1 2) (LIST 3 4)) => ((1 2) (3 4))
All in all, this system is very elegant. Much more than the equivalent
in other languages.

To those who know more about this than I do, please fix any major
screwups in the above. <g>

> Xah, x...@best.com
> http://www.best.com/~xah/Wallpaper_dir/c0_WallPaper.html
> "Perl side effects: make one assignment, the next thing you know your
house is deleted."

B.B. --I am not a goat!

Erik Naggum

unread,
Jan 3, 1998, 3:00:00 AM1/3/98
to

* x...@best.com (Xah)

| From a language design point of view, why do we have the pair notion in
| Scheme?

well, this list and cons business is an instance of what is commonly
known as an abstract data type (ADT) and one of the more successful ones
at that. you will learn more about ADTs in the basic CS curricula if you
bother to study. successful ADTs have simple interfaces and provably
correct implementations that nobody ever need to mess with again. in the
case of lists, most languages simply do not have them, and those that do,
apart from the Lisp family, get them wrong. there are two reasons for
this: (1) short-sighted users who think that "list" is a primitive and
can't be bothered to understand why a simple interface doesn't have
simple implementations, (2) stupid language designers who believe too
strongly in strong typing and thus shoot everybody else in the foot when
they want lists of objects and certainly can't be bothered to invent a
useful concept for their short-sighted programmers to use.

imagine that you want a _list_ ADT that (1) can be read and written as
lists would ordinarily be written, (2) can hold any kind of object
without modifying the objects in the list, (3) can be taken apart into a
head and a tail in constant time, and (4) could be extended with a new
object in constant time. as a general rule it takes a real genius to
implement something so that it looks simple, in contrast to the dumb
people who implement something the simple way, so it becomes complex and
generally useless to everybody else. in our case, John McCarthy is
probably to credit for the historic invention of the only slightly
counter-intuitive implementation of the simple "list" concept (ADT).

instead of parsing and printing lists in a special syntax that did not
itself produce anything useful, such as C's {a, b, ..., z} syntax, useful
only in initializer lists to the compiler, the decision was that lists
should be _something_ in their own right, and that programs should be
able to deal with them internally as well as read and write them. this
is brilliant.

instead of having only one particular type in a list of objects that knew
they were lists (like linked lists with a pointer in the object itself),
it would be real cool to have a list "wrapper" that had at least one
pointer to another object and it would be even cooler if it could point
to any kind of object so the list could be heterogenous. let's call this
the "payload pointer". of course, the payload pointer and the resulting
heterogeneity introduces complexity to the reader and writer functions
for lists, but fortunately, this complexity is fully contained in the
implementation of the abstract data type. constructing a list would be
equally painless without the programmer having to know the internals.
this is brilliant.

instead of the obvious "hardware" solution using a fixed-size vector of
pointers and the ability to have a pointer to any element of that vector
count as the whole vector, a linked list approach was made in the list
"wrapper". this makes it possible to add elements at the head of the
list in constant time, instead of having to copy the entire vector into a
bigger vector when it outgrew its limits (which you'd have to keep track
of somewhere, since you already use the hardware ability to point to any
slot in the fixed-sized vector). let's use a "link pointer" and put it
in the list "wrapper", too. now, not only do we have a pointer to the
object we want lists of, we have a pointer to another list "wrapper" in a
linked list, but we said that the payload pointer would be heterogenous
and could point to anything, so let's let the payload pointer and the
link pointer be the same type. the payload pointer can now point to a
list "wrapper" and the link pointer can point to data. this way, we have
a general object that has two genral pointers and can be used for more
than just lists, although lists have especially nice read and write
functions and other accessors. this is brilliant design!

finally, let's recognize the fact that there is only one empty list, but
instead of having an anchor at the end of every chain, let's let the
empty list be represented by a special list "wrapper" that pointed to
itself with both the payload and link pointer. again, this adds some
complexity to the internals, but we find that this complexity actually
reduces overall system complexity a _lot_. (of course, Scheme blew this
design issue big time and forces you to treat the empty list as an anchor
in _all_ cases, not just where it acts as an anchor.)

you see, this list "wrapper" is a profoundly general mechanism.

now, for historical reasons, the list "wrapper" object is called a "cons
cell", and the payload pointer is known as the car of the cons cell, the
link pointer the cdr of the cons cell. this comes from the historic fact
that lists were implemented early on in machine registers that were twice
as large as pointers, quite contrary to modern machines. the left half
of this register was the payload pointer, and the right half was the link
pointer. now, this kind of pointer magic had a special meaning to the
hardware on which it was implemented. unlike today's braindamaged CPU's,
pointers into hardware pointer vectors had a _size_ attached to them, so
you could tell when you had reached the end of the list. this was used
to implement the push-down list (PDL), better known as a "stack". when
you pushed an object on a PDL, the PDL pointer itself would be one closer
to the end of the list, so the size would be decremented, and its payload
pointer would be incremented. (stacks that grown downward are a newer
invention.) in effect, the kind of register that was later reused for
the list "wrapper" had an _address_ part and a _decrement_ part.

on this ancient hardware, it was useful to extract the left half and the
right half into other machine registers. the Contents of the Address
Register was obtained with the "car" instruction. likewise, the Contents
of the Decrement Register was obtained with the "cdr" instruction. since
we were implementing an abstract data type that users would see as lists,
it didn't much matter what they were named, but we can probably be glad
that it was something that simple (they can be combined like "cddadr",
which can be appreciated only when dealing with a complexity that most
newbies take years to discover), not something like HLRZ and HRRZ, as
which they were actually implemented on another CPU (much more beautiful
than the IBM, but direct followups on that topic to alt.sys.pdp10 and/or
alt.folklore.computers :).

however they were named and initially implemented, it didn't much matter
that it could be optimally implemented on one computer or had to be
represented as a pair of full machine-word pointers on other computers.
the concept of a cons as a pair of pointers prevailed and the "car" and
"cdr" instructions became general accessors into the "cons". of course,
the Scheme folks threw the baby out with the bathwater in their incessant
desire for cleanliness and forgot the history, so now call it "pair?"
although they have yet to understand the usefulness of operators like
`first', `second', ..., `tenth' for individual elements and `rest' for
the tail and stick to `car' and `cdr' always.

in Common Lisp, the "list" abstraction is taken one step further, and we
have the `list' and `list*' operators to construct lists of elements or
elements and another list (tail), respectively. `first' and `rest' refer
to the head and the tail of the list abstraction. `cons' is still around
because the cons cell is so much more general than just lists. `car' and
`cdr' are generally used to access the pointers in the cons cell as such,
not as implementation parts of the list abstraction. this is useful at a
low level and when dealing with trees or complex list structures.

| All in all, the pairs notion is redundant.

I hope you understand and appreciate what I have written above so the
following does not apply to you anymore. you see, I wrote it all because
I _really_ wanted to say that that sentence is the single most ignorant
and shallow-minded line that I have ever seen in any programming language
newsgroup or forum and I hope _never_ to see anybody display such utter
disregard for the brilliance and genius of people who came before them
just because they grew up in an age when "simple interface" is sneered
and laughed at in favor of "simple implementation" so any dumb fsck can
implement it right out of a "for Dummies" book. the "list" concept in
Lisp is easy to use but actually quite hard to implement correctly. it
is in the language because it is hard to implement and because the dumb
fscks who implement it on their own screw up so monumentally. (just look
at what C programmers do when they need lists! *shudder*)

thank you and have a nice new year.

#:Erik
--
The year "98" was new 1900 years ago. | Help fight MULE in GNU Emacs 20!
Be year 2000 compliant, write "1998"! | http://sourcery.naggum.no/emacs/

Ben Goetter

unread,
Jan 3, 1998, 3:00:00 AM1/3/98
to

In article <30928371...@naggum.no>, cle...@naggum.no says...

> you will learn more about ADTs in the basic CS curricula if you
> bother to study. [...]

Erik, will you /please/ use a consistent posting address? You're
slipping through my killfiles.

Thank you. Have a nice new year.

Ray Dillinger

unread,
Jan 3, 1998, 3:00:00 AM1/3/98
to

Xah wrote:
>
> From a language design point of view, why do we have the pair notion
> in Scheme?

Well... in scheme, remember that iteration is expressed by the syntax
of recursion. Thus, when iterating across a list, we usually need to
be able to get the head of the list (the thing we're accessing now)
and the rest of the list (to pass to the next iteration of processing).

Cons cells provide a very handy way to do that, without incurring
any overhead for shifting and re-sizing lists as we make our recursive
calls. Because they are small, cons cells don't usually have to have
mutators applied to them (which makes a number of things handier in
scheme) from the time they are allocated to the time they are destroyed.

> As I understand it, LISTs are build up using pairs primitive CONS
> in a unique nested fashion, like
>
> (list 1 2 3) == (cons 1 (cons 2 (cons 3 nil)))

You understand correctly.

> Why can't we simply have LIST as primitive? It seems to me that

> basing list on pairs is ad hoc and creats ugliness. For example,
> the length of (cons (list 1 2) (list 3 4)) is actually 3, not

> the more intuitive 2. The existence of car, cdr, and variants

> cadr, caar...etc. is really ugly.

'((1 2) 3 4), which is the value your expression evaluates to,
does in fact look like something of length 3 to me.

(list (list 1 2) 3 4), which is how most scheme programmers would
have written the expression, looks like an expression that *ought*
to evaluate to something with a length of three.

Car and cdr are unfortunate names based on ancient hardware --
I'd prefer to call them 'first' and 'rest' -- but the concepts
they stand for are very solid. What they are is the smallest
possible building blocks for arbitrarily complex structured
data.

Because they are uniform in size, they are easier to
garbage-collect, and can be allocated in blocks for greater
efficiency in memory use, and can be re-used once the garbage
collector has reaped them rather than deallocated and reallocated,
which is another efficiency win. Because they are small, They
rarely need to be mutated -- which contributes greatly to the
efficiency of most garbage collection algorithms by keeping to
a minimum the number of pointers that point at things allocated
more recently than the pointers themselves.

Caddr, cddadr, etc, are merely syntactic sugar -- if you don't
like them, you don't need to use them. They are handy for some
things though -- consider caar and cadr as accessors into a list
of dotted pairs for example.

> All in all, the pairs notion is redundant. If user wants pairs,
> she can simply do (list 1 2), or nested instance thereof. The
> car and cdr can be supplanted by first, rest, last, list-ref etc.

The pairs notion is extremely useful. Can you develop a data
structure that has the same "nice" properties in terms of
constant-time accessors for first element and rest, easy
garbage collection, re-usability of cells, etc, that isn't
based on pairs? I can't, I don't think. I'd have to think
about it very hard for an inordinate length of time, anyway.

Scheme does have a "vector" primitive data type, which is a
very flexible implementation of heterogenous arrays. I use it
when I need random-access to sets of elements, but it is nowhere
near as flexible, general, and clean as the pairs-based notion
of lists.

Bear


---
It takes a long time to understand nothing.
-- Edward Dahlberg

Bill House

unread,
Jan 3, 1998, 3:00:00 AM1/3/98
to

Martin Rodgers

unread,
Jan 3, 1998, 3:00:00 AM1/3/98
to

Ben Goetter wheezed these wise words:

> Erik, will you /please/ use a consistent posting address? You're
> slipping through my killfiles.

[grin] I suspect there's some great reason why Erik posts comments
like <30928371...@naggum.no> that appear to be flamebait, like a
Zen master hitting a monk who is meditating. (See below.) If he did
not post provocative statements, we might be in danger of falling
asleep. A little "stimulating" should encourage us to think.

Erik is, after all, explaining a rare concept in computer languages.
I fully agree with his comments about ADTs, even if I am more
comfortable than he is with strong type checking. The real question is
whether a type system enables or disables you, and to what degree.
So, we can find examples that severely disable the programmer, like
C++, and those that powerfully enable the programmer. I'd place all
forms of Lisp in the latter group of languages.

I'd add late binding to the ADT issue. It's possible to use late
binding via functions to acheive the same ADT effect in a language
like C, but as it's so tedious to do so (see Erik's comments about C's
initialisers), few C programmers bother. There's more code to write,
more effort to visualise what the code does, more "paperwork" to
manage the code. In other words, the C programmer has to do more work
that can be done much easier and more reliably by the computer.

This is, I think, one of the real strengths of the ADT approach.
It can easily be exploited in many languages, but some will make it
easiler than others. Erik is saying that this is easier in Common Lisp
than in Scheme, because the empty list can only be represented by '().

To Erik, this is a bug. To others, it may be a feature. This could be
why Erik uses CL and others use Scheme. I'm reminded of the (bitter?)
semicolon issue that divides C and Pascal programmers. Byte gender is
another example. People don't call that last one "Endian" for nothing.

So, in this thread, we could view Erik as a Zen master and Xah as his
student. Perhaps if Erik were to crosspost whilst stating that there's
only One True Programming Language, there could be some cause for
offence. Most of us prefer the diplomatic approach, adding qualifiers
like "IMHO". Perhaps a Zen master has less reason to be humble?



> Thank you. Have a nice new year.

And a Happy New Year to you, too.
--
Please note: my email address is munged; You can never browse enough
"Oh knackers!" - Mark Radcliffe

Xah

unread,
Jan 3, 1998, 3:00:00 AM1/3/98
to

I (Xah) wrote:
>From a language design point of view, why do we have the pair notion in Scheme?
>...

Most people have misintepreted my question. However, the replies are informative. Thanks.

Perhaps I'd make my original question clearer by using an analogy.

The essential controls to a car is the wheel, acceleration, and break. Stickture gear may provide more efficient control, but it is not essential. (conceptually redundant)

Similarly, a list (abstractly) is simply an ordered sequence of things. A tree is simply a list of lists. A pair notion (like Scheme's cons) is perhaps good in a language for implementation/efficiency reasons, but it is a redundant notion conceptually.

Here's my original question rephrased: Why did the designers of Scheme -- a language that emphasize conceptual elegance -- included the pairs notion (cons) in the language?

Partial answers are inferred from replies: Because lists are efficiently implemented using linked pairs (or so), therefore Scheme designers decided (intentionally or not) to make this pairs explicit as a language primitive. (thus we have cons, car, cdr.)

By having implementation/efficiency issues showing up in the language, Scheme apparantly is less a higher-order (or conceptually elegant) language than it would be. Similarly, the vector notion/primitive in Scheme is also conceptually redundant. A list will do. (again, in the name of efficiency...)

Now having understood some backgrounds of cons, car, cdr, my next question is: Is there any good to actually have a pairs primitive in a (higher-order) language? The answer is apparantly no, as evidenced by few private email I got. (ML apparantly doesn't have the pairs notion, nor does Mathematica -- a list is simply a list, no fuzz attached. (purists don't drive a car with stictures (^_^)) Modern compiler/intepreter optimization technology should be advanced enough to generate the linked pairs internally from however shaped trees.

All comments and inspirations are welcome. Language flames by email, please. Thanks.

PS: The namings of cons, car, and cdr are apparantly historical (as explained in SICP p.85).

Xah, x...@best.com
http://www.best.com/~xah/PageTwo_dir/MathPrograms_dir/mathPrograms.html
"New Programing Paradigm: Parallel execution of bugs and C/C++ programers."

Barry Margolin

unread,
Jan 3, 1998, 3:00:00 AM1/3/98
to

In article <34AEA0...@sonic.net>, Ray Dillinger <be...@sonic.net> wrote:
>Car and cdr are unfortunate names based on ancient hardware --
>I'd prefer to call them 'first' and 'rest'

Those names are only appropriate when conses are being used to construct
lists. For a simple pair, there's no inherent reason why the car should be
considered to be "first" (except that it comes first when reading the
printed representation from left to right), and even less reason why the
cdr should be considered "rest".

Perhaps "left" and "right" would be more appropriate, based on the printed
representation. I like to believe that the old names have stuck around
precisely *because* they're meaningless. Conses are completely abstract,
so it's appropriate that the names for the two cells imply no bias.

--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Cambridge, MA
Support the anti-spam movement; see <http://www.cauce.org/>
Please don't send technical questions directly to me, post them to newsgroups.

Barry Margolin

unread,
Jan 3, 1998, 3:00:00 AM1/3/98
to

In article <xah-ya02408000R...@nntp1.ba.best.com>,
Xah <x...@best.com> wrote:

>Similarly, a list (abstractly) is simply an ordered sequence of things. A
>tree is simply a list of lists. A pair notion (like Scheme's cons) is
>perhaps good in a language for implementation/efficiency reasons, but it
>is a redundant notion conceptually.

How is a treee "simply a list of lists"? A binary tree is a pair of
subtrees. The cons is a perfect data structure for representing a binary
tree (well, not quite -- you often need to have data at the nodes, not just
the leaves).

>Here's my original question rephrased: Why did the designers of Scheme --
>a language that emphasize conceptual elegance -- included the pairs notion
>(cons) in the language?

Sche philosphy of Scheme is to provide basic building blocks from which
everything else can be derived. Conses are more primitive and conceptually
elegant than lists -- you can construct arbitrary data structures from
them. Sure, you can also do this with lists, but then you have that extra
link to nil that's frequently being wasted.

Bryant Brandon

unread,
Jan 3, 1998, 3:00:00 AM1/3/98
to

In article <68mvuj$ljk$3...@mycroft.westnet.com>, Lawrence Troxler
<l...@westnet.com> wrote:

>In comp.lang.lisp Martin Rodgers
<mcr@this_email_address_intentionally_left_crap_wildcard.demon.co.uk>
wrote:
>
>: Erik is, after all, explaining a rare concept in computer languages.


>: I fully agree with his comments about ADTs, even if I am more

> ^^^
>
>Please don't use abbreviations without first defining them.

Ditto. OABTW, you DTS. SBL. IMHO, UFUBAR. But, FYI, ICBW.

<translation>
I agree. Oh, and by the way, you did the same. See below. In my
humble opinion, ypu fouled up beyond all reconciliation. But, foy your
information, I could be wrong.
</translation>

P.S. What does TIA mean?

P.P.S. ADT = Abstract Data Type

>TIA
>
>Larry
>--
>-- Larry Troxler -- l...@westnet.com -- Patterson, NY USA --
>

B.B. --I!G

Kent M Pitman

unread,
Jan 4, 1998, 3:00:00 AM1/4/98
to

Sigh. Who is the "we" in the subject line here?
Much though I have strong interest in some of these topics,
I won't participate in discussing them in a cross-posted forum.

Cross-posted conversations are aggravating and I wish not to
contribute to such aggravation. An explanation of my issues
with this practice can be found in
http://world.std.com/~pitman/pfaq/cross-posting.html

I wish that you would stop cross-posting your questions because it
excludes me and others like me from discussion.

Also, I suspect your asking both forums is based on a blurry notion
that the two languages are the same language, brought to you by the
same people, interested in the same issues, etc. As one of the few
people who participates in design meetings for both languages, I can
tell you there is almost no cultural or design overlap between Scheme
and Lisp and any similarity between the two is largely historical in
nature. The forces that drive the design communities and the
processes of review undergone are wholly different. The user
communities are differnt. The two design communities do not even
agree on what is aesthetic and what is not. The two design
communities do not agree on the same set of priorities (that is,
aesthetics is a dominating criterion in the Scheme community and
explicitly is NOT a dominating criterion in the Common Lisp
community). When you get two communities that are out for entirely
different things and you ask them to talk together as if they were one
without acknowledging the huge gap between them, you just end up
spinning a lot of wheels with no useful outcome in sight. Consensus
is only ever reached within one of these camps, never across them.
They are opposing political parties. It's like asking the Republicans
and the Democrats to have a common debate on what platform to have for
abortion, taxation, or some other hot-button issue that has created
the divide between the two organizations in the first place.

Look around the world. The most heated of wars is fought between the
closest of cousins--the north and south of this or that once-unified
country, protestants and catholics, arabs and jews, ... communities
that to the outsider, not familiar with the subtle history, might
easily be mistaken for one another because in so many ways those
opposing each other are so similar--but superficial similarities must
not be assumed to dictate agreement. They can hide very complex
differences. And the solution is never to say one is right and one is
wrong, and sometimes the soluition isn't even as friendly as inviting
everyone over to the same dinner. The solution is the understanding
that the walls are sometimes there for a reason, and that behind each
of our walls we can enjoy the comfort of knowing that we (for any
value of "we") are right and that others (while "obviously wrong") are
entitled to disagree. Tolerance. It's fine for Scheme people to come
to Lisp and as students of The Lisp Way to learn it for what it is in
context. It's fine for Lisp people to come to Scheme and as students
of The Scheme Way to learn it for what it is in context. But it's a
disaster, in my opinion, to just randomly assert that this or that
feature ripped wholesale from Scheme would work if only jammed into
Lisp, or vice versa. Sort of like if I ripped out my heart and handed
it to you and said "use it. it worked for me." It's part of a live,
complex system and it works because the parts of the system have been
designed as a complete whole. You can discuss human physiology or
klingon physiology, but don't assume that just having studying
physiology in general you're ready to go meddling in anyone's
biosystem.

If you are truly not sure which community you belong to, I suggest
you ask your questions separately of each list and allow the dialogs
to grow separately in each of the communities. You may find the
differences in the way the communities approach the same question
to be instructive.

All of this is just my personal opinion, and is not the official
position of any organization I'm affiliated with.
--Kent

Lawrence Troxler

unread,
Jan 4, 1998, 3:00:00 AM1/4/98
to

: Erik is, after all, explaining a rare concept in computer languages.
: I fully agree with his comments about ADTs, even if I am more
^^^

Please don't use abbreviations without first defining them.

TIA

Martin Rodgers

unread,
Jan 4, 1998, 3:00:00 AM1/4/98
to

Lawrence Troxler wheezed these wise words:

> Please don't use abbreviations without first defining them.

Didn't Erik define ADT? Abstract Data Type.

> TIA

Now I know this one. ;) Thanks In Advance?

Jrm

unread,
Jan 4, 1998, 3:00:00 AM1/4/98
to

Martin Rodgers wrote in message ...


>I'm reminded of the (bitter?)
>semicolon issue that divides C and Pascal programmers.

I believe that it only divides *Pascal* programmers.


Martin Rodgers

unread,
Jan 4, 1998, 3:00:00 AM1/4/98
to

Jrm wheezed these wise words:

> I believe that it only divides *Pascal* programmers.

I wonder why I've seen C programmers involved in "semicolon" wars. ;)
I think that you need _two_ points of view for such disagreements.

An excellent reason for not crossposting between comp.lang.pascal and
comp.lang.c, or even comp.lang.lisp and comp.lang.scheme.

Ray Dillinger

unread,
Jan 4, 1998, 3:00:00 AM1/4/98
to

Jrm wrote:
>
> Martin Rodgers wrote in message ...
> >I'm reminded of the (bitter?)
> >semicolon issue that divides C and Pascal programmers.
>
> I believe that it only divides *Pascal* programmers.

Perhaps a better phrasing would have been,
"the (bitter?) semicolon issue which is a division between
C programmers and Pascal programmers."

But either way, it has nothing to do with scheme. We could,
perhaps, spend our time better speaking of "the (bitter?)
separate namespace issue which is a division between Scheme
programmers and Common Lisp programmers."

Bear

Axel Schairer

unread,
Jan 4, 1998, 3:00:00 AM1/4/98
to

> I (Xah) wrote: >From a language design point of view, why do we have
> the pair notion in Scheme?

Previous posters have described how this could be explained by reasons
rooted in implementation issues or history. Eric has written about
why cons should be preferred instead of a list primitive mentionning
abstract data types (ADTs). I'd like to comment on this second issue:

If you look at modern functional languages, e.g. ML, you invariably
find pairs, or tuples for that matter, built into the type system.
Additionally you find sum types. Using a sum type of product types
you can then design your own lists in an obvious way, i.e. a list is
either an empty list or a pair of an element and another list (see,
e.g., [1: chapters 2 and 3]). Once you play around with these ideas
you find that it is much more appropriate to see lists as a datatype
specified using pairs than the other way round. At least this is how
it occurs to me. Note: I know that these languages all include
built-in support for (basic) lists for various reasons, although to my
knowledge the difference is not visible, i.e. lists behave as if they
were defined as sums of products. (Am I right here?)

Now if you look at how one would (formally) specify the interface to
lists you will find the simple spec along the lines: Given a list l,
MAKE-LIST(a, l) is a list and FIRST of it is a and REST of it is l.
MAKE-EMPTY-LIST is a list different from all lists made by MAKE-LIST.
Nothing else is a list. Everything else can be defined with these
primitives by recursion. Also all properties of lists can be stated
and proved by induction over these terms within a well established
theory (see e.g. [2]). Anyway, you take products (more spcifically,
pairs) for granted for defining lists.

What you cannot yet do is write something like a1, ..., an for a list
(provided you do not have a device for handling a variable number of
arguments, in which case you could design a list-creating function).
But then, strictly speaking, you do not need more than a means of
abbreviating MAKE-LIST(a1, ... MAKE-LIST(an, MAKE-EMPTY-LIST)...) by
something like [a1, ..., an] and you are done. This is a syntactic
thing and we do not have to be concerned with it here.

I have tried to argue informally so far that the notion of products
(`pairs') is used in the definition of lists in functional languages,
even when pairs and lists are distinct. If I am correct here then the
original question boils down to the following: Is it a good or a bad
thing for the structure of a lisp-list, i.e. a list being a cons cell
with the cdr containing a list, to be visible to the programmer. If
memory serves me right there is a discussion on this topic (not
related to lisp however) in [1], too.

My own opinions and experiences:

1) It seems to fit best with the overall `feel' of lisp to make this
structure visible and let the user decide whether he wants to use the
information or not. (I personally use first and rest on lists, and
car and cdr on pairs, so my lists are not pairs ;-)

2) However it annoys me everytime I make a stupid mistake (like
building up a dotted list instead of a proper list) that is not
trapped by lisp (if I call rest on a dotted pair I made an error, but
lisp won't complain) but would have been illegal in, say, ML. So,
contrary to what Eric said for himself, I would profit in this respect
if a list was not at the same time a pair.

3) Also I find that I do not look at the same object as a list and as
a cons cell too often, except for ad hoc hacks. But this is surely
dependent on the domain one is working with and different for other
users. As an example, walking terms with a rich structure that does
not matter at the moment represented as lists can be much easier if
you just handle non-conses and recurse for cars and cdrs of your
conses instead of explicitly walking the rich structure.

[1] B J MacLennan: Functional Programming, Addison Wesley, 1990
[2] J Loeckx, H-D Ehrich, M Wolf: Specification of Abstract Data
Types, Wiley/Teubner, 1996

Does this make sense? -- Axel

=== Axel Schairer, htt p: //w ww.dfk i.de/ vse/staf f/scha irer/ ===
--
=== Axel Schairer, http://www.dfki.de/vse/staff/schairer/ ===

Erik Naggum

unread,
Jan 4, 1998, 3:00:00 AM1/4/98
to

* Axel Schairer

| 2) However it annoys me everytime I make a stupid mistake (like building
| up a dotted list instead of a proper list) that is not trapped by lisp
| (if I call rest on a dotted pair I made an error, but lisp won't
| complain) but would have been illegal in, say, ML. So, contrary to what
| Eric said for himself, I would profit in this respect if a list was not
| at the same time a pair.

ah, but this is a valid point I had not considered. I tend to use `endp'
when traversing lists, so I do get very loud complaints. if you check for
the end of the list with any other test, such as identity with nil, I can
see that you would get problems, but I imagine that `endp' would solve them
in practice if not (completely satisfactorily) in theory.

Carl Alphonce

unread,
Jan 4, 1998, 3:00:00 AM1/4/98
to

< SNIP discussion of pairs deleted SNIP >

> ML apparantly doesn't have the pairs notion, nor does Mathematica -- a list is simply a list, no fuzz attached.

:: in ML is more restricted than Scheme's cons function. Restricted to
list structures, cons behaves just like :: does. car behaves like hd,
and cdr like tl.

Standard ML of New Jersey, Version 0.93, February 15, 1993
val it = () : unit
- val foo = [1,2,3];
val foo = [1,2,3] : int list
- hd foo;
val it = 1 : int
- tl foo;
val it = [2,3] : int list
- val bar = 0::foo;
val bar = [0,1,2,3] : int list

Chez Scheme Version 5.0a
Copyright (c) 1994 Cadence Research Systems
> (define foo '(1 2 3))
> (car foo)
1
> (cdr foo)
(2 3)
> (define bar (cons '0 foo))
> bar
(0 1 2 3)

Because Scheme is weakly typed, cons cells are very flexible. There
is a tradeoff between Scheme's flexibility and the relative security
offered by ML's polymorphic type system. In Scheme cons cells are
used to build all sorts of recursive data structures. In ML you must
declare different data constructors to build different data structures.

Lists in Prolog are modeled after lists in Lisp/Scheme. The list which
written as
[1,2,3]
is represented by the term
'.'(1,'.'(2,'.'(3,[])))
where '.' is the principle functor of the term, and [] is an atom
interpreted as the empty list. This is equivalent to the dot notation
in Scheme. The list (1 2 3) is just a convenient shorthand for
(1 . (2 . (3 . ())))

Like Scheme, Prolog is weakly typed. Prolog affords the programmer
similar flexibility in building recursive data structures as does
Scheme. Any structure built using cons in Scheme can be represented
as a Prolog '.' term, not just list structures.

I think the main point here is that languages like Scheme have made
different choices than languages like ML. :: in ML cannot be as
flexible as cons in Scheme, because ML strongly (though
polymorphically) typed. In return, ML offers the programmer a great
deal of security: there can be no runtime type errors.

Carl
--
Carl Alphonce

Martin Rodgers

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

Ray Dillinger wheezed these wise words:

> But either way, it has nothing to do with scheme. We could,
> perhaps, spend our time better speaking of "the (bitter?)
> separate namespace issue which is a division between Scheme
> programmers and Common Lisp programmers."

Most certainly. I was only pointing out that this kind of division
isn't unique to CL/Scheme. I feel that the C/Pascal semicolon issue is
far less significant than the CL/Scheme namespace issue, yet Scheme
and CL programmers seem, at least to me, to be able to discuss this
issue without the same degree of hostility.

As for the cons question, I think that a programmer who is truely
interested in learning new ideas must sometimes ask questions that
will be "obvious" to more experienced programmers. It may not be
obvious to everyone why Lisp lists should be the way they, and if they
don't ask, then nobody can explain it to them.

Has this question now been answered?
--
<URL:http://www.wildcard.demon.co.uk/> You can never browse enough
"As you read this: Am I dead yet?" - Rudy Rucker
Please note: my email address is gubbish

Xah

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

In the course of discussion, I think there is a concept I'm trying to express but has been explicitly indicated as stupidity by Eric. In this message I'll try to bring it up pointedly.

Consider a set. A set is a collection of things. It is not necessary to think of a set as a pair, where one part is another pair ...etc. An n-tuple is simply a set of n things. Similarly for n-dimentional vectors. There is no structural difference between a set of 3 real numbers, a 3-tuple, or a 3D vector. We can think of them as list of 3 things, or a tree of three branches, and write them as List[a,b,c].

A matrix is simply a set of vectors. Conceptually, it is a set of n things containing m things. In other words, it is a tree with n branches, each containing m branches. (you may switch n m) We can write them as, e.g. List[List[a1,a2],List[b1,b2],List[c1,c2]].

Structurally, all the above are simply trees of different shapes. In general, we write List[expr, expr,...] where each expr is a List[] or nested instance thereof.

As computer sciences advances, programing languages becomes more advanced and abstract where it moves away from references to hardware and implementation details. For example, in Scheme, one can represent arbitrary trees by writing (list expr expr...) where expr are lists themselves. The programer needs not to know about pointers or memory addresses as "lower-level" languages like C does.

In the path of abstraction, it is conceivable a goal to have a *programing* language that is more or less a concrete embodiment of algorithm specification, where any references to hardware or implementation is not existent in the language. The programer is only concerned of writing algorithms without needing to know computer science byproducts like string, float, int, double, OR, cons. In this ideal language, it would be a kludge to say cons (or pairs) is necessary to build up a list, or that there are several ways to represent a tree because of efficiency. Efficiency is an idea in algorithm, but implementation efficiency is not. e.g. Scheme's (vector 1 2 3) vs (list 1 2 3) vs (cons 1 (cons 2 (cons 3 (cons nil)))).

Mathematica is a language that represents trees uniformly. All trees are simply List[...]. I came from a Mathematica background, and this is what prompted me to question on Scheme's cons. It is conceivable that efficiency/abstraction tradeoff are unavoidable in theory, of designing/implementing a progaming language. Eric's argument that cons is a *necessary good* as a primitive for list alludes to me the steoretypical argument between C and lisp programers. One camp wants control, the other abstraction.

My original question on "why Scheme has the pairs notion" is answered satisfactory to me. All replies that addresses my question has been very informative to me. Thanks.

PS this message is posted separatly to comp.lang.lisp and comp.lang.scheme in deference for Kent Pitman. I'm neutral to his manifesto on cross-posting. Also, I do not know ML in any way. In Mathematica, there is no cons notion in any form what-so-ever.

Xah, x...@best.com
http://www.best.com/~xah/Wallpaper_dir/c0_WallPaper.html
"(Intelligence (lisp (pigs (perl (unix baggage (C (brain tumor)))))))"

Tim Bradshaw

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

* Xah wrote:

> By having implementation/efficiency issues showing up in the
> language, Scheme apparantly is less a higher-order (or conceptually
> elegant) language than it would be. Similarly, the vector
> notion/primitive in Scheme is also conceptually redundant. A list
> will do. (again, in the name of efficiency...)

No, this is wrong. What you're missing here is that hiding these
`implementation' issues is hiding issues of computational complexity,
and that is something you do not want to do. There are crucial time
complexity differences between lists and vectors which are reflected
in programs that use them. Specifically the NTH operation is O(1) on
a vector but O(n) on a list, while insertion / deletion at the head is
O(1) for a list but O(n) for a vector.

If you hide those differences then you'll never be able to work out
the complexity of your programs, and you'll lose horribly as a result.
Indeed this kind of lossage is a common problem for Lisp & scheme
beginners, who haven't understood the properties of lists, assume that
NTH is O(1), and implement array-manipulation code in terms of lists
with very slow results. (Here `slow' means `of worse time
complexity'.)

Functional-language people (or someone) will probably argue that you
should just have one type and allow the compiler to figure out the
representation. Perhaps.

(Of course, you could implement conses & therefore lists in terms of
tiny vectors, and just have vectors, which is kind of what actually
happens at the machine level, but they're so useful as a type that
it's worth having them in the language).

--tim

Daniel R Barlow

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

In article <68obf1$eqr$1...@newsie2.cent.net>, Jrm <emer...@cape.com> wrote:
>Martin Rodgers wrote in message ...
>>I'm reminded of the (bitter?)
>>semicolon issue that divides C and Pascal programmers.
>I believe that it only divides *Pascal* programmers.

Yes, I think it terminates C programmers.

-dan

Erik Naggum

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

* Xah

| In the course of discussion, I think there is a concept I'm trying to
| express but has been explicitly indicated as stupidity by Eric.

well, arrogant ignorance, actually, from which there is both room and
hope for improvement, provided one is willing to listen.

| In the path of abstraction, it is conceivable a goal to have a

| *programming* language that is more or less a concrete embodiment of


| algorithm specification, where any references to hardware or
| implementation is not existent in the language.

I don't think this goal is conceivable, simply because, pragmatically,
anything we want to do on a computer has to be implemented somehow, and
philosophically, even your most abstract notions are implementations of
something else. ergo, implementations are ipso facto unavoidable.

| The programmer is only concerned of writing algorithms without needing to


| know computer science byproducts like string, float, int, double, OR,
| cons.

no, a _programmer_ is concerned with strings, floats, ints, doubles, and
conses because he is _implementing_ an algorithm on a _computer_. that's
what programming is all about. but somehow, it is still permissible for
computer scientists to sneer at their tools. Dijkstra has reportedly
said that "Computer science is as much about computers as Astronomy is
about telescopes", and we can appreciate this in terms of our discipline
being much _more_ than just hardware and implementations, but we cannot
_deny_ the computer any more than astronomers can deny their telescopes.

that a programmer is often tasked with algorithm design is due to a
shortage of algorithm designers who are clueful enough to present works
that take less time for programmers to implement than to do it all over
on their own would, plus the way we work on computers these days, and
finally, how computers tend to be hostile to handwaving and explanations
that don't take the form of a precise program.

| In this ideal language, it would be a kludge to say cons (or pairs) is
| necessary to build up a list, or that there are several ways to represent
| a tree because of efficiency. Efficiency is an idea in algorithm, but
| implementation efficiency is not. e.g. Scheme's (vector 1 2 3) vs (list 1
| 2 3) vs (cons 1 (cons 2 (cons 3 (cons nil)))).

what's ideal depends on what your idea is. ideals cannot be universal.

however, as I alluded to, Scheme as a language blew it in just the areas
where its proponents make the most noise about being superior to every
other language. that Scheme forces you to be fully aware of the cons
cell through the accessors `car' and `cdr' is just an instance of this.

Common Lisp has the constructors `list' and `list*', the accessors
`first', `second', ..., `tenth' for individual elements in a known list
structure, `nth' and `elt' for a variable index, and `rest' for the tail.
a test for the end of a list is available with `endp'. not a `cons',
`car', `cdr' or `nil' in sight!

of course, you _could_ implement these in Scheme, too.

| Mathematica is a language that represents trees uniformly. All trees are
| simply List[...]. I came from a Mathematica background, and this is what
| prompted me to question on Scheme's cons.

_all_ of your complaint is that you can see the internals of the `list'
abstraction in Scheme and Lisp that you cannot in Mathematica? right?

I think I understand how you argue, and all I can say is that we do have
a point of agreement: it's bad design to force programmers to deal with
an implementation issue when it costs _nothing_ to do the Right Thing and
hide behind names that suggest a more abstract view and communicate to
the programmer (who is much more important than the computer) that one
deals with lists, not with cons cells, regardless of what the thing
actually does (because that's implementation details that you don't want
to deal with _needlessly_).

I believe that you would never have observed the implementation if you
had not _had_ to deal with `car' and `cdr', but you drew the wrong
conclusion from the evidence: that an implementation is redundant because
you're used to the interface. suppose that things were implemented
exactly the same way, but you couldn't get at the individual cons cells.
would you still complain? I honestly don't think you would. which leads
me to believe that if you had learned Common Lisp instead of Scheme and
read source code from the best and most modern CL programmers, you would
never have had reason to discover the implementation of lists.

if you are still bothered by the fact that a list is made up of conses,
that must mean that you _are_ concerned with implementation details, yet
you profess very strongly that you aren't, so I detect a contradiction in
your interests. if this is not so, may I ask what your real concern is?

| It is conceivable that efficiency/abstraction tradeoff are unavoidable in

| theory, of designing/implementing a programming language.

well, it is not only conceivable, it is obvious to me.

| Eric's argument that cons is a *necessary good* as a primitive for list
| alludes to me the steoretypical argument between C and lisp programers.
| One camp wants control, the other abstraction.

if this is really what you got out of it, then you should go back and
read my article again, unless you don't think this, anymore.

abstractions exist in levels, level N being implemented in level N-1.
your only complaint is that you see one more level than you're used to
from Mathematica. this isn't even close to _any_ stereotypical argument
between Lisp and C programmers. C programmers want easy implementation,
frequently at the cost of prohibiting abstraction. Lisp programmers want
implementations that _support_ abstraction. as I tried to showed you,
the implementation of lists with cons cells is foreign to C programmers,
who _much_ prefer fixed-sized vectors of pointers or a "next" pointer
_in_ the particular objects they wish to make lists of. a C programmer
who had invented the cons cell as a struct or a two-element vector of
pointers would probably be laughed at because it's _inefficient_ in the
minds of other C programmers.

| In Mathematica, there is no cons notion in any form what-so-ever.

well, so it would have been redundant in Mathematica. otherwise, they
would have implemented it, right? I criticize your claim as arrogantly
ignorant because you can not, in fact, hold on to needs that Mathematica
solved for its users when you argue that the cons cell is redundant in
_Scheme_ as you are flat out _ignoring_ the needs that Scheme (and other
Lisps) solved. if they didn't need it, it wouldn't be there. you have
to give the language designers this much credit. if not, you are still
arrogant and ignorant, like so many others who find flaws with

Michael Greenwald

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

Tim Bradshaw <t...@aiai.ed.ac.uk> writes:
> while insertion / deletion at the head is
>O(1) for a list but O(n) for a vector.

People keep saying this, but it is not strictly true. (Rather, it is
true in only a limited and uninteresting sense.)

If you wanted to support insertion at the head of a vector (as, for
example, CLU suports with a primitive addh (add at head) operator),
then the underlying representation could be a block of W words, with a
header pointing to N words beginning at offset o. To insert-at-head,
decrement o. If o is already 0 you have to allocate a new, larger,
block. If you double the length of W every time you insert-at-head
and there's no room, then a series of K insert-at-heads is not (on
average) noticeably worse than inserting at the head of a list. You
incur O(k) costs Log K times, for k being powers of 2, so the total
cost of K insertions will be approx 2K. The cost of consing K times
to the head of a list is of the same complexity (i.e. K conses).
That's worse case, of course, if you start removing entries from the
head, then the vector is much more efficient than lists.

Space overhead? The vector will be at worst 50% empty in the constant
growth scenario, which is equivalent to the wasted space for CDR's in
non-CDR-coded lists.

There *is* a space advantage to CONS cells as a separate
implementation type, if you think that short lists/queues will be
common: avoiding the overhead of the header by having fixed sized
cells. However, this is not so significant (and you can special case
short vectors with special tricks, only going to separate headers for
"large enough" vectors).

To avoid being misunderstood and to avoid provoking flames:

I didn't say anything about the relative values of CONS, LIST, and/or
VECTOR as primitive data types.

The O(1) vs. O(n) insertion argument at the head also applies to the
tail of a list. Using it to argue for CONS vs. Vector in these cases
is a red-herring. Of course, if you implement a priority-queue with
frequent insertions and deletions from the middle of a list, and you
implement it as a sequential vector or list, then the linked list wins
over the naive vector implementation. But you would only do this for
short queues, anyway, before moving on to a totally different data
structures.


Tim Bradshaw

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

* Michael Greenwald wrote:

> If you wanted to support insertion at the head of a vector (as, for
> example, CLU suports with a primitive addh (add at head) operator),
> then the underlying representation could be a block of W words, with a
> header pointing to N words beginning at offset o. To insert-at-head,
> decrement o. If o is already 0 you have to allocate a new, larger,
> block. If you double the length of W every time you insert-at-head
> and there's no room, then a series of K insert-at-heads is not (on
> average) noticeably worse than inserting at the head of a list. You
> incur O(k) costs Log K times, for k being powers of 2, so the total
> cost of K insertions will be approx 2K. The cost of consing K times
> to the head of a list is of the same complexity (i.e. K conses).
> That's worse case, of course, if you start removing entries from the
> head, then the vector is much more efficient than lists.

This (last bit, sorry to quote whole para) can't be right I think.
Removing n entries (one at a time!) has to be O(n), because ... well
because if it isn't you're doing magic somewhere. Or do you mean
efficient as `faster' rather than `less complex'?

But such an implementation obviously has lots of other advantages
anyway like good locality which is a huge win over conses.

Anyway, I should have qualified my original message with `given the
straightforward implementation of things'. I was really trying to
point up the fact that ignoring `implementation details' (ie saying
`everything can be a linked list' as the original poster seemed to be
saying) is something one should not do in general.

--tim

Guillermo 'Bill' J. Rozas

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

x...@best.com (Xah) writes:


> I (Xah) wrote:
> >From a language design point of view, why do we have the pair notion in Scheme?

> >...
>
> Most people have misintepreted my question. However, the replies are informative. Thanks.
>
> Perhaps I'd make my original question clearer by using an analogy.
>
> The essential controls to a car is the wheel, acceleration, and break. Stickture gear may provide more efficient control, but it is not essential. (conceptually redundant)
>

> Similarly, a list (abstractly) is simply an ordered sequence of things. A tree is simply a list of lists. A pair notion (like Scheme's cons) is perhaps good in a language for implementation/efficiency reasons, but it is a redundant notion conceptually.
>

> Here's my original question rephrased: Why did the designers of Scheme -- a language that emphasize conceptual elegance -- included the pairs notion (cons) in the language?
>

> Partial answers are inferred from replies: Because lists are efficiently implemented using linked pairs (or so), therefore Scheme designers decided (intentionally or not) to make this pairs explicit as a language primitive. (thus we have cons, car, cdr.)
>

> By having implementation/efficiency issues showing up in the language, Scheme apparantly is less a higher-order (or conceptually elegant) language than it would be. Similarly, the vector notion/primitive in Scheme is also conceptually redundant. A list will do. (again, in the name of efficiency...)
>

> Now having understood some backgrounds of cons, car, cdr, my next question is: Is there any good to actually have a pairs primitive in a (higher-order) language? The answer is apparantly no, as evidenced by few private email I got. (ML apparantly doesn't have the pairs notion, nor does Mathematica -- a list is simply a list, no fuzz attached. (purists don't drive a car with stictures (^_^)) Modern compiler/intepreter optimization technology should be advanced enough to generate the linked pairs internally from however shaped trees.
>
> All comments and inspirations are welcome. Language flames by email, please. Thanks.
>
> PS: The namings of cons, car, and cdr are apparantly historical (as explained in SICP p.85).
>
> Xah, x...@best.com
> http://www.best.com/~xah/PageTwo_dir/MathPrograms_dir/mathPrograms.html
> "New Programing Paradigm: Parallel execution of bugs and C/C++ programers."

The problem is that you are seeing it all backwards, and missing the
point. :-)

You view lists as the fundamental data structure, and pairs as an
accident of their implementation. As such, you are right that pairs
are not really necessary, may seem confusing, and it is an ugly wart
to show through the implementation of lists in the language.

However, to a true Lisp programmer (the origin of the name Lisp
nonwithstanding) _pairs_ are the fundamental data structure. Lists
are just a conventional use of pairs. No more and no less.

All the abstract properties that you care about and think are a
property of lists are really a property of pairs in Lisp. Since lists
are _just_ a convention, Lisp is a little cavalier about them, and
that's why the abstract properties do not really hold.

However, because lists are such a widespread and useful convention,
Lisp dialects (Scheme included) provide lots of operations that
operate on lists, thereby misleading novices into thinking that lists
are the fundamental data structure, and making them wonder about how
clean they are where the gaps between a "lists are fundamental" and
"lists are a conventional use of pairs" paradigms conflict.

As an analogy, think of the MAXINT or similar values that some
languages/libraries provide. In many programs such a value is used to
represent infinity, but it is not. It is just an often useful
convention. Now, since infinity (a number defined to be larger than
all ordinary numbers) is such a useful convention, you might be
tempted to ask that affine reals or transfinite integers replace the
ordinary numbers of computer arithmetic in order to get the
abstraction right, since otherwise the gaps show sometimes (and can
lead to hard to catch bugs). Most people would probably suggest that
you leave numbers as they are and that if you need a tighter
bulletproof abstraction you could implement it yourself.

The same would hold for lists in Lisp. Pairs are not a wart, but a
very useful ADT on their own right. So useful and covenient, in fact,
that people use them as glue to build other types (such as lists)
without bothering with a tighter definition or implementation.

I don't know much Mathematica, but remember that lists in ML can only
contain uniformly-typed values (since the language is so-called
statically typed), thus "exposing" the underlying pair structure would
not be as useful as it is in Lisp.

An extreme way of viewing the situation would be that since their type
system prevents them from having true Lisp-style pairs and making
lists a convention based on them, they chose a common subset of the
convention that they could fit in their type system and placed it in
the language in the hope that it would make do. :-)

Jon S Anthony

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

mcr@this_email_address_intentionally_left_crap_wildcard.demon.co.uk (Martin Rodgers) writes:

> comfortable than he is with strong type checking. The real question is
> whether a type system enables or disables you, and to what degree.

It does both. Which happens at any point depends on where you're
coming from and what it is you want to accomplish. This is not one of
those issues that has a single "right" answer and is very context
dependent.

/Jon

Michael Greenwald

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

Tim Bradshaw <t...@aiai.ed.ac.uk> writes:

>* Michael Greenwald wrote:

>> The cost of consing K times
>> to the head of a list is of the same complexity (i.e. K conses).
>> That's worse case, of course, if you start removing entries from the
>> head, then the vector is much more efficient than lists.

>This (last bit, sorry to quote whole para) can't be right I think.
>Removing n entries (one at a time!) has to be O(n), because ... well
>because if it isn't you're doing magic somewhere. Or do you mean
>efficient as `faster' rather than `less complex'?

I was assuming that copying and storage management costs were
dominant, rather than storing the item and/or
incrementing/decrementing a length field. Given that assumption
(which I think is usually reasonable) then in the case of both
insertions and deletions the dominant term is sub-linear for vectors
but linear for cons/list. To be fair, you're right, I still shouldn't
ignore the small linear term, but given the current (growing)
disparity between cache hits and cache misses it *almost* seems
reasonable to ignore memory references that hit in the cache. Given
the linear term, I guess I switched meaning from "lower complexity
class" to "faster". My apologies for being confusing.

Jon S Anthony

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

x...@best.com (Xah) writes:

> Consider a set. A set is a collection of things. It is not necessary
> to think of a set as a pair, where one part is another pair
> ...etc.

Yes, but...

> An n-tuple is simply a set of n things. Similarly for
> n-dimentional vectors. There is no structural difference between a
> set of 3 real numbers, a 3-tuple, or a 3D vector.

no. At least not in the typical formal mathematical sense. An
n-tuple is an _ordered_ set. That's the whole point of it. In CS
land, about the only thing that acts sort of like a "set" is an ADT
known as a "bag". Things like lists, arrays, vectors (aka arrays),
etc. have order as part of their (public) semantics.


> A matrix is simply a set of vectors.

Well, no it's not - since order is quite important in the matrix
concept.

> Conceptually, it is a set of n things containing m things.

Taking just the 2D case, it's an ordered n-set of ordered m-sets.


> Mathematica is a language that represents trees uniformly. All trees
> are simply List[...].

Modulo representational equivalences, I don't believe this is correct.


> Eric's argument that cons is a *necessary good* as a primitive for
> list alludes to me the steoretypical argument between C and lisp
> programers. One camp wants control, the other abstraction.

I don't believe that was the point. I believe the point was that the
concept of "list" (in particular, the definition indicated) is
extremely fundamental and should be an intrinsic.

/Jon

--
Jon Anthony
Synquiry Technologies, Ltd., Belmont, MA 02178, 617.484.3383
"Nightmares - Ha! The way my life's been going lately,
Who'd notice?" -- Londo Mollari

Shriram Krishnamurthi

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

The recursive definition of a list is the following:

<list> = null
| (cons <value> <list>)

Scheme provides you the two constructors, null (often written '()) and
cons, and lets you build lists for yourself.

Unfortunately, Scheme also has a confusing notion of `pairs', so that
the second argument to cons can be a non-list. This may have made
sense in some land before time, but clearly once you have generative
structures (aka records), there is no need for this, and cons can be
used with the more restrictive interpretation. I certainly believe it
should be. (We put this into practice in DrScheme and, guess what,
the masses haven't been in revolt.)

The `list' procedure is just a convenient optimization over `cons'.
If you view your data as ADTs corresponding to (free-term
constructive) algebras, then you really want to build all your lists
via cons and null. cadr, etc are, again, just conveniences.

'shriram

Richard Mlynarik

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

I implemented a very-Scheme-like language which
had no cons or list datatype (Two-element vectors
were implemented very efficiently, however.)

I view lists as an interesting library type
(possibly built on vectors) but nothing fundamental.
Their central place in Scheme (read syntax, rest-lists,
etc) seems an anachronism.

Albert Petrofsky

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

mich...@Radon.Stanford.EDU (Michael Greenwald) writes:
> Tim Bradshaw <t...@aiai.ed.ac.uk> writes:
> > while insertion / deletion at the head is
> >O(1) for a list but O(n) for a vector.
>
> People keep saying this, but it is not strictly true. (Rather, it is
> true in only a limited and uninteresting sense.)
>
> If you wanted to support insertion at the head of a vector (as, for
> example, CLU suports with a primitive addh (add at head) operator),
> then the underlying representation could be a block of W words, with a
> header pointing to N words beginning at offset o. To insert-at-head,
> decrement o. If o is already 0 you have to allocate a new, larger,
> block.

There are several different operations that might be called "insertion
at the head". The one that cons gives you for pair-based lists is a
non-mutating insertion with shared structure. The result of the
operation is a new list with one more element than the operand. The
operand remains unchanged and is still usable. Subsequent mutations
of either list can affect the other.

You seem to be describing a mutating insertion operator that has a
side-effect of increasing the length of its operand. This is very
different, and often not as useful:

-- It doesn't allow you to do things like this:

(define l (list <elt1> <elt2> ... <eltN>))

(define l1 (cons <new1> l))
(define l2 (cons <new2> l))
(define l3 (cons <new3> l))
...

where each cons runs in O(1) time.

-- It is unsuitable if you are avoiding the use of mutation and
assignments so as to make it much easier (possible) to prove
properties of your program.

-- If you are using mutation, it doesn't give you the shared
structure between two lists that you might be looking for.

-al

Shriram Krishnamurthi

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

theg...@airmail.net (Bryant Brandon) writes:

> Allright. A List is built up of CONSes. A CONS(short for Construct)
> consists of two pointers. The first is named CAR, the second is named
> CDR. You hold onto a list by the first CONS. [...]
> The function CONS creates a new Construct in memory [...]

What's that you said? Pointers? New Construct in Memory? There are
no pointers in Scheme. Scheme only has values. You must mean C++,
which is down the hallway, second newsgroup on the right.

Even if Scheme had pointers, they would not have names. Variables
have names. Variables are bound to values. But values do not
inherently have names (except in certain implementations for certain
kinds of values ... the kinds of disclaimers you need on c.l.s!).

So, to translate your C++ explanation into Scheme:

A list is built up using CONS. CONS builds a composite value out of
two values. The first value can be accessed by passing the composite
value to the procedure CAR; the second by invoking CDR.

Et cetera.

'shriram

Shriram Krishnamurthi

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

alph...@cs.ubc.ca (Carl Alphonce) writes:

> I think the main point here is that languages like Scheme have made
> different choices than languages like ML. :: in ML cannot be as
> flexible as cons in Scheme, because ML strongly (though
> polymorphically) typed.

This depends on what type model you choose to use for Scheme (as I
said in the email message I sent you).

In return, ML offers the programmer a great
> deal of security: there can be no runtime type errors.

(This should probably go into the FAQ.)

ML programs do have run-time errors. They are called "exceptions".
Just because you rename them does not make them go away. Just because
they are not type errors does not mean they are not real, ugly,
hard-to-debug errors.

If you believe that this points to a real distinction between Scheme
and ML, run a program in MzScheme sometime (an implementation that has
been strongly influenced by ML). In MzScheme, I can wrap every
program I run in

(with-handlers
((exn? (lambda (exn)
(display "An exception was raised!")
(display "But I'm going to ignore it!")
(display "La-di-da!"))))
<program>)

and I would never see a run-time error either. As for _type_ errors
at run-time, it all depends (again) on what type system you choose to
use. (Machine code has no type system, so there are no type errors,
right? (-:)

'shriram

PS: That said, as I've said before in this forum, I think the single
best thing Scheme programmers should do is learn ML. There is a
kernel of truth behind what Alphonce says, and too many Lispers
and Schemers never get it.

Xah

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

(this message is mostly addressing Erik Naggum)

Erik Naggum wrote:
> _all_ of your complaint is that you can see the internals of the `list'
> abstraction in Scheme and Lisp that you cannot in Mathematica? right?

Yes, on the whole. (but perhaps not in the derogative and slightly mutated way you phrased it)

I don't know much computer science, but my questions isn't unthoughtful. If your sensitivity did not cause you to misintepreted it and immediatly severely flamed me and other groups of people flamboyantly and insincerely, this large thread verging on senseless flame war would not have started.

I wasn't complaining, nor was I arrogant. Comparing our posts in this thread, I think it's fair to say you are the one complaining and arrogant. Yes, I was ignorant about lisp, but so? In my first post, I wrote "Please illuminate a Scheme newbie. Thanks". (and please no cheap rebuttals. In all honesty, my first post is in the form of asking a question.)

Some people have indicated the incorrectness of the paragraphs I wrote in my last post on the sameness of sets, n-tuple, vectors, matrixes, and trees. I know their differences. I was just trying to illustrate how they are uniformly represented in Mathematica. In my second post in this thread I briefed described them as ordered lists and trees but apparantly the word `list' too much reminds people of `list' in CL or Scheme with associated implementation connotations.

It is apparant to me that no one so far indicated a familarity with Mathematica. Otherwise, the context of my question would be recognized clearly without doubt. The language of Mathematica is **close** to the ideal language I was talking about, where computer *engineering* concerns like byte, cons, int, float...etc. are as absent as possible in the language, and where a computer *science* oriented programer can concentrate on works in algorithms, AI, computational geometry...etc. without even knowing what is a register or caddadadar! (^_^) (now, is there a limitation to the caadadadddaadr length?)

In Erik's last message arguing point-for-point style to my message, many points are really pointless, philosiphical, and cheap shots. In summary, I think he is too deep into computer engineering than computer science, and too enamored with perhaps CL. Arrogance is usually proportional to one's (tech) prowess. Well, that's ok with me as long as you don't burn my head with real flame.

At least we are now all acquaintances. Good way to start a new year.

Xah, x...@best.com
http://www.best.com/~xah/

Samuel S. Paik

unread,
Jan 5, 1998, 3:00:00 AM1/5/98
to

In article <xah-ya02408000R...@nntp1.ba.best.com>,

x...@best.com (Xah) wrote:
>Similarly, a list (abstractly) is simply an ordered sequence of things.
>A tree is simply a list of lists. A pair notion (like Scheme's cons) is
>perhaps good in a language for implementation/efficiency reasons, but it
>is redundant notion conceptually.

Cons cells are more fundamental than lists.

>language? The answer is apparantly no, as evidenced by few private email
>I got. (ML apparantly doesn't have the pairs notion,

SML most certainly DOES have the equivalent of a cons cell. Well, they
improved on it by making it not mutable. Lists in SML are merely
strings of "cons" cells terminated by a nil (e.g. []), same as Scheme.

Sam

--
Necessity is the Mother of Improvisation. | Samuel S. Paik
Laziness and Greed are the Parents of Innovation| pa...@webnexus.com
A Rebel against the Small World Order | Speak only for self

William Paul Vrotney

unread,
Jan 6, 1998, 3:00:00 AM1/6/98
to

In article <6dd8i6o...@cobalt.transmeta.com> g...@cobalt.transmeta.com


(Guillermo 'Bill' J. Rozas) writes:
>
> However, to a true Lisp programmer (the origin of the name Lisp
> nonwithstanding) _pairs_ are the fundamental data structure. Lists
> are just a conventional use of pairs. No more and no less.
>
> All the abstract properties that you care about and think are a
> property of lists are really a property of pairs in Lisp. Since lists
> are _just_ a convention, Lisp is a little cavalier about them, and
> that's why the abstract properties do not really hold.
>
> However, because lists are such a widespread and useful convention,
> Lisp dialects (Scheme included) provide lots of operations that
> operate on lists, thereby misleading novices into thinking that lists
> are the fundamental data structure, and making them wonder about how
> clean they are where the gaps between a "lists are fundamental" and
> "lists are a conventional use of pairs" paradigms conflict.
>

Your points are well taken, but let's not go so far as to imply to novices
that lists are just a convention in Lisp. One of the first things that
novices learn (or should learn) is that lists are composed of conses or
derive from s-expressions, but beyond that lists are the far more prevalent
abstraction. Modern Lisp programs are data made up of nested lists and
atoms. Lisp macros build list structure that is subsequently evaluated.
Again, lists are a very important and fundamental notation and abstraction.
Any novice can prove this to themselves by taking a Lisp program and
rewriting it in dot notation form. Also note that most modern texts on Lisp
will avoid the use of the the term *cons* as much as possible in in favor of
*list* or *dotted list* whenever it makes sense. This is beyond just a
convention. In the fundamental development of the Lisp data type hierarchy
cons is a primitive data type, but the list is the coup de grace.

--

William P. Vrotney - vro...@netcom.com

J. Han

unread,
Jan 6, 1998, 3:00:00 AM1/6/98
to

Xah <x...@best.com> wrote:
[big snip]

> It is apparant to me that no one so far indicated a familarity with
> Mathematica. Otherwise, the context of my question would be
> recognized clearly without doubt. The language of Mathematica is
> **close** to the ideal language I was talking about, where computer

[...]

Then why are you asking Lispers if your question is in the context of
Mathematica? May I suggest Mathematica newsgroup? Many Mma users do
know Lisp and maybe they can help you understand Lisp from Mma orientation.

regards,

J.

[...]

Kent M Pitman

unread,
Jan 6, 1998, 3:00:00 AM1/6/98
to

I concur with Erik's remarks about the unavoidability of taking
the processor into account. You can choose not to, but there is
no denying the possibility of efficient and inefficient processors,
so if you don't spend any time caring, you sometimes suffer.

Then again--it's like any other trade-off you can make--you're
trading design time for execution time. Often the execution time
doesn't matter and the design time does and it's a good trade.
But you never get anything for free--the Three Laws of
Thermodynamics always apply (1. "You can't win", 2. "You can't break
even", and 3. "You can't get out of the game" ... or, as I loosely
summarize them sometimes: "everything is a trade-off, and often
an imperfect one" ... or, as they sometimes say to gamblers,
"If you don't know who the sucker is in the game, it's you.")

I liked Erik's telescape analogy, too.

One reason I think this discussion belongs separate in CL and
Scheme is that CL people mostly don't moralize over these things.
I think CL has lists as conses because historically they've been
there, for better or worse, and these days we try to maintain
linguistic stability. To change something in CL, you'd have to
demonstrate not only that you had a good idea, but that it was worth
breaking all existing code and textbooks to install your good idea.
I think it just wouldn't happen, and I think it's wrong to say
that CONS/LIST as it exists today is there for technical reasons.
It is not. It's there because it's "useful", because people are
familiar with it, because code uses it, it's part of a rich
tradition, etc. It's no longer retained because of any attempt
on the part of the designers to be smarter than the community;
rather, it's not removed because we've learned that little is gained
and much is lost by trying to assume that just because an idea
appears technically superior on paper, it will work out that
way in practice if we force it on a community that has not
asked for it.

Common Lisp itself arose historically because in the days of
Maclisp (a PDP10 Lisp, long-predating the Macintosh) we the
Lisp programmer used to find mail every monday morning from the
Lisp implementors saying "the language has changed in the following
ways: [...]. Please update your programs" and we (the users)
just got tired of that and said "it's time for the language
to be stable for a while. It's more important that sentences
we say continue to have meaning over time than that those sentences
be written in the perfect language. If it were really true that
the language was more important than the programs written in it,
that would say very little about the value of the language since
it would say no serious work was being invested in it.

And while it may seem "dirty" and "inelegant" to some communities
to have "compatibility" emphasized as a dominant criterion over
"aesthetics", I stand with pride beside such a langauge. It doesn't
mean we never used aesthetics--there are many things that are
aesthetic in CL. But like any successful language (pick your
favorite natural language) it has its odd littel places where things
don't line up. It doesn't mean you can't be precise when you want
to be. It doesn't mean you can't write great books or poems in it.
It just means that the writing of poetry is not all that is important
to the Lisp community.

For the most part, I would say very view things in Lisp are there
because of a technical reason. A lot are hapinstance. But life
is like that. And I'd rather see life in the language than a
language which is a promise for a future that is never realized.

And, frankly, I think it's a personal choice issue whether the "pun"
of lists and conses being shared is elegant or awful. Some people
really like it, some don't. One reason for this is that the human
mind is one of those possible processors and I think is often
overlooked. The reason I mention it is that it seems clear from a
variety of empirical evidence (eg. how we structure natural language)
that the mind is good at handling in parallel a lot of very complex
rules with special cases rather than a few simple ones. As such,
complex language definitions which permit simple sentences rather
than simple language definitions which require complex sentences
would, I think, seem to be more natural. The Scheme community often
tries to deny this, but I think they expect people to just take it on
faith that "simplicity" is uniquely determined. More evidence of
my claim about the three laws of thermodynamics above--i think
simple languages lead to complex programs and complex languages to
simple programs. You get nothing for free--often the Scheme
designers would seem to want you to believe that a small simple
language provides only beenfits and no drawbacks, while a larger,
more complex language like CL provides only drawbacks and no benefits.
Once again, I encourage people to be skeptical of any claim on its
face when it appears to give you benefits for free.

I prefer complex language and simple programs. I am not choosing
"complexity" over "simplicity" in so doing. I am chosing WHERE my
simplicity goes. I think learning a language is an investment that
pays dividends when you program later. I think languages that provide
lots of services pay better dividences and are worth it because I
spend more time programming than learning the language. I also like
language constructs that aren't just there to provide raw power but
are there to provide "commonality of idiom", "focus", and other things
that are more subtle control of expression. Again, I think this
simplifies aspects of program reading and maintenance.

People are welcome to debate the abstract niceties of pairs and
whether we'd be better off with other size trees. If you make them
be arbitrary size, they're trickier to GC in convenient blocks and
you have to pay a penalty for maintaining and accessing a size slot.
Or you can use fixed-size types other than 2--we had them, for
a while: go to deja news (http://www.dejanews.com/) and search
for posts by me about "hunks". Each of these is a trade-off, but
historically they haven't won out.

Yale Scheme (called "T"), conversely, had CHAR/CHDR for moving
down a string and giving the illusion that the string was listlike
even though it wan't. It was fun but didn't catch on either...

We have what we have because when given a choice, it's what people
tend to use. Maybe that's just a "practice effect" and is bad data
as a way to design a language. But so it goes. Who's to say all
that "practice" should be quickly tossed to the wind and not counted
as a valuable commodity...

Michael Sperber [Mr. Preprocessor]

unread,
Jan 6, 1998, 3:00:00 AM1/6/98
to

>>>>> "WPV" == William Paul Vrotney <vro...@netcom.com> writes:

WPV> [ ... ] Modern Lisp programs are data made up of nested lists and
WPV> atoms. Lisp macros build list structure that is subsequently
WPV> evaluated.

That is, of course, utter nonsense for Scheme (as of R4RS, anyway).
Programs have a representation separate from lists. Which is why,
for, example

(+ . (4 5))

(shown to me by Richard Kelsey) may be a valid representation for the
list (+ 4 5), but is not a valid Scheme expression. Admittedly, most
Scheme implementations simply use the read primitive to parse programs
(and therefore accept the above expression), but some (Gambit,
notably) do not.

The only place where lists are really more fundamental to the language
are in argument rest lists.

--
Cheers =8-} Mike
Friede, Völkerverständigung und überhaupt blabla

Martin Rodgers

unread,
Jan 6, 1998, 3:00:00 AM1/6/98
to

Jon S Anthony wheezed these wise words:

> It does both. Which happens at any point depends on where you're
> coming from and what it is you want to accomplish. This is not one of
> those issues that has a single "right" answer and is very context
> dependent.

This is how I see it, too. I just phrased my comment badly.

I should've said: "The real question is how a type system enables and
disables you." The important point I was making was that not everyone
will feel the same way about a particular type system, nor should we
expect them to. It's a very subjective question because of, as you
pointed out, the context dependant nature.
--
Please note: my email address is munged; You can never browse enough
"There are no limits." -- ad copy for Hellraiser

Ketil Z Malde

unread,
Jan 6, 1998, 3:00:00 AM1/6/98