As I understand it, LISTs are build up using pairs primitive CONS in a unique nested fashion, like
(list 1 2 3) == (cons 1 (cons 2 (cons 3 nil)))
Why can't we simply have LIST as primitive? It seems to me that basing list on pairs is ad hoc and creats ugliness. For example, the length of (cons (list 1 2) (list 3 4)) is actually 3, not the more intuitive 2. The existence of car, cdr, and variants cadr, caar...etc. is really ugly.
All in all, the pairs notion is redundant. If user wants pairs, she can simply do (list 1 2), or nested instance thereof. The car and cdr can be supplanted by first, rest, last, list-ref etc. I can understand if cons in lisp is historical, but Scheme too? Please illuminate a Scheme newbie. Thanks.
Xah, x...@best.com
http://www.best.com/~xah/Wallpaper_dir/c0_WallPaper.html
"Perl side effects: make one assignment, the next thing you know your house is deleted."
>From a language design point of view, why do we have the pair notion in
Scheme?
>
>As I understand it, LISTs are build up using pairs primitive CONS in a
unique nested fashion, like
>
>(list 1 2 3) == (cons 1 (cons 2 (cons 3 nil)))
>
>Why can't we simply have LIST as primitive?
Try this with LIST
(cons 'a 'b) => (a . b) NOT (cons 'a (cons 'b nil))
>It seems to me that basing list on pairs is ad hoc and creats ugliness.
For example, the length of (cons (list 1 2) (list 3 4)) is actually 3, not
the more intuitive 2.
Makes perfect sense. So does
(cons (list 1 2) same as (list (list 1 2)
(cons (list 3 4) (list 3 4))
nil))
>The existence of car, cdr, and variants cadr, caar...etc. is really ugly.
In a proper LIST, every CONS's CAR points to data and every CDR points
to the next CONS or NIL. No problem.
>All in all, the pairs notion is redundant. If user wants pairs, she can
simply do (list 1 2), or nested instance thereof. The car and cdr can be
supplanted by first, rest, last, list-ref etc. I can understand if cons in
lisp is historical, but Scheme too? Please illuminate a Scheme newbie.
Thanks.
Allright. A List is built up of CONSes. A CONS(short for Construct)
consists of two pointers. The first is named CAR, the second is named
CDR. You hold onto a list by the first CONS. To get the first piece of
data in a list, use the function CAR which will return what is pointed to
by the CAR pointer of the CONS which you pass to it. CDR is likewise
used.
Let's say 'a' points to a list (0 1 2 3 4). (car a) => 0 (cdr a) =>
(1 2 3 4) (cadr a) => (car (cdr a)) => 1 etc.
The function CONS creates a new Construct in memory and sets the CAR
and CDR pointers to the arguments you specify. If you happen to pass a
list to CONS as the second argument, you basically prepend your new CONS
to the existing list. That means (CONS -1 a) => (-1 0 1 2 3 4) However,
setting the CAR of a construct just winds up pointing to whatever. _That_
means (CONS a 5) => ((0 1 2 3 4) 5) Now you know why (LENGTH (CONS (LIST
1 2) (LIST 3 4))) isn't 2, but is three. If you want each sublist to be
put into the CAR pointers of the list, use LIST. LIST creates a series of
conses and sets the CAR pointers to point to each argument which you have
provided. (LIST (LIST 1 2) (LIST 3 4)) => ((1 2) (3 4))
All in all, this system is very elegant. Much more than the equivalent
in other languages.
To those who know more about this than I do, please fix any major
screwups in the above. <g>
> Xah, x...@best.com
> http://www.best.com/~xah/Wallpaper_dir/c0_WallPaper.html
> "Perl side effects: make one assignment, the next thing you know your
house is deleted."
B.B. --I am not a goat!
well, this list and cons business is an instance of what is commonly
known as an abstract data type (ADT) and one of the more successful ones
at that. you will learn more about ADTs in the basic CS curricula if you
bother to study. successful ADTs have simple interfaces and provably
correct implementations that nobody ever need to mess with again. in the
case of lists, most languages simply do not have them, and those that do,
apart from the Lisp family, get them wrong. there are two reasons for
this: (1) short-sighted users who think that "list" is a primitive and
can't be bothered to understand why a simple interface doesn't have
simple implementations, (2) stupid language designers who believe too
strongly in strong typing and thus shoot everybody else in the foot when
they want lists of objects and certainly can't be bothered to invent a
useful concept for their short-sighted programmers to use.
imagine that you want a _list_ ADT that (1) can be read and written as
lists would ordinarily be written, (2) can hold any kind of object
without modifying the objects in the list, (3) can be taken apart into a
head and a tail in constant time, and (4) could be extended with a new
object in constant time. as a general rule it takes a real genius to
implement something so that it looks simple, in contrast to the dumb
people who implement something the simple way, so it becomes complex and
generally useless to everybody else. in our case, John McCarthy is
probably to credit for the historic invention of the only slightly
counter-intuitive implementation of the simple "list" concept (ADT).
instead of parsing and printing lists in a special syntax that did not
itself produce anything useful, such as C's {a, b, ..., z} syntax, useful
only in initializer lists to the compiler, the decision was that lists
should be _something_ in their own right, and that programs should be
able to deal with them internally as well as read and write them. this
is brilliant.
instead of having only one particular type in a list of objects that knew
they were lists (like linked lists with a pointer in the object itself),
it would be real cool to have a list "wrapper" that had at least one
pointer to another object and it would be even cooler if it could point
to any kind of object so the list could be heterogenous. let's call this
the "payload pointer". of course, the payload pointer and the resulting
heterogeneity introduces complexity to the reader and writer functions
for lists, but fortunately, this complexity is fully contained in the
implementation of the abstract data type. constructing a list would be
equally painless without the programmer having to know the internals.
this is brilliant.
instead of the obvious "hardware" solution using a fixed-size vector of
pointers and the ability to have a pointer to any element of that vector
count as the whole vector, a linked list approach was made in the list
"wrapper". this makes it possible to add elements at the head of the
list in constant time, instead of having to copy the entire vector into a
bigger vector when it outgrew its limits (which you'd have to keep track
of somewhere, since you already use the hardware ability to point to any
slot in the fixed-sized vector). let's use a "link pointer" and put it
in the list "wrapper", too. now, not only do we have a pointer to the
object we want lists of, we have a pointer to another list "wrapper" in a
linked list, but we said that the payload pointer would be heterogenous
and could point to anything, so let's let the payload pointer and the
link pointer be the same type. the payload pointer can now point to a
list "wrapper" and the link pointer can point to data. this way, we have
a general object that has two genral pointers and can be used for more
than just lists, although lists have especially nice read and write
functions and other accessors. this is brilliant design!
finally, let's recognize the fact that there is only one empty list, but
instead of having an anchor at the end of every chain, let's let the
empty list be represented by a special list "wrapper" that pointed to
itself with both the payload and link pointer. again, this adds some
complexity to the internals, but we find that this complexity actually
reduces overall system complexity a _lot_. (of course, Scheme blew this
design issue big time and forces you to treat the empty list as an anchor
in _all_ cases, not just where it acts as an anchor.)
you see, this list "wrapper" is a profoundly general mechanism.
now, for historical reasons, the list "wrapper" object is called a "cons
cell", and the payload pointer is known as the car of the cons cell, the
link pointer the cdr of the cons cell. this comes from the historic fact
that lists were implemented early on in machine registers that were twice
as large as pointers, quite contrary to modern machines. the left half
of this register was the payload pointer, and the right half was the link
pointer. now, this kind of pointer magic had a special meaning to the
hardware on which it was implemented. unlike today's braindamaged CPU's,
pointers into hardware pointer vectors had a _size_ attached to them, so
you could tell when you had reached the end of the list. this was used
to implement the push-down list (PDL), better known as a "stack". when
you pushed an object on a PDL, the PDL pointer itself would be one closer
to the end of the list, so the size would be decremented, and its payload
pointer would be incremented. (stacks that grown downward are a newer
invention.) in effect, the kind of register that was later reused for
the list "wrapper" had an _address_ part and a _decrement_ part.
on this ancient hardware, it was useful to extract the left half and the
right half into other machine registers. the Contents of the Address
Register was obtained with the "car" instruction. likewise, the Contents
of the Decrement Register was obtained with the "cdr" instruction. since
we were implementing an abstract data type that users would see as lists,
it didn't much matter what they were named, but we can probably be glad
that it was something that simple (they can be combined like "cddadr",
which can be appreciated only when dealing with a complexity that most
newbies take years to discover), not something like HLRZ and HRRZ, as
which they were actually implemented on another CPU (much more beautiful
than the IBM, but direct followups on that topic to alt.sys.pdp10 and/or
alt.folklore.computers :).
however they were named and initially implemented, it didn't much matter
that it could be optimally implemented on one computer or had to be
represented as a pair of full machine-word pointers on other computers.
the concept of a cons as a pair of pointers prevailed and the "car" and
"cdr" instructions became general accessors into the "cons". of course,
the Scheme folks threw the baby out with the bathwater in their incessant
desire for cleanliness and forgot the history, so now call it "pair?"
although they have yet to understand the usefulness of operators like
`first', `second', ..., `tenth' for individual elements and `rest' for
the tail and stick to `car' and `cdr' always.
in Common Lisp, the "list" abstraction is taken one step further, and we
have the `list' and `list*' operators to construct lists of elements or
elements and another list (tail), respectively. `first' and `rest' refer
to the head and the tail of the list abstraction. `cons' is still around
because the cons cell is so much more general than just lists. `car' and
`cdr' are generally used to access the pointers in the cons cell as such,
not as implementation parts of the list abstraction. this is useful at a
low level and when dealing with trees or complex list structures.
| All in all, the pairs notion is redundant.
I hope you understand and appreciate what I have written above so the
following does not apply to you anymore. you see, I wrote it all because
I _really_ wanted to say that that sentence is the single most ignorant
and shallow-minded line that I have ever seen in any programming language
newsgroup or forum and I hope _never_ to see anybody display such utter
disregard for the brilliance and genius of people who came before them
just because they grew up in an age when "simple interface" is sneered
and laughed at in favor of "simple implementation" so any dumb fsck can
implement it right out of a "for Dummies" book. the "list" concept in
Lisp is easy to use but actually quite hard to implement correctly. it
is in the language because it is hard to implement and because the dumb
fscks who implement it on their own screw up so monumentally. (just look
at what C programmers do when they need lists! *shudder*)
thank you and have a nice new year.
#:Erik
--
The year "98" was new 1900 years ago. | Help fight MULE in GNU Emacs 20!
Be year 2000 compliant, write "1998"! | http://sourcery.naggum.no/emacs/
Erik, will you /please/ use a consistent posting address? You're
slipping through my killfiles.
Thank you. Have a nice new year.
Well... in scheme, remember that iteration is expressed by the syntax
of recursion. Thus, when iterating across a list, we usually need to
be able to get the head of the list (the thing we're accessing now)
and the rest of the list (to pass to the next iteration of processing).
Cons cells provide a very handy way to do that, without incurring
any overhead for shifting and re-sizing lists as we make our recursive
calls. Because they are small, cons cells don't usually have to have
mutators applied to them (which makes a number of things handier in
scheme) from the time they are allocated to the time they are destroyed.
> As I understand it, LISTs are build up using pairs primitive CONS
> in a unique nested fashion, like
>
> (list 1 2 3) == (cons 1 (cons 2 (cons 3 nil)))
You understand correctly.
> Why can't we simply have LIST as primitive? It seems to me that
> basing list on pairs is ad hoc and creats ugliness. For example,
> the length of (cons (list 1 2) (list 3 4)) is actually 3, not
> the more intuitive 2. The existence of car, cdr, and variants
> cadr, caar...etc. is really ugly.
'((1 2) 3 4), which is the value your expression evaluates to,
does in fact look like something of length 3 to me.
(list (list 1 2) 3 4), which is how most scheme programmers would
have written the expression, looks like an expression that *ought*
to evaluate to something with a length of three.
Car and cdr are unfortunate names based on ancient hardware --
I'd prefer to call them 'first' and 'rest' -- but the concepts
they stand for are very solid. What they are is the smallest
possible building blocks for arbitrarily complex structured
data.
Because they are uniform in size, they are easier to
garbage-collect, and can be allocated in blocks for greater
efficiency in memory use, and can be re-used once the garbage
collector has reaped them rather than deallocated and reallocated,
which is another efficiency win. Because they are small, They
rarely need to be mutated -- which contributes greatly to the
efficiency of most garbage collection algorithms by keeping to
a minimum the number of pointers that point at things allocated
more recently than the pointers themselves.
Caddr, cddadr, etc, are merely syntactic sugar -- if you don't
like them, you don't need to use them. They are handy for some
things though -- consider caar and cadr as accessors into a list
of dotted pairs for example.
> All in all, the pairs notion is redundant. If user wants pairs,
> she can simply do (list 1 2), or nested instance thereof. The
> car and cdr can be supplanted by first, rest, last, list-ref etc.
The pairs notion is extremely useful. Can you develop a data
structure that has the same "nice" properties in terms of
constant-time accessors for first element and rest, easy
garbage collection, re-usability of cells, etc, that isn't
based on pairs? I can't, I don't think. I'd have to think
about it very hard for an inordinate length of time, anyway.
Scheme does have a "vector" primitive data type, which is a
very flexible implementation of heterogenous arrays. I use it
when I need random-access to sets of elements, but it is nowhere
near as flexible, general, and clean as the pairs-based notion
of lists.
Bear
---
It takes a long time to understand nothing.
-- Edward Dahlberg
> Erik, will you /please/ use a consistent posting address? You're
> slipping through my killfiles.
[grin] I suspect there's some great reason why Erik posts comments
like <30928371...@naggum.no> that appear to be flamebait, like a
Zen master hitting a monk who is meditating. (See below.) If he did
not post provocative statements, we might be in danger of falling
asleep. A little "stimulating" should encourage us to think.
Erik is, after all, explaining a rare concept in computer languages.
I fully agree with his comments about ADTs, even if I am more
comfortable than he is with strong type checking. The real question is
whether a type system enables or disables you, and to what degree.
So, we can find examples that severely disable the programmer, like
C++, and those that powerfully enable the programmer. I'd place all
forms of Lisp in the latter group of languages.
I'd add late binding to the ADT issue. It's possible to use late
binding via functions to acheive the same ADT effect in a language
like C, but as it's so tedious to do so (see Erik's comments about C's
initialisers), few C programmers bother. There's more code to write,
more effort to visualise what the code does, more "paperwork" to
manage the code. In other words, the C programmer has to do more work
that can be done much easier and more reliably by the computer.
This is, I think, one of the real strengths of the ADT approach.
It can easily be exploited in many languages, but some will make it
easiler than others. Erik is saying that this is easier in Common Lisp
than in Scheme, because the empty list can only be represented by '().
To Erik, this is a bug. To others, it may be a feature. This could be
why Erik uses CL and others use Scheme. I'm reminded of the (bitter?)
semicolon issue that divides C and Pascal programmers. Byte gender is
another example. People don't call that last one "Endian" for nothing.
So, in this thread, we could view Erik as a Zen master and Xah as his
student. Perhaps if Erik were to crosspost whilst stating that there's
only One True Programming Language, there could be some cause for
offence. Most of us prefer the diplomatic approach, adding qualifiers
like "IMHO". Perhaps a Zen master has less reason to be humble?
> Thank you. Have a nice new year.
And a Happy New Year to you, too.
--
Please note: my email address is munged; You can never browse enough
"Oh knackers!" - Mark Radcliffe
Most people have misintepreted my question. However, the replies are informative. Thanks.
Perhaps I'd make my original question clearer by using an analogy.
The essential controls to a car is the wheel, acceleration, and break. Stickture gear may provide more efficient control, but it is not essential. (conceptually redundant)
Similarly, a list (abstractly) is simply an ordered sequence of things. A tree is simply a list of lists. A pair notion (like Scheme's cons) is perhaps good in a language for implementation/efficiency reasons, but it is a redundant notion conceptually.
Here's my original question rephrased: Why did the designers of Scheme -- a language that emphasize conceptual elegance -- included the pairs notion (cons) in the language?
Partial answers are inferred from replies: Because lists are efficiently implemented using linked pairs (or so), therefore Scheme designers decided (intentionally or not) to make this pairs explicit as a language primitive. (thus we have cons, car, cdr.)
By having implementation/efficiency issues showing up in the language, Scheme apparantly is less a higher-order (or conceptually elegant) language than it would be. Similarly, the vector notion/primitive in Scheme is also conceptually redundant. A list will do. (again, in the name of efficiency...)
Now having understood some backgrounds of cons, car, cdr, my next question is: Is there any good to actually have a pairs primitive in a (higher-order) language? The answer is apparantly no, as evidenced by few private email I got. (ML apparantly doesn't have the pairs notion, nor does Mathematica -- a list is simply a list, no fuzz attached. (purists don't drive a car with stictures (^_^)) Modern compiler/intepreter optimization technology should be advanced enough to generate the linked pairs internally from however shaped trees.
All comments and inspirations are welcome. Language flames by email, please. Thanks.
PS: The namings of cons, car, and cdr are apparantly historical (as explained in SICP p.85).
Xah, x...@best.com
http://www.best.com/~xah/PageTwo_dir/MathPrograms_dir/mathPrograms.html
"New Programing Paradigm: Parallel execution of bugs and C/C++ programers."
Those names are only appropriate when conses are being used to construct
lists. For a simple pair, there's no inherent reason why the car should be
considered to be "first" (except that it comes first when reading the
printed representation from left to right), and even less reason why the
cdr should be considered "rest".
Perhaps "left" and "right" would be more appropriate, based on the printed
representation. I like to believe that the old names have stuck around
precisely *because* they're meaningless. Conses are completely abstract,
so it's appropriate that the names for the two cells imply no bias.
--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Cambridge, MA
Support the anti-spam movement; see <http://www.cauce.org/>
Please don't send technical questions directly to me, post them to newsgroups.
>Similarly, a list (abstractly) is simply an ordered sequence of things. A
>tree is simply a list of lists. A pair notion (like Scheme's cons) is
>perhaps good in a language for implementation/efficiency reasons, but it
>is a redundant notion conceptually.
How is a treee "simply a list of lists"? A binary tree is a pair of
subtrees. The cons is a perfect data structure for representing a binary
tree (well, not quite -- you often need to have data at the nodes, not just
the leaves).
>Here's my original question rephrased: Why did the designers of Scheme --
>a language that emphasize conceptual elegance -- included the pairs notion
>(cons) in the language?
Sche philosphy of Scheme is to provide basic building blocks from which
everything else can be derived. Conses are more primitive and conceptually
elegant than lists -- you can construct arbitrary data structures from
them. Sure, you can also do this with lists, but then you have that extra
link to nil that's frequently being wasted.
>In comp.lang.lisp Martin Rodgers
<mcr@this_email_address_intentionally_left_crap_wildcard.demon.co.uk>
wrote:
>
>: Erik is, after all, explaining a rare concept in computer languages.
>: I fully agree with his comments about ADTs, even if I am more
> ^^^
>
>Please don't use abbreviations without first defining them.
Ditto. OABTW, you DTS. SBL. IMHO, UFUBAR. But, FYI, ICBW.
<translation>
I agree. Oh, and by the way, you did the same. See below. In my
humble opinion, ypu fouled up beyond all reconciliation. But, foy your
information, I could be wrong.
</translation>
P.S. What does TIA mean?
P.P.S. ADT = Abstract Data Type
>TIA
>
>Larry
>--
>-- Larry Troxler -- l...@westnet.com -- Patterson, NY USA --
>
B.B. --I!G
Cross-posted conversations are aggravating and I wish not to
contribute to such aggravation. An explanation of my issues
with this practice can be found in
http://world.std.com/~pitman/pfaq/cross-posting.html
I wish that you would stop cross-posting your questions because it
excludes me and others like me from discussion.
Also, I suspect your asking both forums is based on a blurry notion
that the two languages are the same language, brought to you by the
same people, interested in the same issues, etc. As one of the few
people who participates in design meetings for both languages, I can
tell you there is almost no cultural or design overlap between Scheme
and Lisp and any similarity between the two is largely historical in
nature. The forces that drive the design communities and the
processes of review undergone are wholly different. The user
communities are differnt. The two design communities do not even
agree on what is aesthetic and what is not. The two design
communities do not agree on the same set of priorities (that is,
aesthetics is a dominating criterion in the Scheme community and
explicitly is NOT a dominating criterion in the Common Lisp
community). When you get two communities that are out for entirely
different things and you ask them to talk together as if they were one
without acknowledging the huge gap between them, you just end up
spinning a lot of wheels with no useful outcome in sight. Consensus
is only ever reached within one of these camps, never across them.
They are opposing political parties. It's like asking the Republicans
and the Democrats to have a common debate on what platform to have for
abortion, taxation, or some other hot-button issue that has created
the divide between the two organizations in the first place.
Look around the world. The most heated of wars is fought between the
closest of cousins--the north and south of this or that once-unified
country, protestants and catholics, arabs and jews, ... communities
that to the outsider, not familiar with the subtle history, might
easily be mistaken for one another because in so many ways those
opposing each other are so similar--but superficial similarities must
not be assumed to dictate agreement. They can hide very complex
differences. And the solution is never to say one is right and one is
wrong, and sometimes the soluition isn't even as friendly as inviting
everyone over to the same dinner. The solution is the understanding
that the walls are sometimes there for a reason, and that behind each
of our walls we can enjoy the comfort of knowing that we (for any
value of "we") are right and that others (while "obviously wrong") are
entitled to disagree. Tolerance. It's fine for Scheme people to come
to Lisp and as students of The Lisp Way to learn it for what it is in
context. It's fine for Lisp people to come to Scheme and as students
of The Scheme Way to learn it for what it is in context. But it's a
disaster, in my opinion, to just randomly assert that this or that
feature ripped wholesale from Scheme would work if only jammed into
Lisp, or vice versa. Sort of like if I ripped out my heart and handed
it to you and said "use it. it worked for me." It's part of a live,
complex system and it works because the parts of the system have been
designed as a complete whole. You can discuss human physiology or
klingon physiology, but don't assume that just having studying
physiology in general you're ready to go meddling in anyone's
biosystem.
If you are truly not sure which community you belong to, I suggest
you ask your questions separately of each list and allow the dialogs
to grow separately in each of the communities. You may find the
differences in the way the communities approach the same question
to be instructive.
All of this is just my personal opinion, and is not the official
position of any organization I'm affiliated with.
--Kent
: Erik is, after all, explaining a rare concept in computer languages.
: I fully agree with his comments about ADTs, even if I am more
^^^
Please don't use abbreviations without first defining them.
TIA
> Please don't use abbreviations without first defining them.
Didn't Erik define ADT? Abstract Data Type.
> TIA
Now I know this one. ;) Thanks In Advance?
Martin Rodgers wrote in message ...
>I'm reminded of the (bitter?)
>semicolon issue that divides C and Pascal programmers.
I believe that it only divides *Pascal* programmers.
> I believe that it only divides *Pascal* programmers.
I wonder why I've seen C programmers involved in "semicolon" wars. ;)
I think that you need _two_ points of view for such disagreements.
An excellent reason for not crossposting between comp.lang.pascal and
comp.lang.c, or even comp.lang.lisp and comp.lang.scheme.
Perhaps a better phrasing would have been,
"the (bitter?) semicolon issue which is a division between
C programmers and Pascal programmers."
But either way, it has nothing to do with scheme. We could,
perhaps, spend our time better speaking of "the (bitter?)
separate namespace issue which is a division between Scheme
programmers and Common Lisp programmers."
Bear
> I (Xah) wrote: >From a language design point of view, why do we have
> the pair notion in Scheme?
Previous posters have described how this could be explained by reasons
rooted in implementation issues or history. Eric has written about
why cons should be preferred instead of a list primitive mentionning
abstract data types (ADTs). I'd like to comment on this second issue:
If you look at modern functional languages, e.g. ML, you invariably
find pairs, or tuples for that matter, built into the type system.
Additionally you find sum types. Using a sum type of product types
you can then design your own lists in an obvious way, i.e. a list is
either an empty list or a pair of an element and another list (see,
e.g., [1: chapters 2 and 3]). Once you play around with these ideas
you find that it is much more appropriate to see lists as a datatype
specified using pairs than the other way round. At least this is how
it occurs to me. Note: I know that these languages all include
built-in support for (basic) lists for various reasons, although to my
knowledge the difference is not visible, i.e. lists behave as if they
were defined as sums of products. (Am I right here?)
Now if you look at how one would (formally) specify the interface to
lists you will find the simple spec along the lines: Given a list l,
MAKE-LIST(a, l) is a list and FIRST of it is a and REST of it is l.
MAKE-EMPTY-LIST is a list different from all lists made by MAKE-LIST.
Nothing else is a list. Everything else can be defined with these
primitives by recursion. Also all properties of lists can be stated
and proved by induction over these terms within a well established
theory (see e.g. [2]). Anyway, you take products (more spcifically,
pairs) for granted for defining lists.
What you cannot yet do is write something like a1, ..., an for a list
(provided you do not have a device for handling a variable number of
arguments, in which case you could design a list-creating function).
But then, strictly speaking, you do not need more than a means of
abbreviating MAKE-LIST(a1, ... MAKE-LIST(an, MAKE-EMPTY-LIST)...) by
something like [a1, ..., an] and you are done. This is a syntactic
thing and we do not have to be concerned with it here.
I have tried to argue informally so far that the notion of products
(`pairs') is used in the definition of lists in functional languages,
even when pairs and lists are distinct. If I am correct here then the
original question boils down to the following: Is it a good or a bad
thing for the structure of a lisp-list, i.e. a list being a cons cell
with the cdr containing a list, to be visible to the programmer. If
memory serves me right there is a discussion on this topic (not
related to lisp however) in [1], too.
My own opinions and experiences:
1) It seems to fit best with the overall `feel' of lisp to make this
structure visible and let the user decide whether he wants to use the
information or not. (I personally use first and rest on lists, and
car and cdr on pairs, so my lists are not pairs ;-)
2) However it annoys me everytime I make a stupid mistake (like
building up a dotted list instead of a proper list) that is not
trapped by lisp (if I call rest on a dotted pair I made an error, but
lisp won't complain) but would have been illegal in, say, ML. So,
contrary to what Eric said for himself, I would profit in this respect
if a list was not at the same time a pair.
3) Also I find that I do not look at the same object as a list and as
a cons cell too often, except for ad hoc hacks. But this is surely
dependent on the domain one is working with and different for other
users. As an example, walking terms with a rich structure that does
not matter at the moment represented as lists can be much easier if
you just handle non-conses and recurse for cars and cdrs of your
conses instead of explicitly walking the rich structure.
[1] B J MacLennan: Functional Programming, Addison Wesley, 1990
[2] J Loeckx, H-D Ehrich, M Wolf: Specification of Abstract Data
Types, Wiley/Teubner, 1996
Does this make sense? -- Axel
=== Axel Schairer, htt p: //w ww.dfk i.de/ vse/staf f/scha irer/ ===
--
=== Axel Schairer, http://www.dfki.de/vse/staff/schairer/ ===
ah, but this is a valid point I had not considered. I tend to use `endp'
when traversing lists, so I do get very loud complaints. if you check for
the end of the list with any other test, such as identity with nil, I can
see that you would get problems, but I imagine that `endp' would solve them
in practice if not (completely satisfactorily) in theory.
< SNIP discussion of pairs deleted SNIP >
> ML apparantly doesn't have the pairs notion, nor does Mathematica -- a list is simply a list, no fuzz attached.
:: in ML is more restricted than Scheme's cons function. Restricted to
list structures, cons behaves just like :: does. car behaves like hd,
and cdr like tl.
Standard ML of New Jersey, Version 0.93, February 15, 1993
val it = () : unit
- val foo = [1,2,3];
val foo = [1,2,3] : int list
- hd foo;
val it = 1 : int
- tl foo;
val it = [2,3] : int list
- val bar = 0::foo;
val bar = [0,1,2,3] : int list
Chez Scheme Version 5.0a
Copyright (c) 1994 Cadence Research Systems
> (define foo '(1 2 3))
> (car foo)
1
> (cdr foo)
(2 3)
> (define bar (cons '0 foo))
> bar
(0 1 2 3)
Because Scheme is weakly typed, cons cells are very flexible. There
is a tradeoff between Scheme's flexibility and the relative security
offered by ML's polymorphic type system. In Scheme cons cells are
used to build all sorts of recursive data structures. In ML you must
declare different data constructors to build different data structures.
Lists in Prolog are modeled after lists in Lisp/Scheme. The list which
written as
[1,2,3]
is represented by the term
'.'(1,'.'(2,'.'(3,[])))
where '.' is the principle functor of the term, and [] is an atom
interpreted as the empty list. This is equivalent to the dot notation
in Scheme. The list (1 2 3) is just a convenient shorthand for
(1 . (2 . (3 . ())))
Like Scheme, Prolog is weakly typed. Prolog affords the programmer
similar flexibility in building recursive data structures as does
Scheme. Any structure built using cons in Scheme can be represented
as a Prolog '.' term, not just list structures.
I think the main point here is that languages like Scheme have made
different choices than languages like ML. :: in ML cannot be as
flexible as cons in Scheme, because ML strongly (though
polymorphically) typed. In return, ML offers the programmer a great
deal of security: there can be no runtime type errors.
Carl
--
Carl Alphonce
> But either way, it has nothing to do with scheme. We could,
> perhaps, spend our time better speaking of "the (bitter?)
> separate namespace issue which is a division between Scheme
> programmers and Common Lisp programmers."
Most certainly. I was only pointing out that this kind of division
isn't unique to CL/Scheme. I feel that the C/Pascal semicolon issue is
far less significant than the CL/Scheme namespace issue, yet Scheme
and CL programmers seem, at least to me, to be able to discuss this
issue without the same degree of hostility.
As for the cons question, I think that a programmer who is truely
interested in learning new ideas must sometimes ask questions that
will be "obvious" to more experienced programmers. It may not be
obvious to everyone why Lisp lists should be the way they, and if they
don't ask, then nobody can explain it to them.
Has this question now been answered?
--
<URL:http://www.wildcard.demon.co.uk/> You can never browse enough
"As you read this: Am I dead yet?" - Rudy Rucker
Please note: my email address is gubbish
Consider a set. A set is a collection of things. It is not necessary to think of a set as a pair, where one part is another pair ...etc. An n-tuple is simply a set of n things. Similarly for n-dimentional vectors. There is no structural difference between a set of 3 real numbers, a 3-tuple, or a 3D vector. We can think of them as list of 3 things, or a tree of three branches, and write them as List[a,b,c].
A matrix is simply a set of vectors. Conceptually, it is a set of n things containing m things. In other words, it is a tree with n branches, each containing m branches. (you may switch n m) We can write them as, e.g. List[List[a1,a2],List[b1,b2],List[c1,c2]].
Structurally, all the above are simply trees of different shapes. In general, we write List[expr, expr,...] where each expr is a List[] or nested instance thereof.
As computer sciences advances, programing languages becomes more advanced and abstract where it moves away from references to hardware and implementation details. For example, in Scheme, one can represent arbitrary trees by writing (list expr expr...) where expr are lists themselves. The programer needs not to know about pointers or memory addresses as "lower-level" languages like C does.
In the path of abstraction, it is conceivable a goal to have a *programing* language that is more or less a concrete embodiment of algorithm specification, where any references to hardware or implementation is not existent in the language. The programer is only concerned of writing algorithms without needing to know computer science byproducts like string, float, int, double, OR, cons. In this ideal language, it would be a kludge to say cons (or pairs) is necessary to build up a list, or that there are several ways to represent a tree because of efficiency. Efficiency is an idea in algorithm, but implementation efficiency is not. e.g. Scheme's (vector 1 2 3) vs (list 1 2 3) vs (cons 1 (cons 2 (cons 3 (cons nil)))).
Mathematica is a language that represents trees uniformly. All trees are simply List[...]. I came from a Mathematica background, and this is what prompted me to question on Scheme's cons. It is conceivable that efficiency/abstraction tradeoff are unavoidable in theory, of designing/implementing a progaming language. Eric's argument that cons is a *necessary good* as a primitive for list alludes to me the steoretypical argument between C and lisp programers. One camp wants control, the other abstraction.
My original question on "why Scheme has the pairs notion" is answered satisfactory to me. All replies that addresses my question has been very informative to me. Thanks.
PS this message is posted separatly to comp.lang.lisp and comp.lang.scheme in deference for Kent Pitman. I'm neutral to his manifesto on cross-posting. Also, I do not know ML in any way. In Mathematica, there is no cons notion in any form what-so-ever.
Xah, x...@best.com
http://www.best.com/~xah/Wallpaper_dir/c0_WallPaper.html
"(Intelligence (lisp (pigs (perl (unix baggage (C (brain tumor)))))))"
> By having implementation/efficiency issues showing up in the
> language, Scheme apparantly is less a higher-order (or conceptually
> elegant) language than it would be. Similarly, the vector
> notion/primitive in Scheme is also conceptually redundant. A list
> will do. (again, in the name of efficiency...)
No, this is wrong. What you're missing here is that hiding these
`implementation' issues is hiding issues of computational complexity,
and that is something you do not want to do. There are crucial time
complexity differences between lists and vectors which are reflected
in programs that use them. Specifically the NTH operation is O(1) on
a vector but O(n) on a list, while insertion / deletion at the head is
O(1) for a list but O(n) for a vector.
If you hide those differences then you'll never be able to work out
the complexity of your programs, and you'll lose horribly as a result.
Indeed this kind of lossage is a common problem for Lisp & scheme
beginners, who haven't understood the properties of lists, assume that
NTH is O(1), and implement array-manipulation code in terms of lists
with very slow results. (Here `slow' means `of worse time
complexity'.)
Functional-language people (or someone) will probably argue that you
should just have one type and allow the compiler to figure out the
representation. Perhaps.
(Of course, you could implement conses & therefore lists in terms of
tiny vectors, and just have vectors, which is kind of what actually
happens at the machine level, but they're so useful as a type that
it's worth having them in the language).
--tim
Yes, I think it terminates C programmers.
-dan
well, arrogant ignorance, actually, from which there is both room and
hope for improvement, provided one is willing to listen.
| In the path of abstraction, it is conceivable a goal to have a
| *programming* language that is more or less a concrete embodiment of
| algorithm specification, where any references to hardware or
| implementation is not existent in the language.
I don't think this goal is conceivable, simply because, pragmatically,
anything we want to do on a computer has to be implemented somehow, and
philosophically, even your most abstract notions are implementations of
something else. ergo, implementations are ipso facto unavoidable.
| The programmer is only concerned of writing algorithms without needing to
| know computer science byproducts like string, float, int, double, OR,
| cons.
no, a _programmer_ is concerned with strings, floats, ints, doubles, and
conses because he is _implementing_ an algorithm on a _computer_. that's
what programming is all about. but somehow, it is still permissible for
computer scientists to sneer at their tools. Dijkstra has reportedly
said that "Computer science is as much about computers as Astronomy is
about telescopes", and we can appreciate this in terms of our discipline
being much _more_ than just hardware and implementations, but we cannot
_deny_ the computer any more than astronomers can deny their telescopes.
that a programmer is often tasked with algorithm design is due to a
shortage of algorithm designers who are clueful enough to present works
that take less time for programmers to implement than to do it all over
on their own would, plus the way we work on computers these days, and
finally, how computers tend to be hostile to handwaving and explanations
that don't take the form of a precise program.
| In this ideal language, it would be a kludge to say cons (or pairs) is
| necessary to build up a list, or that there are several ways to represent
| a tree because of efficiency. Efficiency is an idea in algorithm, but
| implementation efficiency is not. e.g. Scheme's (vector 1 2 3) vs (list 1
| 2 3) vs (cons 1 (cons 2 (cons 3 (cons nil)))).
what's ideal depends on what your idea is. ideals cannot be universal.
however, as I alluded to, Scheme as a language blew it in just the areas
where its proponents make the most noise about being superior to every
other language. that Scheme forces you to be fully aware of the cons
cell through the accessors `car' and `cdr' is just an instance of this.
Common Lisp has the constructors `list' and `list*', the accessors
`first', `second', ..., `tenth' for individual elements in a known list
structure, `nth' and `elt' for a variable index, and `rest' for the tail.
a test for the end of a list is available with `endp'. not a `cons',
`car', `cdr' or `nil' in sight!
of course, you _could_ implement these in Scheme, too.
| Mathematica is a language that represents trees uniformly. All trees are
| simply List[...]. I came from a Mathematica background, and this is what
| prompted me to question on Scheme's cons.
_all_ of your complaint is that you can see the internals of the `list'
abstraction in Scheme and Lisp that you cannot in Mathematica? right?
I think I understand how you argue, and all I can say is that we do have
a point of agreement: it's bad design to force programmers to deal with
an implementation issue when it costs _nothing_ to do the Right Thing and
hide behind names that suggest a more abstract view and communicate to
the programmer (who is much more important than the computer) that one
deals with lists, not with cons cells, regardless of what the thing
actually does (because that's implementation details that you don't want
to deal with _needlessly_).
I believe that you would never have observed the implementation if you
had not _had_ to deal with `car' and `cdr', but you drew the wrong
conclusion from the evidence: that an implementation is redundant because
you're used to the interface. suppose that things were implemented
exactly the same way, but you couldn't get at the individual cons cells.
would you still complain? I honestly don't think you would. which leads
me to believe that if you had learned Common Lisp instead of Scheme and
read source code from the best and most modern CL programmers, you would
never have had reason to discover the implementation of lists.
if you are still bothered by the fact that a list is made up of conses,
that must mean that you _are_ concerned with implementation details, yet
you profess very strongly that you aren't, so I detect a contradiction in
your interests. if this is not so, may I ask what your real concern is?
| It is conceivable that efficiency/abstraction tradeoff are unavoidable in
| theory, of designing/implementing a programming language.
well, it is not only conceivable, it is obvious to me.
| Eric's argument that cons is a *necessary good* as a primitive for list
| alludes to me the steoretypical argument between C and lisp programers.
| One camp wants control, the other abstraction.
if this is really what you got out of it, then you should go back and
read my article again, unless you don't think this, anymore.
abstractions exist in levels, level N being implemented in level N-1.
your only complaint is that you see one more level than you're used to
from Mathematica. this isn't even close to _any_ stereotypical argument
between Lisp and C programmers. C programmers want easy implementation,
frequently at the cost of prohibiting abstraction. Lisp programmers want
implementations that _support_ abstraction. as I tried to showed you,
the implementation of lists with cons cells is foreign to C programmers,
who _much_ prefer fixed-sized vectors of pointers or a "next" pointer
_in_ the particular objects they wish to make lists of. a C programmer
who had invented the cons cell as a struct or a two-element vector of
pointers would probably be laughed at because it's _inefficient_ in the
minds of other C programmers.
| In Mathematica, there is no cons notion in any form what-so-ever.
well, so it would have been redundant in Mathematica. otherwise, they
would have implemented it, right? I criticize your claim as arrogantly
ignorant because you can not, in fact, hold on to needs that Mathematica
solved for its users when you argue that the cons cell is redundant in
_Scheme_ as you are flat out _ignoring_ the needs that Scheme (and other
Lisps) solved. if they didn't need it, it wouldn't be there. you have
to give the language designers this much credit. if not, you are still
arrogant and ignorant, like so many others who find flaws with
People keep saying this, but it is not strictly true. (Rather, it is
true in only a limited and uninteresting sense.)
If you wanted to support insertion at the head of a vector (as, for
example, CLU suports with a primitive addh (add at head) operator),
then the underlying representation could be a block of W words, with a
header pointing to N words beginning at offset o. To insert-at-head,
decrement o. If o is already 0 you have to allocate a new, larger,
block. If you double the length of W every time you insert-at-head
and there's no room, then a series of K insert-at-heads is not (on
average) noticeably worse than inserting at the head of a list. You
incur O(k) costs Log K times, for k being powers of 2, so the total
cost of K insertions will be approx 2K. The cost of consing K times
to the head of a list is of the same complexity (i.e. K conses).
That's worse case, of course, if you start removing entries from the
head, then the vector is much more efficient than lists.
Space overhead? The vector will be at worst 50% empty in the constant
growth scenario, which is equivalent to the wasted space for CDR's in
non-CDR-coded lists.
There *is* a space advantage to CONS cells as a separate
implementation type, if you think that short lists/queues will be
common: avoiding the overhead of the header by having fixed sized
cells. However, this is not so significant (and you can special case
short vectors with special tricks, only going to separate headers for
"large enough" vectors).
To avoid being misunderstood and to avoid provoking flames:
I didn't say anything about the relative values of CONS, LIST, and/or
VECTOR as primitive data types.
The O(1) vs. O(n) insertion argument at the head also applies to the
tail of a list. Using it to argue for CONS vs. Vector in these cases
is a red-herring. Of course, if you implement a priority-queue with
frequent insertions and deletions from the middle of a list, and you
implement it as a sequential vector or list, then the linked list wins
over the naive vector implementation. But you would only do this for
short queues, anyway, before moving on to a totally different data
structures.
> If you wanted to support insertion at the head of a vector (as, for
> example, CLU suports with a primitive addh (add at head) operator),
> then the underlying representation could be a block of W words, with a
> header pointing to N words beginning at offset o. To insert-at-head,
> decrement o. If o is already 0 you have to allocate a new, larger,
> block. If you double the length of W every time you insert-at-head
> and there's no room, then a series of K insert-at-heads is not (on
> average) noticeably worse than inserting at the head of a list. You
> incur O(k) costs Log K times, for k being powers of 2, so the total
> cost of K insertions will be approx 2K. The cost of consing K times
> to the head of a list is of the same complexity (i.e. K conses).
> That's worse case, of course, if you start removing entries from the
> head, then the vector is much more efficient than lists.
This (last bit, sorry to quote whole para) can't be right I think.
Removing n entries (one at a time!) has to be O(n), because ... well
because if it isn't you're doing magic somewhere. Or do you mean
efficient as `faster' rather than `less complex'?
But such an implementation obviously has lots of other advantages
anyway like good locality which is a huge win over conses.
Anyway, I should have qualified my original message with `given the
straightforward implementation of things'. I was really trying to
point up the fact that ignoring `implementation details' (ie saying
`everything can be a linked list' as the original poster seemed to be
saying) is something one should not do in general.
--tim
> I (Xah) wrote:
> >From a language design point of view, why do we have the pair notion in Scheme?
> >...
>
> Most people have misintepreted my question. However, the replies are informative. Thanks.
>
> Perhaps I'd make my original question clearer by using an analogy.
>
> The essential controls to a car is the wheel, acceleration, and break. Stickture gear may provide more efficient control, but it is not essential. (conceptually redundant)
>
> Similarly, a list (abstractly) is simply an ordered sequence of things. A tree is simply a list of lists. A pair notion (like Scheme's cons) is perhaps good in a language for implementation/efficiency reasons, but it is a redundant notion conceptually.
>
> Here's my original question rephrased: Why did the designers of Scheme -- a language that emphasize conceptual elegance -- included the pairs notion (cons) in the language?
>
> Partial answers are inferred from replies: Because lists are efficiently implemented using linked pairs (or so), therefore Scheme designers decided (intentionally or not) to make this pairs explicit as a language primitive. (thus we have cons, car, cdr.)
>
> By having implementation/efficiency issues showing up in the language, Scheme apparantly is less a higher-order (or conceptually elegant) language than it would be. Similarly, the vector notion/primitive in Scheme is also conceptually redundant. A list will do. (again, in the name of efficiency...)
>
> Now having understood some backgrounds of cons, car, cdr, my next question is: Is there any good to actually have a pairs primitive in a (higher-order) language? The answer is apparantly no, as evidenced by few private email I got. (ML apparantly doesn't have the pairs notion, nor does Mathematica -- a list is simply a list, no fuzz attached. (purists don't drive a car with stictures (^_^)) Modern compiler/intepreter optimization technology should be advanced enough to generate the linked pairs internally from however shaped trees.
>
> All comments and inspirations are welcome. Language flames by email, please. Thanks.
>
> PS: The namings of cons, car, and cdr are apparantly historical (as explained in SICP p.85).
>
> Xah, x...@best.com
> http://www.best.com/~xah/PageTwo_dir/MathPrograms_dir/mathPrograms.html
> "New Programing Paradigm: Parallel execution of bugs and C/C++ programers."
The problem is that you are seeing it all backwards, and missing the
point. :-)
You view lists as the fundamental data structure, and pairs as an
accident of their implementation. As such, you are right that pairs
are not really necessary, may seem confusing, and it is an ugly wart
to show through the implementation of lists in the language.
However, to a true Lisp programmer (the origin of the name Lisp
nonwithstanding) _pairs_ are the fundamental data structure. Lists
are just a conventional use of pairs. No more and no less.
All the abstract properties that you care about and think are a
property of lists are really a property of pairs in Lisp. Since lists
are _just_ a convention, Lisp is a little cavalier about them, and
that's why the abstract properties do not really hold.
However, because lists are such a widespread and useful convention,
Lisp dialects (Scheme included) provide lots of operations that
operate on lists, thereby misleading novices into thinking that lists
are the fundamental data structure, and making them wonder about how
clean they are where the gaps between a "lists are fundamental" and
"lists are a conventional use of pairs" paradigms conflict.
As an analogy, think of the MAXINT or similar values that some
languages/libraries provide. In many programs such a value is used to
represent infinity, but it is not. It is just an often useful
convention. Now, since infinity (a number defined to be larger than
all ordinary numbers) is such a useful convention, you might be
tempted to ask that affine reals or transfinite integers replace the
ordinary numbers of computer arithmetic in order to get the
abstraction right, since otherwise the gaps show sometimes (and can
lead to hard to catch bugs). Most people would probably suggest that
you leave numbers as they are and that if you need a tighter
bulletproof abstraction you could implement it yourself.
The same would hold for lists in Lisp. Pairs are not a wart, but a
very useful ADT on their own right. So useful and covenient, in fact,
that people use them as glue to build other types (such as lists)
without bothering with a tighter definition or implementation.
I don't know much Mathematica, but remember that lists in ML can only
contain uniformly-typed values (since the language is so-called
statically typed), thus "exposing" the underlying pair structure would
not be as useful as it is in Lisp.
An extreme way of viewing the situation would be that since their type
system prevents them from having true Lisp-style pairs and making
lists a convention based on them, they chose a common subset of the
convention that they could fit in their type system and placed it in
the language in the hope that it would make do. :-)
> comfortable than he is with strong type checking. The real question is
> whether a type system enables or disables you, and to what degree.
It does both. Which happens at any point depends on where you're
coming from and what it is you want to accomplish. This is not one of
those issues that has a single "right" answer and is very context
dependent.
/Jon
>* Michael Greenwald wrote:
>> The cost of consing K times
>> to the head of a list is of the same complexity (i.e. K conses).
>> That's worse case, of course, if you start removing entries from the
>> head, then the vector is much more efficient than lists.
>This (last bit, sorry to quote whole para) can't be right I think.
>Removing n entries (one at a time!) has to be O(n), because ... well
>because if it isn't you're doing magic somewhere. Or do you mean
>efficient as `faster' rather than `less complex'?
I was assuming that copying and storage management costs were
dominant, rather than storing the item and/or
incrementing/decrementing a length field. Given that assumption
(which I think is usually reasonable) then in the case of both
insertions and deletions the dominant term is sub-linear for vectors
but linear for cons/list. To be fair, you're right, I still shouldn't
ignore the small linear term, but given the current (growing)
disparity between cache hits and cache misses it *almost* seems
reasonable to ignore memory references that hit in the cache. Given
the linear term, I guess I switched meaning from "lower complexity
class" to "faster". My apologies for being confusing.
> Consider a set. A set is a collection of things. It is not necessary
> to think of a set as a pair, where one part is another pair
> ...etc.
Yes, but...
> An n-tuple is simply a set of n things. Similarly for
> n-dimentional vectors. There is no structural difference between a
> set of 3 real numbers, a 3-tuple, or a 3D vector.
no. At least not in the typical formal mathematical sense. An
n-tuple is an _ordered_ set. That's the whole point of it. In CS
land, about the only thing that acts sort of like a "set" is an ADT
known as a "bag". Things like lists, arrays, vectors (aka arrays),
etc. have order as part of their (public) semantics.
> A matrix is simply a set of vectors.
Well, no it's not - since order is quite important in the matrix
concept.
> Conceptually, it is a set of n things containing m things.
Taking just the 2D case, it's an ordered n-set of ordered m-sets.
> Mathematica is a language that represents trees uniformly. All trees
> are simply List[...].
Modulo representational equivalences, I don't believe this is correct.
> Eric's argument that cons is a *necessary good* as a primitive for
> list alludes to me the steoretypical argument between C and lisp
> programers. One camp wants control, the other abstraction.
I don't believe that was the point. I believe the point was that the
concept of "list" (in particular, the definition indicated) is
extremely fundamental and should be an intrinsic.
/Jon
--
Jon Anthony
Synquiry Technologies, Ltd., Belmont, MA 02178, 617.484.3383
"Nightmares - Ha! The way my life's been going lately,
Who'd notice?" -- Londo Mollari
<list> = null
| (cons <value> <list>)
Scheme provides you the two constructors, null (often written '()) and
cons, and lets you build lists for yourself.
Unfortunately, Scheme also has a confusing notion of `pairs', so that
the second argument to cons can be a non-list. This may have made
sense in some land before time, but clearly once you have generative
structures (aka records), there is no need for this, and cons can be
used with the more restrictive interpretation. I certainly believe it
should be. (We put this into practice in DrScheme and, guess what,
the masses haven't been in revolt.)
The `list' procedure is just a convenient optimization over `cons'.
If you view your data as ADTs corresponding to (free-term
constructive) algebras, then you really want to build all your lists
via cons and null. cadr, etc are, again, just conveniences.
'shriram
I view lists as an interesting library type
(possibly built on vectors) but nothing fundamental.
Their central place in Scheme (read syntax, rest-lists,
etc) seems an anachronism.
There are several different operations that might be called "insertion
at the head". The one that cons gives you for pair-based lists is a
non-mutating insertion with shared structure. The result of the
operation is a new list with one more element than the operand. The
operand remains unchanged and is still usable. Subsequent mutations
of either list can affect the other.
You seem to be describing a mutating insertion operator that has a
side-effect of increasing the length of its operand. This is very
different, and often not as useful:
-- It doesn't allow you to do things like this:
(define l (list <elt1> <elt2> ... <eltN>))
(define l1 (cons <new1> l))
(define l2 (cons <new2> l))
(define l3 (cons <new3> l))
...
where each cons runs in O(1) time.
-- It is unsuitable if you are avoiding the use of mutation and
assignments so as to make it much easier (possible) to prove
properties of your program.
-- If you are using mutation, it doesn't give you the shared
structure between two lists that you might be looking for.
-al
> Allright. A List is built up of CONSes. A CONS(short for Construct)
> consists of two pointers. The first is named CAR, the second is named
> CDR. You hold onto a list by the first CONS. [...]
> The function CONS creates a new Construct in memory [...]
What's that you said? Pointers? New Construct in Memory? There are
no pointers in Scheme. Scheme only has values. You must mean C++,
which is down the hallway, second newsgroup on the right.
Even if Scheme had pointers, they would not have names. Variables
have names. Variables are bound to values. But values do not
inherently have names (except in certain implementations for certain
kinds of values ... the kinds of disclaimers you need on c.l.s!).
So, to translate your C++ explanation into Scheme:
A list is built up using CONS. CONS builds a composite value out of
two values. The first value can be accessed by passing the composite
value to the procedure CAR; the second by invoking CDR.
Et cetera.
'shriram
> I think the main point here is that languages like Scheme have made
> different choices than languages like ML. :: in ML cannot be as
> flexible as cons in Scheme, because ML strongly (though
> polymorphically) typed.
This depends on what type model you choose to use for Scheme (as I
said in the email message I sent you).
In return, ML offers the programmer a great
> deal of security: there can be no runtime type errors.
(This should probably go into the FAQ.)
ML programs do have run-time errors. They are called "exceptions".
Just because you rename them does not make them go away. Just because
they are not type errors does not mean they are not real, ugly,
hard-to-debug errors.
If you believe that this points to a real distinction between Scheme
and ML, run a program in MzScheme sometime (an implementation that has
been strongly influenced by ML). In MzScheme, I can wrap every
program I run in
(with-handlers
((exn? (lambda (exn)
(display "An exception was raised!")
(display "But I'm going to ignore it!")
(display "La-di-da!"))))
<program>)
and I would never see a run-time error either. As for _type_ errors
at run-time, it all depends (again) on what type system you choose to
use. (Machine code has no type system, so there are no type errors,
right? (-:)
'shriram
PS: That said, as I've said before in this forum, I think the single
best thing Scheme programmers should do is learn ML. There is a
kernel of truth behind what Alphonce says, and too many Lispers
and Schemers never get it.
Erik Naggum wrote:
> _all_ of your complaint is that you can see the internals of the `list'
> abstraction in Scheme and Lisp that you cannot in Mathematica? right?
Yes, on the whole. (but perhaps not in the derogative and slightly mutated way you phrased it)
I don't know much computer science, but my questions isn't unthoughtful. If your sensitivity did not cause you to misintepreted it and immediatly severely flamed me and other groups of people flamboyantly and insincerely, this large thread verging on senseless flame war would not have started.
I wasn't complaining, nor was I arrogant. Comparing our posts in this thread, I think it's fair to say you are the one complaining and arrogant. Yes, I was ignorant about lisp, but so? In my first post, I wrote "Please illuminate a Scheme newbie. Thanks". (and please no cheap rebuttals. In all honesty, my first post is in the form of asking a question.)
Some people have indicated the incorrectness of the paragraphs I wrote in my last post on the sameness of sets, n-tuple, vectors, matrixes, and trees. I know their differences. I was just trying to illustrate how they are uniformly represented in Mathematica. In my second post in this thread I briefed described them as ordered lists and trees but apparantly the word `list' too much reminds people of `list' in CL or Scheme with associated implementation connotations.
It is apparant to me that no one so far indicated a familarity with Mathematica. Otherwise, the context of my question would be recognized clearly without doubt. The language of Mathematica is **close** to the ideal language I was talking about, where computer *engineering* concerns like byte, cons, int, float...etc. are as absent as possible in the language, and where a computer *science* oriented programer can concentrate on works in algorithms, AI, computational geometry...etc. without even knowing what is a register or caddadadar! (^_^) (now, is there a limitation to the caadadadddaadr length?)
In Erik's last message arguing point-for-point style to my message, many points are really pointless, philosiphical, and cheap shots. In summary, I think he is too deep into computer engineering than computer science, and too enamored with perhaps CL. Arrogance is usually proportional to one's (tech) prowess. Well, that's ok with me as long as you don't burn my head with real flame.
At least we are now all acquaintances. Good way to start a new year.
Xah, x...@best.com
http://www.best.com/~xah/
Cons cells are more fundamental than lists.
>language? The answer is apparantly no, as evidenced by few private email
>I got. (ML apparantly doesn't have the pairs notion,
SML most certainly DOES have the equivalent of a cons cell. Well, they
improved on it by making it not mutable. Lists in SML are merely
strings of "cons" cells terminated by a nil (e.g. []), same as Scheme.
Sam
--
Necessity is the Mother of Improvisation. | Samuel S. Paik
Laziness and Greed are the Parents of Innovation| pa...@webnexus.com
A Rebel against the Small World Order | Speak only for self
In article <6dd8i6o...@cobalt.transmeta.com> g...@cobalt.transmeta.com
(Guillermo 'Bill' J. Rozas) writes:
>
> However, to a true Lisp programmer (the origin of the name Lisp
> nonwithstanding) _pairs_ are the fundamental data structure. Lists
> are just a conventional use of pairs. No more and no less.
>
> All the abstract properties that you care about and think are a
> property of lists are really a property of pairs in Lisp. Since lists
> are _just_ a convention, Lisp is a little cavalier about them, and
> that's why the abstract properties do not really hold.
>
> However, because lists are such a widespread and useful convention,
> Lisp dialects (Scheme included) provide lots of operations that
> operate on lists, thereby misleading novices into thinking that lists
> are the fundamental data structure, and making them wonder about how
> clean they are where the gaps between a "lists are fundamental" and
> "lists are a conventional use of pairs" paradigms conflict.
>
Your points are well taken, but let's not go so far as to imply to novices
that lists are just a convention in Lisp. One of the first things that
novices learn (or should learn) is that lists are composed of conses or
derive from s-expressions, but beyond that lists are the far more prevalent
abstraction. Modern Lisp programs are data made up of nested lists and
atoms. Lisp macros build list structure that is subsequently evaluated.
Again, lists are a very important and fundamental notation and abstraction.
Any novice can prove this to themselves by taking a Lisp program and
rewriting it in dot notation form. Also note that most modern texts on Lisp
will avoid the use of the the term *cons* as much as possible in in favor of
*list* or *dotted list* whenever it makes sense. This is beyond just a
convention. In the fundamental development of the Lisp data type hierarchy
cons is a primitive data type, but the list is the coup de grace.
--
William P. Vrotney - vro...@netcom.com
> It is apparant to me that no one so far indicated a familarity with
> Mathematica. Otherwise, the context of my question would be
> recognized clearly without doubt. The language of Mathematica is
> **close** to the ideal language I was talking about, where computer
[...]
Then why are you asking Lispers if your question is in the context of
Mathematica? May I suggest Mathematica newsgroup? Many Mma users do
know Lisp and maybe they can help you understand Lisp from Mma orientation.
regards,
J.
[...]
Then again--it's like any other trade-off you can make--you're
trading design time for execution time. Often the execution time
doesn't matter and the design time does and it's a good trade.
But you never get anything for free--the Three Laws of
Thermodynamics always apply (1. "You can't win", 2. "You can't break
even", and 3. "You can't get out of the game" ... or, as I loosely
summarize them sometimes: "everything is a trade-off, and often
an imperfect one" ... or, as they sometimes say to gamblers,
"If you don't know who the sucker is in the game, it's you.")
I liked Erik's telescape analogy, too.
One reason I think this discussion belongs separate in CL and
Scheme is that CL people mostly don't moralize over these things.
I think CL has lists as conses because historically they've been
there, for better or worse, and these days we try to maintain
linguistic stability. To change something in CL, you'd have to
demonstrate not only that you had a good idea, but that it was worth
breaking all existing code and textbooks to install your good idea.
I think it just wouldn't happen, and I think it's wrong to say
that CONS/LIST as it exists today is there for technical reasons.
It is not. It's there because it's "useful", because people are
familiar with it, because code uses it, it's part of a rich
tradition, etc. It's no longer retained because of any attempt
on the part of the designers to be smarter than the community;
rather, it's not removed because we've learned that little is gained
and much is lost by trying to assume that just because an idea
appears technically superior on paper, it will work out that
way in practice if we force it on a community that has not
asked for it.
Common Lisp itself arose historically because in the days of
Maclisp (a PDP10 Lisp, long-predating the Macintosh) we the
Lisp programmer used to find mail every monday morning from the
Lisp implementors saying "the language has changed in the following
ways: [...]. Please update your programs" and we (the users)
just got tired of that and said "it's time for the language
to be stable for a while. It's more important that sentences
we say continue to have meaning over time than that those sentences
be written in the perfect language. If it were really true that
the language was more important than the programs written in it,
that would say very little about the value of the language since
it would say no serious work was being invested in it.
And while it may seem "dirty" and "inelegant" to some communities
to have "compatibility" emphasized as a dominant criterion over
"aesthetics", I stand with pride beside such a langauge. It doesn't
mean we never used aesthetics--there are many things that are
aesthetic in CL. But like any successful language (pick your
favorite natural language) it has its odd littel places where things
don't line up. It doesn't mean you can't be precise when you want
to be. It doesn't mean you can't write great books or poems in it.
It just means that the writing of poetry is not all that is important
to the Lisp community.
For the most part, I would say very view things in Lisp are there
because of a technical reason. A lot are hapinstance. But life
is like that. And I'd rather see life in the language than a
language which is a promise for a future that is never realized.
And, frankly, I think it's a personal choice issue whether the "pun"
of lists and conses being shared is elegant or awful. Some people
really like it, some don't. One reason for this is that the human
mind is one of those possible processors and I think is often
overlooked. The reason I mention it is that it seems clear from a
variety of empirical evidence (eg. how we structure natural language)
that the mind is good at handling in parallel a lot of very complex
rules with special cases rather than a few simple ones. As such,
complex language definitions which permit simple sentences rather
than simple language definitions which require complex sentences
would, I think, seem to be more natural. The Scheme community often
tries to deny this, but I think they expect people to just take it on
faith that "simplicity" is uniquely determined. More evidence of
my claim about the three laws of thermodynamics above--i think
simple languages lead to complex programs and complex languages to
simple programs. You get nothing for free--often the Scheme
designers would seem to want you to believe that a small simple
language provides only beenfits and no drawbacks, while a larger,
more complex language like CL provides only drawbacks and no benefits.
Once again, I encourage people to be skeptical of any claim on its
face when it appears to give you benefits for free.
I prefer complex language and simple programs. I am not choosing
"complexity" over "simplicity" in so doing. I am chosing WHERE my
simplicity goes. I think learning a language is an investment that
pays dividends when you program later. I think languages that provide
lots of services pay better dividences and are worth it because I
spend more time programming than learning the language. I also like
language constructs that aren't just there to provide raw power but
are there to provide "commonality of idiom", "focus", and other things
that are more subtle control of expression. Again, I think this
simplifies aspects of program reading and maintenance.
People are welcome to debate the abstract niceties of pairs and
whether we'd be better off with other size trees. If you make them
be arbitrary size, they're trickier to GC in convenient blocks and
you have to pay a penalty for maintaining and accessing a size slot.
Or you can use fixed-size types other than 2--we had them, for
a while: go to deja news (http://www.dejanews.com/) and search
for posts by me about "hunks". Each of these is a trade-off, but
historically they haven't won out.
Yale Scheme (called "T"), conversely, had CHAR/CHDR for moving
down a string and giving the illusion that the string was listlike
even though it wan't. It was fun but didn't catch on either...
We have what we have because when given a choice, it's what people
tend to use. Maybe that's just a "practice effect" and is bad data
as a way to design a language. But so it goes. Who's to say all
that "practice" should be quickly tossed to the wind and not counted
as a valuable commodity...
WPV> [ ... ] Modern Lisp programs are data made up of nested lists and
WPV> atoms. Lisp macros build list structure that is subsequently
WPV> evaluated.
That is, of course, utter nonsense for Scheme (as of R4RS, anyway).
Programs have a representation separate from lists. Which is why,
for, example
(+ . (4 5))
(shown to me by Richard Kelsey) may be a valid representation for the
list (+ 4 5), but is not a valid Scheme expression. Admittedly, most
Scheme implementations simply use the read primitive to parse programs
(and therefore accept the above expression), but some (Gambit,
notably) do not.
The only place where lists are really more fundamental to the language
are in argument rest lists.
--
Cheers =8-} Mike
Friede, Völkerverständigung und überhaupt blabla
> It does both. Which happens at any point depends on where you're
> coming from and what it is you want to accomplish. This is not one of
> those issues that has a single "right" answer and is very context
> dependent.
This is how I see it, too. I just phrased my comment badly.
I should've said: "The real question is how a type system enables and
disables you." The important point I was making was that not everyone
will feel the same way about a particular type system, nor should we
expect them to. It's a very subjective question because of, as you
pointed out, the context dependant nature.
--
Please note: my email address is munged; You can never browse enough
"There are no limits." -- ad copy for Hellraiser
Rather offtopic, but since we're ruthlessly dedicated to infinite
precision...
>> A matrix is simply a set of vectors.
> Well, no it's not - since order is quite important in the matrix
> concept.
A matrix can be a lot of things, depending on your point of view
(ie. your choice of abstraction). It could be a linear operator on
vectors, for instance.
And similarly, if you want to view it as a set of vectors, make sure it
is an ordered set of *same length* vectors. :-)
~kzm
--
If I haven't seen further, it is by standing in the footprints of giants
> How is a treee "simply a list of lists"? A binary tree is a pair of
> subtrees. The cons is a perfect data structure for representing a binary
> tree (well, not quite -- you often need to have data at the nodes, not just
> the leaves).
I think a reasonable definition of a tree is a node and a set of
subtrees. The cons cell implies an ordering of the subtrees, which may
be ignored, of course, and which anyway would seem to be necessary for
navigating the tree.
But I think you could argue that "perfect" is a strong word, and that
"suitable" might be more fitting.
This is not scheme's notion of a "List", ie, the last node of each
of these sublists is not necessarily the empty list. But it is a
valid if suboptimal notion of what one way to represent a list
might be.
Bear
William Paul Vrotney:
> [ ... ] Modern Lisp programs are data made up of nested
> lists and atoms. Lisp macros build list structure that
> is subsequently evaluated.
Michael Sperber:
> That is, of course, utter nonsense for Scheme (as of R4RS,
> anyway). Programs have a representation separate from lists.
> Which is why, for, example
>
> (+ . (4 5))
>
> (shown to me by Richard Kelsey) may be a valid representation
> for the list (+ 4 5), but is not a valid Scheme expression.
> Admittedly, most Scheme implementations simply use the read
> primitive to parse programs (and therefore accept the above
> expression), but some (Gambit, notably) do not. [...]
If "eval" is included in the language (as proposed in the not-
yet-released R5RS), will the above be required to represent
a valid expression?
-thant
--
thant at acm dot org
OK. if you say so.
| Comparing our posts in this thread, I think it's fair to say you are the
| one complaining and arrogant.
I think it's fair to say that you don't listen to what people tell you.
if there is a definition of arrogance, it must include such behavior.
| I was just trying to illustrate how they are uniformly represented in
| Mathematica.
it seems that you don't differentiate between what something looks like
and what it is. how something internal is shown to the user is (often)
called the "presentation". how something external is implemented
internally is (often) called the "representation" since we're talking
about using concepts and means different from the ones we're trying to
_represent_ and making them take on the meanings we want, as something
essentially very different from what they _are_ per se.
e.g., a machine word is just a number of bits. we _represent_ an integer
with them by multiplying each bit by a weight according to its position
in the word and summing the results. we _present_ this machine word as a
string of digits to the user when the user has decided that a particular
machine word is to be used as an integer. then we ignore the number of
bits in a machine word and talk of integers that may require any number
of machine words (as many as are necessary to maintain the bit with the
highest weight) as an "integer". this is a successful abstraction. is
it inconceivable to you that somebody might want to know about the value
of a particular bit and thus would have to know about the weight if he
could not view the integer as a number of bits? now, ask yourself what
you would do in a language that hid the implementation of integers as a
number of bits: you would have to _recover_ the implementation. Common
Lisp has a function `logbitp' which reports on a bit value by position.
is this as much a no-no as the cons cell to you, or can you think in
terms of integers as values even though `logbitp' exposes internals? I
know I can think in terms of both integer value and bits at different
times even though there is no difference to the computer. that's why I
talk about using `first' instead of `car' when the object is a list, and
using `car' instead of `first' when the object is a cons. do you get it?
however, since you keep clamoring about how Mathematica _represents_ this
and that, I must assume that you don't listen to what you're being told.
you don't _know_ how Mathematica _represents_ anything at all, since you
aren't allowed to look inside it, right? so you confuse user interface
with implementation. this is sheer ignorance, and it has not abated one
millimeter since you asked to be illuminated. if you were blinded by the
strong light, you could still have been illuminated, and there are other
ways to express this than those you have chosen. I do get both irritated
and sometimes downright hostile to "newbies" who don't want the answers
they get because they didn't understand the questions they asked. I did
try to be helpful and set the stage for your illumination, but _noooo_,
you rejected all helpful offers and were instead _only_ offended. in
contrast, I do something useful when I'm offended, but I don't turn off
the irritation, because I think people who are irritating should be told.
I very rerely flame people without solid technical contents alongside it
that also indicate that flaming will cease if they get the ideas. if you
can't see this, don't blame me for it. you choose what you react to just
as much as I do.
| It is apparant to me that no one so far indicated a familarity with
| Mathematica. Otherwise, the context of my question would be recognized
| clearly without doubt.
this is another clear indication of your arrogance. you came to us to
tell us that some notion of ours was redundant. you're wrong. then you
tell us that it's our fault that we don't agree with you because you are
ignorant of Lisp and you accuse us of being ignorant of whatever you
think is ideal. (but don't you listen? ideals are _never_ universal,
because they depend on the ideas they are embodiments of, and those ideas
differ according to context and individual preferences.)
if somebody had come to your ideal Mathematica world and made as much
fuss about the implementation of lists as you do in Lisp, what would you
have thought about them? could you please try to listen to what you're
saying as you best can imagine it would be heared by those you talk to?
| (now, is there a limitation to the caadadadddaadr length?)
in the predefined functions, c[ad][ad][ad][ad]r is the longest it gets.
you are free to define your own, longer names, if you so desire. the
_usefulness_ of this is highly debatable, however, since you should name
them according to your needs, not according to the implementation.
but why _does_ this implementation issue bother you so much? I get the
distinct impression that you cannot think in higher-level terms unless
the lower-level concepts are removed from out of your sight. you even
accuse _me_ of such a shallow-minded attitude, when it is instead obvious
that you don't make _any_ effort to understand what I'm talking about. I
have shown you that by a switch of names, your problem would go away, but
you still continue to hang on to your confusion and insistence that we
should listen to your misguided "ideals". _why_ don't you listen? do
you suffer from prestige?
| In Erik's last message arguing point-for-point style to my message, many
| points are really pointless, philosiphical, and cheap shots.
I'm really sorry, but you have to realize that the questions you asked
just don't have the answers you want them to have. so, some of them will
_have_ to be answered philosophically because you don't understand the
philosophical underpinnings of your questions, only the implementation
issues that you are used to from Mathematica.
consider this: it's an artifact of the implementation of the Mathematica
language that causes you to believe that there is not a cons-cell-based
implementation underneath, and an artifact of the implementation of lists
in Scheme (and CL) that causes you to complain about the implementation.
whether to provide access to lower-level aspects to an abstraction is
_not_ a language issue, but an implementation issue. whether to focus on
the implementation issue instead of the abstractions they implement is a
user attirude issue, not a language issue. when you ignore the fact that
there are other artifacts of the implementation that any reasonably good
Lisp programmer would employ that would not expose the implementation,
you argue very strongly against yourself -- you really want to deal with
an implementation issue: "how much of the internals can an implementation
show before it bothers Xah to death?" I don't ask that question. I ask
the question: "how little of the implementation do I have to worry about
if I want to worry as little about the implementation as possible?"
you claim that you don't want to think or talk about the implementation
issues, but you talk of nothing else! now, quit that. talk about the
abstractions you want to talk about. _I_ have written (at length) about
the fact that you _can_ think in terms of the abstractions regardless of
how the list abstraction is implemented. you have _ignored_ this and
focus only on whatever you don't like, so I have revised my opinion of
you. I think you are beyond hope and that your arrogant ignorance is
really due to stupidity. I also think it was stupid of you to force me
to revise my opinion of you.
| In summary, I think he is too deep into computer engineering than
| computer science, and too enamored with perhaps CL.
well, you have demonstrated that you can't think, so it's immaterial what
you say you think, but let's hope that you will take the time to listen
to what I have said, instead of what your narrow-minded reactions tell
you to ignore. you may find that I did lend _you_ a hand while I slapped
your ignorant comments (did you notice how I differentiated between your
state of mind before and after reading my explanation?). you stupidly
decided to behave as if _you_ were slapped, and now you return with
moronic accusations that portray your own _obsession_ with implementation
issues, quite contrary to your stated interests. whatever did you expect
to achieve from all this?
if you want lists, you have them. if you want to be bothered about cons
cells, that's your choice. I suggest you don't get bothered by them.
I hope this is one last post by me. I have some clarifications to make
regarding Erick's comments on cons in general.
Let me make it clear one last time that all I'm concerned is interface. I
don't care about how something is implementated. May it be cons cells or
whatever. I learned from Erik that things internal are often termed
"representation" while external termed "presentation". I correct my previous
statement about using the term "representation" when I meant "presentation".
Here's the correct statement: "trees are _presented_ in Mma as List[...]".
It is true that in Mathematica (mma) one cannot look at internals. Only
people at Wolfram Research Inc. (the maker of Mma) have the privilege to
look at the internals.
On the whole, the only thing I object to Erik's posts is his tone. After
some misunderstandings of his first message, I have now no objection to his
technical information about cons provided to me. And on the whole, I think
we have common views of what is the "right thing". But let me add that this
is irrelevant since I don't really have any computer science background to
make my concuring of his views meanful.
Erik expressed that he is uncertain about whether I care about
implemenation, becaues he has explained that `cons' needs not to be used,
and if I agree to this, I shouldn't have any problems. The reason for his
uncertainty is: "I have not yet expressed whether I agree that cons, car,
cdr, needs not to be used if one does not want to.". I have good general
experience programing Mma. On the whole, I do not doubt Erik's words, but
because my limited knowledge of Scheme, I'm yet unable to confirm whether I
could write code without ever using cons, car, cdr and still have a normal
looking Scheme code. I have reasons to say the contrary: In the SICP book,
very early on, cons, car, cdr, pair? and null? are used for several purposes
for making or traversing _flat_ lists. (list as of "ordered set". No
language connotations) This bothers me because the appearance of cons now
interfere with what would normally be coded if cons hasn't surfaced up in
the language as a primitive. Now since this classic Scheme book is filled
with cons and its associated primitives (car, cdr, pair?, null?), one could
imagine writing codes without using them is probably out-of-style with the
community. Also, as Erik himself expressed, only good or experiences lisper
avoids using cons and associated primitives.
Basically, I think if you have something in a language, people will use it.
Isn't this true by experience?
Again, let me emphasize I don't care about implementations. The case about
cons is that, the implementation interfered with the language (interface),
and that's what I dislike. (note: I'm not now making fun of languages. I'm
just expressing myself.) It all could just be that I'm not used to Scheme
yet. I'm just bothered by the fact that _on paper_, a nested thing is used
to represent a flat thing, and afraid the confusion will really begin when I
what a real nested thing -- trees. I could elaborate on the cons business
and give sample codes to illustrate my point if anyone wishes. (it'll take
time, especially because I'm not skilled at Scheme. I hope I've made my
point clear.)
PS for few non-technical reasons, I'm looking for a second language other
than Mma. That is the reason I'm learning Scheme.
Xah, x...@best.com
http://www.best.com/~xah/Wallpaper_dir/c0_WallPaper.html
"Unix + C = human resource hog of the century."
I think people would understand your concerns a little better if you
were able to explain why you don't have the same worries about
mathematica. After all, First is the same operator as CAR, Rest is
the same operator as CDR, and Prepend is the same operator as CONS
(when you are dealing with lists). If you're not worried about the
implementation, then why doesn't the presence of these operators in
Mathematica also "interfere with what wouldnormally be coded."
Now, it sounds like you're worried about "good style" in the language,
rather than "just interface".
> The case about cons is that, the implementation interfered with the
> language (interface), and that's what I dislike.
I don't quite see what you mean by interfering here.
> I'm just bothered by the fact that _on paper_, a nested thing is used
> to represent a flat thing,
Nesting is in the eye of the beholder, I suppose :-) You are probably
confused by the following (quoted from Revised(4) Report on the Algorithmic
Language Scheme, Clinger & Rees (ed.)):
A more streamlined notation can be used for lists: the elements of the list
are simply enclosed in parentheses and separated by spaces. The empty list
is written (). For example,
(a b c d e)
and
(a . (b . (c . (d . (e . ())))))
are equivalent notations for a list of symbols.
Now, nobody would think about using 2nd notation for lists; it's just there
to explain conses, and for educational purposes it may do more harm than
good. Please forget about the second notation for lists altogether; it's
implemenation details (You will only run into the dot notation for improper
lists, which are more an advanced topic, so leave it).
> and afraid the confusion will really begin when I
> what a real nested thing -- trees.
OK, to recap: a list like (a b c d e) is *not* nested. It's just a list of
atoms. Now a nested list looks just like a simple list, except that one or
more of the elements may be (possibly nested!) lists themselves. That's all
there is to it; no dots.
Nested lists are useful for representing trees. E.g., the following tree
(which has only data on the leafs, not on the nodes):
/\
/ /\
/\ c/\
a b d e
could be (re)presented as ((a b) (c (d e))), and which is nested alright.
(for the masochists insisting on using the dot notation, this could be
written '((a . (b)) . ((c . ((d . (e)))))). I hope this demonstrates why the
dot notation is to be avoided).
> PS for few non-technical reasons, I'm looking for a second language other
> than Mma. That is the reason I'm learning Scheme.
Good luck!
Philip
--
Philip Lijnzaad, lijn...@ebi.ac.uk | European Bioinformatics Institute
+44 (0)1223 49 4639 | Wellcome Trust Genome Campus, Hinxton
+44 (0)1223 49 4468 (fax) | Cambridge CB10 1SD, GREAT BRITAIN
PGP fingerprint: E1 03 BF 80 94 61 B6 FC 50 3D 1F 64 40 75 FB 53
> Let me make it clear one last time that all I'm concerned is interface. I
> don't care about how something is implementated.
To reiterate my previous post (hopefully without repeating the
previous mistakes!), I think that you can't do this and expect your
programs to perform reasonably, and I think that that's at the heart
of what people are trying to say (but I may be wrong).
For instance if your language gives you an array abstraction, you
*really need to know* what the performance of a[i,j,...,k] is, and you
really need it to be close to constant time. If it's not (for
instance if it's O(i+j+...+k) which it would be for a naive
linked-list-based implementation of arrays, then your array-bashing
code is going to be very slow indeed. For instance the following code
to compute the Euclidean length of a vector:
for i from 1 below length(a)
sum a[i]
will be O(length[a]^2) not O(length[a]). Similarly for many other
algorithms. If you want to deal with big arrays this is going to be
very bad news.
Similarly, if we were to use the clever array-based list
implementation someone showed earlier in this thread, then in many
cases you'd have copying going on when you might not expect it (or
very funny semantics. For instance:
a = (1 2 3 4 5)
b = tail(a)
c = cons(3,b)
now either head(a) is 3 (bad) or the list got copied somewhere (or
there's an even cleverer representation!). Again this kind of thing
can cause your programs to have different orders of time & space
complexity than you expected. Note this is *not* trying to be rude
about the array-based list stuff, it's just a property of that
implementation (I hope). People who've used the hp48 calculator will
have come across behaviour like this.
All of these issues are algorithm time/space complexity issues, and
those are `theoretical computer science' issues. But in order to talk
about them you have to know about the implementation, or at least
about the complexity of the operations it supports, which is
reasonably close to the same thing in many (most?) cases.
--tim
I'm glad that my first impression was so much better than my revised
opinion -- this is not a stupid concern, and I can also sympathize with
it, since I also think Scheme is at fault for not providing a standard
view of lists that hides the implementation. (I've already said this,
but just to be sure you don't ignore it, again.) however, such _is_
Scheme, and I guess the design rationale is "if you can do it yourself,
regardless of cost, we shouldn't do it for you". when you bring up SICP
and its use of the implementation-oriented functions instead of making an
"abstraction barrier" by using a few simple definitions, I'm somewhat
disappointed that your criticism is accurate.
| On the whole, I do not doubt Erik's words, but because my limited
| knowledge of Scheme, I'm yet unable to confirm whether I could write code
| without ever using cons, car, cdr and still have a normal looking Scheme
| code.
this is also a valid concern. when everybody has to invent their own
simple definitions of a list abstraction, they will naturally do it
subtly differently, and then we're left with designing a new language on
top of Scheme in every project or library or what have you. (this is one
of the reasons I don't like Scheme.)
| Now since this classic Scheme book is filled with cons and its associated
| primitives (car, cdr, pair?, null?), one could imagine writing codes
| without using them is probably out-of-style with the community.
I don't know whether you would.
| Also, as Erik himself expressed, only good or experiences lisper avoids
| using cons and associated primitives.
well, not _only_ good ones, but good Lisp programmers communicate their
intent to other programmers (and themselves) as much as they communicate
with the compiler and Lisp system. (I think this is true of any good
programmer, regardless of language, and that bad programmers prefer
"idiom" and arcana to clarity of purpose, but that's immaterial now.)
| Basically, I think if you have something in a language, people will use
| it. Isn't this true by experience?
yes, this is unfortunately true. however, there is this notion of a good
user of a language, and I don't think we should optimize away the good
user by forbidding the bad ones. there _is_ a place for the kind of code
grinders who make things "work most of the time". we just need to make
sure they stay in New Jersey. (sorry.)
| PS for few non-technical reasons, I'm looking for a second language other
| than Mma. That is the reason I'm learning Scheme.
I think Common Lisp would be a good choice for you. Scheme is like
coming to a construction site and being thrown a hammer and a box of
nails and off you go to make something or other with whatever raw
materials you find. Common Lisp is like moving into a big house that is
inhabitable right away but which sort of humbles you for the first month
or so while you're trying to figure out what and where everything is,
then becomes so comfortable you don't want to leave it. where Scheme is
for people who are really arrogant and want only the most basic building
blocks because they think they can craft everything they need better than
anybody else, Common Lisp is for people who are able and willing to
accept that others are a lot better at a lot of hard stuff that it would
only be nice if they could learn someday when the hard stuff that they
are themselves good at has been solved to satisfaction. I see Common
Lisp as a language for interchange between experts (and those who want to
become experts) and Scheme as a language for students. I fear, as you
also seem to do, that Scheme will force you to stay on the lower levels
if you want to talk to anybody else about your code.
in conclusion, I don't think you're the Scheme type. Scheme types tend
to _want_ to play with the implementation in order to build abstractions
from first principles. of course, people do lots of neat stuff in
Scheme, too, but it isn't "bare Scheme", anymore. with lots and lots of
code that gives them the abstraction level they really want, frequently
so much that the code only maintains a superficial Scheme flavor, it's an
open question whether it was indeed written in Scheme. my background is
in C and Unix and I have to come to dislike the need to build everything
from scratch every time, and so I embrace the "gifts" in Common Lisp, but
I still find that I need to know the internals in order to _successfully_
ignore them. somebody said: "know yourself. forget yourself". I think
the order is as important in that saying as it is with internals in a
programming language implementation.
The Scheme designers have tried to avoid providing multiple ways to do the
same thing. Since the the functions that operate on pairs can be used to
implement list operations, they haven't provided many higher-level
functions. But they have provided some, such as (list x y z) so you don't
have to write (cons x (cons y (cons z '()))).
The Scheme philosophy is that programmers should use the primitives to
implement application-specific abstractions. For instance, if you're
implementing the representation of a moving object as pair of location and
velocity, and location and velocity as lists of three numbers representing
the x, y, and z components, you would define:
(define (make-position-vector x y z)
(list x y z))
(define (position-x pv)
(car pv))
(define (position-y pv)
(cadr pv))
(define (position-z pv)
(caddr pv))
(define (make-velocity-vector dx dy dz)
(list dx dy dz))
(define (velocity-dx vv)
(car vv))
(define (velocity-dy vv)
(cadr vv))
(define (velocity-dz vv)
(caddr vv))
(define (make-moving-object position velocity)
(cons position velocity))
(define (moving-object-position mo)
(car mo))
(define (moving-object-velocity mo)
(cdr mo))
My choice to use a list for the position and velocity vectors, versus a
cons for the moving object, was essentially arbitrary. But it's also
unimportant, since all other functions in this application should use the
11 functions I defined above. If I decide to change from a cons to a list
in the implementation of moving objects, I only need to change three
functions. (Actually, more functions are needed if you want to support
updating the objects, but the idea is the same.)
Code like the above is what Lisp's DEFSTRUCT was created to replace; rather
than defining all these abstraction functions manually, you simply give a
type name and slots, and the macro defines all the necessary function. And
in fact, early MacLisp DEFSTRUCT expanded into code very much like the
above (it generated macros rather than functions, for performance reasons).
Why did I use a cons for moving objects rather than a list? My personal
philosophy is that lists should be for arbitrary-length, uniform
collections (whether ordered or not). For fixed-size collections of data,
I use the most compact data type that's convenient. When the size is 2, a
cons fits the bill; for the position and velocity vectors, I could have
used (cons x (cons y z)) instead of (list x y z), but in this case the
convenience of the list function overcame the compactness issue. However,
sometimes people make the choice on other esthetic grounds, such as printed
representation. If you print my moving object, it will print as ((x y z)
dx dy dz) rather than ((x y z) . (dx dy dz)), because the printer doesn't
know about the abstraction choice; if moving objects were (list position
velocity) they would print as ((x y z) (dx dy dz)), which looks more
"right"
Of course, in Common Lisp none of this would be an issue, as I would just
use DEFSTRUCT or DEFCLASS for these things. When I was doing lots of CL
programming (it's been about 4 years), I rarely used conses directly, used
lists mostly for uniform collections, and used DEFSTRUCT for most
"structured" data.
As you can see, there are many issues that go into deciding what primitive
data structures should be in a language, and which should be used in
application programming.
--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Cambridge, MA
Support the anti-spam movement; see <http://www.cauce.org/>
Please don't send technical questions directly to me, post them to newsgroups.
In article <uf1zym7...@synquiry.com> Jon S Anthony <j...@synquiry.com>
writes:
>
> mcr@this_email_address_intentionally_left_crap_wildcard.demon.co.uk (Martin Rodgers) writes:
>
> > comfortable than he is with strong type checking. The real question is
> > whether a type system enables or disables you, and to what degree.
>
> It does both. Which happens at any point depends on where you're
> coming from and what it is you want to accomplish. This is not one of
> those issues that has a single "right" answer and is very context
> dependent.
>
Agreed with both. But just wanted to insert an opinion: Although popular
and commonly used, the terms "strong" and "weak" type checking by themselves
are misnomers. More precise would be "compile" and "run" type checking. Or
"strong compiler" and "weak compiler" type checking. For example C++ has by
default strong compile type checking but weak run type checking. And Lisp
usually has by default less compile type checking and more run type
checking.
Also "strong" and "weak" can apply to other aspects of object types.
Another classification is "static" and "dynamic". In C++ objects are
statically typed which allows the strong compile type checking. The term
"strong" could even be considered a misnomer here since *every* unvoid C++
object is type checked by the compiler. In Lisp, objects are dynamically
typed. In C++ not using RTTI (Run Time Type Information) object types are
unknown to code not dispatched on for that specific object type. Whereas in
Lisp, object types are always knowable by the code. One could say that C++
has *weak* dynamic typing capabilities and that Lisp has *strong* dynamic
typing capabilities.
Type dispatching is yet another finer classification. For example in C++
objects are typically dynamically dispatched on by having a virtual function
table slot pointing to a table of all functions that dispatch on the given
object class. This limits the type dispatching capabilities since one must
define a virtual function for all such objects for this to be an advantage
and multiple object dispatching is awkward. In Lisp there is typically a
type slot in the object that points to a type meta object that contains type
dispatching information or code will dispatch on the type pointer itself.
Both of which impose no such constraints. Since virtual function
dispatching has such limited capabilities one could say that C++ has *weak*
type dispatching capabilities and Lisp has *strong* type dispatching
capabilities.
As to right and wrong, there are also all sorts of problems about what kinds
of type data manipulation is better in which circumstances. For example a
single dynamic type check and branch in general is more efficient than a
virtual function dispatch. Common in a lot of Lisp code is the branch
(cond ((consp object) ...)
(t ...))
which can always be more efficient than a version that uses a virtual
function dispatch on the object.
Another problem relates to the idea that as the programmer decides to use
dynamic type information more heavily in his programs the weaker dynamic
type language starts to lose more and more. C++ advocates frequently fight
this vehemently even though dynamic type programming is a legitimate and
advanced programming technique. Furthermore they will convolute the
algorithm preposterously so as to use virtual functions to solve the
problem. To me it feels like we are adjusting the algorithm to suite the
programming language which is a kind of violation of principles since the
algorithm is a more general abstraction than the program.
Agreed that there is no clear right or wrong here. For example in the
previous paragraph the C++ programmer who convolutes the algorithm will
maintain that the convoluted program is the way the algorithm should have
been expressed in the first place. However, given that the algorithm is the
more general thing and given that complex programs must evolve and be
maintainable and flexible some question this thesis.
What exactly is the rationale behind this? I'd think that it would be much
cleaner if all expressions were just data structures.. after all, isn't that
what's usually meant by "uniform notation of code and data"?
It would seem more sensible if an expression was just a representation for
some object or structure (prolly a list) which was then read, and then,
well, evalled. But apparently syntactic keywords, for one, cannot be held to
be symbols in the list that forms the expression, for one thing..
So I'd be interested in hearing why in scheme programs are written in a
language that's almost, but not quite, the same as the printed
representations of its data structures?
Lauri Alanko
l...@iki.fi
I don't think Scheme ever makes this claim. It's one of the differences
between Scheme and more traditional members of the Lisp family.
>It would seem more sensible if an expression was just a representation for
>some object or structure (prolly a list) which was then read, and then,
>well, evalled. But apparently syntactic keywords, for one, cannot be held to
>be symbols in the list that forms the expression, for one thing..
This is how things are defined in Lisp. But standard Scheme doesn't
provide any programmer interface to evaluation and compiling, so it doesn't
need to define any relationship between data and programs. So it can
simply provide a traditional syntax specification for the language.
Admittedly, that syntax specification could have included everything that
READ accepts, but it would have complicated the description.
The definitions of most other flavors of Lisp, such as Maclisp and Common
Lisp, virtually require that the compiler be written in its own language.
They require code to be run in the compiler's environment in order to
implement macros and reader macros. Scheme generally tries to avoid this
trap (it doesn't have reader macros or eval-when, and macros are in an
extension rather than the base language).
>So I'd be interested in hearing why in scheme programs are written in a
>language that's almost, but not quite, the same as the printed
>representations of its data structures?
Actually, I believe it's essentially the same as the printed
representation, but not the same as the readable representation. I.e. if
you print the representation of a valid program to a file and then load
that file, it will always work because the printer produces the format that
the language accepts. For example, if you (write '(+ . (4 5))) to a file,
it will write "(+ 4 5)", so it's OK.
I used to do research in issues like this. Being by nature a somewhat
obsessive perfectionist, I wanted to find the most elegant, general,
expressive primitives for a programming language. My conclusion after
some 10 years is, concisely put: there's no such thing. Your concerns
about list structure provide a simple example of why this is.
Your view (if I may rephrase) is that the cons is a special case of the
more general list structure; that is, every cons is a two element list.
Now although two element lists are _formally_ adequate, they are not
adequately _expressive_. They only allow one to _implement_ all list
operations, not to _express_ them. Your preference is to make the more
general list concept the fundamental one, and to make pairs simply a
special case.
An opposing view is that since pairs are a more primitive concept, they
are more appropriate for primitive expression in the language. Just as
you would say that a cons is nothing more than a two element list, an
advocate of this position would say that a list is nothing more than a
chain of conses. People have different fundamental instincts about
these things. To one the most fundamental structure is the most general
one; to another the most fundamental structure is the most primitive
one.
I think there is probably a psychological generality to be found here.
Some people tend to think synthetically: "Given this collection of
things, how can I view them all as a single sort of thing?"
while others think analytically: "Given this collection of things, what
is the simplest sort of thing that I can decompose them into?" Put
another way, some people ask "what properties do all of these things
have in common?" while others ask "what parts do all of these things
have in common?"
Analogous differences crop up in other areas:
A relation is a set of tuples vs. a set is a one place relation.
A multi-dimensional array is a nested list of lists vs. a list is a
1-dimensional array.
A function is a many-one relation vs. a relation is a function ranging
over {true,false}.
Javascript has an interesting example of an extreme synthetic view. One
kind of object in the language has the properties of a record, an array,
and a (finite) function from strings to X. SQL makes use of both sorts
of thought; the only real aggregate data structure in SQL is the
relation. It synthetically views sets like one place relations and
records like relations with one element. But sequences are analytically
implemented with relations (meaning that there is no simple way to view
a sequence simply a special case of a relation).
Lauri> What exactly is the rationale behind this? I'd think that it
Lauri> would be much cleaner if all expressions were just data
Lauri> structures.. after all, isn't that what's usually meant by
Lauri> "uniform notation of code and data"?
Sure. But Scheme doesn't have it. However, the language of Scheme
expressions is a sublanguage of that of data representations. Isn't
this good enough?
Lauri> It would seem more sensible if an expression was just a
Lauri> representation for some object or structure (prolly a list)
Lauri> which was then read, and then, well, evalled.
However, Scheme systems don't always work that way for good reasons.
As soon as you want to provide (say), textual location information for
a Scheme program (for example for error reporting), vanilla read is
useless anyway.
Lauri> But apparently syntactic keywords, for one, cannot be held to
Lauri> be symbols in the list that forms the expression, for one
Lauri> thing..
I don't understand what you're saying.
In article <34B69861...@mci2000.com> Dave Gudeman
<dave.g...@mci2000.com> writes:
>
> Xah wrote:
> > ...
> > Similarly, a list (abstractly) is simply an ordered sequence of things. A tree is simply a list of lists. A pair notion (like Scheme's cons) is perhaps good in a language for implementation/efficiency reasons, but it is a redundant notion conceptually.
> >
> > Here's my original question rephrased: Why did the designers of Scheme -- a language that emphasize conceptual elegance -- included the pairs notion (cons) in the language?
>
> I used to do research in issues like this. Being by nature a somewhat
> obsessive perfectionist, I wanted to find the most elegant, general,
> expressive primitives for a programming language. My conclusion after
> some 10 years is, concisely put: there's no such thing. Your concerns
> about list structure provide a simple example of why this is.
>
> Your view (if I may rephrase) is that the cons is a special case of the
> more general list structure; that is, every cons is a two element list.
Once again, to be more precise, we don't want to confuse novices even more:
Every cons (pair) with a non-list cdr is a two element *dotted* list. If the
cons (pair) has NIL in its cdr then it is a *one* element true list. A
dotted list is defined as one whose last cons (pair) does not have NIL ('())
in its cdr. Non-dotted lists are referred to as *true* lists. It is common
to use the term *list* to refer to the composite set of true lists and
dotted lists. This is why we can say that a Lisp program is a nested list
of lists and atoms since dotted lists can be included.
> Once again, to be more precise, we don't want to confuse novices even more:
>
> Every cons (pair) with a non-list cdr is a two element *dotted* list. If the
> cons (pair) has NIL in its cdr then it is a *one* element true list. A
> dotted list is defined as one whose last cons (pair) does not have NIL ('())
> in its cdr. Non-dotted lists are referred to as *true* lists. It is common
> to use the term *list* to refer to the composite set of true lists and
> dotted lists.
And you claim you don't want to confuse novices? (-:
'shriram
In article <j7v67np...@new-world.cs.rice.edu> Shriram Krishnamurthi
<shr...@cs.rice.edu> writes:
You quoted me out of context. This was in response to a claim that "A cons
was a two element list" something that might be contradictory to what a
novice might find in a textbook. Surely confusing.
The concept of a dotted list should not be confusing. Here are some
elementary examples for any confused novices (Wherever you see the term
"cons" you can substitute "pair" and for "nil" you can substitute "the empty
list". To see the cons depictions you need a fixed width font.):
The cons:
(a . nil)
which is depicted by
---------
| . |nil|
--|------
|
V
a
where the rectangle is a cons cell. The "car" of the cons cell points to
the symbol "a" and the "cdr" of the cons cell has nil. And this is
equivalent to the one element list
(a)
The cons
(a . (b . nil))
is depicted by a cons cell with its cdr pointing to another cons cell
--------- ---------
| . | .-|---> | . |nil|
--|------ --|------
| |
V V
a b
and is equivalent to the two element list
(a b)
The cons
(a . (b . c))
is similarly depicted by two cons cells whose last cdr points to the symbol
"c" instead of nil
--------- ---------
| . | .-|---> | . | .-|---> c
--|------ --|------
| |
V V
a b
and is equivalent to the dotted list
(a b . c)
So in summary we have
(a b) is a true list
(a b . c) is a dotted list
>
I think the other problem is when lisp returns the value.. it does not always
print the way it is really represented..
for example you could have
(defvar x (cons 1 2))
x = (1 . 2)
(defvar y (cons 3 4))
y = (3 . 4)
(defvar z (cons x y))
when you do this to z you would think it would print out like this:
z = ( ( 1 . 2) . (3 . 4) )
but what it will actually print out is...
( (1 . 2) 4 . 5))
so you get this notation.. the algorithm used is if a dot is beside a left
parentheis.. the dot, left parenthisis and the enclosing right parethesis is
removed...
( defvar x (cons 1 ())
(defvar y(cons 2 ())
so x = (1 . ()) but the dot is is beside the left parnthesis.. so this is removed
and we have
x = (1)
same goes for y but now... lets define z
(defvar z (cons x y))
internal representation this is.. ( ( 1 . ()) . (2 . ()) )
using our algorithm.. we have ( ( 1 ) . ( 2 ))
which then is reduced to (( 1 ) 2)
This is at least much closer to the list interpretation.. and understanding these
mechanisms will allow a novice to understand more clearly what is actually going
on...