Could CDR-coding be on the way back?

112 views
Skip to first unread message

Frank A. Adrian

unread,
Dec 6, 2000, 3:00:00 AM12/6/00
to
I just finished reading an artical in the Dec. 2000 issue of Computer
Magazine called "Making Pointer-based Data Structures Cache Conscious", by
Chilimbi, et.al. In it, the authors suggest that because of increasing
memory-CPU speed mismatches, modifying data structures to use less space and
be more cache coherant, even at the cost of raw instruction count, might
well be a win for today's machine architectures.

Although the article was based on structures in the current language of the
day (i.e., Java), the suggestions that the authors make would probably be
even more appropriate for a language like Lisp, due to the smaller size of
objects and greater frequency of pointer traversals. One of their main
points was that data structures should be compacted by using offsets instead
of full pointers and via data compaction so that more of them would stay in
cache. In addition, they believe that objects should be allocated together
so that objects likely to be traversed to would be brought into cache, as
well.

Oddly enough, this seems suspiciously close to some of the optimizations
that were made in Lisp machines. CDR-coding reduced the size of cons cells
and the garbage collectors of the day tried to place items thought to be
used together on the same memory page. Back then, the reason was to try to
reduce disk to memory paging, but a mismatch is a mismatch, eh?

If the data in this article (as well as the corresponding articles from
SIGPAN '99) is correct, spending a few cycles on decompressing a CDR might
make sense again today. That last analysis of CDR-coding I saw was from
about 10 years ago, when memory-CPU speeds were much more even. The
analysis then was that CDR-coding took more cycles to execute and caused
branch stalls, so it wasn't as good as directly coded addresses. Thinking
about (a) superscalar execution, (b) 6-to-10 cycle cache memory latencies,
and (c) larger instruction caches and branch tables, maybe CDR coding might
make sense again.

Even more importantly, perhaps one CDR-codes only the nursery of a
generational scavenger. This allows more objects to be created before GC is
necessary and, while aging objects are pushed out to older tranches,
translation into full-offset objects can be made on those objects that are
going to be around for a while. If enough cache is resident, the entire
nursery might be cached in, leading to very substantial speed increases.

So, again we see that the world simply re-invents Lisp but, more
importantly, one of the neatest hacks from the olden days might be
reasonable again. Opinions?

faa

Kai Harrekilde-Petersen

unread,
Dec 7, 2000, 2:22:58 AM12/7/00
to
"Frank A. Adrian" <fad...@uswest.net> writes:

> I just finished reading an artical in the Dec. 2000 issue of Computer
> Magazine called "Making Pointer-based Data Structures Cache Conscious", by
> Chilimbi, et.al.

[snip]

> Oddly enough, this seems suspiciously close to some of the optimizations

> that were made in Lisp machines. CDR-coding [...]

What does CDR-coding mean in this context? - To me, CDR is Clock Data
Recovery, and that's sure wrong here.


Kai

Stefan Monnier <foo@acm.com>

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
>>>>> "Bruce" == Bruce Hoult <br...@hoult.org> writes:
> It's perhaps only slightly less common to want to store characters,
> booleans, the empty list, empty array, zero length string etc directly
> as well. This requires stealing another bit from the pointer
> representation, forcing all pointers to be 4-byte aligned (on a byte
> addressed machine).

You can also encode those as small values and make sure that pointers
are always larger than those values, so you don't need extra bits.

> "CDR-coding" consists of stealing *another* two bits from the pointer
> representation. There are three possible values:
> - the next word of memory is the CDR, as usual
> - the next word of memory would be a pointer to NULL (the
> empty list), if it was there. i.e. the current element is
> the last in the list.
> - the next word of memory is the CAR of the next cons cell in
> the current list

The "next word is NULL" case is not strictly necessary so you
can reduce the number of required bits to just one.

> memory for contiguously-allocated lists. The cost in extra instructions
> to check for special-cases is quite high, especially when such a list is
> modified by inserting or deleting elements in the middle.

But such `setcdr' operations are not very frequent, or might
even not be allowed (in languages like ML).


Stefan

Ian Kemmish

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
In article <CzFX5.6952$i32.4...@news.uswest.net>, fad...@uswest.net says...

>
>I just finished reading an artical in the Dec. 2000 issue of Computer
>Magazine called "Making Pointer-based Data Structures Cache Conscious", by
>Chilimbi, et.al. In it, the authors suggest that because of increasing
>memory-CPU speed mismatches, modifying data structures to use less space and
>be more cache coherant, even at the cost of raw instruction count, might
>well be a win for today's machine architectures.

A less labour intensive rewrite is to make your code smaller (if you end up
with fewer cache misses, do you really care whether they're due to data or
instructions?). The latest major rewrite of Jaws (well, a few years old now)
was undertaken with just this aim in mind, and did show a noticeable
performance increase, despite having fewer unrolled loops etc.

Of course, writing compact code might mean having to eschew fashionable
techniques like ubiquitous object-orientation:-)

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Ian Kemmish 18 Durham Close, Biggleswade, Beds SG18 8HZ, UK
i...@jawssytems.com Tel: +44 1767 601 361
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Behind every successful organisation stands one person who knows the secret
of how to keep the managers away from anything truly important.


Barry Margolin

unread,
Dec 7, 2000, 3:00:00 AM12/7/00
to
In article <BOQX5.399$3V2....@news.dircon.co.uk>,

Ian Kemmish <i...@jawssystems.com> wrote:
>A less labour intensive rewrite is to make your code smaller (if you end up
>with fewer cache misses, do you really care whether they're due to data or
>instructions?).

But since instructions are mostly sequential, so the hardware can prefetch
the next block into the cache, so even larger code may have few cache
misses. With data, even if the hardware does prefetching, it will only be
helpful if the data exhibits high locality. This is true for array
structures, but when there's lots of pointers and indirection, as there is
in Lisp, it's only true if the GC keeps pointers together with the things
they point to, or if the hardware has a pointer-aware prefetcher (as
someone mentioned existed on LMI Lisp Machines).

--
Barry Margolin, bar...@genuity.net
Genuity, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

Bruce Hoult

unread,
Dec 7, 2000, 6:49:15 AM12/7/00
to
In article <80bsuoa...@orthanc.exbit-technology.com>, Kai
Harrekilde-Petersen <k...@exbit.dk> wrote:

A Lisp "cons cell" is a data record that contains two pointers called
(for wierd historical reasons) CAR and CDR. The most common use of them
is to build linked lists in which the CDR of each cell points to the
next element in the list and the CAR points to the data item at that
position in the list.

In fact for efficiency the "pointers" aren't always actually pointers.
In particular it's useful to be able to store an integer directly in the
CAR and/or CDR positions, rather than storing a pointer to some record
containing an integer. Therefore most Lisps arrange to reserve e.g. the
even values for actual pointers and the odd values for integers with
integers represented as e.g. 2N+1. On machines with compulsory
word-alignment for pointers this doesn't affect pointers at all, and
give you half the number of integer values you'd otherwise have (i.e. 31
bit on a 32 bit machine).

It's perhaps only slightly less common to want to store characters,
booleans, the empty list, empty array, zero length string etc directly
as well. This requires stealing another bit from the pointer
representation, forcing all pointers to be 4-byte aligned (on a byte
addressed machine).

Cons cells can be used to build any data structure, but simple
single-linked lists are very common, and it's fairly common to allocate
them all at once, which means that each cons cell ends up pointing at a
cell right next to it in memory.

"CDR-coding" consists of stealing *another* two bits from the pointer
representation. There are three possible values:

- the next word of memory is the CDR, as usual
- the next word of memory would be a pointer to NULL (the
empty list), if it was there. i.e. the current element is
the last in the list.
- the next word of memory is the CAR of the next cons cell in
the current list

You now have a total of four bits stolen from the pointer
representation, but as a result you might be able to save quite a lot of

memory for contiguously-allocated lists. The cost in extra instructions
to check for special-cases is quite high, especially when such a list is
modified by inserting or deleting elements in the middle.

-- Bruce

Duane Rettig

unread,
Dec 7, 2000, 10:55:03 AM12/7/00
to
Bruce Hoult <br...@hoult.org> writes:

> You now have a total of four bits stolen from the pointer
> representation, but as a result you might be able to save quite a lot of
> memory for contiguously-allocated lists. The cost in extra instructions
> to check for special-cases is quite high, especially when such a list is
> modified by inserting or deleting elements in the middle.

Note that one such test that becomes nontrivial on General Purpose
hardware (i.e. not lisp machines) is EQ. Instead of one instruction,
those extra bits must be masked off before the comparison. I don't
know if anyone has ever considered placing those extra bits "elsewhere",
e.g. in a parallel location to the actual pointer/tag word.

--
Duane Rettig Franz Inc. http://www.franz.com/ (www)
1995 University Ave Suite 275 Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253 du...@Franz.COM (internet)

Joe Marshall

unread,
Dec 7, 2000, 12:43:28 PM12/7/00
to

At LMI, we abandoned CDR-coding for our last machine for several
reasons:
1. It took bits out of the pointers, which, being word pointers,
meant a reduction in address space.

2. It would complicate the hardware. We had hardware that
`understood' lists and could do `prefetches' on them.

3. Only about 10% of the allocated storage in a running Lisp
machine was compressed cdr-coded lists.


-----= Posted via Newsfeeds.Com, Uncensored Usenet News =-----
http://www.newsfeeds.com - The #1 Newsgroup Service in the World!
-----== Over 80,000 Newsgroups - 16 Different Servers! =-----

Chuck Fry

unread,
Dec 8, 2000, 10:41:58 PM12/8/00
to
Like Frank, I've been wondering about this for a couple of years now. I
am starting to think current superscalar and/or VLIW technology might
encourage clever programmers to reintroduce the concept of
'microcoding', but under a New Millennium-style buzzword (e.g. VLIW!).
Shades of the B5000....

And with 64 bits to play with, who's to say we can't spare a couple for
CDR-coding?

In article <4k89cn...@beta.franz.com>,


Duane Rettig <du...@franz.com> wrote:
>Bruce Hoult <br...@hoult.org> writes:
>> You now have a total of four bits stolen from the pointer
>> representation, but as a result you might be able to save quite a lot of
>> memory for contiguously-allocated lists. The cost in extra instructions
>> to check for special-cases is quite high, especially when such a list is
>> modified by inserting or deleting elements in the middle.
>
>Note that one such test that becomes nontrivial on General Purpose
>hardware (i.e. not lisp machines) is EQ. Instead of one instruction,
>those extra bits must be masked off before the comparison. I don't
>know if anyone has ever considered placing those extra bits "elsewhere",
>e.g. in a parallel location to the actual pointer/tag word.

Getting back to the original point of this thread, given superscalar
CPUs running in the GHz region, register-to-register or
immediate-to-register instructions are cheap (as long as they're in the
I-cache), and dereferencing pointers is *extremely* expensive (if the
targets are not in the D-cache).

Duane, I figure you would know this as well as anyone, so I have to ask:
What is the true cost of doing this masking compared to that of chasing
pointers into uncached RAM on a couple of well-known architectures,
e.g. x86, SPARC, or PowerPC? If it's done all the time I imagine the
instruction fetches would be damned near free.

-- Chuck, used-to-wannabe computer architect
--
Chuck Fry -- Jack of all trades, master of none
chu...@chucko.com (text only please) chu...@tsoft.com (MIME enabled)
Lisp bigot, car nut, photographer, sometime guitarist and mountain biker
The addresses above are real. All spammers will be reported to their ISPs.

Per Bothner

unread,
Dec 9, 2000, 2:53:01 PM12/9/00
to
"Frank A. Adrian" <fad...@uswest.net> writes:

> So, again we see that the world simply re-invents Lisp but, more
> importantly, one of the neatest hacks from the olden days might be
> reasonable again. Opinions?

I have an even more efficient and radical idea: Don't use lists.
They are almost always the wrong data structure. Use arrays.
Arrays are available and efficient in Lisp, Scheme, Java, C++,
Fortran, etc etc.

If you need variable-sized data structures, you should still use
arrays. Just double the size of the array when it becomes too small.
You double the array to make sure that amortized (average) cost of
insertions is constant. Even just after you've doubled the array and
so almost half the space is unused, you are not using any more space
than a linked list would take.

If it is important to be able to insert/delete in the middle of
a sequence, you might consider using a list. But consider a
buffer-gap array first. Instead of leaving the unused space at
teh end of the array, allow it to be in the middle, whereever
insertions/deletions are done. You need to "move the gap" (i.e.
copy things around in the data structure) when the gap is not where
you need to insert/delete, but most of the time insertions/deletions
will tend to be clustered (near each other).

A arguemnt could be made that lists are easy to use in a functional
(side-effect-free) programs - just use recursion. But I don't think
that is a strong argument. You can write clean effecient side-effect-free
code using arrays: Just look at APL and its successors. Modern
functional languages have "list comprehensions" - these can just as
easily be used for "array comprehensions".

I am talking about standard Lisp-style lists implemented using pairs.
There is a better case to be made for chaining objects together using
link slot in a more complex objects. At least in that case you're not
allocating a separate cons cell for each link.
--
--Per Bothner
p...@bothner.com http://www.bothner.com/~per/

Erik Naggum

unread,
Dec 9, 2000, 5:00:21 PM12/9/00
to
* Per Bothner <p...@bothner.com>

| I have an even more efficient and radical idea: Don't use lists.

Oh, Christ! We've been there, done that, about a million times.

| They are almost always the wrong data structure. Use arrays. Arrays
| are available and efficient in Lisp, Scheme, Java, C++, Fortran, etc
| etc.

Yes, so they are. Available. Arrays are also extremely inefficient
where lists are not the wrong data structure. Hence, use lists _and_
arrays, as appropriate. Lisp programmers are not the idiots you think
they are, Per. They figure out when it makes sense to use arrays even
if you don't and need to reduce the problem to silliness.

| If it is important to be able to insert/delete in the middle of
| a sequence, you might consider using a list.

If your data structures nest, use lists.

If your data structures have unpredictable structure, use lists.

If you need to maintain uniformity of type for the "rest" after
processing an initial portion of the data structure, use lists. (If
you think this only applies to recursion, which it seems you do, you
really must think more carefully about algorithm design with arrays
and why displaced arrays or passing _pairs_ (sorry, data structures in
small arrays, of course) of the array and the index won't cut it.)

If you need to manipulate the structure of the data structure in any
way, not just insert/delete, use lists. (Flattening, pairing, etc.)

If after much headache and painfully idiotic programming mistakes, you
discover that your array implementation really must use two elements
per "array", one data and one "next", you have reinvented the cons.
Congratulations! Welcome to the wisdom of 40 years of language design!

| I am talking about standard Lisp-style lists implemented using pairs.
| There is a better case to be made for chaining objects together using
| link slot in a more complex objects. At least in that case you're not
| allocating a separate cons cell for each link.

It is just insane to add "link" slots to more complex objects when you
can separate and abstract out the container concept from the containee.

Who cares about allocating separate cons cells? Simple arrays have to
be more than four elements long to take less space than the equivalent
list using cons cells. Specialized arrays may have to be much longer.
The allocation for cons cells is extremely efficiently implemented in
all Lisp implementations. Using your stupid arrays-for-lists would of
course force some super-clever implementation of exponent-of-2-sized
arrays that double in size as they are used _and_ move around all the
time (destroying object identity unless you waste even _more_ space on
their already inefficient implementation), but people have been there,
many times over, and they have _returned_ to lists, implemented via
cons cells, when that has been the most intelligent solution.

What we have left after so many years of using lists _and_ arrays
_and_ structures _and_ clos objects _and_ databases, etc, is that
there is not a single argument left for using anything _but_ lists
where lists are used. If you want to teach novices to use arrays, at
least don't make the misguided Scheme-style argument that alternatives
to the obvious and the most natural must be chosen because it somehow
more divine to use recursion, oops, I meant arrays.

You've done so much good work. How come you don't know simple things?

#:Erik
--
"When you are having a bad day and it seems like everybody is trying
to piss you off, remember that it takes 42 muscles to produce a
frown, but only 4 muscles to work the trigger of a good sniper rifle."
-- Unknown

Kai Henningsen

unread,
Dec 9, 2000, 10:14:00 AM12/9/00
to
i...@jawssystems.com (Ian Kemmish) wrote on 07.12.00 in <BOQX5.399$3V2....@news.dircon.co.uk>:

> now) was undertaken with just this aim in mind, and did show a noticeable
> performance increase, despite having fewer unrolled loops etc.

Maybe not despite, but because of fewer unrolled loops. For exatly the
cache footprint reasons mentioned, I've heard people argue that these
days, unrolled loops are actually counter productive.

Kai
--
http://www.westfalen.de/private/khms/
"... by God I *KNOW* what this network is for, and you can't have it."
- Russ Allbery (r...@stanford.edu)

t...@larva.flyingcroc.no.spam.please.net

unread,
Dec 10, 2000, 12:29:13 AM12/10/00
to
>
> Re: Could CDR-coding be on the way back?

>
> From: Per Bothner <p...@bothner.com>
> Date: 09 Dec 2000 11:53:01 -0800
>
>"Frank A. Adrian" <fad...@uswest.net> writes:
>
>> So, again we see that the world simply re-invents Lisp but, more
>> importantly, one of the neatest hacks from the olden days might be
>> reasonable again. Opinions?
>
>I have an even more efficient and radical idea: Don't use lists.
>They are almost always the wrong data structure. Use arrays.
>Arrays are available and efficient in Lisp, Scheme, Java, C++,
>Fortran, etc etc.
>
>If you need variable-sized data structures, you should still use
>arrays. Just double the size of the array when it becomes too small.
>You double the array to make sure that amortized (average) cost of
>insertions is constant. Even just after you've doubled the array and
>so almost half the space is unused, you are not using any more space
>than a linked list would take.

You mean like this?:
http://www.andansbutt.org/ttk/extant
There's a statement about efficiency of doubling array sizes
in the documentation that could have almost been a quote from
your paragraph above :-)

-- TTK

--
"Live and Learn; Die and Learn Faster"
(motto of the Lynbrook Historical Simulations club)
Technical and RKBA documents at: http://www.ciar.org/~ttk/

Alberto Moreira

unread,
Dec 10, 2000, 10:24:40 AM12/10/00
to
t...@larva.flyingcroc.NO.SPAM.PLEASE.net wrote:
>
> >
> > Re: Could CDR-coding be on the way back?
> >
> > From: Per Bothner <p...@bothner.com>
> > Date: 09 Dec 2000 11:53:01 -0800
> >
> >"Frank A. Adrian" <fad...@uswest.net> writes:

> >If you need variable-sized data structures, you should still use
> >arrays. Just double the size of the array when it becomes too small.
> >You double the array to make sure that amortized (average) cost of
> >insertions is constant. Even just after you've doubled the array and
> >so almost half the space is unused, you are not using any more space
> >than a linked list would take.
>
> You mean like this?:
> http://www.andansbutt.org/ttk/extant
> There's a statement about efficiency of doubling array sizes
> in the documentation that could have almost been a quote from
> your paragraph above :-)

Except for a few CPU intensive applications, the flexibility of
lists far compensates for the loss of speed. STL vectors, for
example, are implemented with this doubling strategy, but they
also provide the resize() member function: if you have a
2,000,000 element vector you're going to need 8Mb just to add
one element to it!

And who would sacrifice the elegance of object orientation and
the expedience and breadth of STL to program by hand in plain
vanilla C using arrays ? Besides running benchmarks and the odd
go-for-broke scientific app, I see no reason why.

At some point in the past we evolved from writing in assembler
to writing in block-structured languages; we're in the process
of leaving BS for Object Orientation. Dynamic memory allocation
is of fundamental importance now, and we use it even when we
must give away some performance.

Alberto.

Jonathan Thornburg

unread,
Dec 10, 2000, 10:55:42 AM12/10/00
to
In article <m2hf4dl...@kelso.bothner.com>,

Per Bothner <p...@bothner.com> wrote:
>I have an even more efficient and radical idea: Don't use lists.
>They are almost always the wrong data structure. Use arrays.
>Arrays are available and efficient in Lisp, Scheme, Java, C++,
>Fortran, etc etc.
>
>If you need variable-sized data structures, you should still use
>arrays. Just double the size of the array when it becomes too small.
>You double the array to make sure that amortized (average) cost of
>insertions is constant. Even just after you've doubled the array and
>so almost half the space is unused, you are not using any more space
>than a linked list would take.

An interesting case study is symbolic algebra systems.
Of the 4 major systems:
Reduce = based on lisp ==> uses lists
Macsyma = based on lisp ==> uses lists
Mathematica = ?? written in its own list-based programming language ??
Maple = written in its own array-based programming language

The 3 "M" systems all offer roughly comparable functionality (Reduce
is a bit behind). But Maple is generally regarded as a factor of 3-5
faster, and anywhere from a factor of 2 to a factor of 10 smaller in
memory usage than the other two.

Maple uses contiguous arrays, or trees whose nodes are contiguous arrays,
for all its data structures. The authors state that this is one reason
(though certainly not the only one) for their performance advantages.

--
-- Jonathan Thornburg <jth...@thp.univie.ac.at>
http://www.thp.univie.ac.at/~jthorn/home.html
Universitaet Wien (Vienna, Austria) / Institut fuer Theoretische Physik
Q: Only 7 countries have the death penalty for children. Which are they?
A: Congo, Iran, Nigeria, Pakistan[*], Saudi Arabia, United States, Yemen
[*] Pakistan moved to end this in July 2000. -- Amnesty International,
http://www.web.amnesty.org/ai.nsf/index/AMR511392000

Parag Patel

unread,
Dec 10, 2000, 5:45:53 PM12/10/00
to
In article <3A33A038...@moreira.mv.com>, Alberto Moreira wrote:
>t...@larva.flyingcroc.NO.SPAM.PLEASE.net wrote:
>
> [...] if you have a

>2,000,000 element vector you're going to need 8Mb just to add
>one element to it!

I implemented a similar idea for dynamic arrays but with the addition of
a "bump-size" parameter. The idea is to double the size of the array
until it hits the bump-size, say 4Kb, and thereafter only grow the array
in multiples of bump-size.

Of course the downside is to grow an 8Mb array still requires *another*
8Mb plus 4Kb to grow, assuming that realloc() will be unable to simply
resize the original array. But the original 8Mb will be freed.

But note that if the data is stored in a singly-linked list, it still
eats up 16Mb of memory - 8Mb for the data plus another 8Mb for each
pointer.

For really large arrays such as the 8Mb example, a combined approach of
a linked-list of smaller (say 16Kb) arrays would seem to be a more
manageable solution, for a modest increase in code-size.

But I digress, and no doubt bore to tears.


-- Parag Patel

Peter da Silva

unread,
Dec 10, 2000, 6:27:51 PM12/10/00
to
In article <slrn9381t1...@pinhead.parag.codegen.com>,

Parag Patel <pa...@pinhead.parag.codegen.com> wrote:
> For really large arrays such as the 8Mb example, a combined approach of
> a linked-list of smaller (say 16Kb) arrays would seem to be a more
> manageable solution, for a modest increase in code-size.

Sounds like UNIX 'clists'.

--
`-_-' In hoc signo hack, Peter da Silva.
'U` "Milloin halasit viimeksi suttasi?"

Disclaimer: WWFD?

Alberto Moreira

unread,
Dec 10, 2000, 9:25:32 PM12/10/00
to
Parag Patel wrote:

> For really large arrays such as the 8Mb example, a combined approach of
> a linked-list of smaller (say 16Kb) arrays would seem to be a more
> manageable solution, for a modest increase in code-size.

You know, Lisp lists aren't really lists but more like trees.
And a tree can be represented with an array, so if push comes to
shove it is probably possible to represent lists as arrays too,
and no pointers. But then, the memory allocation granularty
issue comes back.


Alberto.

Brian Inglis

unread,
Dec 10, 2000, 11:55:21 PM12/10/00
to
On 09 Dec 2000 11:53:01 -0800, Per Bothner <p...@bothner.com>
wrote:

One technique I've used for threaded transactions with variable
numbers of entries of the same fixed length is to use dynamic
array sizing (with doubling) for allocation, with the list
entries as array entries. You can then use indices into the array
instead of pointers, and reduce the allocation overhead from
thousands of small chunks of memory to a dozen or so
reallocations of a single chunk, and a very simple custom subpool
allocator, that never reuses memory, until it can all be freed as
one chunk.

Of course, it's always better if you can compute an upper bound
on all the memory that may be required at the very start, and
allocate it all at one time, but that can only be done where you
know all the parameter values required to size the space at the
start of fairly simple transactions.

Thanks. Take care, Brian Inglis Calgary, Alberta, Canada
--
Brian_...@CSi.com (Brian dot Inglis at SystematicSw dot ab dot ca)
use address above to reply

Terje Mathisen

unread,
Dec 11, 2000, 2:36:59 AM12/11/00
to
Peter da Silva wrote:
>
> In article <slrn9381t1...@pinhead.parag.codegen.com>,
> Parag Patel <pa...@pinhead.parag.codegen.com> wrote:
> > For really large arrays such as the 8Mb example, a combined approach of
> > a linked-list of smaller (say 16Kb) arrays would seem to be a more
> > manageable solution, for a modest increase in code-size.
>
> Sounds like UNIX 'clists'.

The fun part is that this is exactly the approach which x86 forced us to
use back in the 16-bit days:

A list (or array) of arrays, with each subarray less than 64K in size.

I usually set the number of elements in each subarray such as to make it
between 32 and 64K in total size, and still a power of two, to simplify
addressing.

Since all accesses within the same subarray would be to the same
segment, the overhead of segment reloads could often/mostly be avoided,
at least for (semi-)sequential accesses.

Terje

--
- <Terje.M...@hda.hydro.com>
Using self-discipline, see http://www.eiffel.com/discipline
"almost all programming can be viewed as an exercise in caching"

Duane Rettig

unread,
Dec 11, 2000, 2:23:10 PM12/11/00
to
chu...@best.com (Chuck Fry) writes:

> And with 64 bits to play with, who's to say we can't spare a couple for
> CDR-coding?

Let's not make this mistake again. When IBM (and Amdahl) created their XA
architectures for the 370 compatibles (extending the address range from 24
bits to 32 bits), several programs suddenly broke. These programs, which
used the upper 8 bits of address words for tags and flags, assumed (correctly,
but without foresight) that these bits were available for use. However,
whether these programs were justified in the original designs or not, it
would be safe for me to venture a guess that _none_ of them exist in their
previous design form, and that all have been either rewritten or scrapped.

But now, we're dealing with a different issue - one that makes it even harder
to justify using more address bits: 64-bit architectures tend to define their
entire virtual space. For example, the PA-RISC with the W bit set defines
text segments starting at 0x4000000000000000, and data segments starting at
0x8000000000000000. Which bits would you steal? Some bits in the middle?

Of course, lisps on GP hardware already use bits within the address space for
tags, but they tend to be only the least significant bits. There is indeed
a cost to this usage of bits, and that is the granularity of lisp object
addressing, which becomes increased to a two-word "allocation-unit" boundary
in the major lisps, which is the minimum size needed for a cons cell. This
increased granularity presents no problem for lisp implementations, and it
implies 3 bits of tag for 32-bit lisps, and 4 bits of tag for 64-bit lisps.
This means, all other things being equal, that there is only one extra bit
left for a decision to add cdr-coding. I don't think that one bit is enough;
I believe that you need two bits.

> In article <4k89cn...@beta.franz.com>,
> Duane Rettig <du...@franz.com> wrote:
> >Bruce Hoult <br...@hoult.org> writes:
> >> You now have a total of four bits stolen from the pointer
> >> representation, but as a result you might be able to save quite a lot of
> >> memory for contiguously-allocated lists. The cost in extra instructions
> >> to check for special-cases is quite high, especially when such a list is
> >> modified by inserting or deleting elements in the middle.
> >
> >Note that one such test that becomes nontrivial on General Purpose
> >hardware (i.e. not lisp machines) is EQ. Instead of one instruction,
> >those extra bits must be masked off before the comparison. I don't
> >know if anyone has ever considered placing those extra bits "elsewhere",
> >e.g. in a parallel location to the actual pointer/tag word.
>
> Getting back to the original point of this thread, given superscalar
> CPUs running in the GHz region, register-to-register or
> immediate-to-register instructions are cheap (as long as they're in the
> I-cache), and dereferencing pointers is *extremely* expensive (if the
> targets are not in the D-cache).

I believe that EQ testing does speak directly to the point of cdr-coding;
if you measure the number of times CDR is taken vs the number of times
EQ is performed in a typical program, you find that EQ usage overwhelms
CDR usage. Thus, there is a strong impetus to optimize EQ testing
in preference to CDR operations.

Remember that any boolean test, e.g. "(when x ...)" is equivalent to
"(unless (eq x nil) ...)", in a similar manner that in C "if (x) {...}"
is equivalent to "if (x!=0) {...}", so there are many more EQ tests than
meets the eye in lisp code. If you use a bit which changes the value
of a pointer without changing its object's identity, then tests for
identity can no longer be performed in one instruction, and lisp then
loses to other languages...

> Duane, I figure you would know this as well as anyone, so I have to ask:
> What is the true cost of doing this masking compared to that of chasing
> pointers into uncached RAM on a couple of well-known architectures,
> e.g. x86, SPARC, or PowerPC? If it's done all the time I imagine the
> instruction fetches would be damned near free.

Well, obviously bit manipulation is faster than memory access, but why
are you asking this question? What do you have in mind?

And why uncached RAM, as opposed to cached data? If you're talking
about normal cons-cell lists vs cdr coding, shouldn't we assume a garbage
collector that can/does tend to keep list structures local with respect
to each other?

Lieven Marchand

unread,
Dec 11, 2000, 3:57:05 PM12/11/00
to
Duane Rettig <du...@franz.com> writes:

> Let's not make this mistake again. When IBM (and Amdahl) created their XA
> architectures for the 370 compatibles (extending the address range from 24
> bits to 32 bits), several programs suddenly broke. These programs, which
> used the upper 8 bits of address words for tags and flags, assumed (correctly,
> but without foresight) that these bits were available for use. However,
> whether these programs were justified in the original designs or not, it
> would be safe for me to venture a guess that _none_ of them exist in their
> previous design form, and that all have been either rewritten or scrapped.

Thanks for reminding me. Most of my stuff broke. But the mess was even
worse: they had to put in mode bits to allow older programs to run
unchanged. In a system like VM you ended up with two mode bits (EC/BC
and XA or something like that? I've forgotten most of this stuff and I
can't find my yellow booklet) and programs that only ran with one
combination of them. It was very nice to have eight bits to work with
with each pointer. Perhaps architecture designers should reconsider
giving user programs some bits in the address space.

--
Lieven Marchand <m...@bewoner.dma.be>
Lambda calculus - Call us a mad club

Duane Rettig

unread,
Dec 11, 2000, 6:57:52 PM12/11/00
to
Lieven Marchand <m...@bewoner.dma.be> writes:

> It was very nice to have eight bits to work with
> with each pointer. Perhaps architecture designers should reconsider
> giving user programs some bits in the address space.

OK, let's play this request out: suppose that general purpose computer
manufacturer A decides to put in two extra bits besides the address, so
that instead of 64 bits of address, there are now 64 bits plus two extra
for tags/flags (two bits, eight bits, the story is still the same). Lisp
vendor L loves it, and designs some cdr-coding (or whatever else it wants
to do).

Now, in order to use those bits, they must be available per word, so
they must be included in the data path and memory. Thus, each data
word now has 64 bits for regular data plus these two extra bits for
L to use.

Now, C vendor (C, of course) and Java vendor J are getting on the
bandwagon of the newly popular A machines, but they notice that they
are not getting all they could be getting out of the architecture;
they notice that there are 66 bits of address width, 66 bits in the
data path, and a disconnect between the data and address in two of
the bits, and thus the address space is 1/4 the size that it could
be if those bits could be made use of. So C and L complain to A.

The A developers then set to work on the new A' (or should we call
it XA? :-) architecture which includes full functionality for the 65th
and 66th bit, it comes out and leaves L to reimplement its lisp in more
general terms or else be lost in the dust.

The moral of this is that whatever we ask of hardware vendors had better
be general-purpose enough to be useful in any language, or else it will
eventually fail.

hack

unread,
Dec 11, 2000, 7:13:38 PM12/11/00
to
In article <4aea2i...@beta.franz.com>,

Duane Rettig <du...@franz.com> wrote:
>chu...@best.com (Chuck Fry) writes:
>
>> And with 64 bits to play with, who's to say we can't spare a couple for
>> CDR-coding?
>
>Let's not make this mistake again. When IBM (and Amdahl) created their XA
>architectures for the 370 compatibles (extending the address range from 24
>bits to 32 bits), several programs suddenly broke. These programs, which
>used the upper 8 bits of address words for tags and flags, assumed (correctly,
>but without foresight) that these bits were available for use. However,
>whether these programs were justified in the original designs or not, it
>would be safe for me to venture a guess that _none_ of them exist in their
>previous design form, and that all have been either rewritten or scrapped.
>[snip]

>--
>Duane Rettig Franz Inc. http://www.franz.com/ (www)
>1995 University Ave Suite 275 Berkeley, CA 94704
>Phone: (510) 548-3600; FAX: (510) 548-8253 du...@Franz.COM (internet)

A few of those 24-bit programs still exist, and they still run! Only those
programs that wanted to exploit the wonders of 31-bit mode were exposed, and
indeed those who coded without foresight were doomed to continue to play in
their little 24-bit corner. But they didn't break -- IBM takes backwards
compatibility seriously (for user-level programs, that is). They simply
couldn't automatically exploit new features.

There was the following problem, however. For the most part 31-bit mode
was not very different, so some programmers were lulled into a false
sense of security and thought they could simply run in 31-bit mode.
Those are the programs that broke when subtle differences became fatal.

Technically, the architecture supports both 24-bit (almost 360-compatible)
and 31-bit mode, with user-level instructions to switch mode. The default
settings for loaders are such that programs which don't declare any special
properties will run in 24-bit mode, so those dusty old object decks will
still run. Modern compilers (and assembler programmers) flag their code
as 31-bit, of course.

The newly-announced z-series (64-bit S/390) adds a third mode (24-31-64),
and old 24-bit programs are *still* supported. Current 31-bit programs
will run just fine on z-machines. Those that know better can use new
instructions to access the full 64-bit registers, or enter 64-bit mode
and explore as large an address space as the control program will give
them; those that don't will not gain from the new capabilities, but they
will not lose anything either. THAT's compatibility.

Michel.

Joe Marshall

unread,
Dec 11, 2000, 7:30:50 PM12/11/00
to
Duane Rettig <du...@franz.com> writes:

>
> OK, let's play this request out: suppose that general purpose computer

> manufacturer A decides to put in two extra bits besides the address....

[long description elided]

> The moral of this is that whatever we ask of hardware vendors had better
> be general-purpose enough to be useful in any language, or else it will
> eventually fail.

I don't buy this line of reasoning. There are far too many `unusual'
machines that are commercially viable for this to be the case (think
about some of the IBM architectures).

Even `standard' computer architectures have their quirks: In many
processors there are constraints on the use of the topmost address
bits, and in many standard architectures there are extra bits on each
word for ECC (all that wasted RAM).


Of course, as it stands right now, new hardware that doesn't
faithfully mimic the Pentium instruction set at better than about
500MHz would be at an extreme disadvantage.

Duane Rettig

unread,
Dec 11, 2000, 8:05:46 PM12/11/00
to
ha...@watson.ibm.com (hack) writes:

> In article <4aea2i...@beta.franz.com>,
> Duane Rettig <du...@franz.com> wrote:
> >chu...@best.com (Chuck Fry) writes:
> >
> >> And with 64 bits to play with, who's to say we can't spare a couple for
> >> CDR-coding?
> >
> >Let's not make this mistake again. When IBM (and Amdahl) created their XA
> >architectures for the 370 compatibles (extending the address range from 24
> >bits to 32 bits), several programs suddenly broke. These programs, which
> >used the upper 8 bits of address words for tags and flags, assumed (correctly,
> >but without foresight) that these bits were available for use. However,
> >whether these programs were justified in the original designs or not, it
> >would be safe for me to venture a guess that _none_ of them exist in their
> >previous design form, and that all have been either rewritten or scrapped.
> >[snip]
>

> A few of those 24-bit programs still exist, and they still run! Only those
> programs that wanted to exploit the wonders of 31-bit mode were exposed, and
> indeed those who coded without foresight were doomed to continue to play in
> their little 24-bit corner. But they didn't break -- IBM takes backwards
> compatibility seriously (for user-level programs, that is). They simply
> couldn't automatically exploit new features.

I stand corrected, if there are 24-bit programs out there. (and I also
stand corrected for talking about 32 bits instead of 31 bits - forgot
about that little difference :-) I am also all for backward
compatibility. However, I can say with certainty (almost by definition)
that the only reason why those 24-bit programs could stay 24-bit was
because they were not _forced_ to go to 31 bits. This may seem like
circular reasoning, but it provides the preparation to point out
something you missed below:

> There was the following problem, however. For the most part 31-bit mode
> was not very different, so some programmers were lulled into a false
> sense of security and thought they could simply run in 31-bit mode.
> Those are the programs that broke when subtle differences became fatal.

What you call a subtle difference is certainly not so subtle. One
criticism that Lisp as a language has had to bear is its tendency to
be large (though nowadays it tends to be one of the smaller of
languages, comparatively). However, the very fact that lisp is able
to handle large problem spaces well has always allowed it to be used
for increasingly larger projects. Currently, we have customers,
especially in the CAD realm, whose data spaces push beyond the normal
2 to 3 Gb address spaces allowed by 32 bit systems, thus causing a
need to move toward usage of 64-bit addressing. Certainly it was
long ago that these problem sets broke through the now seemingly
pitiful virtual address range of 16 Mb, which is what 24 bits
of addressing provides.

So it was not necessarily a false sense of security in an optional
move, but instead a non-scalable design in an ever-growing need for
space, which caused the need for at least some programs to move to
31-bit mode, and which thus broke. I suspect that there were at
least some programs besides Lisps which broke in this way, but I
don't myself know that.

Duane Rettig

unread,
Dec 11, 2000, 8:26:08 PM12/11/00
to
Joe Marshall <j...@content-integrity.com> writes:

> Duane Rettig <du...@franz.com> writes:
>
> >
> > OK, let's play this request out: suppose that general purpose computer
> > manufacturer A decides to put in two extra bits besides the address....
>
> [long description elided]
>
> > The moral of this is that whatever we ask of hardware vendors had better
> > be general-purpose enough to be useful in any language, or else it will
> > eventually fail.
>
> I don't buy this line of reasoning. There are far too many `unusual'
> machines that are commercially viable for this to be the case (think
> about some of the IBM architectures).

Which machines, and how so `unusual'? Are you really still talking
about general-purpose machines?

> Even `standard' computer architectures have their quirks: In many
> processors there are constraints on the use of the topmost address
> bits, and in many standard architectures there are extra bits on each
> word for ECC (all that wasted RAM).

True enough, but ECC and parity tend to be transparent to the user
level programmer. They aren't "seen".

As for segmented architectures, even HP's PA-RISC, which forced the
assembler programmer to load segment registers and perform external
branches in their 32-bit architecture - they fixed their instruction
set to allow continuous addressing in their 64-bit architecture.
Now it is also transparent to the user-level programmer.

> Of course, as it stands right now, new hardware that doesn't
> faithfully mimic the Pentium instruction set at better than about
> 500MHz would be at an extreme disadvantage.

Well, as a software vendor to many different architectures, I disagree,
but that is really another discussion.

Stephen Fuld

unread,
Dec 11, 2000, 11:40:53 PM12/11/00
to
"Alberto Moreira" <alb...@moreira.mv.com> wrote in message
news:3A343B1C...@moreira.mv.com...

> Parag Patel wrote:
>
> > For really large arrays such as the 8Mb example, a combined approach of
> > a linked-list of smaller (say 16Kb) arrays would seem to be a more
> > manageable solution, for a modest increase in code-size.
>
> You know, Lisp lists aren't really lists but more like trees.
> And a tree can be represented with an array, so if push comes to
> shove it is probably possible to represent lists as arrays too,
> and no pointers.


But if push comes to pop, then it is probably a stack :-) (I'm sorry,
I couldn't resist. I'll go back in my hole now.)

--
- Stephen Fuld

Stephen Fuld

unread,
Dec 11, 2000, 11:40:54 PM12/11/00
to

"Duane Rettig" <du...@franz.com> wrote in message
news:4aea2i...@beta.franz.com...

> chu...@best.com (Chuck Fry) writes:
>
> > And with 64 bits to play with, who's to say we can't spare a couple for
> > CDR-coding?
>
> Let's not make this mistake again. When IBM (and Amdahl) created their XA
> architectures for the 370 compatibles (extending the address range from 24
> bits to 32 bits), several programs suddenly broke. These programs, which
> used the upper 8 bits of address words for tags and flags, assumed
(correctly,
> but without foresight) that these bits were available for use. However,
> whether these programs were justified in the original designs or not, it
> would be safe for me to venture a guess that _none_ of them exist in their
> previous design form, and that all have been either rewritten or scrapped.

Unfortunately, your guess would be wrong! In MVS and its successors, there
is still the concept of below/above the line, where the line is the address
(16MB) that fits within 24 bits. Some code and data structures must still
run "below the line". It is truely amazing how long a bad decision can
haunt one :-(.

--
- Stephen Fuld

Snip

Terje Mathisen

unread,
Dec 12, 2000, 4:11:32 AM12/12/00
to
Duane Rettig wrote:
[huge snip, lots of interesting stuff about CDR vs EQ]

> I believe that EQ testing does speak directly to the point of cdr-coding;
> if you measure the number of times CDR is taken vs the number of times
> EQ is performed in a typical program, you find that EQ usage overwhelms
> CDR usage. Thus, there is a strong impetus to optimize EQ testing
> in preference to CDR operations.
>
> Remember that any boolean test, e.g. "(when x ...)" is equivalent to
> "(unless (eq x nil) ...)", in a similar manner that in C "if (x) {...}"
> is equivalent to "if (x!=0) {...}", so there are many more EQ tests than
> meets the eye in lisp code. If you use a bit which changes the value
> of a pointer without changing its object's identity, then tests for
> identity can no longer be performed in one instruction, and lisp then
> loses to other languages...

At least on x86, this is much less costly than it would seem:

if (x & SIGNIFICANT_BITS)

takes exactly (modulo slightly increased opcode size) the same time as

if (x)

The relevant asm code would be:

test eax,eax
jnz skip

vs

test eax,SIGNIFICANT_BITS ; 32-bit immediate
jnz skip

On other architectures you would probably (semi-)permanently allocate a
register to contain that 'SIGNIFICANT_BITS' mask, which does carry some
cost.

Duane Rettig

unread,
Dec 12, 2000, 5:23:21 AM12/12/00
to
Terje Mathisen <terje.m...@hda.hydro.com> writes:

> Duane Rettig wrote:
> [huge snip, lots of interesting stuff about CDR vs EQ]
> > I believe that EQ testing does speak directly to the point of cdr-coding;
> > if you measure the number of times CDR is taken vs the number of times
> > EQ is performed in a typical program, you find that EQ usage overwhelms
> > CDR usage. Thus, there is a strong impetus to optimize EQ testing
> > in preference to CDR operations.
> >
> > Remember that any boolean test, e.g. "(when x ...)" is equivalent to
> > "(unless (eq x nil) ...)", in a similar manner that in C "if (x) {...}"
> > is equivalent to "if (x!=0) {...}", so there are many more EQ tests than
> > meets the eye in lisp code. If you use a bit which changes the value
> > of a pointer without changing its object's identity, then tests for
> > identity can no longer be performed in one instruction, and lisp then
> > loses to other languages...
>
> At least on x86, this is much less costly than it would seem:
>
> if (x & SIGNIFICANT_BITS)
>
> takes exactly (modulo slightly increased opcode size) the same time as

Not a real issue, but can't resist this: 5 bytes instead of 2 bytes
is slight? :-)

> if (x)
>
> The relevant asm code would be:
>
> test eax,eax
> jnz skip
>
> vs
>
> test eax,SIGNIFICANT_BITS ; 32-bit immediate
> jnz skip

This magic works only when you are testing against integer 0. C
happens to pun zero as the null or false value, but in Common Lisp,
zero is a non-null integer and NIL is the false value (Scheme is even
more "pure", with nil being the empty list and a special false value,
both distinct from 0).

So such a test would necessarily require mask and compare instructions,
and could not use test.

Our lisp implementation keeps NIL in edi, and tests against it with
a long compare instruction, which is definitely faster than a mask
followed by a compare.

My second paragraph was intended as a reminder to lisp programmers
in case they forget how many times they do EQ comparisons. However,
EQ is a general comparison test, and is the most primitive operation
in lisp. It is in effect, a single instruction to compare between
to longwords (besides, of course, the jump instruction, although some
architectures have compare-and-jump instructions, thus eliding the final
jump at the end). It is the general comparison "instruction" that would
have to be enhanced to first mask off any bits that are "noise" with
respect to what lispers know to be two identical lisp objects. It is
this enhancement required by cdr-coding that I object to.

Terje Mathisen

unread,
Dec 12, 2000, 8:33:39 AM12/12/00
to
Duane Rettig wrote:

>
> Terje Mathisen <terje.m...@hda.hydro.com> writes:
> > At least on x86, this is much less costly than it would seem:
> >
> > if (x & SIGNIFICANT_BITS)
> >
> > takes exactly (modulo slightly increased opcode size) the same time as
>
> Not a real issue, but can't resist this: 5 bytes instead of 2 bytes
> is slight? :-)

Yes, as long as this isn't the only instruction used. I.e. even if 10%
of all code was test for C NULL values, it would only increase the total
code size by 7.5%.

> > test eax,SIGNIFICANT_BITS ; 32-bit immediate
> > jnz skip
>
> This magic works only when you are testing against integer 0. C
> happens to pun zero as the null or false value, but in Common Lisp,
> zero is a non-null integer and NIL is the false value (Scheme is even
> more "pure", with nil being the empty list and a special false value,
> both distinct from 0).
>
> So such a test would necessarily require mask and compare instructions,
> and could not use test.

OK, this is much more serious!

John F Carr

unread,
Dec 12, 2000, 8:53:04 AM12/12/00
to
In article <4aea2i...@beta.franz.com>,

Duane Rettig <du...@franz.com> wrote:
>chu...@best.com (Chuck Fry) writes:
>
>> And with 64 bits to play with, who's to say we can't spare a couple for
>> CDR-coding?
>
>Let's not make this mistake again. When IBM (and Amdahl) created their XA
>architectures for the 370 compatibles (extending the address range from 24
>bits to 32 bits), several programs suddenly broke.

Can anyone come up with a plausible program in the next two decades, to
run on conventional architectures, that will require an address space
between 60 and 64 bits? If you say "let's map everything into a flat
space" you will need more than 64 bits. If you stick to conventional
addressing, including memory mapping individual objects, you use a lot
less than 64 bits.

--
John Carr (j...@mit.edu)

Jan Ingvoldstad

unread,
Dec 12, 2000, 9:51:20 AM12/12/00
to
On 12 Dec 2000 13:53:04 GMT, j...@mit.edu (John F Carr) said:

> Can anyone come up with a plausible program in the next two decades, to
> run on conventional architectures, that will require an address space
> between 60 and 64 bits? If you say "let's map everything into a flat
> space" you will need more than 64 bits. If you stick to conventional
> addressing, including memory mapping individual objects, you use a lot
> less than 64 bits.


18 446 744 073 709 551 616 bits
2 305 843 009 213 693 952 bytes
2 147 483 648 gigabytes

Today, a "full" Usenet feed is around 250 GB/day, and the yearly
growth is around 200%. With two decades of continued growth at the
same rate, that'll be 250*3^20 gigabytes, or:

871 696 100 250 gigabytes

So the answer is, yes, there is a plausible program that will require
far more, if it is expected to use a system with flat addressing.

Since EMC delivers systems with storage capacity well into the
terabyte range today (enough for storing a full Usenet feed for well
over a week, perhaps as much as a month, I haven't checked their
products lately), it doesn't seem unlikely that they will have
something that can store more than 2 billion gigabytes (or 2 exabytes)
in 2020. It isn't as if it's _that_ much data. Perhaps, if you think
of it on a block level, and let each block be 16 KB, then you'll sort
of postpone the necessity of moving beyond 64 bit for a while.


Assumptions:

- Usenet continues to grow exponentially at approximately the same
rate (the growth rate has been slightly more than exponential for
the past 7 years or so, but was higher again before 1993)
- Some people will still want to store all of Usenet for more than
one day (i.e. a week or so)
- Storage technology can keep up with this growth
- Network technology can keep up
- Other computer technology can keep up

Sources:

- Comparative studies in my own ongoing thesis on Usenet
distribution
- Henry E. Hardy's "The Usenet System", ITCA Yearbook, 1993
- Jeremy Nixon's old news statistics on SuperNews (now unavailable)
- http://newsfeed.mesh.ad.jp/flow/

--
In the beginning was the Bit, and the Bit was Zero. Then Someone
said, Let there be One, and there was One. And Someone blessed them,
and Someone said unto them, Be fruitful, and multiply, and replenish
the Word and subdue it: and have dominion over every thing that is.

Peter da Silva

unread,
Dec 12, 2000, 10:10:15 AM12/12/00
to
In article <4aea2i...@beta.franz.com>,

Duane Rettig <du...@franz.com> wrote:
> Let's not make this mistake again. When IBM (and Amdahl) created their XA
> architectures for the 370 compatibles (extending the address range from 24
> bits to 32 bits), several programs suddenly broke.

And Microsoft made the same mistake with their Basic, *twice* on the Macintosh
and then *again* on the Amiga.

John F Carr

unread,
Dec 12, 2000, 11:21:12 AM12/12/00
to
In article <oas8zpl...@skidbladnir.ifi.uio.no>,

Jan Ingvoldstad <ja...@ifi.uio.no> wrote:
>On 12 Dec 2000 13:53:04 GMT, j...@mit.edu (John F Carr) said:
>
>> Can anyone come up with a plausible program in the next two decades, to
>> run on conventional architectures, that will require an address space
>> between 60 and 64 bits? If you say "let's map everything into a flat
>> space" you will need more than 64 bits. If you stick to conventional
>> addressing, including memory mapping individual objects, you use a lot
>> less than 64 bits.
>
>
>18 446 744 073 709 551 616 bits
> 2 305 843 009 213 693 952 bytes
> 2 147 483 648 gigabytes
>
>Today, a "full" Usenet feed is around 250 GB/day, and the yearly
>growth is around 200%. With two decades of continued growth at the
>same rate, that'll be 250*3^20 gigabytes, or:
>
> 871 696 100 250 gigabytes
>
>So the answer is, yes, there is a plausible program that will require
>far more, if it is expected to use a system with flat addressing.

I don't believe usenet will grow a million times over the
next 20 years or be mapped into a flat address space.
There's a limit to the amount of data a human can read or
write, in any medium.

When the company I worked for was bought the new owners told us
to expect their stock to continue growing at the rate of 80% per
year for many years to come. A few moments of thought made it
clear that the stock value would exceed the US GNP, or even the
world GNP, in only a few years. (We've actually lost about 30%
over 7 months.)

--
John Carr (j...@mit.edu)

Jan Ingvoldstad

unread,
Dec 12, 2000, 11:53:45 AM12/12/00
to
On 12 Dec 2000 16:21:12 GMT, j...@mit.edu (John F Carr) said:

> I don't believe usenet will grow a million times over the
> next 20 years or be mapped into a flat address space.
> There's a limit to the amount of data a human can read or
> write, in any medium.

You clearly don't know what Usenet is being used for these days. If
you thought it was plain text articles like those we're blessed with
seeing in comp.arch, you're dead wrong.

Have a looksie at alt.binaries, and come back when you've uploaded a
handful of CD images, like so many others seem to do.

Then watch out for those DVD images.

In the future, expect other storage formats to be uuencoded or
base64-coded just as successfully as today's formats are.


> When the company I worked for was bought the new owners told us
> to expect their stock to continue growing at the rate of 80% per
> year for many years to come. A few moments of thought made it
> clear that the stock value would exceed the US GNP, or even the
> world GNP, in only a few years. (We've actually lost about 30%
> over 7 months.)

This isn't stock. Usenet feeds will keep on growing until technology
can't let it grow anymore; the users will see to that.

(BTW, the text-only feed for Usenet is probably far less than 5 GB
today, and it isn't growing nearly as fast as the full feed.)

Anne & Lynn Wheeler

unread,
Dec 12, 2000, 11:58:31 AM12/12/00
to
Jan Ingvoldstad <ja...@ifi.uio.no> writes:

> Today, a "full" Usenet feed is around 250 GB/day, and the yearly
> growth is around 200%. With two decades of continued growth at the
> same rate, that'll be 250*3^20 gigabytes, or:
>
> 871 696 100 250 gigabytes

lets say that there is currently 6billion people ... then 250gb/day
amounts to about 40bytes/person/day. say in 20 years there are twice
the number of people or 12 billion people on the planet ... then the
871 696 100 250 gigabytes ... works out to around 7-8 million bytes
per person/day.

already the majority of the bytes are not text ... but binary of one
form or another. lets say that text is only 10% of the total ... and
something is worked out to handle the binary in some other way. That
would still say in 20 years that every person on the planet would have
to be generating around 750,000 text characters per day; say 6
characters per word ... around 120,000 words/day ... say 300 words/page
... around 400 pages/person/day.

majority of the usenet growth has been binary with some increase due
to more people involved. however, once every person in the world is
spending all their time only generating material for usenet ... i
would think that the growth would start to level off.

--
Anne & Lynn Wheeler | ly...@garlic.com - http://www.garlic.com/~lynn/

hrum...@cc.hut.fi

unread,
Dec 12, 2000, 1:02:12 PM12/12/00
to
jth...@mach.thp.univie.ac.at (Jonathan Thornburg) writes:

> Mathematica = ?? written in its own list-based programming language ??

IIRC what Mathematica calls lists are really arrays. I would expect its
(lack of) performance to be more related to the weird evaluation model.

Hannu Rummukainen

Duane Rettig

unread,
Dec 12, 2000, 12:47:59 PM12/12/00
to

It won't take two decades.

This kind of thinking comes dangerously close to the "it'll never happen"
mantra that intelligent programmers and engineers have been (incorrectly)
reciting for at least the last 6 decades.

The mantra comes in two forms: "it can't be done", and "you'll never
use it all".

The first form is probably the most familiar, and usually occurs when
engineers see a wall in the technology they work with, past which the
technology doesn't work. It happened with vacuum tubes, magnetic core
memory, transistors, and even various forms of integrated circuits.
It has also happened in the mechanical world; I am amazed myself to see
gears and motors only a few molecules across. Are we at the limit there?
I wouldn't count on it...

The second form is more fun, because instead of picturing limits on
one's technology, we are picturing limits on one's own imagination.
My favorite example was in college, when I was in the university
computer lab, and talking to the resident guru about wanting to wait
to buy my own computer because I wanted one of those new machines with
64 kbytes of memory. His response was "what do you want with one of
those? You'll never use all that memory". Did I know myself precisely
what I would do with all that memory? No, but I knew I would find a
use...

Duane Rettig

unread,
Dec 12, 2000, 12:57:42 PM12/12/00
to
Anne & Lynn Wheeler <ly...@garlic.com> writes:

> Jan Ingvoldstad <ja...@ifi.uio.no> writes:
>
> > Today, a "full" Usenet feed is around 250 GB/day, and the yearly
> > growth is around 200%. With two decades of continued growth at the
> > same rate, that'll be 250*3^20 gigabytes, or:
> >
> > 871 696 100 250 gigabytes
>
> lets say that there is currently 6billion people ... then 250gb/day
> amounts to about 40bytes/person/day. say in 20 years there are twice
> the number of people or 12 billion people on the planet ... then the
> 871 696 100 250 gigabytes ... works out to around 7-8 million bytes
> per person/day.
>
> already the majority of the bytes are not text ... but binary of one
> form or another. lets say that text is only 10% of the total ... and

==============================================^^


> something is worked out to handle the binary in some other way. That
> would still say in 20 years that every person on the planet would have
> to be generating around 750,000 text characters per day; say 6
> characters per word ... around 120,000 words/day ... say 300 words/page
> ... around 400 pages/person/day.

Why 10%? Why not 1%, or .00001%? How many characters in a picture?
Or a movie, or a symphony, or...

My boss' wife just had a baby. Almost immediately, he had four pictures
available for viewing. Very few words, the pictures said it all.
Much less than 10% text, I would say.

> majority of the usenet growth has been binary with some increase due
> to more people involved. however, once every person in the world is
> spending all their time only generating material for usenet ... i
> would think that the growth would start to level off.

The assumption that people will spend all of their time generating
material for usenet is based on a presumption that the rate of
material generation itself will remain constant. A dubious
presumption, I think.

John F Carr

unread,
Dec 12, 2000, 1:52:50 PM12/12/00
to
In article <oasr93d...@skidbladnir.ifi.uio.no>,

Jan Ingvoldstad <ja...@ifi.uio.no> wrote:
>On 12 Dec 2000 16:21:12 GMT, j...@mit.edu (John F Carr) said:
>
>> I don't believe usenet will grow a million times over the
>> next 20 years or be mapped into a flat address space.
>> There's a limit to the amount of data a human can read or
>> write, in any medium.
>
>You clearly don't know what Usenet is being used for these days. If
>you thought it was plain text articles like those we're blessed with
>seeing in comp.arch, you're dead wrong.

There is a limit regardless of medium -- even video can't usefully go
over tens of megabytes per second. But let's assume that usenet will
grow 2x per year forever, and the world will come to an end if a day of
usenet can't fit into memory. (Or a week, month, or all of time.)

Continued exponential growth actually makes my case stronger. Because
no number of bits will be adequate in the long term, reducing the
effective address space by 1 bit only advances the transition to a
bigger address by one doubling period.

--
John Carr (j...@mit.edu)

Joe Marshall

unread,
Dec 12, 2000, 2:27:20 PM12/12/00
to
j...@mit.edu (John F Carr) writes:

> In article <oasr93d...@skidbladnir.ifi.uio.no>,
> Jan Ingvoldstad <ja...@ifi.uio.no> wrote:
> >On 12 Dec 2000 16:21:12 GMT, j...@mit.edu (John F Carr) said:
> >
> >> I don't believe usenet will grow a million times over the
> >> next 20 years or be mapped into a flat address space.
> >> There's a limit to the amount of data a human can read or
> >> write, in any medium.

You are making the unwarranted assumption that address space is only
used for addressing memory. There are a number of good reasons that
you might want 64 address bits, but not use (expt 2 64) bytes.

Paul Wallich

unread,
Dec 12, 2000, 2:45:18 PM12/12/00
to
In article <3a367402$0$29...@senator-bedfellow.mit.edu>, j...@mit.edu (John
F Carr) wrote:

Meanwhile, there's another limit that might be harder to deal with:
250 GB/day is about 23 megabits/sec average, i.e. half a t-3 for the
news backbone. The expanded version of usenet will require about
80 exabits/sec for the backbone; to give you a rough idea of what
that means, using straightforward modulation techniques that would
be about 10 signal transitions per cycle of yellow light.

I expect that doesn't make it impossible; instead, if anything, it
makes it likely that Usenet-N will be some kind of horrific distributed
database that absolutely requires a flat address space to work...

I was rather struck by someone else's comparison to ECC, which is
transparent to user-level programs. What kind of smart memory
architecture would you want so that CDR-coding was similary
transparent, and would you want it? (Would it be just one of a
host of possibilities for memory subsystems that didn't store your
data as written but simply promised to return that value when you
asked for it?)

paul

Russell Crook

unread,
Dec 12, 2000, 2:43:21 PM12/12/00
to
John F Carr wrote:
>
> In article <oasr93d...@skidbladnir.ifi.uio.no>,
> Jan Ingvoldstad <ja...@ifi.uio.no> wrote:
> >On 12 Dec 2000 16:21:12 GMT, j...@mit.edu (John F Carr) said:
> >
> >> I don't believe usenet will grow a million times over the
> >> next 20 years or be mapped into a flat address space.
> >> There's a limit to the amount of data a human can read or
> >> write, in any medium.

Why should the data be human written? (current examples:
satellite weather pictures, executables)

Or for that matter human read? (current examples: mirror engines,
executables again)

Arbitrarily large bandwidth will have arbitrarily astounding (and
unpredictable) impact.

> >
> >You clearly don't know what Usenet is being used for these days. If
> >you thought it was plain text articles like those we're blessed with
> >seeing in comp.arch, you're dead wrong.
>
> There is a limit regardless of medium -- even video can't usefully go
> over tens of megabytes per second.

You obviously don't play video intensive computer games :->
Even at 60 fps (still discernable flicker) at 1280x1024x24 bit
you'll be nudging 200 MB/sec ... and that's still way below
(e.g.) IMAX film bandwidth requirements for real-time playback.

<snip>

--
Russell Crook
Not speaking officially for Sun (or anyone else, for that matter).

[about the 70 MB harddisk of the MicroVAX]
"Why waste a huge disk on a single user Workstation?"
- Ken Olson, ex-President of ex-company Digital Equipment Corporation

John F Carr

unread,
Dec 12, 2000, 3:15:13 PM12/12/00
to
In article <3A367FD9...@canada.Sun.COM>,
Russell Crook <russel...@canada.Sun.COM> wrote:

>You obviously don't play video intensive computer games :->
>Even at 60 fps (still discernable flicker) at 1280x1024x24 bit
>you'll be nudging 200 MB/sec ... and that's still way below
>(e.g.) IMAX film bandwidth requirements for real-time playback.

Video compresses a lot. To within a factor of a few, I see the maximum
useful video rate for human viewing at 10 megapixels times 1 bit per
pixel per frame times 100 fps, or tens of MB/sec.

--
John Carr (j...@mit.edu)

Erik Naggum

unread,
Dec 12, 2000, 4:01:17 PM12/12/00
to
* Paul Wallich

| I expect that doesn't make it impossible; instead, if anything, it
| makes it likely that Usenet-N will be some kind of horrific distributed
| database that absolutely requires a flat address space to work...

I expect that only some headers will be passed around very shortly,
with really clever cache propagation protocols that makes clients
retrieve an article by message-id on demand, ultimately from the
originating server, cutting traffic and disk requirements down to what
is actually used, and killing half the stupid spam problem, too. News
traffic will follow the readers, not always flood the whole Net. And
since we're all fully connected (in N months, anyway), the off-line
generation of news readers and the desire to keep a full copy of
today's cache/catch of USENET locally will simply go away. Nobody in
their right mind has a desire to keep a "local copy" of the whole
World Wide Web just to read an infinitesimally small fragment of it,
yet that is what people do with USENET because getting out of how it
has been done up to now is hard to get away from, but it really is no
worse than some large ISP doing it the right way. The bonus of such a
design is that most originating servers would be able to hold on to an
article _much_ longer than the receiving servers of today are, and
that, too, would relieve most people of the need to keep local copies
of everything.

#:Erik
--
"When you are having a bad day and it seems like everybody is trying
to piss you off, remember that it takes 42 muscles to produce a
frown, but only 4 muscles to work the trigger of a good sniper rifle."
-- Unknown

Lieven Marchand

unread,
Dec 12, 2000, 11:44:17 AM12/12/00
to
Duane Rettig <du...@franz.com> writes:

> Lieven Marchand <m...@bewoner.dma.be> writes:
>
> > It was very nice to have eight bits to work with
> > with each pointer. Perhaps architecture designers should reconsider
> > giving user programs some bits in the address space.
>

> Now, in order to use those bits, they must be available per word, so
> they must be included in the data path and memory. Thus, each data
> word now has 64 bits for regular data plus these two extra bits for
> L to use.
>
> Now, C vendor (C, of course) and Java vendor J are getting on the
> bandwagon of the newly popular A machines, but they notice that they
> are not getting all they could be getting out of the architecture;
> they notice that there are 66 bits of address width, 66 bits in the
> data path, and a disconnect between the data and address in two of
> the bits, and thus the address space is 1/4 the size that it could
> be if those bits could be made use of. So C and L complain to A.
>
> The A developers then set to work on the new A' (or should we call
> it XA? :-) architecture which includes full functionality for the 65th
> and 66th bit, it comes out and leaves L to reimplement its lisp in more
> general terms or else be lost in the dust.
>
> The moral of this is that whatever we ask of hardware vendors had better
> be general-purpose enough to be useful in any language, or else it will
> eventually fail.

Maybe. Would A really do a new architecture to add 2 bits? They would
probably go for 128 data bits and 4 extra bits. A lot of architectures
have some hidden bits associated with each page entry in the paging
table for operating systems to use. Those haven't been subjected to
the kind of pressure from vendors like you describe either, although
some OS'es use them and some don't. Likewise for security rings. I
don't know offhand of an OS in popular use on Intel that uses all four
rings, but they're not going away either.

--
Lieven Marchand <m...@bewoner.dma.be>
Lambda calculus - Call us a mad club

Peter da Silva

unread,
Dec 12, 2000, 4:17:51 PM12/12/00
to
In article <r93dsa...@content-integrity.com>,

Joe Marshall <j...@content-integrity.com> wrote:
> You are making the unwarranted assumption that address space is only
> used for addressing memory. There are a number of good reasons that
> you might want 64 address bits, but not use (expt 2 64) bytes.

Oooh, it's a trick question! I kno the answer!

"And CDR-coding is one of those reasons."

:->

Peter da Silva

unread,
Dec 12, 2000, 4:31:46 PM12/12/00
to
In article <uzoi16...@earthlink.net>,

Anne & Lynn Wheeler <ly...@garlic.com> wrote:
> lets say that there is currently 6billion people ... then 250gb/day
> amounts to about 40bytes/person/day. say in 20 years there are twice
> the number of people or 12 billion people on the planet ... then the
> 871 696 100 250 gigabytes ... works out to around 7-8 million bytes
> per person/day.

That's only 5-6 frames per day at a modest 1280x1024 resolution.

(http://www.eecg.toronto.edu/~mann/)

Duane Rettig

unread,
Dec 12, 2000, 4:59:16 PM12/12/00
to
Lieven Marchand <m...@bewoner.dma.be> writes:

> Duane Rettig <du...@franz.com> writes:
>
> > The moral of this is that whatever we ask of hardware vendors had better
> > be general-purpose enough to be useful in any language, or else it will
> > eventually fail.
>
> Maybe. Would A really do a new architecture to add 2 bits?

Of course not. My whole straw proposal was an illustration to show
how it is dangerous to ask general-purpose hardware manufactureres
for those features that are not likely to be used by other langauges,
but it does not speak to the whole problem of convincing them to do
the job in the first place...

> They would
> probably go for 128 data bits and 4 extra bits. A lot of architectures
> have some hidden bits associated with each page entry in the paging
> table for operating systems to use. Those haven't been subjected to
> the kind of pressure from vendors like you describe either, although
> some OS'es use them and some don't. Likewise for security rings. I
> don't know offhand of an OS in popular use on Intel that uses all four
> rings, but they're not going away either.

Correct. Hardware designers will put in things that they feel are
important.

However, sometimes something gets in that benefits lisp and similar
languages: take the taddcc instruction on the Sparc, for example
(and this is a perfect example of how something like this eventually
fails) - The taddcc instruction is perfect for those software
architectures which view a small integer (e.g., "fixnum" in lisp) as
a signed integer shifted left two bits where the least significant bits
are 0. The taddcc/tsubcc instructions make addition and subtraction
of these numbers very efficient on a 32-bit lisp, and allow overflows
or non-fixnums to be software-trapped to a more generic add/subtract
function.

However, take a look at what has happened:

1. A more recent Sparc architecture release (either V9 or V10, I
forget which) has deprecated taddcc/tsubcc. This is presumably
because there was not enough use for it (though the lisp community,
at least, use it).

2. When we go to the 64-bit architecture, two changes occur:
a. The best representation for a fixnum is a shifted-by-three-bits
number, with three ls-bits 0.
b. Overflow occurs into the sign bit of a 64-bit machine integer,
rather than into the sign bit of a 32-bit machine integer.
The taddcc/tsubcc instructions look at two ls-bits, and check
overflow into the 32nd bit. There are no equivalent instructions
which could be considered "64-bit txxxcc", which would check the
3 ls-bits and a 64-bit signed overflow.

So we have a case where a used-(mostly/only)-by-lisp feature on GP
hardware will eventually go away.

I still believe that the wisest thing that the lisp community can ask
of hardware and operating system vendors is fast user-level trap
capability; with fewer registers saved and fewer cycles between the
trap and the return from the trap. Such a feature would be much more
likely to be widely used than cdr-coding, even in lisp systems.

John F Carr

unread,
Dec 12, 2000, 5:33:25 PM12/12/00
to
In article <4k895q...@beta.franz.com>,
Duane Rettig <du...@franz.com> wrote:

>I still believe that the wisest thing that the lisp community can ask
>of hardware and operating system vendors is fast user-level trap
>capability; with fewer registers saved and fewer cycles between the
>trap and the return from the trap. Such a feature would be much more
>likely to be widely used than cdr-coding, even in lisp systems.

I've wished for that feature several times over the years,
and I rarely use lisp.

Linux microbenchmarks have at least made people pay a little
attention to trap performance. ("My cheap Linux system does
something useless 10 times faster than your expensive commercial
Unix system.") So far I think the effects have been all in
software.

--
John Carr (j...@mit.edu)

Tim McCaffrey

unread,
Dec 12, 2000, 7:08:50 PM12/12/00
to
In article <4k895q...@beta.franz.com>, Duane Rettig says...

>I still believe that the wisest thing that the lisp community can ask
>of hardware and operating system vendors is fast user-level trap
>capability; with fewer registers saved and fewer cycles between the
>trap and the return from the trap. Such a feature would be much more
>likely to be widely used than cdr-coding, even in lisp systems.
>

I will admit I haven't thought this through, but it seems like you
are almost asking for the ValueCall (VALC) operation on the Unisys
A-series (aka Clearpath NX). The ValueCall opcode will load the value from a
memory location, if the memory location is a value, your done, if it is a
pointer to data, it dereferences (loads the data pointed to) and if it is a
pointer to code, it calls it, expecting the code to return the relevant data
on the stack (A-series is a stack machine).

And who said Call-by-Name was useless? :)

- Tim McCaffrey

pla...@nospam-cs.wisc.edu

unread,
Dec 12, 2000, 7:12:41 PM12/12/00
to
In comp.arch Duane Rettig <du...@franz.com> wrote:
> languages, comparatively). However, the very fact that lisp is able
> to handle large problem spaces well has always allowed it to be used
> for increasingly larger projects. Currently, we have customers,
> especially in the CAD realm, whose data spaces push beyond the normal
> 2 to 3 Gb address spaces allowed by 32 bit systems, thus causing a
> need to move toward usage of 64-bit addressing. Certainly it was

Out of curiosity: what kinds of large commercial
applications use LISP? Do they use it as an embedded
scripting engine, or to write the application proper?

I guess you could ask the same question about functional
languages in general. All I am aware about is Ericcson's
use of Erlang. Phil Wadler has a web page talking
about some others:
http://www.cs.bell-labs.com/who/wadler/realworld/

Manoj

Duane Rettig

unread,
Dec 12, 2000, 8:01:57 PM12/12/00
to
pla...@nospam-cs.wisc.edu writes:

> In comp.arch Duane Rettig <du...@franz.com> wrote:
> > languages, comparatively). However, the very fact that lisp is able
> > to handle large problem spaces well has always allowed it to be used
> > for increasingly larger projects. Currently, we have customers,
> > especially in the CAD realm, whose data spaces push beyond the normal
> > 2 to 3 Gb address spaces allowed by 32 bit systems, thus causing a
> > need to move toward usage of 64-bit addressing. Certainly it was
>
> Out of curiosity: what kinds of large commercial
> applications use LISP? Do they use it as an embedded
> scripting engine, or to write the application proper?

Both are possible, as well as using lisp as glue to hold
together systems with subsystems written in different languages.
However, the applications I refer to, which _must_ use lisp for
its ability to handle large problem sets, tend to be largely
written in lisp.

> I guess you could ask the same question about functional
> languages in general. All I am aware about is Ericcson's
> use of Erlang. Phil Wadler has a web page talking
> about some others:
> http://www.cs.bell-labs.com/who/wadler/realworld/

Looks like a good list, and I notice that Lisp is a major
contributor in that list. However, lisp is not a purely
functional language, per se; I would characterize it as very
general purpose, with good support for functional, object-oriented,
dynamic, etc. programming.

You can also check out our website (below) under "Success Stories",
and also http://www.lisp.org, for more info.

Duane Rettig

unread,
Dec 12, 2000, 8:47:52 PM12/12/00
to
timothy....@spam2filter.unisys.com.takethisoff (Tim McCaffrey) writes:

Touche'. However, I would call it Call-by-deReference :-)

I don't know that architecture, but you have the right idea. Some
of the features that would have to be present in order for such an
operation to be useful to us are:

- The memory address would have to be passed into the code when
called, so that new locations for the forwarded data could be
calculated.
- The code would need to also be called in a store operation.
Whether the opration is read or write would have to be passed to
the code or else calculable.
- Access to simple data would have to be as fast as a simple load
instruction.

This implementation seems to require a lot of hardware to implement
(again, I don't know this architecture, but am guessing), and I'd also
guess that there are a couple of extra bits involved, there, to
determine whether the other bits are data, indirect, or code.

A more general purpose and portable read/write barrier would involve
setting memory page attributes in the MMU to "no access" and when the
location is read from or written to, the appropriate actions are
taken by the garbage collector's trap handler to perform the proper
forwarding and accessing of the forwarded object. This would require
a very fast trap handler, at the same order as a first level page
fault handler, perhaps partially implemented in hardware.

Per Bothner

unread,
Dec 12, 2000, 9:08:09 PM12/12/00
to
Erik Naggum <er...@naggum.net> writes:

> Arrays are also extremely inefficient
> where lists are not the wrong data structure.

Yes - but I think that is rare.

> Hence, use lists _and_ arrays, as appropriate.

Of course.

> Lisp programmers are not the idiots you think they are, Per. They
> figure out when it makes sense to use arrays even if you don't and
> need to reduce the problem to silliness.

Did I accuse anyone of being idiots? (I leave that to you.) Even
when arrays are *in principle* a better data structure, it may be
perfectly reasonable to use lists. For example it may be easier in
Lisp to program with lists than array. While Lisps do support arrays,
they are traditionally based on lists as the fundamental data
structure (for example &rest binds to a list, not an array), so it
just might be easier to use lists, even if they are less efficient.

Note the context of this discussion: Is cdr-coding a good idea?
My answer is that it might be a good useful for existing code
that uses lists heavily. However, it is reasonable to believe that
any code that would gain signficantly from cdr-coding would gain
as much or more from using arrays instead of lists. This
takes away much of the point of implementing cdr-coding.

> If your data structures nest, use lists.

Or nested arrays. Or you can use more clever data structures, such
as the "document table model" that the Xalan XSL processor uses
to represent XML documents.

> If your data structures have unpredictable structure, use lists.

If your data structures have unpredictable structure, you may be
in trouble ...

> If you need to maintain uniformity of type for the "rest" after
> processing an initial portion of the data structure, use lists.

Or use a more abstract sequence model, which would have arrays and
sub-ranges of arrsys *and* lists as special cases.

Or use a pair of an array and an integer.

> (If you think this only applies to recursion, which it seems you do, you
> really must think more carefully about algorithm design with arrays
> and why displaced arrays or passing _pairs_ (sorry, data structures in
> small arrays, of course) of the array and the index won't cut it.)

This is too vague for me to answer.

> If you need to manipulate the structure of the data structure in any
> way, not just insert/delete, use lists. (Flattening, pairing, etc.)

It is of course quite reasonable and efficient to do flattening
into an array.

> If after much headache and painfully idiotic programming mistakes, you
> discover that your array implementation really must use two elements
> per "array", one data and one "next", you have reinvented the cons.
> Congratulations! Welcome to the wisdom of 40 years of language design!

Erik, you know that I am older than you, and have probably been
programming at least as long as you. Don't you imagine I might have
quite a bit of experience with both arrays *and* lists? (I have less
experience with Common Lisp than you, I'm sure though.) I think my
share of "painfully idiotic programming mistakes" is modest, yet I
have not found that my "implementation really must use two elements
per "array"".

> Who cares about allocating separate cons cells? Simple arrays have to
> be more than four elements long to take less space than the equivalent
> list using cons cells.

Assume a modest one word of overhead per object. Then an extensible
array with three elements takes 3 words for the array header (header +
current length + pointer to data array) plus 4 words for the data
array (1 word of header plus three words of data) or 7 words total. A
non-extensible array would only need 4 words. If you need a separate
header word for the array allocated length, add one more word yielding
8 or 5 words, respectively. A list of three cons cells take 9 words.
Already for length three, the array wins.

If you use a memory allocation scheme not requiring any header words,
then the extensible 3-element array takes 6 words, the non-extensible
array takes 4 words, and the list version takes 6 words. The array
is equal to the list in space.

> The allocation for cons cells is extremely efficiently implemented in
> all Lisp implementations.

Unless you use cdr-coding, you will need 2 pointers plus any per-object
overhead (which may be zero) for each list element. This is twice as
much space as using an array. This only wins for short lists, or
for extensible size arrays whose size are just past the doubling threshhold.

> Using your stupid arrays-for-lists would of
> course force some super-clever implementation of exponent-of-2-sized
> arrays that double in size as they are used _and_ move around all the
> time (destroying object identity unless you waste even _more_ space on
> their already inefficient implementation),

One obvious way to solve the problem of unused space in extensible arrays
(while will still almost always takes less spaces than using lists
expect for very short sequences) is to have the garbage collector trim
off the excess space. (You double the space allocation to deal with
arrays that are actively growling, to prevent quadratic effects.
If they have survived to garbage collection, they are probably not
actively growing.)

An alternative to doubling the size may be to use the Fibonacci sequence
for the arrays sizees: 3 words, 5 words, 8 words, etc. This reduces the
maximum wastes space waste to asymptoticaly about 38%. I think this
preserves linear complexity (i.e. each array element has been copied
an *average* of a small constant number of times), but I'm not sure
enough of my math to say for sure. Does anyone know? (Knuth mentions
Fibonacci memory allocation, but I don't know if he says anything
relevant to this problem.)

> but people have been there,
> many times over, and they have _returned_ to lists, implemented via
> cons cells, when that has been the most intelligent solution.

Because it is convenient, not necessarily because it is efficient.

> You've done so much good work. How come you don't know simple things?

Likewise, I'm sure.
--
--Per Bothner
p...@bothner.com http://www.bothner.com/~per/

Jan Ingvoldstad

unread,
Dec 12, 2000, 10:30:06 PM12/12/00