Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Could CDR-coding be on the way back?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 76 - 100 of 246 - Collapse all  -  Translate all to Translated (View all originals) < Older  Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Duane Rettig  
View profile  
 More options Dec 13 2000, 2:05 pm
Newsgroups: comp.lang.lisp, comp.arch
From: Duane Rettig <du...@franz.com>
Date: 13 Dec 2000 10:54:29 -0800
Local: Wed, Dec 13 2000 1:54 pm
Subject: Re: Could CDR-coding be on the way back?
pe...@harlequin.co.uk (Pekka P. Pirinen) writes:

> Duane Rettig <du...@franz.com> writes:
> > 64-bit architectures tend to define their
> > entire virtual space.  For example, the PA-RISC with the W bit set defines
> > text segments starting at 0x4000000000000000, and data segments starting at
> > 0x8000000000000000.  Which bits would you steal?  Some bits in the middle?

> Sure.  The allocator needs to know about it, but as long as there's
> plenty of virtual space, it's not that hard to do.  If there isn't --
> and I do believe it would be rash to assume somebody won't find a use
> for all that space -- you need another version of your system with no
> cdr-coding.  It's wasteful not to put all that virtual space to
> creative use, if the application doesn't need it all.

Right, but at Franz, we are not making such assumptions of non-use; this
is precisely the kind of assumption leading to a rewrite that I think
we should try to avoid.

> > > >Note that one such test that becomes nontrivial on General Purpose
> > > >hardware (i.e. not lisp machines) is EQ.  Instead of one instruction,
> > > >those extra bits must be masked off before the comparison.  I don't
> > > >know if anyone has ever considered placing those extra bits "elsewhere",
> > > >e.g. in a parallel location to the actual pointer/tag word.

> I can't figure out what kind of an implementation you envision that
> would have a special problem with EQ.  I would place the extra bits in
> the CAR cell to code the representation of the CDR, and mask them out
> when reading the CAR; if you don't mask them out then, there's lots of
> other operations that would have to know about them as well, not just
> EQ.

Yes, exactly!  Without cdr-coding, EQ objects have identical tagged
pointer values (which I call LispVals) and thus no masking is necessary
at all to compare identity.  And my example was by no means
exhaustive, but was instead only the simplest example.

--
Duane Rettig          Franz Inc.            http://www.franz.com/ (www)
1995 University Ave Suite 275  Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253   du...@Franz.COM (internet)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Paul Wallich  
View profile  
 More options Dec 13 2000, 3:59 pm
Newsgroups: comp.lang.lisp, comp.arch
From: p...@panix.com (Paul Wallich)
Date: Wed, 13 Dec 2000 15:58:51 -0500
Local: Wed, Dec 13 2000 3:58 pm
Subject: Re: Could CDR-coding be on the way back?
In article <oaslmtlj8w1....@skidbladnir.ifi.uio.no>, Jan Ingvoldstad

<j...@ifi.uio.no> wrote:
>On Tue, 12 Dec 2000 14:45:18 -0500, p...@panix.com (Paul Wallich) said:

>> Meanwhile, there's another limit that might be harder to deal with:
>> 250 GB/day is about 23 megabits/sec average, i.e. half a t-3 for the
>> news backbone. The expanded version of usenet will require about
>> 80 exabits/sec for the backbone;

>Eh, no, 811 exabytes/day is not 80 exabits/sec, but roughtly 80
>petabits/sec, which is much easier to handle.  ;)

Oops, got my prefixes swapped.

>> to give you a rough idea of what
>> that means, using straightforward modulation techniques that would
>> be about 10 signal transitions per cycle of yellow light.

>Does this mean that it's only (roughly) 0.01 signal transitions per
>cycle of yellow light, or is that what you get for petabits?

The latter. Exponent was right, name was wrong.

>But still, the time for this problem to arise might still come.
>Usenet news isn't the only thing in the world requiring larger amounts
>of storage and bandwidth.  Video-on-demand has (so far) largely been a
>failure for the mass market because of the lack of bandwidth for the
>necessary quality (although MPEG-4 and later MPEG-7 may make subtle
>changes to this), and sometime, people may actually have those nice
>HDTV (or better) displays in their homes.  So yes, I think we'll reach
>that limit one day.  Maybe it won't be in 20 years, but I'll be around
>when it happens.

I don't think so, mostly because I don't think that the notion of "backbone"
will make any sense at that point. There still will be
places where enormous amount of data get shipped around, but the idea of
big, fat, long pipes connecting a bunch of narrow ends seems less likely.

Btw, video-on-demand requires less bandwidth into each house than does
regular cable TV. It's the aggregate bandwidth but more important the
decentralization that gets you.

paul


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Per Bothner  
View profile  
 More options Dec 13 2000, 6:44 pm
Newsgroups: comp.lang.lisp, comp.arch
From: Per Bothner <p...@bothner.com>
Date: 13 Dec 2000 15:44:46 -0800
Local: Wed, Dec 13 2000 6:44 pm
Subject: Re: Could CDR-coding be on the way back?

Erik Naggum <e...@naggum.net> writes:
>   Really?  Reading Lisp data from input streams is well-defined today,

Perhaps, but does it make sense?  Presumably you application has some
idea of the structure of the data it is expecting?  (Even trees have
structure.)

>   but arrays are very inefficient because you don't know their size
>   a priori, and copying elements like crazy as the list grows is
>   much less efficient than grabbing cons cells from a pool of them.

But that is just my argument - it is *not* much less efficient.  If
you double the array as it grows, the "average" element is only copied
*a single time*.  (Some elements may be copied n*log(n) times, but at
least half the elements have been copied at most once, at least
three-quarter of the elements have been copied at most twice, etc.)

> The way Lisp readers usually deal with arrays is in fact to build a
> list from the input and then convert the list to an array.

A reasonable thing to do, at least if you have a good GC.

>   You recommend arrays to other people, not just to yourself, unless I'm
>   _gravely_ mistaken about the purpose of your article.  Other people
>   who have much less experience than you and rely on your experience in
>   order to form their own, with the least possible pain.  My experience
>   with people who try to implement things with arrays is that they end
>   up with very cons-like concepts once they outgrow the C "memory is
>   really an array of bytes" idea.  The reason is simple: Arrays with the
>   properties we discuss are very, very hard to implement correctly.

It is not that hard.  Still, I agree it is is best if arrays are well
supported by the programming language or at least a good library.

>   If we _are_ talking about what a compiler designer would do for a
>   language that natively supports these things in such a way that
>   programmers would never _see_ the array, the kind of reasoning you
>   have provided is completely misplaced.  I have therefore assumed that
>   you do _not_ argue about primarily what a compiler writer would do,
>   although CDR-coding _is_ at that level.

My argument is a little of all of:

(1) I think high-level programming languages should be built around
array support rather list support.  (Actually, I prefer language
primitives to work on generalized sequences, with arrays just being
the default implementation.)

(2) I think programmers should (in general) use arrays rather than
lists, even in Lisp-like lanuages, at least when the number of
elements is large and performance is important.

(3) As a consequence of (1) and (2), I don't see a lot of justification
for cdr-coding, at least as something one would do in hardware.

>   I think you need to explain how you ended up with 9 words for the
>   list.  There is of course no header word for cons cells if you are
>   going to implement a very basic idea in a language with this.

I discussed this case in the following paragraph.

>   Type information is stored much more efficiently if you care about
>   these things, or you allocate them in blocks that only hold cons
>   cells.  That is, arrays of cons cells, if you want.  Likewise,
>   your arrays would have type information in a similarly more
>   efficient way.

Agreed, at least for a Lisp-like language.  For a object-oriented
language like Java, with many different types where fewer of them
predominate, it may make more sense to always have a header word.  (Of
course CL is object-oriented.  However, it differs from Java in that
certain pre-defined types are used very heavily so it makes sense to
optimize for those types in a way that makes less sense for Java.)

>   Here's how I count.  First, the data array.  It has one word per
>   element, but it is aligned at and always allocate in some useful
>   chunk, like 4 words.   This is to make allocation significantly
>   cheaper.  Since the data array may need to be moved when grown, there
>   is a meta-object that contains the allocation size, the displacement
>   offset, the active size, and a pointer to the data array, or 4 words,
>   one allocation unit.

The displacement offset is not needed for adjustable arrays.  It is only
needed for displaced arrays.  And you can place the allocation size
at the start of the data array.  (It would be needed there anyway if
memory management code needs to scan memory consequtively.)  That gives
only 2 words for the header meta-object.

> | If you use a memory allocation scheme not requiring any header words,
> | then the extensible 3-element array takes 6 words, the non-extensible
> | array takes 4 words, and the list version takes 6 words.  The array
> | is equal to the list in space.

>   Yes, but only at enormous expense in allocation costs with such a fine
>   granularity.  Also, you really do not want your array to move around
>   if it grows, or you lose the idea of identity of lists.

But the array object doesn't move if you use a header meta-word, only
the data array moves.

Of course if you want to preserve identity of lists for linked links,
you will also need an initial header cons.

>   The overhead is twice as much _after_ the overhead of arrays have been
>   dealt with, including that copying-when-doubling scheme, which out of
>   necessity must allocate memory for the same element many times over as
>   the list grows.

As I said:  Only twice total, on average.

>   By cleverly allocating allocation units for the data
>   array from a single pool that is not used for anything else, the same
>   way cons cells are allocated, you can grow your list painlessly and
>   linearly without copying until you allocate another array.

No such tricks need to painlessly and *linearly* grow the array,
even *with* copying.  Not to say that your scheme is a bad one
- I agree it a very reasonable thing to do.

>   The interesting thing is, however, what the languages look like after
>   you decide to represent lists as arrays that way.  E.g., cdr-ing down
>   a list can easily be emulated with displaced arrays, i.e., arrays that
>   really use another array's data for its elements, but starting from an
>   offset for elemen t0.

Agreed.  However, my preference is for languages where the programmer
does not usually explicitly "cdr" down a list - or even index them.
I think languages should encourage programmers to work with arrays
as a unit, and language primitive should encourage that.  I'm in favor
of "array languages" or what we might loosely call APL-style languages.

In the context of Common Lisp, I prefer the "series" design rather than
the "loop" design.
--
        --Per Bothner
p...@bothner.com   http://www.bothner.com/~per/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter da Silva  
View profile  
 More options Dec 13 2000, 6:47 pm
Newsgroups: comp.lang.lisp, comp.arch
From: pe...@abbnm.com (Peter da Silva)
Date: 13 Dec 2000 23:38:26 GMT
Local: Wed, Dec 13 2000 6:38 pm
Subject: Re: Could CDR-coding be on the way back?
In article <joswig-54D876.13154213122...@news.is-europe.net>,
Rainer Joswig  <jos...@corporate-world.lisp.de> wrote:

> In article <917vho$...@web.nmti.com>, pe...@abbnm.com (Peter da Silva)
> wrote:
> > In article <m24s09hy46....@kelso.bothner.com>,
> > Per Bothner  <p...@bothner.com> wrote:
> > > Note the context of this discussion:  Is cdr-coding a good idea?
> > CDR-coding is an implementation technique for hiding an array structure
> > behind a list structure.
> CDR-coding is an implementation technique for optimizing
> the space usage and the locality of lists.

I see these two statements as being equivalent in the context of this
discussion.

--
 `-_-'   In hoc signo hack, Peter da Silva.
  'U`    "Milloin halasit viimeksi suttasi?"

         Disclaimer: WWFD?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
joswig  
View profile  
 More options Dec 13 2000, 7:30 pm
Newsgroups: comp.lang.lisp, comp.arch
From: jos...@corporate-world.lisp.de
Date: Thu, 14 Dec 2000 00:16:09 GMT
Local: Wed, Dec 13 2000 7:16 pm
Subject: Re: Could CDR-coding be on the way back?
In article <91919i$...@web.nmti.com>,
  pe...@abbnm.com (Peter da Silva) wrote:

Symbolics was concerned with space usage. You don't get
constant time access of arrays. Accessing the nth
element of a cdr-coded list is not equivalent in terms of speed
to accessing the nth element of an array.

Sent via Deja.com
http://www.deja.com/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
cbbrowne  
View profile  
 More options Dec 13 2000, 9:16 pm
Newsgroups: comp.lang.lisp
From: cbbro...@hex.net
Date: Thu, 14 Dec 2000 02:16:16 GMT
Local: Wed, Dec 13 2000 9:16 pm
Subject: Re: Could CDR-coding be on the way back?
>>>>> "Paul" == Paul Repacholi <p...@prep.synonet.com> writes:

Paul> j...@mit.edu (John F Carr) writes:

>> In article <4aea2iwyp....@beta.franz.com>, Duane Rettig
>> <du...@franz.com> wrote: >chu...@best.com (Chuck Fry) writes: >
>> >> And with 64 bits to play with, who's to say we can't spare a
>> couple for >> CDR-coding?  > >Let's not make this mistake
>> again.  When IBM (and Amdahl) created their XA >architectures
>> for the 370 compatibles (extending the address range from 24
>> >bits to 32 bits), several programs suddenly broke.
>> Can anyone come up with a plausible program in the next two
>> decades, to run on conventional architectures, that will require an
>> address space between 60 and 64 bits?  If you say "let's map
>> everything into a flat space" you will need more than 64 bits.  If
>> you stick to conventional addressing, including memory mapping
>> individual objects, you use a lot less than 64 bits.

Paul> Take a plastic ball and cover the bottom half with microwave
Paul> detecters and A/Ds. Hand wave that some horrid reasearcher will
Paul> come up with a way to get 8 bits per sample, even at, say,
Paul> 500GHz. Hand wave that the packages is about 5cm. So that is
Paul> about 200 per ball, with out getting too thing about cramming
Paul> them all in. So that comes to 1TB per sec.

Paul> That per 'ball'. You have a square Km of them, plus extras. Say
Paul> a spacing of 2M, thats 500x500, 250,000. So that comes to 250
Paul> EB/sec... Now run it 24 hrs/days for a decade... Or, just take
Paul> 10 hrs data for analysis. 64 bits does not go far here, just in
Paul> data size.

Paul> Plus, the accounting software may need to be 64+ bits as
Paul> well :)

The jump from 16 bit architectures to 32 was a _pretty_ big deal, as
the typical "dynamic scope" of 32 bits, +/-2^31-1, is a whole lot
bigger than +/-32767, and suffices to characterize a lot more
problems.

Jumping from there to 64 bits is a _whopping_ big jump; while 32 bits
isn't quite enough to represent national debts, it's not _far_ off,
and whether we speak of 36 bits, 48, 60, or 64, it's all "quite large
enough" to represent values that required compound data structures
with only 32 bits to play with.

Long and short being:
  "No, the accounting software probably does _not_ need 64+ bits;
   they'll probably revalue the currency before that becomes
   relevant..."

There will indeed always be some "bleeding edge" applications that
could use the maximum capabilities that any technology can ever offer;
that doesn't mean that such technologies will be economically
feasible, or that they will offer enough to "more mundane"
applications to economically justify their development.

But where 8 bits sufficed for some applications, and having more is
gravy, the same is true for successive generations of microprocessors,
and 60 or 64 bits will "suffice" for many classes of applications
regardless of whether we are thinking of numeric ranges or the sizes
of memory addresses.  

It should be eminently useful to take 64 bit architectures and waste a
few bits on tagging; cutting ranges from 64 to 60 still leaves a
useful range for _any_ application that has been deployable on a 32
bit architecture.
--
(concatenate 'string "cbbrowne" "@hex.net")
<http://www.ntlug.org/~cbbrowne/>
Referring to undocumented private communications allows one to claim
virtually anything: "we discussed this idea in our working group last
year, and concluded that it was totally brain-damaged".
-- from the Symbolics Guidelines for Sending Mail


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Joe Pfeiffer  
View profile  
 More options Dec 13 2000, 10:15 pm
Newsgroups: comp.lang.lisp, comp.arch
From: Joe Pfeiffer <pfeif...@cs.nmsu.edu>
Date: 13 Dec 2000 19:44:42 -0700
Local: Wed, Dec 13 2000 9:44 pm
Subject: Re: Could CDR-coding be on the way back?
pe...@abbnm.com (Peter da Silva) writes:

The post points out that the discussion may be fundamentally flawed:
lists and vectors are abstract data types, linked lists and arrays are
implementations.  Up until that post, the data type and implementation
were being confused.
--
Joseph J. Pfeiffer, Jr., Ph.D.       Phone -- (505) 646-1605
Department of Computer Science       FAX   -- (505) 646-1002
New Mexico State University          http://www.cs.nmsu.edu/~pfeiffer
VL 2000 Homepage:  http://www.cs.orst.edu/~burnett/vl2000/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Dec 13 2000, 11:17 pm
Newsgroups: comp.lang.lisp, comp.arch
From: Erik Naggum <e...@naggum.net>
Date: 14 Dec 2000 03:17:39 +0000
Local: Wed, Dec 13 2000 10:17 pm
Subject: Re: Could CDR-coding be on the way back?
* Per Bothner <p...@bothner.com>
| Perhaps, but does it make sense?

  Excuse me?  Yes, it makes sense to read Lisp data that the application
  does not know the structure of a priori.  For one thing, source _code_
  matches that description.  Data whose structure is only evident after
  having been read in as a complete expression also matches very well.

| Presumably you application has some idea of the structure of the data
| it is expecting?  (Even trees have structure.)

  Yes, it knows they will all be Lisp objects.  That's it.  This is part
  of the significant charm with Lisp's lists.

| But that is just my argument - it is *not* much less efficient.  If
| you double the array as it grows, the "average" element is only copied
| *a single time*.  (Some elements may be copied n*log(n) times, but at
| least half the elements have been copied at most once, at least
| three-quarter of the elements have been copied at most twice, etc.)

  There's something wrong with your math.  The average number of times
  an element is copied is 1.0 if you have (- (1+ (expt 2 n)) m) elements
  in the final array with m elements in the initial allocation and
  approaches 2.0 asymptotically for (1+ (expt 2 n)) elements in the
  final array, since adding the last element will copy everything and
  end up with half the allocated array empty (short that one element).

| My argument is a little of all of:
|
| (1) I think high-level programming languages should ...
| (2) I think programmers should ...
| (3) As a consequence of (1) and (2), ...

  I'd be interested in how you _arrived_ at your preferences, instead of
  just having something follow from some random personal favorites.

| (Of course CL is object-oriented.  However, it differs from Java in
| that certain pre-defined types are used very heavily so it makes sense
| to optimize for those types in a way that makes less sense for Java.)

  This is where you're losing me.  I don't see a huge need for making a
  basic type more efficient in languages that _don't_ show a predominant
  use of that basic type.  And why are we discussing Java?  I thought it
  was already as array-friendly as you wanted languages to be?  And Lisp
  already has adjustable arrays, so what's the point in redesigning them?

| The displacement offset is not needed for adjustable arrays.  It is only
| needed for displaced arrays.

  Sigh.  Why am I wasting any time on this?  The displacement is needed
  for the "rest" (cdr) function that people who work on lists, however
  they are actually implemented, need very often.  If you want to offer
  Lisp people something in terms of efficiency, you can't take away
  their primitives without changing the entire language under them.
  That's what I was trying to get at in the last parapraph I wrote.

| And you can place the allocation size at the start of the data array.

  Yes, but that is a truly boneheaded decision as it means that the data
  array can no longer be specialized to contain a uniform data type --
  say double-floats instead of pointers to double-floats (and you really
  want that if you go the way of arrays in the first place -- people
  even ask for specialized list types, and they won't stop if they know
  the underlying implementation is array-based), but must somehow have a
  special initial element or consist of a single word that destroys
  alignment of the rest of the array.  This is just stupid.  It's just
  like the old string representation that used the first byte to hold
  the length and couldn't hold more than 255 characters per string.

| (It would be needed there anyway if memory management code needs to
| scan memory consequtively.)

  Really?  I thought memory management code followed pointers and chains
  from a root set if it were any smart.  Randomly walking over memory it
  has no idea what contains just seems like a recipe for major disaster.
  Some conservative garbage collectors that are bolted onto languages
  that were never properly designed to deal with abandoned pointers seem
  to make do with interpreting machine words as pointers on a hunch, but
  I have always been exceedingly skeptical of the entire approach.  If
  you keep the size of a vector in a well-known structure with the
  pointer to the data array, there is only one way to get at that data
  array, anyway.

| That gives only 2 words for the header meta-object.

  And much more overhead and increased complexity and general madness.
  You have obviously never actually experimented with this in _Lisp_-
  like languages.  I don't care about optimizing other languages for
  adjustable arrays, as we already have efficient adjustable arrays in
  Common Lisp, so there's no point in reinventing the wheel, and using
  adjustable arrays to represent lists requires concern for the usage of
  lists so optimized.  I'm sure you can optimize this stuff to death in
  Java, but Java doesn't have any need for cdr-coding or more efficient
  storage of lists in particular, either, so I must admit to completely
  failing to see the whole point of this exercise, except that you get
  to argue for your preference for array-based languages, which I do not
  share, and am unlikely to begin to share just because of this.

| Of course if you want to preserve identity of lists for linked links,
| you will also need an initial header cons.

  I'm sorry, but is this a real argument?  You're talking about data
  arrays that move when they grow at the tail end, right?  Lists can
  grow at the tail end without any "header cons" simply because we just
  overwrite the cdr that is nil in the final cons with a pointer to the
  cons cell that holds the new element (and a new nil).  Lists thus
  grown retain their first cons cell, which does not move, and thus
  retains identity at the pointer level.  You do need a meta-object for
  the header of arrays because the data array moves the address of the
  first element when it is moved.

| As I said:  Only twice total, on average.

  Well, you actually said *a single time*, but if you keep doubling it
  every time you say it, I'm happy.  (Sorry, but I found that fuuny. :)

| However, my preference is for languages where the programmer does not
| usually explicitly "cdr" down a list - or even index them.

  Well, I think it is preferable to ask people who prefer the languages
  they are trying to optimize for their ideas on how to optimize them
  instead of optimizing the programmers to use some language _they_ do
  not prefer simply because the optimizers do.

  I'm confused about your purposes, I guess.  Promote APL if you like,
  but then I don't see why you're even offering suggestions for how to
  implement lists efficiently in Lisp.  This is mystifying to me.

#:Erik
--
  The United States of America, soon a Bush league world power.  Yeee-haw!


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Frank A. Adrian  
View profile  
 More options Dec 14 2000, 1:27 am
Newsgroups: comp.lang.lisp, comp.arch
From: "Frank A. Adrian" <fadr...@uswest.net>
Date: Wed, 13 Dec 2000 22:27:32 -0800
Local: Thurs, Dec 14 2000 1:27 am
Subject: Re: Could CDR-coding be on the way back?
"Per Bothner" <p...@bothner.com> wrote in message

news:m2wvd3ga35.fsf@kelso.bothner.com...

> (2) I think programmers should (in general) use arrays rather than
> lists, even in Lisp-like lanuages, at least when the number of
> elements is large and performance is important.

I think you are correct if (a) the language supports variable length arrays
or
(b) the user knows how many elements he will have or (c) random access of
the sequence data is important and (d) no large amountof sharing of sequence
data is needed and (e) no insertion of elements into the sequence happens.
In other
words, it depends.  Also, note that what is supported by the language is
often what
gets coded by a user.  If every sequence looks like an array, the user will
avoid
insertion and sharing of substructure and start assuming that O(1) access of
any
element is present.  This, of course, makes some code very ugly, but since
the
user doesn't see any obvious alternative, he's stuck.

> (3) As a consequence of (1) and (2), I don't see a lot of justification
> for cdr-coding, at least as something one would do in hardware.

If you read my post carefully, you would have seen that nowhere did I ask
whether
or not there should be hardware support for such a feature.  In fact,
history has shown
that language specific features for hardware are usually a bad idea.  What I
asked was
given the mismatch of cache speed and CPU speed, would a software
implementation
of CDR-coding be a good implementation technique.  I was hoping that someone
might
pop up with some discussion that actually involved hardware and cycle count.
Instead
I get straw men about hardware support for the feature.

To me the only valid argument I have seen so far is the one that says that
there are
better uses for the tag bits.  This I can believe.  Show me cycle counts
that demon-
strate that CDR-coding would add time to the "straight-line" path for list
traversal and
I will believe that CDR-coding is still a bad idea.  Arguments that simply
assert that
arrays are "generally" better than lists are not convincing.  Lists AND
arrays have been
around since the  advent of computers.  They are both useful under different
circumstances
and I don't think that you've given enough evidence to show that an array is
the best choice
for the default sequence type for a language.  Certainly, it seems to be the
"most natural"
choice  for Fortran, C, and Java language users, given that construction,
traversal, and
manipulation of lists is clunky in those languages (Strangely enough C++
with the STL
and operator overloading sidesteps this objection to a certain extent, but I
attribute that
to luck more than any conscious design effort) but that's hardly a
convincing argument
overall.

faa


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Per Bothner  
View profile  
 More options Dec 14 2000, 3:39 am
Newsgroups: comp.lang.lisp, comp.arch
From: Per Bothner <p...@bothner.com>
Date: 14 Dec 2000 00:40:16 -0800
Local: Thurs, Dec 14 2000 3:40 am
Subject: Re: Could CDR-coding be on the way back?

Erik Naggum <e...@naggum.net> writes:
>   Excuse me?  Yes, it makes sense to read Lisp data that the application
>   does not know the structure of a priori.  For one thing, source _code_
>   matches that description.

Your point seems to be that it is a bad idea to use array to represent
complex nested structure, such as Lisp-style source code.  I agree
this arrays are at a disadvantage here, since Lisp source code uses
many variable-sized and short sequences.  However, the disadvantage is
minor (at most a small constant factor), and it doesn't really matter,
since source code is relatively small and only kept around during input
and compilation.  It is more interesting what happens with large files
and large sequences.  That is where arrays win and (I think) lists
have the disadvantage.

>   Yes, [the application] knows they will all be Lisp objects.
>   That's it.  This is part of the significant charm with Lisp's lists.

An application has to know more.  Presumably it intends to *do*
something with those Lisp objects.

>   There's something wrong with your math.  The average number of times
>   an element is copied is 1.0 if you have (- (1+ (expt 2 n)) m) elements
>   in the final array with m elements in the initial allocation and
>   approaches 2.0 asymptotically for (1+ (expt 2 n)) elements in the
>   final array, since adding the last element will copy everything and
>   end up with half the allocated array empty (short that one element).

I believe your're right.

> | My argument is a little of all of:
> |
> | (1) I think high-level programming languages should ...
> | (2) I think programmers should ...
> | (3) As a consequence of (1) and (2), ...

>   I'd be interested in how you _arrived_ at your preferences, instead of
>   just having something follow from some random personal favorites.

First, I've tried to argue that arrays are a better general-purpose
data structure than lists: You can do things (indexing) that you can't
to efficiently with lists.  Sequential traversal (including merging
and appending to the end) is perhaps most common use of sequences, and
boths arrays and lists handle that well, but arrays make it easy
to traverse from either end.  Arrays have better locality of reference,
and they take less space (in general - short arrays and extensible
arrays just after resizing might lose out spacewise to lists, but not
by much).  Lists have some advantages.  Some of the advantages are
historic (i.e. Lisp is based on lists); other may be inherent.  For
example some algorithms (tree-sort? - merge-sort? I forget the name)
need linked lists.  Other applications may be more convenient to
code using lists, especially in languages like Lisp which historically
are oriented towards Lisp.  So, generally arrays make a lots of sense,
and if you are going to represent a sequences of values in your
program, in general arrays are more likely to be the better answer
than lists.  Now if you are going to be using Lisp, this may be a
little different, just because of Lisp's support for lists.  But
from an abstract data-structure point of view (and note this is comp.arch,
not just comp.lang.lisp), arrays usually are to be preferred over lists.

Secondly, if I was designing a programming langauge from scratch, I
would use arrays rather than lists as the fundamental data type for
implementing sequences, partly for the reasons above.  The other
reason is that I like languages that work on collections as a unit,
and provide operations than work on collections (sequences, sets,
relations, trees).  Such languages can be very powerful and compact.
I also think they may be easier to use than languages that encourage
recursion to work on lists.  I like side-effect free languages, or at
least languages that encourage a side-effect free style.  In such a
language, the benefit of using lists instead of arrays is that it is
easy to use recursion to define your algorithms.  But using recursion to
scan through a list is, I think, a bad idea, because it is harder to
understand than iteration.  Recursion should be used for recursive
data structures.  But defining a sequence as a recursive data
structure (a chain of conses) is wrong - you're creating recursion
where you don't need it.

>   This is where you're losing me.  I don't see a huge need for making a
>   basic type more efficient in languages that _don't_ show a predominant
>   use of that basic type.

Neither do I.  My point was the different trade-off of Java and Lisp.

>.  And why are we discussing Java?  I thought it
>   was already as array-friendly as you wanted languages to be?

Absolutely not.  But that is a different discussion.

>  And Lisp
>   already has adjustable arrays, so what's the point in redesigning them?

I'm not.  You're the one who insists on adding a displacement pointer
to all adjustable arrays.

>   Sigh.  Why am I wasting any time on this?  The displacement is needed
>   for the "rest" (cdr) function that people who work on lists, however
>   they are actually implemented, need very often.

I guess the problem is we're discussing two different things here:
Arrays vs lists in abstract, and how they are used in Lisp.  When
I work on arrays, I don't find the need for a rest function a lot.
If you need the rest/cdr function a lot, then perhaps you shouldn't
be using arrays.  Or re-think your algorithms.

>   Yes, but that is a truly boneheaded decision as it means that the data
>   array can no longer be specialized to contain a uniform data type --

Of course it can.  This is what Java primitive arrays are:  A non-adjustable
array of some primtive type (for exampe double-floats).  These are
comonly implemented as an object header followed by a length
followed by the data (words, bytes, or whatever).

>   This is just stupid.

There are all sorts of tradeoffs, in memory management especially.

> | (It would be needed there anyway if memory management code needs to
> | scan memory consequtively.)

>   Really?  I thought memory management code followed pointers and chains
>   from a root set if it were any smart.

Well, for example a compacting phase might be easier if you could
scan memory in order.

>  Randomly walking over memory it has no idea what contains

If you have the appropriate header words, you would know what the memory
contains.

> | That gives only 2 words for the header meta-object.

>   And much more overhead and increased complexity and general madness.
>   You have obviously never actually experimented with this in _Lisp_-
>   like languages.

Of course I have.  However, I have done most of my work in the context
of the Java VM, which has other tradeoffs than if you can custom-design
your own memory-management.  There are other Lisp implementations
that use the Boehm conservative collector (or similar), perhaps
because interoperability with C is desired, or because they would
rather used an existing well-engineered GC instead of writing their own.

>   Lists can grow at the tail end without any "header cons" simply
>   because we just overwrite the cdr that is nil in the final cons
>   with a pointer to the cons cell that holds the new element (and a
>   new nil).

Not if the list can start out empty.

> | As I said:  Only twice total, on average.

>   Well, you actually said *a single time*, but if you keep doubling it
>   every time you say it, I'm happy.  (Sorry, but I found that fuuny. :)

Twice referred to the number of memory accesses to write the array
(move once on average plus initialization).  I'll bump that up to thrice,
if you like.

>   I'm confused about your purposes, I guess.  Promote APL if you like,
>   but then I don't see why you're even offering suggestions for how to
>   implement lists efficiently in Lisp.  This is mystifying to me.

I am not talking about how to implement lists efficiently in Lisp, if
by "lists" you mean the "list" type of Common Lisp, except that I
don't think there is much point in implementing cdr-coding, at least
in hardware.  What I am talking about how to implement sequences or
lists in the more generic sense.
--
        --Per Bothner
p...@bothner.com   http://www.bothner.com/~per/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tim Bradshaw  
View profile  
 More options Dec 14 2000, 5:59 am
Newsgroups: comp.lang.lisp, comp.arch
From: Tim Bradshaw <t...@tfeb.org>
Date: 14 Dec 2000 10:59:08 +0000
Local: Thurs, Dec 14 2000 5:59 am
Subject: Re: Could CDR-coding be on the way back?

Per Bothner <p...@bothner.com> writes:

> Your point seems to be that it is a bad idea to use array to represent
> complex nested structure, such as Lisp-style source code.  I agree
> this arrays are at a disadvantage here, since Lisp source code uses
> many variable-sized and short sequences.  However, the disadvantage is
> minor (at most a small constant factor), and it doesn't really matter,
> since source code is relatively small and only kept around during input
> and compilation.  It is more interesting what happens with large files
> and large sequences.  That is where arrays win and (I think) lists
> have the disadvantage.

I think this paragraph sums up what is wrong with your argument for
Lisp. You seem to be assuming that people typicially use huge,
shallowly-nested lists which would `obviously' be better represented
in some array form.  Well, I just don't think that's true for
well-written Lisp programs.  I don't have general evidence, but in
programs I've written I've done some measurement of this kind of
thing, because I was concerned about things like search times, and I
found mean lengths of about 6 elements and mean nesting depths about
the same.

I suspect this isn't the case for Java, but Java programs perhaps
don't deal with the same kind of data that Lisp ones do.

--tim


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Terje Mathisen  
View profile  
 More options Dec 14 2000, 6:20 am
Newsgroups: comp.lang.lisp, comp.arch
From: Terje Mathisen <terje.mathi...@hda.hydro.com>
Date: Thu, 14 Dec 2000 09:59:07 +0100
Local: Thurs, Dec 14 2000 3:59 am
Subject: Re: Could CDR-coding be on the way back?

Per Bothner wrote:

> Erik Naggum <e...@naggum.net> writes:
> >   You recommend arrays to other people, not just to yourself, unless I'm
> >   _gravely_ mistaken about the purpose of your article.  Other people
> >   who have much less experience than you and rely on your experience in
> >   order to form their own, with the least possible pain.  My experience
> >   with people who try to implement things with arrays is that they end
> >   up with very cons-like concepts once they outgrow the C "memory is
> >   really an array of bytes" idea.  The reason is simple: Arrays with the
> >   properties we discuss are very, very hard to implement correctly.

> It is not that hard.  Still, I agree it is is best if arrays are well
> supported by the programming language or at least a good library.

What's wrong with STL?

If you have to use C++ anyway, STL has a lot of very useful ideas.

> My argument is a little of all of:

> (1) I think high-level programming languages should be built around
> array support rather list support.  (Actually, I prefer language
> primitives to work on generalized sequences, with arrays just being
> the default implementation.)

I.e. STL <vector> etc.

Terje

--
- <Terje.Mathi...@hda.hydro.com>
Using self-discipline, see http://www.eiffel.com/discipline
"almost all programming can be viewed as an exercise in caching"


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Dec 14 2000, 8:17 am
Newsgroups: comp.lang.lisp, comp.arch
From: Erik Naggum <e...@naggum.net>
Date: 14 Dec 2000 12:35:59 +0000
Local: Thurs, Dec 14 2000 7:35 am
Subject: Re: Could CDR-coding be on the way back?
* Per Bothner <p...@bothner.com>
| I am not talking about how to implement lists efficiently in Lisp, if
| by "lists" you mean the "list" type of Common Lisp, except that I
| don't think there is much point in implementing cdr-coding, at least
| in hardware.  What I am talking about how to implement sequences or
| lists in the more generic sense.

  Well, OK.  cdr-coding is all about efficient implementation of lists in
  Lisp, and the reason I insist on that displacement offset that you reject
  out of hand even with the explanation is that the context of the whole
  thread has been how to store lists more efficiently, while obviously
  keeping all the listness qualities.  I'm sorry you have missed this.

  I don't have a problem with your desire to use arrays.  I use them all
  the time myself, and so do other Lispers.  I don't see a conflict, and I
  don't see a need to optimize Lisp any more towards arrays, as the support
  for the abstract type "sequence" already there, too.

  It seems to me that you think "Lisp" is the Scheme branch of things,
  which I barely think is a Lisp at all -- I think Scheme is an Algol with
  Lispy syntax and some functional properties.  There are many reasons I
  loath Scheme, but one of them is certainly that it exposes the choice of
  underlying types so much in the interest of giving you that Algol feel of
  efficiency and typeness.

#:Erik
--
  The United States of America, soon a Bush league world power.  Yeee-haw!


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jan Ingvoldstad  
View profile  
 More options Dec 14 2000, 9:12 am
Newsgroups: comp.lang.lisp, comp.arch
From: Jan Ingvoldstad <j...@ifi.uio.no>
Date: 14 Dec 2000 15:12:13 +0100
Local: Thurs, Dec 14 2000 9:12 am
Subject: Re: Could CDR-coding be on the way back?
On Wed, 13 Dec 2000 06:57:46 GMT, Ketil Z Malde <ke...@ii.uib.no>
said:

> You're probably right, but a good argument against it is that USENET
> actually works in a responsive and reliable manner.  WWW seems always
> to be too slow and produce too many 40x's.  

One of the reasons for that is that not all news servers get all of
Usenet.  Thank goodness.

Another argument against changing it to fetch-by-demand is that the
way things are feeded now, it works like crude pre-caching.

(Fetch-by-demand can be implemented in a smart way, too, of course,
lessening the problem, and even combined with pre-caching.  Isn't that
nice.)

--
In the beginning was the Bit, and the Bit was Zero.  Then Someone
said, Let there be One, and there was One.  And Someone blessed them,
and Someone said unto them, Be fruitful, and multiply, and replenish
the Word and subdue it: and have dominion over every thing that is.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jan Ingvoldstad  
View profile  
 More options Dec 14 2000, 9:27 am
Newsgroups: comp.lang.lisp, comp.arch
From: Jan Ingvoldstad <j...@ifi.uio.no>
Date: 14 Dec 2000 15:27:32 +0100
Local: Thurs, Dec 14 2000 9:27 am
Subject: Re: Could CDR-coding be on the way back?
On 13 Dec 2000 10:48:19 +0100, Espen Vestre
<espen@*do-not-spam-me*.vestre.net> said:

> Erik Naggum <e...@naggum.net> writes:
>> I expect that only some headers will be passed around very shortly,
>> with really clever cache propagation protocols that makes clients
>> retrieve an article by message-id on demand, ultimately from the
>> originating server, cutting traffic and disk requirements down to what
>> is actually used, and killing half the stupid spam problem, too.  

This is a bit too crude, unless it's your local server doing the
fetching for you, not the client (user agent).  But that may be what
you mean with "clever cache propagation".

However, the originating/injecting server must have the bandwidth or
other capacity for dealing with external requests for articles or even
storing articles for as long as necessary (that is, until some
reasonable amount of time has passed).  If those requirements aren't
met, the current model seems much better to me.

Considering that there are great variances in how long articles are
stored from news provider to news provider, it seems likely that there
is a significant amount of users who want to read older articles.  It
isn't unreasonable to assume there will be a good amount of arbitrary
requests for older articles at the originating server, say up to a
month after the article was posted.  Someone with a large user/poster
base will have to upgrade their injecting servers.  :)

Another issue is that if the injecting server is somewhere remote in
Australia and your client is in Norway, response will be slow,
reducing the usefulness of Usenet compared to the web.

Ketil Z Malde has a point when he talks about the responsiveness of
today's Usenet; it's very important for the user that the articles
requested appear "immediately".  (There hasn't been much research on
Usenet, but I believe it's safe to apply relevant aspects of usability
studies of the web.)

> I've been hoping for such a thing for about 5 years now, but is there
> any work goin gon...?

From what I understand, it isn't uncommon to deal with header-only
feeds, and Diablo supports fetching Message-IDs from other servers by
demand (automatic redirecting to the other news server).  The latter
seemed to work well when I tested it when the news servers in the
chain were topologically close.  I didn't test with servers on the
other side of the globe, though.

--
In the beginning was the Bit, and the Bit was Zero.  Then Someone
said, Let there be One, and there was One.  And Someone blessed them,
and Someone said unto them, Be fruitful, and multiply, and replenish
the Word and subdue it: and have dominion over every thing that is.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter da Silva  
View profile  
 More options Dec 14 2000, 12:33 pm
Newsgroups: comp.lang.lisp, comp.arch
From: pe...@abbnm.com (Peter da Silva)
Date: 14 Dec 2000 17:16:53 GMT
Local: Thurs, Dec 14 2000 12:16 pm
Subject: Re: Could CDR-coding be on the way back?
In article <1b7l53pvqd....@viper.cs.nmsu.edu>,
Joe Pfeiffer  <pfeif...@cs.nmsu.edu> wrote:

> The post points out that the discussion may be fundamentally flawed:
> lists and vectors are abstract data types, linked lists and arrays are
> implementations.  Up until that post, the data type and implementation
> were being confused.

Good call, that man. Thanks for putting what I meant into clearer terms.

--
 `-_-'   In hoc signo hack, Peter da Silva.
  'U`    "Milloin halasit viimeksi suttasi?"

         Disclaimer: WWFD?


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Miha Peternel  
View profile  
 More options Dec 14 2000, 1:20 pm
Newsgroups: comp.lang.lisp, comp.arch
From: mXrY...@email.com (Miha Peternel)
Date: Thu, 14 Dec 2000 19:20:06 +0100
Local: Thurs, Dec 14 2000 1:20 pm
Subject: Re: Could CDR-coding be on the way back?
In article <m3elzehe1q....@localhost.localdomain>, m...@bewoner.dma.be
says...

> Thanks for reminding me. Most of my stuff broke. But the mess was even
> worse: they had to put in mode bits to allow older programs to run
> unchanged. In a system like VM you ended up with two mode bits (EC/BC
> and XA or something like that? I've forgotten most of this stuff and I
> can't find my yellow booklet) and programs that only ran with one
> combination of them. It was very nice to have eight bits to work with
> with each pointer. Perhaps architecture designers should reconsider
> giving user programs some bits in the address space.

This is not an issue for architecture designers anymore, because
you can use paging to remap the addresses the way you wish. This
will only work with mid to top bits of course.

You can take linux and do the remapping the way you wish if it's
really that important.

Miha


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nicholas Geovanis  
View profile  
 More options Dec 14 2000, 4:50 pm
Newsgroups: comp.lang.lisp, comp.arch
From: Nicholas Geovanis <nick...@merle.acns.nwu.edu>
Date: Thu, 14 Dec 2000 15:32:11 -0600
Local: Thurs, Dec 14 2000 4:32 pm
Subject: Re: Could CDR-coding be on the way back?
Can't resist a discussion about old IBM 'frames and lisp. I did my lisp
homework assignments on line-at-a-time TSO :-)

If I remember correctly, back around 1980 or so, if you asked an IBM
engineer why they "needed" to move past 24-bit addressing to 31-bit, the
reason given was unrelated to application programming. That is, noone
needed to write programs which required that much memory, except for one
very special group of programmers.....the programmers who wrote IBM's
database management systems and operating systems.

Along with the big, new address spaces (actually just before them) came
additions to the instruction-set which permitted and protected the ability
to switch virtual storage context from one virtual address-space to
another (note that this is orthogonal to the ability to switch from
"kernel-mode" to "user-mode"). In practical terms, this permitted the OS
itself to spread ITS OWN data-structures across multiple address spaces in
order to circumvent the 16MB architectural limit. DBMS's did the same
thing.

The design and performance "win" in both cases was to permit the OS's
paging routines and the firmware's virtual-storage enhancements, which
were highly-optimized, to manage the buffering of data and data-structures
in real-memory. The "applications", ie. the bulk of the OS and DBMS, were
thereby relieved of these duties: coding and design were simplified.

I should mention that due to previous architecture/design choices, not all
of the older 16MB address space was available to the OS and/or DBMS
(and/or application) for exclusive, private use. Thus the effective
available virtual storage was less than 16MB, sometimes considerably less,
dependent on local OS changes as well. Even more reason to enlarge it and
to permit access to multiple virtual address spaces.

Any IBM folks want to chime-in here? Are all of the System/370 hands
extinct already? I haven't touched a 'frame in 9 years (don't miss
them either :-))

* Nick Geovanis        The data were not anything that would lead anybody
| IT Computing Svcs     to conclude the Fed would do anything different
| Northwestern Univ     than what they already believed they were likely
| n-geova...@nwu.edu    to do. - Chief Stock Market Analyst, 12/14/2000
+------------------->            Cantor Fitzgerald and Co.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anne & Lynn Wheeler  
View profile  
 More options Dec 14 2000, 6:56 pm
Newsgroups: comp.lang.lisp, comp.arch
From: Anne & Lynn Wheeler <l...@garlic.com>
Date: Thu, 14 Dec 2000 23:56:53 GMT
Local: Thurs, Dec 14 2000 6:56 pm
Subject: Re: Could CDR-coding be on the way back?

Nicholas Geovanis <nick...@merle.acns.nwu.edu> writes:
> I should mention that due to previous architecture/design choices, not all
> of the older 16MB address space was available to the OS and/or DBMS
> (and/or application) for exclusive, private use. Thus the effective
> available virtual storage was less than 16MB, sometimes considerably less,
> dependent on local OS changes as well. Even more reason to enlarge it and
> to permit access to multiple virtual address spaces.

> Any IBM folks want to chime-in here? Are all of the System/370 hands
> extinct already? I haven't touched a 'frame in 9 years (don't miss
> them either :-))

a major MVS issue (in 24bit addressing land) was that it had inherited
from SVS, MVT, PCP (back to early '60s), etc a paradigm/design where
the supervisor occupied the same address space as the application
programs. This was slightly mitigated in the SVS->MVS (in the early
'70s) where the restriction that all applications programs and all
supervisor functions occupy the same 24-bit address space was slightly
lifted (single virtual storage ... became multiple virtual storage ...
where there was a 16mbyte address space for each application ... with
the MVS supervisor and various subsystem components residing in each
one).

However, by the late-70s having all supervisor functions in the same
address space as the application along with various supervisor
subsystem requirements were again starting to serverely strees the
24bit address limit.

while some of the MVS gurus might have believed that they were the
biggest address space hogs in the world, some MVS installations were
having difficulty leaving even 6mbytes (of the 16mbytes) available to
application program. There were actual applications in the '70s that
demonstrated large address space appetites. Some of these were large
database transaction subsystems that had to exist in the tiny space
left in the 16mbytes space after the MVS supervisor and subsystem
requirements were met.

In the initial port of apl/360 to cms/apl ... the apl workspace
limited was opened up from typically 32k-64k bytes to just under
16mbytes. There were a number of applications that actually took
advanage of the increased workspace size.

One of those were the HONE service. This was the service in the '70s
and '80s that supported world-wide sales, marketing, hdqtrs, and field
support operations. One example, starting sometime in the mid '70s,
IBM mainframe orders became so complex that it couldn't be done
manually, a HONE application was needed to fill-in the order. Another
big use of HONE was for economic planning & modeling ... much of the
"what-if" processes done today on PC spreadsheets were performed in
APL.

In '77, the US HONE operations were consolidated in a single location
in california with, what was at the time believed to be the largest
single-system image operation in the world (cluster of SMP processors
sharing large disk farm). In '78/'79, the single-system image was
replicated in dallas and boulder providing disaster survivability
support (in case of national disaster, like an earthquake in
cal.). This was in addition to various HONE clones that resided in
numerous countries around the world.

Almost the entire HONE "experience" was delivered first on cms/apl and
then later on apl/cms.

random refs:

http://www.garlic.com/~lynn/2000f.html#62
http://www.garlic.com/~lynn/2000f.html#30
http://www.garlic.com/~lynn/2000.html#75
http://www.garlic.com/~lynn/99.html#112

--
Anne & Lynn Wheeler   | l...@garlic.com -  http://www.garlic.com/~lynn/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Dec 14 2000, 7:17 pm
Newsgroups: comp.lang.lisp, comp.arch
From: Erik Naggum <e...@naggum.net>
Date: 14 Dec 2000 23:23:33 +0000
Local: Thurs, Dec 14 2000 6:23 pm
Subject: Re: Could CDR-coding be on the way back?
* Jan Ingvoldstad <j...@ifi.uio.no>
| This is a bit too crude, unless it's your local server doing the
| fetching for you, not the client (user agent).  But that may be what
| you mean with "clever cache propagation".

  I consider today's USENET distribution a crude cache propagation
  mechanism and propose something that uses much less bandwidth and disk
  space while it maintains the principle of nearby caches.  A few years
  ago, for instance, the five largest USENET sites in Oslo, Norway, were
  all located within a square mile, duplicating each other with great
  precision and speed because they had different user bases and owners.
  The extant news software could not propagate news articles by any
  other means than flooding every server with everything, but if the
  news protocols were only slightly smarter, they could have waited
  until they needed the article body before they requested it, if they
  had the headers that they could show to users.  Users would not have
  noticed those delays, and if there were any millisecond delays, it
  would be only for the first reader that day/week/whatever.  Now, scale
  this up and consider large (network-topological) regions cooperating
  to avoid duplicating news articles and traffic needlessly.  How many
  USENET servers are there around the core interchange points these
  days?  Distributed, redundant load balancing is not exactly news to
  people who care about distributed systems, but we still have people
  who worry needlessly if they cannot have their local USENET feed.  If
  you get upset because you cannot read news because a remote server is
  down, you are a USENET addict and need treatment, not better servers.

  The problem with the whole current design is, however, that you do not
  know who the originator (initial injector) and owner is and you cannot
  request the article except from another nearby cache, so if you have a
  mesasge-id that some article is a response to, you are basically hosed
  if you have not been so lucky as to have it flood by you.  That does
  not mean there _is_ no originator and owner, so I consider it useful
  to think of the injecting user as _virtually_ reachable.

| However, the originating/injecting server must have the bandwidth or
| other capacity for dealing with external requests for articles or even
| storing articles for as long as necessary (that is, until some
| reasonable amount of time has passed).  If those requirements aren't
| met, the current model seems much better to me.

  The current in-flow rate is a problem.  I have never heard of any
  USENET site that has out-flow problems for articles originating at
  their site.  Keeping your "own" articles around for months should be
  the smallest issue compared to keeping everybody else's articles
  around for weeks and days.  You can increase your out-flow a hundred-
  fold and still be _far_ short of the current in-flow, and this makes
  it possible for leaf sites to have (relatively speaking) _very_ small
  caches while their preferred nearest cache (akin to the site they get
  their feed from today) holds on to articles as long as one of their
  many leaf sites request it.  The great thing with a design like this
  is that you can upgrade from the leaf sites "inward" to the core
  servers, because as long as there are major core servers who still
  operate in the "old way", there will be much less pressure on the
  injecting site.  Given how slowly changes occur in the USENET core
  technology, the chances are very, very good that there will remain a
  number of huge servers who can and will act as proxies for the many
  injecting sites that will never see any significant network load.

| Considering that there are great variances in how long articles are
| stored from news provider to news provider, it seems likely that there
| is a significant amount of users who want to read older articles.

  Yes, that's the idea, to keep the original article available longer.

| It isn't unreasonable to assume there will be a good amount of
| arbitrary requests for older articles at the originating server, say
| up to a month after the article was posted.  Someone with a large
| user/poster base will have to upgrade their injecting servers.  :)

  Methinks you have lost all sense of proportion and need to back up and
  look at the numbers you give for the current situation and its growth
  and consider who the people are who contribute the volume of data that
  you refer to.  Yes, there are some large sites, perhaps responsible
  for as much as 1/1000th of the total volume on USENET each, but they
  _already_ have 1000 times the capacity to handle "injections" and if
  you ask them a 100 times for every article they have published, they
  are still at 1/10th their old bandwidth, and they aren't going to fill
  the remaining 9/10th with requests from other servers, either.

| Another issue is that if the injecting server is somewhere remote in
| Australia and your client is in Norway, response will be slow,
| reducing the usefulness of Usenet compared to the web.

  Really?  How come I can pick up the article from any number of very
  nearby caches today?  Hmmm.  Mystifying!  How _did_ they get there?

| Ketil Z Malde has a point when he talks about the responsiveness of
| today's Usenet; it's very important for the user that the articles
| requested appear "immediately".  (There hasn't been much research on
| Usenet, but I believe it's safe to apply relevant aspects of usability
| studies of the web.)

  Yeah, I think you guys have got it _exactly_ right.  Of course I'm out
  to destroy USENET and its usability when I suggest that we invent a
  better way to propagate articles.  Of course I'm trying my very best
  to screw with the minds of people who implement the software so _they_
  also think "Hey, this USENET thing really was a bad idea to begin
  with, so let's just do something utterly braindamaged that will kill
  it and hopefully everybody using it, too".  Get a _grip_, guys!

  If you don't understand cache propagation mechanisms, say so.  If you
  cannot even _imagine_ that somebody out there have tried to think
  about the way USENET propagates _while_ trying to keep it working for
  its users, I suggest you try to insult people openly instead of just
  assuming they have extra spicy garlic meatballs for brains.  Sheesh.

  Today's USENET propagation model does not try to keep track of where
  articles are read, that is the task of local news admins, most of whom
  take some pride in providing "everything" to their users.  If we did
  not ship the articles until they were read, only enough header stuff
  that people could see newsgroups listings and stuff, the traffic would
  change patterns to adapt to the readership instead of the global flood
  algorithm used today.  This would cause an increased ability to read
  "fringe" newsgroups (it's always a hassle to get a news admin to get
  another weird newsgroup hierarchy, because only one group is too much
  work and the whole hierarchy is too much data), with actual _user_
  input to the distribution of news articles.

| From what I understand, it isn't uncommon to deal with header-only
| feeds, and Diablo supports fetching Message-IDs from other servers by
| demand (automatic redirecting to the other news server).  The latter
| seemed to work well when I tested it when the news servers in the
| chain were topologically close.  I didn't test with servers on the
| other side of the globe, though.

  I'm aware of Diablo and strongly encourage further development along
  those lines, but as the network of servers picks up messages, they
  will naturally be cached many places along the way, not very much
  different from today's system.  Having to follow a chain of forwarding
  servers to get a particular article is therefore very unlikely, unless
  you read "fringe" newsgroups that nobody else in your vicinity reads.
  When you do that, you might also well tolerate longer access times.

#:Erik
--
  The United States of America, soon a Bush league world power.  Yeee-haw!


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Per Bothner  
View profile  
 More options Dec 14 2000, 7:22 pm
Newsgroups: comp.lang.lisp, comp.arch
From: Per Bothner <p...@bothner.com>
Date: 14 Dec 2000 16:23:19 -0800
Local: Thurs, Dec 14 2000 7:23 pm
Subject: Re: Could CDR-coding be on the way back?
"Frank A. Adrian" <fadr...@uswest.net> writes:

> What I asked was given the mismatch of cache speed and CPU speed,
> would a software implementation of CDR-coding be a good
> implementation technique.

This is a reasonable question.  CDR-coding could conceivably be a
useful implementation techique in certain environments optimized for
large Lisp code bases.  That assumes the effort to implement it is
reasonable; since I'm not convinced you could not get a better payoff
by using arrays in place of lists, at least in those place where
cdr-coding would make difference.  But that might require re-writing
working code, which I can see people would rather not do!

> [List and arrays] are both useful under different circumstances

Agreed.

> and I don't think that you've given enough evidence to show that an
> array is the best choice for the default sequence type for a language.

Well, reasonable people can differ on this one.

> Certainly, it seems to be the "most natural" choice for Fortran, C,
> and Java language users, given that construction, traversal, and
> manipulation of lists is clunky in those languages

Using Lisp-style lists is quite straight-forward in Java.  What it lacks
is a standard pre-defined list class based on conses.  Using lists in
languages that lack a garbage collector is likely to be more tedious.
--
        --Per Bothner
p...@bothner.com   http://www.bothner.com/~per/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Per Bothner  
View profile  
 More options Dec 14 2000, 7:26 pm
Newsgroups: comp.lang.lisp, comp.arch
From: Per Bothner <p...@bothner.com>
Date: 14 Dec 2000 16:27:51 -0800
Local: Thurs, Dec 14 2000 7:27 pm
Subject: Re: Could CDR-coding be on the way back?

Tim Bradshaw <t...@tfeb.org> writes:
> You seem to be assuming that people typicially use huge,
> shallowly-nested lists which would `obviously' be better represented
> in some array form.  Well, I just don't think that's true for
> well-written Lisp programs.  I don't have general evidence, but in
> programs I've written I've done some measurement of this kind of
> thing, because I was concerned about things like search times, and I
> found mean lengths of about 6 elements and mean nesting depths about
> the same.

I think this actually supports my point, since arrays of length 6
are more space-efficient than lists of length 6.  (With some quibbles
about implementation techniques for adjustable vs non-adjustable arrays.)
I agree it wouldn't make that much difference either way for short lists.

One might also use data structures differently in an array language.
The classic example is a matrix, which is more efficiently represented
using a single vector in row-major order, rather than a nested vector
of vectors.  Another example is using two arrays to represent a tree.

> I suspect this isn't the case for Java, but Java programs perhaps
> don't deal with the same kind of data that Lisp ones do.

Probably not - and good Java programmers would be likely to structure
a solution differently than good Lisp programmers would.
--
        --Per Bothner
p...@bothner.com   http://www.bothner.com/~per/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anne & Lynn Wheeler  
View profile  
 More options Dec 14 2000, 7:44 pm
Newsgroups: comp.lang.lisp, comp.arch
From: Anne & Lynn Wheeler <l...@garlic.com>
Date: Fri, 15 Dec 2000 00:44:38 GMT
Local: Thurs, Dec 14 2000 7:44 pm
Subject: Re: Could CDR-coding be on the way back?

... oh yes, and a somewhat separate issue for the 370 ... in addition
to virtual 24bit address was the issue of real 24bit address. With
sufficient concurrent applications it was possible to start to stress
the 16mbyte real storage limits (and possibly precipitate page
thrashing).

many 3033 shops in late '70s (basically a souped up 370/168) started
seeing these real-storage constraint problems.

the issue was how to address the problem.

CCW (i.e. i/o transfer) already supported 31bit real address with
IDALs.

To get some slight relief, the 3033 introduced a hack for getting more
than 16mbyte real storage. The 370 pagetable entry is 16bits, 12bits
for specifying real page number (when combined with 12bit, 4k pages,
yields 24bit addressing), an invalid bit, a protection bit, and two
unused bits ... something like

           NNNNNNNNNNNNPxxI

where "xx" are the unused/undefined bits. The 3033 hack was to use the
two "xx" bits and prefix them to the 12bit page number to allow
addressing up to 2**14 real pages ... or 2**26, 64mbytes of real
storage. Executable code was still limited to 24bit virtual addresses
but it was possible to allocate virtual pages in real storage above
the 24bit line by setting the appropriate bits in the page table
entry. And of course, the standard 370 CCW IDALs already had 31bits
available for addressing real storage in i/o operations.

cross-memory services was also introduced with the 3033. in an attempt
to help get some of the supervisor subsystem functions out of the same
address space as the application (at least get things to the point
where maybe a whole virtual 8mbytes was available to applications)
... and not to have a significant downside impact on the MVS
"address-pointer" parameter paradigm, these supervisor subsystem
functions had to reside in their own address space while still
directly supporting services requiring addressing of the application
virtual address space. cross-memory services introduced new addressing
modes allowing instructions to address virtual storage different than
the virtual address space that they were executing in.

random refs:

http://www.garlic.com/~lynn/99.html#99
http://www.garlic.com/~lynn/99.html#7
http://www.garlic.com/~lynn/2000e.html#57
http://www.garlic.com/~lynn/2000g.html#11

--
Anne & Lynn Wheeler   | l...@garlic.com -  http://www.garlic.com/~lynn/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anne & Lynn Wheeler  
View profile  
 More options Dec 14 2000, 8:01 pm
Newsgroups: comp.lang.lisp, comp.arch
From: Anne & Lynn Wheeler <l...@garlic.com>
Date: Fri, 15 Dec 2000 01:01:48 GMT
Local: Thurs, Dec 14 2000 8:01 pm
Subject: Re: Could CDR-coding be on the way back?
Anne & Lynn Wheeler <l...@garlic.com> writes:

oops finger slip, that should have been:

http://www.garlic.com/~lynn/99.html#190

giving the 3033 announce & ship dates

--
Anne & Lynn Wheeler   | l...@garlic.com -  http://www.garlic.com/~lynn/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Paul DeMone  
View profile  
 More options Dec 14 2000, 8:15 pm
Newsgroups: comp.lang.lisp, comp.arch
From: Paul DeMone <pdem...@igs.net>
Date: Thu, 14 Dec 2000 20:12:18 -0500
Local: Thurs, Dec 14 2000 8:12 pm
Subject: Re: Could CDR-coding be on the way back?

Per Bothner wrote:

[...]

> Secondly, if I was designing a programming langauge from scratch, I
> would use arrays rather than lists as the fundamental data type for
> implementing sequences, partly for the reasons above.  The other
> reason is that I like languages that work on collections as a unit,
> and provide operations than work on collections (sequences, sets,
> relations, trees).  Such languages can be very powerful and compact.

 You might be interested in NIAL (Nested Interactive Array Language)

 http://www.nial.com

--
Paul W. DeMone       The 801 experiment SPARCed an ARMs race of EPIC
Kanata, Ontario      proportions to put more PRECISION and POWER into
dem...@mosaid.com    architectures with MIPSed results but ALPHA's well
pdem...@igs.net      that ends well.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 76 - 100 of 246 < Older  Newer >
« Back to Discussions « Newer topic     Older topic »