Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Lisp io performance.

58 views
Skip to first unread message

David B. Lamkins

unread,
Jan 18, 1999, 3:00:00 AM1/18/99
to
In article <7810mk$1ik$1...@nnrp1.dejanews.com> , robert_...@yahoo.com
wrote:

> What's the fastest way to write a copy-file routine? It seems like there's no
> way to do block reads and writes, and the fastest copy-file routine I can
> write is about 4-5 times slower than the unix `cp', because it reads a byte,
> checks for the eof condition, then writes the byte for each and every byte.

Portably? I think the best you can do is to allocate a large buffer and use
read-sequence and write-sequence. Use file-length to determine the size of
the buffer (and to decide whether you need to copy the file in chunks).

Here's a start. This code does not do chunking -- you must be able to
allocate a buffer large enough to hold the file.

The speed of this will depend upon your Lisp vendor's implementation of
read/write-sequence.

See the CLHS for additional hints on efficiency of the read/write-sequence
functions.

(defun fast-copy-file (src-pathname dst-pathname)
(with-open-file (instream src-pathname :direction :input)
(with-open-file (outstream dst-pathname :direction :output)
(let* ((size (file-length instream))
(buffer (make-string size)))
(read-sequence buffer instream)
(write-sequence buffer outstream))))
t)


--
David B. Lamkins <http://www.teleport.com/~dlamkins/>

((lambda (x) (list x (list 'quote x)))
'(lambda (x) (list x (list 'quote x))))


robert_...@yahoo.com

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to
What's the fastest way to write a copy-file routine? It seems like there's no
way to do block reads and writes, and the fastest copy-file routine I can
write is about 4-5 times slower than the unix `cp', because it reads a byte,
checks for the eof condition, then writes the byte for each and every byte.

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/ Search, Read, Discuss, or Start Your Own

Erik Naggum

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to
* robert_...@yahoo.com

| What's the fastest way to write a copy-file routine? It seems like
| there's no way to do block reads and writes...

there is. READ-SEQUENCE and WRITE-SEQUENCE. apply it to vectors of some
reasonable machine size (8192 is frequently large enough to hold at least
one complete disk block) having an element-type of either CHARACTER or
(UNSIGNED-BYTE 8), and make the streams match that type.

#:Erik
--
SIGTHTBABW: a signal sent from Unix to its programmers at random
intervals to make them remember that There Has To Be A Better Way.

robert_...@yahoo.com

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to
In article <eAUo2.19577$W_.11...@news1.teleport.com>,

"David B. Lamkins" <dlam...@teleport.com> wrote:
> In article <7810mk$1ik$1...@nnrp1.dejanews.com> , robert_...@yahoo.com
> wrote:
>
> > What's the fastest way to write a copy-file routine? It seems like there's
no
> > way to do block reads and writes, and the fastest copy-file routine I can
> > write is about 4-5 times slower than the unix `cp', because it reads a byte,
> > checks for the eof condition, then writes the byte for each and every byte.
>
> Portably? I think the best you can do is to allocate a large buffer and use
> read-sequence and write-sequence. Use file-length to determine the size of
> the buffer (and to decide whether you need to copy the file in chunks).
>
> Here's a start. This code does not do chunking -- you must be able to
> allocate a buffer large enough to hold the file.
>
> The speed of this will depend upon your Lisp vendor's implementation of
> read/write-sequence.
>
> See the CLHS for additional hints on efficiency of the read/write-sequence
> functions.
>
> (defun fast-copy-file (src-pathname dst-pathname)
> (with-open-file (instream src-pathname :direction :input)
> (with-open-file (outstream dst-pathname :direction :output)
> (let* ((size (file-length instream))
> (buffer (make-string size)))
> (read-sequence buffer instream)
> (write-sequence buffer outstream))))
> t)

This works great. I tried writing a version with chunking, and it's just as
fast as the Linux `cp' on ACL, but it hangs CMUCL for some reason. Since it's
so basic, I'm sure I screwed up somewhere. Have a look:

(defun fast-copy-file (src-pathname dst-pathname)
(with-open-file (instream src-pathname :direction :input)
(with-open-file (outstream dst-pathname :direction :output

:if-exists :supersede)
(let ((size (file-length instream))
(buffer (make-string 8192))
(pos 0))
(loop
(incf pos (read-sequence buffer instream))
(write-sequence buffer outstream)
(when (< (- size pos) 8192)


(read-sequence buffer instream)
(write-sequence buffer outstream

:end (- size pos))
(return))))))
t)

Any ideas?

Will Fitzgerald

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to
How about:

(defun fast-copy-file (src-pathname dst-pathname &optional (blocksize 8192))
"copies a file; returns number of bytes copied"
(with-open-file (instream src-pathname :direction :input :element-type
'CHARACTER)
(with-open-file (outstream dst-pathname :direction :output :element-type
'CHARACTER)
(let ((buffer (make-array blocksize :element-type 'CHARACTER)))
(do ((pos (read-sequence buffer instream)
(read-sequence buffer instream))
(total 0 (+ total pos)))
((zerop pos) total)
(write-sequence buffer outstream :end (mod pos blocksize)))))))

[Combining Naggum's comments and Lamkins' suggestions]

Will Fitzgerald

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to
Sorry, it should be:

(defun fast-copy-file (src-pathname dst-pathname &optional (blocksize 8192))
"copies a file; returns number of bytes copied"
(with-open-file (instream src-pathname
:direction :input :element-type 'CHARACTER)
(with-open-file (outstream dst-pathname
:direction :output :element-type 'CHARACTER)
(let ((buffer (make-array blocksize :element-type 'CHARACTER)))
(do ((pos (read-sequence buffer instream)
(read-sequence buffer instream))
(total 0 (+ total pos)))
((zerop pos) total)

(write-sequence buffer outstream :end pos))))))

Will Fitzgerald wrote in message <7825u0$5...@news.net-link.net>...

Jeff Sandys

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to
Why not just issue a shell command?
With this expension in Allegro you can specify wait till done.

(defun copy-file (source destination)
(excl:run-shell-command (format nil "cp ~A ~A :wait t" source
destination))

thanks,
Jeff Sandys
Jeffrey....@boeing.com

robert_...@yahoo.com wrote:
>
> What's the fastest way to write a copy-file routine? It seems like there's no
> way to do block reads and writes, and the fastest copy-file routine I can
> write is about 4-5 times slower than the unix `cp', because it reads a byte,
> checks for the eof condition, then writes the byte for each and every byte.
>

Rainer Joswig

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to
In article <7828av$n...@news.net-link.net>, "Will Fitzgerald" <fitzg...@neodesic.com> wrote:

> Sorry, it should be:
>
> (defun fast-copy-file (src-pathname dst-pathname &optional (blocksize 8192))
> "copies a file; returns number of bytes copied"
> (with-open-file (instream src-pathname
> :direction :input :element-type 'CHARACTER)
> (with-open-file (outstream dst-pathname
> :direction :output :element-type 'CHARACTER)
> (let ((buffer (make-array blocksize :element-type 'CHARACTER)))
> (do ((pos (read-sequence buffer instream)
> (read-sequence buffer instream))
> (total 0 (+ total pos)))
> ((zerop pos) total)
> (write-sequence buffer outstream :end pos))))))

Do you really want character as the element type?

--
http://www.lavielle.com/~joswig

Barry Margolin

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to
In article <781nss$kmg$1...@nnrp1.dejanews.com>,

<robert_...@yahoo.com> wrote:
> (loop
> (incf pos (read-sequence buffer instream))
> (write-sequence buffer outstream)
> (when (< (- size pos) 8192)
> (read-sequence buffer instream)
> (write-sequence buffer outstream
> :end (- size pos))
> (return))))))
> t)

Think about what happens if SIZE is less than 8192.

BTW, stylistic note: any time a number outside the range -1:1, e.g. 8192,
appears multiple times in a piece of code, it's probably a candidate for a
DEFCONSTANT. In the case of this function, DEFPARAMETER might be more
appropriate, so you can experiment with different size buffers simply by
changing the value (for those of you who have wondered, this is where
DEFPARAMETER gets its name).

--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Don't bother cc'ing followups to me.

Pierre Mai

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to
robert_...@yahoo.com writes:

> This works great. I tried writing a version with chunking, and it's just as
> fast as the Linux `cp' on ACL, but it hangs CMUCL for some reason. Since it's
> so basic, I'm sure I screwed up somewhere. Have a look:
>
> (defun fast-copy-file (src-pathname dst-pathname)
> (with-open-file (instream src-pathname :direction :input)

> (with-open-file (outstream dst-pathname :direction :output

> :if-exists :supersede)
> (let ((size (file-length instream))
> (buffer (make-string 8192))
> (pos 0))

> (loop
> (incf pos (read-sequence buffer instream))
> (write-sequence buffer outstream)
> (when (< (- size pos) 8192)
> (read-sequence buffer instream)
> (write-sequence buffer outstream
> :end (- size pos))
> (return))))))
> t)
>

> Any ideas?

This is a bug in CMUCL. Funnily, this bug is dependant on the size of
the buffer. If the size exceeds a certain limit, read-sequence
suddenly starts to return 0, although it reads the sequence
correctly. I'll forward this post to the CMUCL bug mailing-list.

CMUCL's read-sequence has already a serious bug in current versions,
which was exposed by the get-file-as-string I recently posted, and
reported to the cmucl-imp mailing-list by Christopher R. Barry. This
bug is a flaw in the handling of read-sequence by the compiler, which
incorrectly flushes the call, because read-sequence is set-up as a
flushable (i.e. non side-effecting) function. Douglas Thomas Crosher
located the bug and produced a patch, so future versions of CMUCL will
not be affected by this.

Regs, Pierre.

--
Pierre Mai <pm...@acm.org> http://home.pages.de/~trillian/
"One smaller motivation which, in part, stems from altruism is Microsoft-
bashing." [Microsoft memo, see http://www.opensource.org/halloween1.html]

Erik Naggum

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to
* Barry Margolin <bar...@bbnplanet.com>

| BTW, stylistic note: any time a number outside the range -1:1, e.g. 8192,
| appears multiple times in a piece of code, it's probably a candidate for
| a DEFCONSTANT. In the case of this function, DEFPARAMETER might be more
| appropriate, so you can experiment with different size buffers simply by
| changing the value (for those of you who have wondered, this is where
| DEFPARAMETER gets its name).

in this case, I think it's also worth pointing out that the size of the
string is still available. (length buffer) will yield 8192.

FWIW, here's my cut at it:

(defconstant disk-block-size 8192)

(defun copy-file (from to)
(with-open-file (in from :direction :input)
(with-open-file (out to :direction :output :if-exists :supersde)
(let ((buffer (make-array disk-block-size
:element-type (stream-element-type in))))
(loop for read = (read-sequence buffer in) until (zerop read) do
(write-sequence buffer out :end read))))))

Erik Naggum

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to
* Jeff Sandys <Jeffrey....@NOboeingSPAM.com>

| Why not just issue a shell command?

calling out to a Unix process suspends the entire Allegro CL process.
this is not good for multiprocessing.

Steven M. Haflich

unread,
Jan 19, 1999, 3:00:00 AM1/19/99
to Rainer Joswig
Rainer Joswig wrote:

> Do you really want character as the element type?

No. None of the proposed solutions (all of which have tried to copy
files as characters) has been portably sufficient to the task of
copying a file efficiently. Surprise: It cannot be done portably! It
is necessary to make some assumptions about the filesystems etc.

First, a character stream won't suffice because it isn't necessary on
every operating system and every file system that any file can
smoothly be processed as a stream of characters. Three example
glitches are worthy of mention: First, the ANS requires that line
breaks in an external character file be translated to and from the
single Lisp character #\newline. This isn't a problem on most Unix
systems, but on (losing) Windows this requires the Lisp IO routines
look at every character coming in and every character going out.
Second, some characters (e.g. ^Z on Windows, which is taken as EOF
unless special care is taken) won't read on all systems. Third, the
existence of the pretty printer suggests that character output streams
should try to keep track of the current column position. Consider a
program that does:

(write-sequence mumbl-string stream) (write foo stream)

If mumble-string doesn't end with a #\newline then the pretty printer
will need to know the column it ends in to determine if and where to
insert line breaks while printing foo.

Now, it might be DWIM for write-sequence to imply that I'm not
concerned about tracking the line position, or about line break
processing, but that isn't implied by anything in the language
standard -- and is sometimes definitely _not_ the case!

X3J13 added read-sequence and write-sequence with good reason and the
need they were intended to fill was very real. But they were not
based on sufficient real use and were therefore not carefully
considered in light of the range of problems they were intended to
solve. I know: I was there, I voted for them, and I've been trying to
live with them ever since. I have some thoughts on how the IO
specification for this industrial, real-world language might be
brought into the real world in a future revision of the standard...

Meanwhile, back to the original problem. If one is willing to assume
that any file can be copied as a stream of (UNSIGNED-BYTE 8), which is
true on all current systems on my aquaintance, then this would be my
approach:

(defun copy-file (from to)
(with-open-file (out to
:element-type '(unsigned-byte 8)
:direction :output
:if-exists :error :if-does-not-exist :create)
(let ((buf (make-array 2048 :element-type '(unsigned-byte 8))))
(with-open-file (in from :element-type '(unsigned-byte 8))
(loop as x = (read-sequence buf in)
until (= x 0)
do (write-sequence buf out :end x))))
out))

I originally wrote this as part of ACL and casual tests showed it to
be essentially as fast as cat on Unix. I think there is a heftier,
exported version with lots of options in the 5.0 release.

Will Fitzgerald

unread,
Jan 20, 1999, 3:00:00 AM1/20/99
to
Rainer Joswig politely and correctly asks:

>
>Do you really want character as the element type?
>


And the answer is, of course, no, as Steven Haflich post states. The
following works on ACL for Windows, and copies Windows binaries, which the
previous version did not):

(defun fast-copy-file (src-pathname dst-pathname &optional (blocksize 8192))
"copies a file; returns number of bytes copied"
(with-open-file (instream src-pathname

:direction :input :element-type '(unsigned-byte
8))
(with-open-file (outstream dst-pathname
:direction :output :element-type
'(unsigned-byte 8))
(let ((buffer (make-array blocksize :element-type '(unsigned-byte
8))))

Rainer Joswig

unread,
Jan 20, 1999, 3:00:00 AM1/20/99
to
In article <784o1d$2...@news.net-link.net>, "Will Fitzgerald" <fitzg...@neodesic.com> wrote:

>
>
> (defun fast-copy-file (src-pathname dst-pathname &optional (blocksize 8192))

Just to mention it, MCL has COPY-FILE :

COPY-FILE old-pathname new-pathname &key :if-exists :fork
[Function]
copies old-pathname to new-pathname and returns new-pathname. If
new-pathname already exists and the value of :if-exists is :error,
Macintosh Common Lisp signals an error. If its value is nil, Macintosh
Common Lisp returns nil; if it is :overwrite, the previous file is
overwritten and new-pathname is returned. :fork can be :data,
:resource, or :both. The default is :both.


CL-HTTP has:

HTTP:STREAM-COPY-UNTIL-EOF from-stream to-stream &optional copy-mode
[generic-function]
Copies all input from FROM-STREAM to TO-STREAM.
COPY-MODE can be any of :TEXT, :BINARY, or :CRLF, but defaults to :TEXT.
Ports and applications may specialize this method to optimize data transfer rates.


And CL-HTTP also has:

HTTP:COPY-FILE from-pathname to-pathname &key copy-mode &allow-other-keys
[generic-function]
A portable file copy.
COPY-MODE is one of :TEXT, CRLF, or binary.


Genera has:


copy-file from-pathname to-pathname &key (:element-type ':default)
(:characters ':default)
:byte-size
(:copy-creation-date t)
(:copy-author t)
:report-stream
(:create-directories ':query)
Function

This function copies one file to another. from-pathname identifies the
source file and must refer to a single file and contain no wild com-
ponents. to-pathname identifies the destination file and can contain
wild components, which are eliminated after merging the defaults by
means of :translate-wild-pathname.

copy-file first attempts to open from-path. When that has happened
successfully, it parses to-pathname and merges it (using merge-
pathnames) against the link-opaque truename of from-pathname and a
version of :newest. The output file specified by to-pathname is
opened with :if-exists :supersede. The processing of to-pathname
has the following result for version numbers.

Source Target Result
>foo>a.b.newest >bar> Retains the version number
>foo>a.b.newest >bar>x Makes a new version of >bar>x.b

The defaults for to-pathname come from the link-opaque truename of
from-pathname. For systems without links, this is indistinguishable
from the truename. Otherwise, the link-opaque truename depends on
whether from-pathname contains an :oldest or :newest version. If it
does not and if it is fully defaulted, with no wild components, the
pathname is its own link-opaque truename. If a pathname x contains
an :oldest or :newest version, the link-opaque truename is the
pathname of the file or link that corresponds to x, with the version
number filled in. For example, copying the LMFS file >a>p1.lisp to
>b> results in >b>p1.lisp, with the version of >a>p1.lisp.newest in-
herited. This is so whether >a>p1.lisp.newest is a real file, a link, or
a rename-through link.

By default, copy-file copies the creation date and author of the file.

Following is a description of the other options:

:element-type This argument specifies the type of Lisp object
transferred by the stream. Anything that can be rec-
ognized as being a finite subtype of character or
integer is acceptable. In particular, the following
types are recognized:

:characters Possible values:
:default copy-file decides whether this is
a binary or character transfer ac-
cording to the canonical type of
from-pathname. You do not need to
supply this argument for standard
file types. For types that are not
known canonical types, it opens
from-pathname in :default mode.
In that case, the server for the file
system containing from-pathname
makes the character-or-binary de-
cision.

t Specifies that the transfer must be
in character mode.

nil Specifies that the transfer must be
binary mode (in this case, you must
supply byte-size if using a byte
size other than 16).

:byte-size Specifies the byte size with which both files are
opened for binary transfers. You must supply
:byte-size when :characters is nil and the byte
size is other than 16. Otherwise, copy-file deter-
mines the byte size from the file type for from-
pathname. When from-pathname is a binary file with
a known canonical type, it determines the byte size
from the :binary-file-byte-size property of the
type. When the file does not have a known type, it
requests the byte size for from-pathname from the
file server. When the server for the file system con-
taining from-pathname cannot supply the byte size, it
assumes that the byte size is 16.

:report-stream
When :report-stream is nil (the default), the
copying takes place with no messages. Otherwise,
the value must be a stream for reporting the start
and successful completion of the copying. The com-
pletion message contains the truename of to-
pathname.

:create-directories
Determines whether directories should be created, if
needed, for the target of the copy. Permissible val-
ues are as follows:

t Try to create the target directory
of the copy and all superiors. Re-
port directory creation to
*standard-output*.

nil Do not try to create directories. If
the directory does not exist, handle
this condition like any other error.

:query If the directory does not exist, ask
whether or not to create it. This is
the default.

--
http://www.lavielle.com/~joswig

Rainer Joswig

unread,
Jan 20, 1999, 3:00:00 AM1/20/99
to
In article <mn7luhe...@rainbow.studorg.tuwien.ac.at>, chei...@ag.or.at wrote:

> Erik Naggum <er...@naggum.no> writes:
>
> > FWIW, here's my cut at it:

> > =
>
> > (defconstant disk-block-size 8192)
> > =
>
> > (defun copy-file (from to)


> > (with-open-file (in from :direction :input)
> > (with-open-file (out to :direction :output :if-exists :supersde)
> > (let ((buffer (make-array disk-block-size
> > :element-type (stream-element-type in))))

> (declare (dynamic-extent buffer))
> > (loop for read =3D (read-sequence buffer in) until (zerop read) do


> > (write-sequence buffer out :end read))))))
>

> Since this thread is about fast copying, declaring buffer
> DYNAMIC-EXTENT would help (especially when you copy lots of files),
> wouldn't it?

How about making it a resource and preallocating a pool of buffers?

--
http://www.lavielle.com/~joswig

eric dahlman

unread,
Jan 20, 1999, 3:00:00 AM1/20/99
to
Erik Naggum <er...@naggum.no> writes:

> * Jeff Sandys <Jeffrey....@NOboeingSPAM.com>
> | Why not just issue a shell command?
>
> calling out to a Unix process suspends the entire Allegro CL process.
> this is not good for multiprocessing.

I thought that they promised that version 5.0 would fix that on most
Unix platforms via os level threading. I only have 5.0 on Linux which
still uses user level threading so I cannot verify that until they
send us our Solaris version...

-Eric

Erik Naggum

unread,
Jan 20, 1999, 3:00:00 AM1/20/99
to
* eric dahlman <dah...@cs.colostate.edu>

| I thought that they promised that version 5.0 would fix that on most
| Unix platforms via os level threading.

Allegro CL 5.0 uses OS-level threading only under Bill Gates' regime.

| I only have 5.0 on Linux which still uses user level threading so I
| cannot verify that until they send us our Solaris version...

no difference between Linux and Solaris in this regard in Allegro CL 5.0.

BTW, I don't _want_ OS-level threading. it has far more problems than it
could ever hope to solve. OS-level threading is something you do when
you can't get it right in the language, IMNSHO. C needs it. that's
almost enough reason to avoid it.

Robert Monfera

unread,
Jan 20, 1999, 3:00:00 AM1/20/99
to
One reason for using OS level threads is to allow multiprocessing.
Probably there is an alternative way of doing that without having to
create processes?
Also, could you tell what major problems you see with OS level threads
so I can avoid them (I am using LWW) and how you would design
applications that scale well on multiprocessor machines. I am
interested in others' experiences with multithreading or multiprocessing
with LWW or ACL for B.G.R.

Thanks,
Robert

Erik Naggum

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to
* Robert Monfera <mon...@fisec.com>

| One reason for using OS level threads is to allow multiprocessing.

obviously, Allegro CL has multiprocessing within the same Unix process.

| Probably there is an alternative way of doing that without having to
| create processes?

a call to MAKE-PROCESS does indeed create a process, but not in the Unix
sense with a separate process idea and memory and all that.

| Also, could you tell what major problems you see with OS level threads so
| I can avoid them (I am using LWW) and how you would design applications
| that scale well on multiprocessor machines.

the problem with scaling on multiprocessor machines is not the OS-thread
or not OS-thread issue, but how Lisp wants a single address space and you
are changing the semantics of the language as a whole when you make each
thread a separate address space or almost a separate address space.

BTW, I have struggled with cooperating processes on two machines and I
really thought I could cut corners relative to the full theory, but it
turns out I can't, so I had to settle for one master and multiple slaves.
whether the slaves are Lisp processes in the same Unix process, Lisp
processes in separate Unix processes on the same machine, or Lisp
processes in Unix process elsewhere is not an issue: the interaction
between them is based on a low-bandwidth exchange of s-expressions.
where I have used run-reasons in the Allegro CL multiprocessing model, I
send forms for evaluations to the slaves.

| I am interested in others' experiences with multithreading or
| multiprocessing with LWW or ACL for B.G.R.

well, it would be nice if we had a journal in which to publish such
things, instead of _just_ the newsgroup and the Web. perhaps at the next
Lisp Users and Vendors or Lisp Users Group Meeting.

Christopher R. Barry

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to
Erik Naggum <er...@naggum.no> writes:

> the problem with scaling on multiprocessor machines is not the OS-thread
> or not OS-thread issue, but how Lisp wants a single address space and you
> are changing the semantics of the language as a whole when you make each
> thread a separate address space or almost a separate address space.

Kinda OT...

Awhile back I was at one of my friend's places and in his bookshelf he
had some cheesy "Time Life Illustrated Guide to Artificial
Intelligence" or something like that, and I was bored so I cracked it
open and looked through it a bit. It mentioned this computer called
the Connection Machine that had 65,000 processors or so (I've never
heard of any computer having even 1/8th that many processors, so I
think they might have got their info messed up, but maybe not). In
CLTL2 Guy Steele mentioned Connection Machine Lisp when talking about
the XAPPING format control string for FORMAT, and on Rainier's page
there's a picture of one (I can't really make it out though).

I tried doing some web searches for it, but I couldn't find any
pictures or documentation of the hardware or software.

Not that this has anything to do with the immediate problem at hand,
but it's cool that Lisp has been implemented before to do such
massively parallel processing.

Christopher

Erik Naggum

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to
* cba...@2xtreme.net (Christopher R. Barry)

| Not that this has anything to do with the immediate problem at hand, but
| it's cool that Lisp has been implemented before to do such massively
| parallel processing.

massively parallel computing really isn't like a multiprocessor: there
aren't any "threads" in them. MPC is about breaking down a sequential
problem into time-independent parts and has a granularity that is
miniscule compared to the very bulky "thread" thingies. e.g., an MPC
conceptually executes the several computations in a LET binding, time-
independent assignments, and the like on separate processors. some of
the benefits of this is actually achieved in modern processors for a
small number of parallel computations that use independent registers.

OS-level threads are really the wrong solution to the wrong problem:
because the OS likes to suspend the whole process and wake it up when the
kernel has serviced a request (which is bad design), you "need" a means
to make the kernel let some other, independent part of the process run,
but the right way is to send requests to the kernel for execution and
pass a continuation along, so you can go on with other things. so this
OS-level thread stuff really is about blocking system calls, not so much
about exploiting symmetrical multiprocessors. however, it wouldn't be
very hard to create a Common Lisp with SMP support -- it just had to work
its way down to the machine level in an unprecedented fashion, which the
OS-level threads themselves make it very hard for us to do, because they
solve the problem "well enough" that people don't bother to do all the
extra work it takes to get it _right_.

Sashank Varma

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to
In article <87ogntr...@2xtreme.net>, cba...@2xtreme.net (Christopher
R. Barry) wrote:
[snip]

> It mentioned this computer called
> the Connection Machine that had 65,000 processors or so (I've never
> heard of any computer having even 1/8th that many processors, so I
> think they might have got their info messed up, but maybe not).
[snip]

> I tried doing some web searches for it, but I couldn't find any
> pictures or documentation of the hardware or software.
[snip]

65k processors is correct.

read hillis' dissertation, probably available as an MIT tech report.
it was also published under the title 'the connection machine'. it
makes interesting reading.

there's also the following reference:

Wholey, Skef. An experimental implementation of connection machine
lisp. in Peter Lee (Ed.), "Topics in Advanced Language Implementation".

i don't know much about traditional approaches to multiprocessing but
enjoyed these readings.

sashank

Barry Margolin

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to
In article <87ogntr...@2xtreme.net>,

Christopher R. Barry <cba...@2xtreme.net> wrote:
>Awhile back I was at one of my friend's places and in his bookshelf he
>had some cheesy "Time Life Illustrated Guide to Artificial
>Intelligence" or something like that, and I was bored so I cracked it
>open and looked through it a bit. It mentioned this computer called

>the Connection Machine that had 65,000 processors or so (I've never
>heard of any computer having even 1/8th that many processors, so I
>think they might have got their info messed up, but maybe not). In
>CLTL2 Guy Steele mentioned Connection Machine Lisp when talking about
>the XAPPING format control string for FORMAT, and on Rainier's page
>there's a picture of one (I can't really make it out though).

I worked for Thinking Machines, and the CM-1 and CM-2 did support up to 64K
processors. However, these processors are extremely simple, 1-bit
processors; there were something like 16 or 32 of them on a chip. The idea
was modeled on the brain -- a complex network of extremely simple
components.

The followon system, the CM-5, used SPARC processors. I think the
architecture was designed to scale potentially to 16K, but the largest we
could make with the technology at the time was somewhere between 1K and 4K
(I don't remember).

>I tried doing some web searches for it, but I couldn't find any
>pictures or documentation of the hardware or software.

W. Daniel Hillis, the founder of TMC, wrote a book on the Connection
Machine, published by MIT Press.

The company filed for bankruptcy in 1994; when they came out, the only
thing left was the division working on data warehousing, and that's what
TMC currently does. The hardware is no longer sold (another company is
still providing maintenance on the ones that are still in operation).

Larry Hunter

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to

We had a 16k processor CM-1 at Yale when I was a grad student. I recall
having great fun (and miserable headaches) trying to program in *LISP (the
name of the lisp for the CM). Poking around on my bookshelf, I found I still
have a *LISP manual from 1990 (version 5.2). I (and many others) wrote a
fairly fast backprop/neural network learner in *LISP, but the most popular
thing we did was a version of Conway's game of life that used the LEDs on
the computer itself for a display. Very high coolness factor.

Since Hillis and the CM came from the AI lab, there was an initial bias
toward Lisp. Unfortunately, according to my friends who were at TMC, it
became clear fairly quickly that most of the people actually putting up
money to buy the machine wanted *FORTRAN, so *LISP languished.

Larry


--
Lawrence Hunter, PhD.
National Library of Medicine phone: +1 (301) 496-9303
Bldg. 38A, 9th fl, MS-54 fax: +1 (301) 496-0673
Bethesda. MD 20894 USA email: hun...@nlm.nih.gov

Rainer Joswig

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to
In article <yJJp2.410$oD6....@burlma1-snr1.gtei.net>, Barry Margolin <bar...@bbnplanet.com> wrote:

> >the XAPPING format control string for FORMAT, and on Rainier's page
> >there's a picture of one (I can't really make it out though).

It is the large black cube on the left with the red blinkenlights.
Well, the image is not that sharp. Sigh.

On the MCL CDROM there is old code to display images on the
LEDs. ;-)

--
http://www.lavielle.com/~joswig

Matt Wette

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to
cba...@2xtreme.net (Christopher R. Barry) writes:

> Erik Naggum <er...@naggum.no> writes:
>
> > the problem with scaling on multiprocessor machines is not the OS-thread
> > or not OS-thread issue, but how Lisp wants a single address space and you
> > are changing the semantics of the language as a whole when you make each
> > thread a separate address space or almost a separate address space.
>
> Kinda OT...
>

> Awhile back I was at one of my friend's places and in his bookshelf he
> had some cheesy "Time Life Illustrated Guide to Artificial
> Intelligence" or something like that, and I was bored so I cracked it
> open and looked through it a bit. It mentioned this computer called
> the Connection Machine that had 65,000 processors or so (I've never
> heard of any computer having even 1/8th that many processors, so I
> think they might have got their info messed up, but maybe not). In
> CLTL2 Guy Steele mentioned Connection Machine Lisp when talking about

> the XAPPING format control string for FORMAT, and on Rainier's page
> there's a picture of one (I can't really make it out though).
>

> I tried doing some web searches for it, but I couldn't find any
> pictures or documentation of the hardware or software.
>

> Not that this has anything to do with the immediate problem at hand,
> but it's cool that Lisp has been implemented before to do such
> massively parallel processing.

The company is Thinking Machines. The CM was SIMD. *The*
language to run on this machine was, of course, Fortran 90.

--
matthew.r.wette at jpl.nasa.gov -- I speak for myself, not for JPL.

Barry Margolin

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to
In article <7khftki...@jpl.nasa.gov>,

Matt Wette <Matthew.R.W...@jpl.nasa.gov> wrote:
>The company is Thinking Machines. The CM was SIMD. *The*
>language to run on this machine was, of course, Fortran 90.

Actually, the high-level language that mapped most naturally would probably
be APL. There was a guy at TMC who was into APL and I think had an
implementation, but it was never made part of the product.

Meanwhile, someone at the MIT AI Lab also implemented *Logo, so that
kiddies could have zillions of turtles on their screens. :)

Rainer Joswig

unread,
Jan 22, 1999, 3:00:00 AM1/22/99
to
In article <varmas-2101...@129.59.192.40>, var...@ctrvax.vanderbilt.edu (Sashank Varma) wrote:

> Wholey, Skef. An experimental implementation of connection machine
> lisp. in Peter Lee (Ed.), "Topics in Advanced Language Implementation".
>
> i don't know much about traditional approaches to multiprocessing but
> enjoyed these readings.
>
> sashank

There is also a book about "Paralation Lisp" written by G. Sabot.
- wasn't Paralation Lisp also for the CM - I can't remember.

--
http://www.lavielle.com/~joswig

Peter Van Eynde

unread,
Jan 22, 1999, 3:00:00 AM1/22/99
to
On 21 Jan 1999 14:32:47 -0500, Larry Hunter wrote:
>
>We had a 16k processor CM-1 at Yale when I was a grad student. I recall
>having great fun (and miserable headaches) trying to program in *LISP (the
>name of the lisp for the CM). Poking around on my bookshelf, I found I still
>have a *LISP manual from 1990 (version 5.2). I (and many others) wrote a
>fairly fast backprop/neural network learner in *LISP, but the most popular
>thing we did was a version of Conway's game of life that used the LEDs on
>the computer itself for a display. Very high coolness factor.

I still have a copy of a *LISP emulator (in Common Lisp) at home. I've been
planning to try it out...

>Since Hillis and the CM came from the AI lab, there was an initial bias
>toward Lisp. Unfortunately, according to my friends who were at TMC, it
>became clear fairly quickly that most of the people actually putting up
>money to buy the machine wanted *FORTRAN, so *LISP languished.

Hmm. Even seen the proposals for fortran-2000? Fortran seems to become more
and more Lisp-like with the years... (Disclaimer: I only took a quick look
and don't know recent developments anymore, but the fact that it then had
a GC and mapping functions was... disturbing)

Groetjes, Peter

--
It's logic Jim, but not as we know it. | pvan...@debian.org for pleasure,
"God, root, what is difference?",Pitr | pvan...@inthan.be for more pleasure!

Paolo Amoroso

unread,
Jan 26, 1999, 3:00:00 AM1/26/99
to
The popular textbook "Computer Organization & Design - The
Hardware/Software Interface" (by David A. Patterson and John L. Hennessy;
Morgan Kaufmann, 1994 - ISBN 1-55860-282-8 [*]) includes short but
interesting descriptions of the CM-2 (pages 599-601) and CM-5 (pages
626-630) architectures.

There are also pictures of both machines. The CM-2 is impressive and
reminds me of the WOPR(?) computer of the "War Games" movie. The CM-5 is
nice too, certainly much nicer than the Sequent Symmetry "refrigerator"
shown in another picture of the book :) Was CM-1 similar to CM-2?


On 21 Jan 1999 14:32:47 -0500, Larry Hunter <hun...@nlm.nih.gov> wrote:

> We had a 16k processor CM-1 at Yale when I was a grad student. I recall
> having great fun (and miserable headaches) trying to program in *LISP (the
> name of the lisp for the CM). Poking around on my bookshelf, I found I still

Was *LISP similar to any major contemporary Lisp implementation? How did it
exploit the system's parallelism? By means of appropriate language
constructs? Maybe the compiler did most of the hard work? Something
in-between?


> fairly fast backprop/neural network learner in *LISP, but the most popular
> thing we did was a version of Conway's game of life that used the LEDs on
> the computer itself for a display. Very high coolness factor.

From what I can see from the above mentioned pictures, the prominent LED
panels seem to contain hundreds of items. Your game of life must have been
really cool. What was the purpose of all those LEDs, by the way? The
pictures are too small to see whether there are any labels associated to
them.


Paolo

[*] This info refers to the first edition, but there's a more recent one.
--
Paolo Amoroso <amo...@mclink.it>

Rainer Joswig

unread,
Jan 26, 1999, 3:00:00 AM1/26/99
to
In article <36adb5d...@news.mclink.it>, amo...@mclink.it (Paolo Amoroso) wrote:

> From what I can see from the above mentioned pictures, the prominent LED
> panels seem to contain hundreds of items. Your game of life must have been
> really cool. What was the purpose of all those LEDs, by the way?

Processor state/activity, I'd guess.

--
http://www.lavielle.com/~joswig

George Neuner

unread,
Jan 26, 1999, 3:00:00 AM1/26/99
to
On Tue, 26 Jan 1999 12:36:19 GMT, amo...@mclink.it (Paolo Amoroso)
wrote:

>From what I can see from the above mentioned pictures, the prominent LED
>panels seem to contain hundreds of items. Your game of life must have been

>really cool. What was the purpose of all those LEDs, by the way? The
>pictures are too small to see whether there are any labels associated to
>them.

Every processor had two LEDs, one to indicate that it was active and
the other to indicate a hardware failure (you didn't want see those
lit).

As several people have said, the CM was a SIMD machine: every
processor simultaneously executed the same program. Because each
processor was working on a separate data set, it was necessary to have
conditional instruction execution. The solution used in the CM was to
partition the processors into "active" and "inactive" sets and to
provide ways to move processors from one set to the other. In
actuality there was one unconditional activation instruction and all
conditional tests could inactivate the CPU. CPUs that were inactive
would simply ignore all further instructions except the unconditional
activation.

This approached proved interesting the first time debugging a program
on the CM (especially in a high level language) because conditional
branching constructs didn't work like on serial machines. Rather than
skip the instructions in the unexecuted branch, every instruction in
both branches was executed - the conditional test(s) would mark some
set of processors inactive and those processors would idle until
control rejoined the main thread.


George Neuner
Dynamic Resolutions, Inc.

===================================================
The opinions expressed herein are my own and do not
reflect the opinions or policies of my employer.
===================================================

Barry Margolin

unread,
Jan 26, 1999, 3:00:00 AM1/26/99
to
In article <36adb5d...@news.mclink.it>,

Paolo Amoroso <amo...@mclink.it> wrote:
>Was *LISP similar to any major contemporary Lisp implementation? How did it
>exploit the system's parallelism? By means of appropriate language
>constructs? Maybe the compiler did most of the hard work? Something
>in-between?

*LISP was Common Lisp augmented with a bunch of extensions, mostly in the
form of macros. The main data type it adds is a "pvar", a parallel
variable; it's essentially an array whose elements are spread across the
processors. The main macros are *DEFUN, which defines a function that can
be run on the parallel processors, and *LET, which binds parallel
variables. There were also things like *IF, which invoked the CM's
conditional execution mechanism (on a pure SIMD system, like the CM-1 and
CM-2, conditionals are implemented by having a flag in each processor
saying whether it should run or not at any particular time, so an IF is
executed by copying the boolean into the flag, executing the THEN code,
toggling the flag in all processors, executing the ELSE code, then enabling
all the processors). The developers reverse-engineered a number of hooks
into the compilers (Symbolics and Lucid) to make this work reasonably --
they needed to tie into the code walker so that references to pvars would
be noticed and the containing expressions executed in parallel.

William Clodius

unread,
Jan 26, 1999, 3:00:00 AM1/26/99
to
Peter Van Eynde wrote:
> <snip>

> Hmm. Even seen the proposals for fortran-2000? Fortran seems to become more
> and more Lisp-like with the years... (Disclaimer: I only took a quick look
> and don't know recent developments anymore, but the fact that it then had
> a GC and mapping functions was... disturbing)
> <snip>

??

While Fortran 90 had some involvement with people very familiar with
Lisp, mostly from Thinking Machines, that standard was far more
influenced by Algol 68, Modula 2, Ada, and maybe APL. The mapping
functions were all array oriented. Some of the Lisp people continued on
with High Performance Fortran (HPF) 1.0, only part of which was included
in Fortran 95, notably FORALL and a more general WHERE. There seems to
be no significant effort to include more of HPF 2.0 in Fortran 2000
(which will probably end up as Fortran 2002).

As to GC, that usually refers to garbage collection. While the language
was designed to allow garbage collection for standard conforming code,
(and implementations from NAG and NASOFT have optional garbage
collectors) the current plans are to require the collection of only the
simplest forms of garbage, ("heap" variables with local scope), and not
garbage collection in general.

http://www.etrc.ox.ac.uk/SC22WG5/

http://www.ionet.net/~jwagener/j3/

--

William B. Clodius Phone: (505)-665-9370
Los Alamos Nat. Lab., NIS-2 FAX: (505)-667-3815
PO Box 1663, MS-C323 Group office: (505)-667-5776
Los Alamos, NM 87545 Email: wclo...@lanl.gov

0 new messages