Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

CMUCL18d on Alpha?

8 views
Skip to first unread message

Rolf Wester

unread,
Apr 15, 2002, 3:42:42 AM4/15/02
to
Hi,

I'm running CMUCL18b (binary distribution) on a Compaq Alpha. I would like
to upgrade to 18d. Is there any binary version of this release for the Alpha
architecture?
I couldn't find any. And if not how difficult would it be to build it myself?


Rolf Wester

P.S.: I have much more experience in using C++/Java etc. but but I'm
increasingly using CMUCL
for my daily work. I really appreciate the efforts of all the volunteers who
contribute to maintaining CMUCL.


Eric Marsden

unread,
Apr 15, 2002, 5:05:17 AM4/15/02
to
>>>>> "rw" == Rolf Wester <wes...@ilt.fhg.de> writes:

rw> I'm running CMUCL18b (binary distribution) on a Compaq Alpha. I
rw> would like to upgrade to 18d. Is there any binary version of
rw> this release for the Alpha architecture? I couldn't find any.

none of the CMUCL developers are interested in/have access to this
platform, which is why there are no current binaries.

rw> And if not how difficult would it be to build it myself?

it would be tricky, since a cross-compile from a currently supported
architecture would be required, given the number of changes between
18b and 18d. Ask on the cmucl-imp mailing list if you are interested
in helping.

rw> I have much more experience in using C++/Java etc. but but I'm
rw> increasingly using CMUCL for my daily work.

much more fun, isn't it!

--
Eric Marsden <URL:http://www.laas.fr/~emarsden/>

Christophe Rhodes

unread,
Apr 15, 2002, 5:44:03 AM4/15/02
to
Rolf Wester <wes...@ilt.fhg.de> writes:

> I'm running CMUCL18b (binary distribution) on a Compaq Alpha. I would like
> to upgrade to 18d. Is there any binary version of this release for the Alpha
> architecture?
> I couldn't find any. And if not how difficult would it be to build it myself?

As well as Eric's advice, there's another avenue that may or may not
be of use to you: SBCL is supported on the Alpha, though currently
only on Linux; if you're using another flavour of Unix, it probably
isn't too hard to port the CMUCL runtime for that flavour over to SBCL
(somewhere between a day and a week, I should think).

Cheers,

Christophe
--
Jesus College, Cambridge, CB5 8BL +44 1223 510 299
http://www-jcsu.jesus.cam.ac.uk/~csr21/ (defun pling-dollar
(str schar arg) (first (last +))) (make-dispatch-macro-character #\! t)
(set-dispatch-macro-character #\! #\$ #'pling-dollar)

Rolf Wester

unread,
Apr 16, 2002, 4:33:42 AM4/16/02
to

Eric Marsden wrote:

> >>>>> "rw" == Rolf Wester <wes...@ilt.fhg.de> writes:
>
> rw> I'm running CMUCL18b (binary distribution) on a Compaq Alpha. I
> rw> would like to upgrade to 18d. Is there any binary version of
> rw> this release for the Alpha architecture? I couldn't find any.
>
> none of the CMUCL developers are interested in/have access to this
> platform, which is why there are no current binaries.
>
> rw> And if not how difficult would it be to build it myself?
>
> it would be tricky, since a cross-compile from a currently supported
> architecture would be required, given the number of changes between
> 18b and 18d. Ask on the cmucl-imp mailing list if you are interested
> in helping.

I doubt that my skills would be sufficient for this. It would probably be waste
of time. I'm going to switch to Linux on x86.

> rw> I have much more experience in using C++/Java etc. but but I'm
> rw> increasingly using CMUCL for my daily work.
>
> much more fun, isn't it!

It depends on the problem at hand. If performance isn't the point it's much more
fun indeed.
But right now I have the problem of reading a 27 MB file of floats. Maybe I'm
doing something
wrong but as it is it's to slow to be usefull.

Thanks for your reply.

Regards

Rolf Wester

Rolf Wester

unread,
Apr 16, 2002, 4:38:24 AM4/16/02
to

Christophe Rhodes wrote:

> Rolf Wester <wes...@ilt.fhg.de> writes:
>
> > I'm running CMUCL18b (binary distribution) on a Compaq Alpha. I would like
> > to upgrade to 18d. Is there any binary version of this release for the Alpha
> > architecture?
> > I couldn't find any. And if not how difficult would it be to build it myself?
>
> As well as Eric's advice, there's another avenue that may or may not
> be of use to you: SBCL is supported on the Alpha, though currently
> only on Linux; if you're using another flavour of Unix, it probably
> isn't too hard to port the CMUCL runtime for that flavour over to SBCL
> (somewhere between a day and a week, I should think).
>

Thanks for your reply. But I think under these circumstances it's better for me
to switch to Linux on x86. A high end PC isn't slower than the Alpha.

Regards

Rolf Wester

Siegfried Gonzi

unread,
Apr 16, 2002, 5:53:48 AM4/16/02
to
Rolf Wester wrote:

> It depends on the problem at hand. If performance isn't the point it's much more
> fun indeed.
> But right now I have the problem of reading a 27 MB file of floats. Maybe I'm
> doing something
> wrong but as it is it's to slow to be usefull.

Maybe you should post the code here. I had some conversation in private
emails where my opponent always insisted that Common Lisp -- and the
right Lisp technique -- is comparable in speed to C.

I am quite sure you will get some good tips (not from me; it is not my
intention to become a proficient Lisp programmer) and it will be
interesting to see how Common Lisp performs.

For example, I saw (due to the private email conversation) that it is
possible to read in from a file into a Lisp list and you can then
initialize an array with this list: I am not sure whether Lisp
internally just sets a pointer or whether it copies some stuff. But this
was a quite interestinng (and not really obvious for me as untrained
Lisp programmer).

S. Gonzi

Joe Marshall

unread,
Apr 16, 2002, 8:02:11 AM4/16/02
to

"Siegfried Gonzi" <siegfri...@kfunigraz.ac.at> wrote in message
news:3CBBF4AC...@kfunigraz.ac.at...

>
> For example, I saw (due to the private email conversation) that it is
> possible to read in from a file into a Lisp list and you can then
> initialize an array with this list: I am not sure whether Lisp
> internally just sets a pointer or whether it copies some stuff. But this
> was a quite interestinng (and not really obvious for me as untrained
> Lisp programmer).

There is a Lisp function called READ-SEQUENCE that is the
standard interface for a high-performance read. It is designed
to read elements from a stream into a pre-allocated sequence.
The main use for something like this is to read ascii text into
a string or a byte-stream into array. Depending on how it is
implemented it could be quite a bit faster than just calling
read-char (or read-byte) however many times.

If you need something even faster, you can just make a
foreign call to MMAP. In Lucid CL I did this and then constructed
a displaced character array to the mmap'd memory.

The real question is what the original poster intends to
*do* with 27MB of floats.

Siegfried Gonzi

unread,
Apr 16, 2002, 8:21:46 AM4/16/02
to

Joe Marshall wrote:

>
> The real question is what the original poster intends to
> *do* with 27MB of floats.

This is not really uncommon. I once had the case to deal with 50MB files
(ASCII files which floating-point values). Clean handled the files in
seconds.

The problem here lies: how would you read a binary file in Common Lisp
(yes,yes,yes it is maybe possible, but only when one is an expert and
invests some time for functions which converts 4 binaries into something
like a floating point representation).


S. Gonzi

Joe Marshall

unread,
Apr 16, 2002, 9:48:43 AM4/16/02
to

"Siegfried Gonzi" <siegfri...@kfunigraz.ac.at> wrote in message
news:3CBC175A...@kfunigraz.ac.at...

>
>
> Joe Marshall wrote:
>
> >
> > The real question is what the original poster intends to
> > *do* with 27MB of floats.
>
> This is not really uncommon. I once had the case to deal with 50MB files
> (ASCII files which floating-point values). Clean handled the files in
> seconds.

I wasn't doubting that it was common. The question is what sort of
processing does he intend to do. Reading 27MB of floats into Common
Lisp quickly requires a bit of work on the part of the programmer,
and manipulating 27MB of floats will, too. Since lisp wants everything
to be tagged, and since floats don't have room for a tag, you
can take a severe hit if your code is constantly boxing and unboxing
the floats. You'd want to put them into an array specialized for
floats, and ensure the code that manipulates them does not
attempt to create boxed objects. This could be a challenge even
for someone moderately familiar with Lisp.

On the other hand, Lisp might be ideally suited for what he
intends to do with the results, and the C code that manipulates
the numbers might be hairier than the Lisp code that reads them.
If you take a look on the web for `Supercomputer Toolkit' you'll
find some research done at MIT and HP where they used partial
evaluation to compile high-level Scheme code into high-performance
floating point code (over 98% of the dynamically executed
instructions were floating point ops).

Marco Antoniotti

unread,
Apr 16, 2002, 10:36:24 AM4/16/02
to

Siegfried Gonzi <siegfri...@kfunigraz.ac.at> writes:

Why not reading directly into the (1D) vector?

(defvar *v* (make-array <some-size>))

(with-open-file (f "1d-data.dat" :direction :input)
(read-sequence *v* f))

Or are you referring to 2D tabular data?

Could you post some C code that read data into a C 2D array?

Thanks

--
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group tel. +1 - 212 - 998 3488
719 Broadway 12th Floor fax +1 - 212 - 995 4122
New York, NY 10003, USA http://bioinformatics.cat.nyu.edu
"Hello New York! We'll do what we can!"
Bill Murray in `Ghostbusters'.

Dr. Edmund Weitz

unread,
Apr 16, 2002, 10:55:45 AM4/16/02
to
Marco Antoniotti <mar...@cs.nyu.edu> writes:

> Why not reading directly into the (1D) vector?
>
> (defvar *v* (make-array <some-size>))
>
> (with-open-file (f "1d-data.dat" :direction :input)
> (read-sequence *v* f))

Wouldn't that mean that you read one character (or rather stream
element) per array element? I think the OP wanted to have one float
per array element.

Edi.

--

Dr. Edmund Weitz
Hamburg
Germany

The Common Lisp Cookbook
<http://cl-cookbook.sourceforge.net/>

Rolf Wester

unread,
Apr 18, 2002, 4:50:25 AM4/18/02
to

Siegfried Gonzi wrote:

> Rolf Wester wrote:
>
> > It depends on the problem at hand. If performance isn't the point it's much more
> > fun indeed.
> > But right now I have the problem of reading a 27 MB file of floats. Maybe I'm
> > doing something
> > wrong but as it is it's to slow to be usefull.
>
> Maybe you should post the code here. I had some conversation in private
> emails where my opponent always insisted that Common Lisp -- and the
> right Lisp technique -- is comparable in speed to C.
>
> I am quite sure you will get some good tips (not from me; it is not my
> intention to become a proficient Lisp programmer) and it will be
> interesting to see how Common Lisp performs.
>

The code for reading the ascii file is:

(defun read-lines (file)
(with-open-file (in file :direction :input)
(loop for zeile = (read-line in nil 'eof) and
until (eql zeile 'eof)
collect zeile)))
or:

(defun read-file (file)
(with-open-file (in file :direction :input)
(let* ((n (file-length in)
(str (make-string n)))
(read-sequence str stream :start 0 :end n)
(values str)))

The next step would be to parse floats. I got a parse-float
function from the CMU repository. But the point is that
CMUCL doesn't even read the ascii file. I guess that the
problem has something to do with memory allocation
and/or the gc. I got messages like:

set_auto_gc_trigger: tried to set gc trigger too high! (67638792)

and then CMUCL was hanging. I posted to cmucl-help but didn't
get an answer until now.

Concerning Lisp performance I have done some test. The relative
performance of the same program written in different languages
strongly depends on the problem at hand. I just made some test
with a matrix-matrix multiplication. I wrote a cpp version using
TNT::Matrix<double> with and without bounds checking, a OCaml
version natively compiled with and without -unsafe (no bound checks)
and CMUCL with type declarations. The result for a the multiplication of
400*400 matrices is (all programs run on a Compaq Alpha):

bounds check no bounds check
Compaq/CXX 5.5 s 2.8 s
OCaml 5 s 3.9 s
CMUCL 5.3 s ?

The cpp code could probably be made somewhat faster when using raw pointers.
But my experience with this is quite bad, so I avoid it whenever possible.
I think CMUCL is quite fast, especially compared to the bounds check versions
of OCaml and CXX. And the great advantage of CMUCL is that I can use it
interactively (the interactive version of OCaml is only bytcode compiled and much,
much slower: 64 s). So if I could manage CMUCL to read my big file and parse
the floats in a reasonable time I would use CMUCL for the problem at hand. The above
numbers are valid for the special problem of matrix matrix multiplication. I made
some other tests some time ago where the difference between CXX, OCaml and
CMUCL were greater. But for quite a lot of problems, especially those that would
benefit from interactivity, CMUCL's performance is good enough (nobody would want
to intercat with a cpp program that runs an hour or so).


Rolf Wester

Siegfried Gonzi

unread,
Apr 18, 2002, 9:47:41 AM4/18/02
to
Rolf Wester wrote:

>The result for a the multiplication of
> 400*400 matrices is (all programs run on a Compaq Alpha):
>
> bounds check no bounds check
> Compaq/CXX 5.5 s 2.8 s
> OCaml 5 s 3.9 s
> CMUCL 5.3 s ?

Is the Lisp code without declarations?

>But for quite a lot of problems, especially those that would
> benefit from interactivity, CMUCL's performance is good enough (nobody would want
> to intercat with a cpp program that runs an hour or so).

How much is the performance difference? People often are angry when the
difference to C is about a factor of 2. They even say that everything
which is slower than 90% of C is garbage.

I now mainly work with Python and I had never the dire need for a faster
Python. I agree my opinion would likely change if I had to wait 1 hour
whereas the C function would take 1 min.


I can remember that you mentioned that you are also an experienced
Python programmer. Did you read in the floats with Python too? I only
ask this in order to retrieve some information concerning Python and
CMUCL here.


S. Gonzi

Marco Antoniotti

unread,
Apr 18, 2002, 10:37:44 AM4/18/02
to

Siegfried Gonzi <siegfri...@kfunigraz.ac.at> writes:

> Rolf Wester wrote:
>
> >The result for a the multiplication of
> > 400*400 matrices is (all programs run on a Compaq Alpha):
> >
> > bounds check no bounds check
> > Compaq/CXX 5.5 s 2.8 s
> > OCaml 5 s 3.9 s
> > CMUCL 5.3 s ?
>
> Is the Lisp code without declarations?

Why are you asking this?


>
> >But for quite a lot of problems, especially those that would
> > benefit from interactivity, CMUCL's performance is good enough (nobody would want
> > to intercat with a cpp program that runs an hour or so).
>
> How much is the performance difference? People often are angry when the
> difference to C is about a factor of 2. They even say that everything
> which is slower than 90% of C is garbage.
>
> I now mainly work with Python and I had never the dire need for a faster
> Python. I agree my opinion would likely change if I had to wait 1 hour
> whereas the C function would take 1 min.
>
>
> I can remember that you mentioned that you are also an experienced
> Python programmer. Did you read in the floats with Python too? I only
> ask this in order to retrieve some information concerning Python and
> CMUCL here.

CMUCL is compiled to native machine code. Last I checked Python
isn't.

Cheers

Raymond Toy

unread,
Apr 18, 2002, 11:07:38 AM4/18/02
to
>>>>> "Rolf" == Rolf Wester <wes...@ilt.fhg.de> writes:

Rolf> Siegfried Gonzi wrote:

>> Rolf Wester wrote:
>>
>> > It depends on the problem at hand. If performance isn't the point it's much more
>> > fun indeed.
>> > But right now I have the problem of reading a 27 MB file of floats. Maybe I'm
>> > doing something
>> > wrong but as it is it's to slow to be usefull.
>>
>> Maybe you should post the code here. I had some conversation in private
>> emails where my opponent always insisted that Common Lisp -- and the
>> right Lisp technique -- is comparable in speed to C.
>>
>> I am quite sure you will get some good tips (not from me; it is not my
>> intention to become a proficient Lisp programmer) and it will be
>> interesting to see how Common Lisp performs.
>>

Rolf> The code for reading the ascii file is:

Rolf> (defun read-lines (file)
Rolf> (with-open-file (in file :direction :input)
Rolf> (loop for zeile = (read-line in nil 'eof) and
Rolf> until (eql zeile 'eof)
Rolf> collect zeile)))
Rolf> or:

This would seem to read in each line of your file and return a list of
strings.

Rolf> (defun read-file (file)
Rolf> (with-open-file (in file :direction :input)
Rolf> (let* ((n (file-length in)
Rolf> (str (make-string n)))
Rolf> (read-sequence str stream :start 0 :end n)
Rolf> (values str)))

This just sucks in your entire file into a single string.

Rolf> The next step would be to parse floats. I got a parse-float
Rolf> function from the CMU repository. But the point is that
Rolf> CMUCL doesn't even read the ascii file. I guess that the
Rolf> problem has something to do with memory allocation
Rolf> and/or the gc. I got messages like:

Can you say what the actual format of your file is? Is it just one
float per line?

If it really is just a file containing floats, wouldn't something like
(not tested):

(loop for zeile = (read in nil 'eof) and

until (eql zeile 'eof)
collect zeile)

work better? Here you get a list of floats. Isn't that what you
really wanted to begin with?


Ray

Geoff Summerhayes

unread,
Apr 18, 2002, 11:40:34 AM4/18/02
to

"Rolf Wester" <wes...@ilt.fhg.de> wrote in message
news:3CBE88D1...@ilt.fhg.de...

>
>
> The code for reading the ascii file is:
>
> (defun read-lines (file)
> (with-open-file (in file :direction :input)
> (loop for zeile = (read-line in nil 'eof) and
> until (eql zeile 'eof)
> collect zeile)))
> or:
>
> (defun read-file (file)
> (with-open-file (in file :direction :input)
> (let* ((n (file-length in)
> (str (make-string n)))
> (read-sequence str stream :start 0 :end n)
> (values str)))
>

If you had written this in C, what would it look like?
Would you have used the same algorithm to read in the file?

/* You just have to love the C way of doing things :-P */
char *read_file(char *filename)
{
FILE *fh;
char *buffer=NULL;
size_t filesize;
if((fh=fopen(filename,"rt"))!=NULL)
{
if((filesize=get_file_length(fh))>0)
{
if((buffer=(char *)malloc(filesize+1))!=NULL)
{
if(filesize!=fread(buffer,1,filesize,fh))
{
free(buffer);
buffer=NULL;
}
else buffer[filesize]='\0';
}
}
fclose(fh);
}
return buffer;
}

If not, perhaps you should be looking at writing Lisp
code more closely reflects the generic algorithm.

Another suggestion (warning:wiser heads may correct this):

(defun read-floats (filename)
(let ((floats '())
(*read-eval* nil))
(with-open-file (stream filename :direction :input)
(loop (multiple-value-bind (f error)
(ignore-errors (read stream nil 'eof))
(cond ((or error (eq f 'eof))
(return (nreverse floats)))
((typep f 'float) (push f floats))))))))

-------
Geoff

Joe Marshall

unread,
Apr 18, 2002, 11:02:43 AM4/18/02
to

"Rolf Wester" <wes...@ilt.fhg.de> wrote

> But right now I have the problem of reading a 27 MB file of floats.>
>

> The code for reading the ascii file is:
>
> (defun read-lines (file)
> (with-open-file (in file :direction :input)
> (loop for zeile = (read-line in nil 'eof) and
> until (eql zeile 'eof)
> collect zeile)))
> or:
>
> (defun read-file (file)
> (with-open-file (in file :direction :input)
> (let* ((n (file-length in)
> (str (make-string n)))
> (read-sequence str stream :start 0 :end n)
> (values str)))

This latter would try to create a string 27 million characters long.
You might want to check to see if that exceeds the array-dimension-limit.
It also might be larger than the default size of the heap, or
the size of the space in which it would be consed.

I'd suggest looking at the CMUCL `alien' interface. If you map
the file into memory and access via the alien interface, you can
avoid some of the problems with consing a 27MB string.


Rolf Wester

unread,
Apr 18, 2002, 12:52:17 PM4/18/02
to

Siegfried Gonzi wrote:

> Rolf Wester wrote:
>
> >The result for a the multiplication of
> > 400*400 matrices is (all programs run on a Compaq Alpha):
> >
> > bounds check no bounds check
> > Compaq/CXX 5.5 s 2.8 s
> > OCaml 5 s 3.9 s
> > CMUCL 5.3 s ?
>
> Is the Lisp code without declarations?
>

Of course!

>
> >But for quite a lot of problems, especially those that would
> > benefit from interactivity, CMUCL's performance is good enough (nobody would want
> > to intercat with a cpp program that runs an hour or so).
>
> How much is the performance difference?

As you can see from the table a factor of two.


> People often are angry when the
> difference to C is about a factor of 2. They even say that everything
> which is slower than 90% of C is garbage.

Everbody is free to think what he/she wants. It's not my problem.

>
> I now mainly work with Python and I had never the dire need for a faster
> Python. I agree my opinion would likely change if I had to wait 1 hour
> whereas the C function would take 1 min.
>
> I can remember that you mentioned that you are also an experienced
> Python programmer. Did you read in the floats with Python too? I only
> ask this in order to retrieve some information concerning Python and
> CMUCL here.
>

No I didn't. Even in teh case that Python would read the floats faster, the
computations
that I want to do with the floats would be much slower.

>
> S. Gonzi

Rolf Wester

Rolf Wester

unread,
Apr 18, 2002, 12:56:20 PM4/18/02
to

Rolf Wester schrieb:

> Siegfried Gonzi wrote:
>
> >
> > Is the Lisp code without declarations?
> >
>
> Of course!
>

Sorry. It has to be: "Of course there are declarations"

>
> Rolf Wester

Siegfried Gonzi

unread,
Apr 18, 2002, 1:52:38 PM4/18/02
to
From: "Rolf Wester" <wes...@ilt.fhg.de>

> > How much is the performance difference?
>
> As you can see from the table a factor of two.


I think you mean the matrix-matrix multiplication? My question was more
aimed at: how much is the performance in Lisp when you read in your floats?

The case you meant a factor of 2 for reading floats from a file: I would not
care. A factor of 2 is absolutely good? Surely: 10 hours and 5 hours there
is a difference. But you did not say where do you roam: in the minutes or
hours regime?

> > People often are angry when the
> > difference to C is about a factor of 2. They even say that everything
> > which is slower than 90% of C is garbage.
>
> Everbody is free to think what he/she wants. It's not my problem.


Yeah, right, but everybody has a different perception, or hasn't it? I start
complaining about speed when there is a factor of 10; but there are such a
great many of people who complain due to a factor of 2.

People should comme together and contemplate: What do we want? Do we want
the power of things like Lisp, Python, Smalltalk,... or do we need brute
number crunching force? *Most* of the computations are not done on
super-big-workstations, where really every penny counts.

It is your right to expect whatever you want (and still a computer is a tool
for number crunching; this even holds after 50 years of "modern computer
science"), but always to compare the performance to C will impede you in the
long run.

S. Gonzi


Marco Antoniotti

unread,
Apr 18, 2002, 3:19:16 PM4/18/02
to

"Siegfried Gonzi" <siegfri...@kfunigraz.ac.at> writes:

> From: "Rolf Wester" <wes...@ilt.fhg.de>
>
> > > How much is the performance difference?
> >
> > As you can see from the table a factor of two.
>
>
> I think you mean the matrix-matrix multiplication? My question was more
> aimed at: how much is the performance in Lisp when you read in your floats?
>
> The case you meant a factor of 2 for reading floats from a file: I would not
> care. A factor of 2 is absolutely good? Surely: 10 hours and 5 hours there
> is a difference. But you did not say where do you roam: in the minutes or
> hours regime?
>
> > > People often are angry when the
> > > difference to C is about a factor of 2. They even say that everything
> > > which is slower than 90% of C is garbage.
> >
> > Everbody is free to think what he/she wants. It's not my problem.
>
>
> Yeah, right, but everybody has a different perception, or hasn't it? I start
> complaining about speed when there is a factor of 10; but there are such a
> great many of people who complain due to a factor of 2.
>
> People should comme together and contemplate: What do we want? Do we want
> the power of things like Lisp, Python, Smalltalk,... or do we need brute
> number crunching force? *Most* of the computations are not done on
> super-big-workstations, where really every penny counts.

Well. The point is that Common Lisp gives you (at this time in
history) a better trade-off in "expressive power" and "speed" than
Python (i.e. more "expressive power" *and* "more speed" than
Python). (I won't comment on Smalltalk).

Rolf Wester

unread,
Apr 19, 2002, 2:53:55 AM4/19/02
to

Joe Marshall wrote:

> "Rolf Wester" <wes...@ilt.fhg.de> wrote
>
> > But right now I have the problem of reading a 27 MB file of floats.>
> >
> > The code for reading the ascii file is:
> >
> > (defun read-lines (file)
> > (with-open-file (in file :direction :input)
> > (loop for zeile = (read-line in nil 'eof) and
> > until (eql zeile 'eof)
> > collect zeile)))
> > or:
> >
> > (defun read-file (file)
> > (with-open-file (in file :direction :input)
> > (let* ((n (file-length in)
> > (str (make-string n)))
> > (read-sequence str stream :start 0 :end n)
> > (values str)))
>
> This latter would try to create a string 27 million characters long.
> You might want to check to see if that exceeds the array-dimension-limit.

I checked this:
* array-dimension-limit
536870911

>
> It also might be larger than the default size of the heap, or
> the size of the space in which it would be consed.
>

When I remember right some time ago I read that older versions of CMUCL
(I use CMUCL18b) only support 100MB of heap size. I don't know it exactly
but it guess that the problem might be linked with this limit.

>
> I'd suggest looking at the CMUCL `alien' interface. If you map
> the file into memory and access via the alien interface, you can
> avoid some of the problems with consing a 27MB string.

Unfortunately I'm running CMUCL18b binary distribution on an Compaq Alpha.
And this doesn't support "load-foreign". There is no more recent CMUCL binary
version for the Alpha and it would be quite tedious to build it from sources.
I'm
going to switch to a x86 PC running Linux.

Thanks for ypur reply.

Regards

Rofl Wester

Rolf Wester

unread,
Apr 19, 2002, 2:58:02 AM4/19/02
to

Raymond Toy wrote:

There are 3 floats per line, x y z.

>
> If it really is just a file containing floats, wouldn't something like
> (not tested):
>
> (loop for zeile = (read in nil 'eof) and
> until (eql zeile 'eof)
> collect zeile)
>
> work better? Here you get a list of floats. Isn't that what you
> really wanted to begin with?

That's right. But the problem is that read has to parse the input stream without
having the information that the next object is a float. So this is the time bottle
neck, not the reading itself. So thats why I looked for alternatives.

Rolf Wester

unread,
Apr 19, 2002, 3:24:33 AM4/19/02
to

Geoff Summerhayes wrote:

>
> If you had written this in C, what would it look like?
> Would you have used the same algorithm to read in the file?
>
> /* You just have to love the C way of doing things :-P */
> char *read_file(char *filename)
> {
> FILE *fh;
> char *buffer=NULL;
> size_t filesize;
> if((fh=fopen(filename,"rt"))!=NULL)
> {
> if((filesize=get_file_length(fh))>0)
> {
> if((buffer=(char *)malloc(filesize+1))!=NULL)
> {
> if(filesize!=fread(buffer,1,filesize,fh))
> {
> free(buffer);
> buffer=NULL;
> }
> else buffer[filesize]='\0';
> }
> }
> fclose(fh);
> }
> return buffer;
> }
>

My cpp version is either:

#include <vector>
#include <fstream>

int read2()
{
ifstream in("gs.txt");
if(!in)
{
cout << "Error opening gs.txt" << endl;
return 1;
}
const int n = 1046529;
vector<double> x(n);
vector<double> y(n);
vector<double> z(n);
double xx, yy, zz;
int i;
for(i=0; i < n; i++)
{
in >> x[i] >> y[i] >> z[i];
}
cout << x[n-1] << " " << y[n-1] << " " << z[n-1] << endl;
return 0;
}
which takes 4.5 s.

or

int read1()
{
ifstream in("gs.txt");
if(!in)
{
cout << "Error opening gs.txt" << endl;
return 1;
}
vector<double> x,y,z;
double xx, yy, zz;
while (1)
{
in >> xx >> yy >> zz;
if(in.eof() == 0)
{
x.push_back(xx);
y.push_back(yy);
z.push_back(zz);
}
else
{
int n = x.size();
cout << x[n-1] << " " << y[n-1] << " " << z[n-1] << endl;
return 0;
}
}
}
which takes 6.5 s.

>
> If not, perhaps you should be looking at writing Lisp
> code more closely reflects the generic algorithm.
>
> Another suggestion (warning:wiser heads may correct this):
>
> (defun read-floats (filename)
> (let ((floats '())
> (*read-eval* nil))
> (with-open-file (stream filename :direction :input)
> (loop (multiple-value-bind (f error)
> (ignore-errors (read stream nil 'eof))
> (cond ((or error (eq f 'eof))
> (return (nreverse floats)))
> ((typep f 'float) (push f floats))))))))
>

I will try this.

Thanks for your reply.

Regards

Rofl Wester

Rolf Wester

unread,
Apr 19, 2002, 4:16:51 AM4/19/02
to

Siegfried Gonzi wrote:

> From: "Rolf Wester" <wes...@ilt.fhg.de>
>
> > > How much is the performance difference?
> >
> > As you can see from the table a factor of two.
>
> I think you mean the matrix-matrix multiplication? My question was more
> aimed at: how much is the performance in Lisp when you read in your floats?
>

With:

(defun read-file (file)
(with-open-file (in file :direction :input)

(loop for z = (read in nil 'eof)
until (eql z 'eof)
collect z)))

(defparameter *file* "gs.txt")
(defparameter *all* nil)

(time
(setf *all* (read-file3 *file*)))

Evaluation took:
261.77 seconds of real time
243.39977 seconds of user run time
14.983552 seconds of system run time
[Run times include 202.83 seconds GC run time]
0 page faults and
402732904 bytes consed.

This is very much compared to 5 s with cpp and 14 s with OCaml (24 s
if byte compiled).

>
> The case you meant a factor of 2 for reading floats from a file: I would not
> care. A factor of 2 is absolutely good? Surely: 10 hours and 5 hours there
> is a difference. But you did not say where do you roam: in the minutes or
> hours regime?
>
> > > People often are angry when the
> > > difference to C is about a factor of 2. They even say that everything
> > > which is slower than 90% of C is garbage.
> >
> > Everbody is free to think what he/she wants. It's not my problem.
>
> Yeah, right, but everybody has a different perception, or hasn't it? I start
> complaining about speed when there is a factor of 10; but there are such a
> great many of people who complain due to a factor of 2.

Nobody is forced to program in Lisp. Everybody is free to do in C or assembler.

> People should comme together and contemplate: What do we want? Do we want
> the power of things like Lisp, Python, Smalltalk,... or do we need brute
> number crunching force? *Most* of the computations are not done on
> super-big-workstations, where really every penny counts.
>
> It is your right to expect whatever you want (and still a computer is a tool
> for number crunching; this even holds after 50 years of "modern computer
> science"), but always to compare the performance to C will impede you in the
> long run.
>

It always depends on the problem at hand. I have programs that run hours or
days.
In that cases performance is a very important point. Other problems may be quite

tedious but don't need much performance. These are the problems I use OCaml or
Lisp for. Both are more powerfull and faster than Python.

Regards

Rolf Wester


Rahul Jain

unread,
Apr 19, 2002, 5:22:02 AM4/19/02
to
Rolf Wester <wes...@ilt.fhg.de> writes:

> (defun read-file (file)
> (with-open-file (in file :direction :input)
> (loop for z = (read in nil 'eof)
> until (eql z 'eof)
> collect z)))

> Evaluation took:
[...]


> [Run times include 202.83 seconds GC run time]

[...]
> 402732904 bytes consed.

Do you know before-hand the exact dimensions of the array you'll be
putting the floats into?

If you put them into a specialized array, you might be able to use the
commonly available parse-float code tweaked such that it operates only
on unboxed floats and the float in the destination array.

--
-> -/ - Rahul Jain - \- <-
-> -\ http://linux.rice.edu/~rahul -=- mailto:rj...@techie.com /- <-
-> -/ "Structure is nothing if it is all you got. Skeletons spook \- <-
-> -\ people if [they] try to walk around on their own. I really /- <-
-> -/ wonder why XML does not." -- Erik Naggum, comp.lang.lisp \- <-
|--|--------|--------------|----|-------------|------|---------|-----|-|
(c)1996-2002, All rights reserved. Disclaimer available upon request.

Rolf Wester

unread,
Apr 19, 2002, 6:27:47 AM4/19/02
to

Rahul Jain wrote:

> Rolf Wester <wes...@ilt.fhg.de> writes:
>
> > (defun read-file (file)
> > (with-open-file (in file :direction :input)
> > (loop for z = (read in nil 'eof)
> > until (eql z 'eof)
> > collect z)))
>
> > Evaluation took:
> [...]
> > [Run times include 202.83 seconds GC run time]
> [...]
> > 402732904 bytes consed.
>
> Do you know before-hand the exact dimensions of the array you'll be
> putting the floats into?
>
> If you put them into a specialized array, you might be able to use the
> commonly available parse-float code tweaked such that it operates only
> on unboxed floats and the float in the destination array.
>

Thank you for your hint. I'm going to try it.

Regards

Rolf Wester

Pierre R. Mai

unread,
Apr 19, 2002, 10:52:02 AM4/19/02
to
Rolf Wester <wes...@ilt.fhg.de> writes:

> > If it really is just a file containing floats, wouldn't something like
> > (not tested):
> >
> > (loop for zeile = (read in nil 'eof) and
> > until (eql zeile 'eof)
> > collect zeile)
> >
> > work better? Here you get a list of floats. Isn't that what you
> > really wanted to begin with?

Actually, since you've got millions of floats, I think that they
should be read directly into a specialized vector, otherwise the list
and boxing overhead is going to be huge, and might be the reason you
are running into the 18b heap limit. On the x86 using a fairly
current CMUCL (post 18c, but pre 18d), I can read 2 million random
single floats (3 per line, making a 20MB file) in around 16-18MB of
heap space, doubles should be double that amount (not tested), with:

(defun read-floats-slow (stream)
(do ((result (make-array 1000 :element-type 'single-float :adjustable t
:fill-pointer 0))
(float (read stream nil stream) (read stream nil stream)))
((eq stream float)
result)
(vector-push-extend float result)))

Reading takes 80s for the file on an AMD K6-2/550.

> That's right. But the problem is that read has to parse the input
> stream without having the information that the next object is a
> float. So this is the time bottle neck, not the reading itself. So
> thats why I looked for alternatives.

A cursory test of that hypothesis turns out to invalidate it. Even if
you strip out all error checking, you can really not squeeze out much
of the tokenization step, since all you need to do in the inner loop
of the finite-state-machine that the reader is, is one table lookup
and a case for each char, and you have to do that even if you only
need to check for a whitespace delimiter. A check with a hacked
version of the CMUCL reader seems to confirm this expectation (time is
more or less unchanged). The expensive part is making the float from
the token, it seems to me. I'll have to investigate if that step can
be made any faster in CMU CL.

Regs, Pierre.

--
Pierre R. Mai <pm...@acm.org> http://www.pmsf.de/pmai/
The most likely way for the world to be destroyed, most experts agree,
is by accident. That's where we come in; we're computer professionals.
We cause accidents. -- Nathaniel Borenstein

Joe Marshall

unread,
Apr 19, 2002, 11:54:57 AM4/19/02
to
Just for kicks I ran a profile reading a file of
30000 double floats with 3 per line. The profile
showed that the vast majority of time was spent in
converting the tokens into floating point numbers.
(This is in AllegroCL, not CMUCL.)

I did a little research and found that it is quite
hard to efficiently and correctly parse floating
numbers. (By correctly, I mean finding the closest
float to the decimal representation. For details
see http://citeseer.nj.nec.com/william90how.html
and http://citeseer.nj.nec.com/gay90correctly.html)

It may be the case that your C library has a faster
mechanism for parsing floating point numbers than
your lisp.


"Rolf Wester" <wes...@ilt.fhg.de> wrote in message

news:3CBFBF03...@ilt.fhg.de...

Duane Rettig

unread,
Apr 19, 2002, 1:00:01 PM4/19/02
to
"Pierre R. Mai" <pm...@acm.org> writes:

> Rolf Wester <wes...@ilt.fhg.de> writes:
>
> > > If it really is just a file containing floats, wouldn't something like
> > > (not tested):

[Actually, from context it looks like it is not a file of floats at
all, but a file of characters with the print-representation of floats]

> > > (loop for zeile = (read in nil 'eof) and
> > > until (eql zeile 'eof)
> > > collect zeile)
> > >
> > > work better? Here you get a list of floats. Isn't that what you
> > > really wanted to begin with?
>
> Actually, since you've got millions of floats, I think that they
> should be read directly into a specialized vector, otherwise the list
> and boxing overhead is going to be huge, and might be the reason you
> are running into the 18b heap limit. On the x86 using a fairly
> current CMUCL (post 18c, but pre 18d), I can read 2 million random
> single floats (3 per line, making a 20MB file) in around 16-18MB of
> heap space, doubles should be double that amount (not tested), with:
>
> (defun read-floats-slow (stream)
> (do ((result (make-array 1000 :element-type 'single-float :adjustable t
> :fill-pointer 0))
> (float (read stream nil stream) (read stream nil stream)))
> ((eq stream float)
> result)
> (vector-push-extend float result)))
>
> Reading takes 80s for the file on an AMD K6-2/550.

80 seconds is a _long_ time. One thing that wasn't mentioned by the OP
was why these floats had to be in their print-representation. Now, of
course, there are good reasons why the file might contain character
representations instead of just the bits - it may be required to be
human-readable, or endian-agnostic, or readable by IEEE-754 and
non-IEEE-754 architectures alike (otherwise, the bit patterns would be
identical).

But if none of these reasons apply, then it might be worthwhile to
change the representation of the floats in the file, and read them
in _directly_ with no parsing. This technique would also tend to be
more accurate, as well, since the bit patterns would be preserved.

I haven't tried it, but I can't imagine such a 20Mb read taking more
than 1 or 2 seconds...

--
Duane Rettig Franz Inc. http://www.franz.com/ (www)
1995 University Ave Suite 275 Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253 du...@Franz.COM (internet)

Erik Naggum

unread,
Apr 19, 2002, 1:11:49 PM4/19/02
to
* "Pierre R. Mai" <pm...@acm.org>

| Actually, since you've got millions of floats, I think that they
| should be read directly into a specialized vector, otherwise the list
| and boxing overhead is going to be huge, and might be the reason you
| are running into the 18b heap limit. On the x86 using a fairly
| current CMUCL (post 18c, but pre 18d), I can read 2 million random
| single floats (3 per line, making a 20MB file) in around 16-18MB of
| heap space, doubles should be double that amount (not tested), with:
|
| (defun read-floats-slow (stream)
| (do ((result (make-array 1000 :element-type 'single-float :adjustable t
| :fill-pointer 0))
| (float (read stream nil stream) (read stream nil stream)))
| ((eq stream float)
| result)
| (vector-push-extend float result)))
|
| Reading takes 80s for the file on an AMD K6-2/550.

If you start off with a vector that is pretty close in size to what you
expect, you will also avoid all the overhead of copying the vector you
have just extended several times. Depending on the implementation, you
may either extend the vector a million times (Allegro CL, with a default
extension of 20) or about 15-20 times (CMUCL doubles the vector's size),
or something entirely different; neither CLISP nor LWL provide reasonably
readable disassembly to understand what they are doing. In any case, if
you have any means of waiting until you know how many floats to allocate
space for, you will be better off. One way to do this is actually to
allocate, fill, and collect some fairly large "chunks" of the total, like
1024-element vectors. At the end of the read phase, you know how much
space you need, and you can write a specialized accessor that uses the
1024-element vectors directly in a two-level vector (note the simple
10-bit shift instead of an expensive division), or you can copy them all
into one vector of known size. This _should_ have a significant effect
on the time it takes to read all the data, but my guess is that it will
all pale in comparison to what you wil do with all this stuff in memory.

///
--
In a fight against something, the fight has value, victory has none.
In a fight for something, the fight is a loss, victory merely relief.

Post with compassion: http://home.chello.no/~xyzzy/kitten.jpg

Bulent Murtezaoglu

unread,
Apr 19, 2002, 1:34:10 PM4/19/02
to
>>>>> "JM" == Joe Marshall <prunes...@attbi.com> writes:
[...]
JM> I did a little research and found that it is quite hard to
JM> efficiently and correctly parse floating numbers. (By
JM> correctly, I mean finding the closest float to the decimal
JM> representation. [...]

Yes, but if you assume that the numbers read have been printed by some
other program that uses the same internal float representation, then this
complication disappears, does it not?

That is, given A.BdEE, printed from an IEEE double in decimal radix with B l
digits log, it seems we should be able to assume that (* (+ Ad0 Bd-l) 1dEE)
will retrieve the original IEEE double. At least I don't see how this would
fail.

cheers,

BM

Duane Rettig

unread,
Apr 19, 2002, 2:00:01 PM4/19/02
to
"Joe Marshall" <prunes...@attbi.com> writes:

> Just for kicks I ran a profile reading a file of
> 30000 double floats with 3 per line. The profile
> showed that the vast majority of time was spent in
> converting the tokens into floating point numbers.
> (This is in AllegroCL, not CMUCL.)
>
> I did a little research and found that it is quite
> hard to efficiently and correctly parse floating
> numbers. (By correctly, I mean finding the closest
> float to the decimal representation. For details
> see http://citeseer.nj.nec.com/william90how.html
> and http://citeseer.nj.nec.com/gay90correctly.html)

If you are on Allegro CL and are running at least 5.0.1 with
the "more exact floats" patch (or a later version of Allegro CL),
you can set either excl::*maintain-float-write-read-consistency*
or excl::*print-floats-exactly* to nil to likely get faster
(thouugh less accurate) operation. Since the text of the patch
log entry is apropos, I am including it below.

> It may be the case that your C library has a faster
> mechanism for parsing floating point numbers than
> your lisp.

It may be that the former is less accurate as well.

I still maintain that the fastest and most accurate way to treat
floats is to _stop_ converting between the bit representations
and the string notations altogether.


Patch Log entry:

Mon Jan 24 17:28:47 PST 2000
Patch: acl5016.dll (libacl5016.* for unix platforms)
This is called the "more exact floats" patch. It was inspired by
Jon L White's paper at LUGM 1999. It increases accuracy
of floating-point reading and writing, as well as the speed. Installed
along with p2a007.001 (above), it also redefines the following constants:
most-positive-double-float most-negative-double-float
short-float-epsilon single-float-epsilon
double-float-epsilon long-float-epsilon
short-float-negative-epsilon single-float-negative-epsilon
double-float-negative-epsilon long-float-negative-epsilon
to be their more correct values. Write/read consistency is
achieved with the default settings (but see the Impact statement).
Print accuracy is guaranteed according to "How to Print Floating
Point Numbers Accurately" (Steele/White, ACM, 1990), and read
accuracy is accurate to the nearest bit in most cases, but never
more than one bit off.
Impact: This is a major patch, and should not be loaded when
getting ready for production. There is a speed impact, some good
and some bad (but only floating point read/write is affected). The
printer conses about half as much as before when floats with small
exponents are printed, and about twices as much when the exponents
are large. The reader conses much, much more than before, since a
brute force method is used to generate floats that maintain perfect
write/read consistency with the printer. This situation can be
mitigated dynamically, by setting the internal variable
excl::*maintain-float-write-read-consistency* to nil. This causes
the reader to be much faster and to cons less, but with a possible
loss of accuracy of up to three bit values in the last bit position
in the double-float. The entire patch can be disabled by setting the
internal variable excl::*print-floats-exactly* to nil.

Siegfried Gonzi

unread,
Apr 19, 2002, 3:07:30 PM4/19/02
to
From: "Duane Rettig" <du...@franz.com>

> >
> > Reading takes 80s for the file on an AMD K6-2/550.
>
> 80 seconds is a _long_ time. One thing that wasn't mentioned by the OP
> was why these floats had to be in their print-representation. Now, of
> course, there are good reasons why the file might contain character
> representations instead of just the bits - it may be required to be
> human-readable, or endian-agnostic, or readable by IEEE-754 and
> non-IEEE-754 architectures alike (otherwise, the bit patterns would be
> identical).


I once wrote a program in Common Lisp in order to retrieve floating point
values from a file with the following scheme:

header
12.23343,0.000223,0.2233,122.23333,....
12.2333,NIL,0.2223,2233.22333,...
...

I wrote at the same time also a Python version. I then did a test on
moderately small files, and the Common Lisp version (on ACL 6.1 and
LispWorks) was about as fast as the Python version. The Python version did
the reading and extracting exclusively with Python functions only (my
function converts NIL or N/A to -1.0); except the values have been stored in
a Numeric array.

I got then some improvement hints for my Lisp version (but the performance
boost was only 40 or 50%). The Lisp and Python version were about the same
when it comes to programming style.

Today I tried it again: a 7.5 MB file with 34 columns delimeted by ",":

Celeron 1000MHz, 256MB RAM:

ACL 61: 90sec

And to my surprise Python2.2: 30sec

a same 7.5MB file but now with only 3 floats in one row (clearly, actually
more lines now):

ACL 6.1: 60sec

Python2.2: 20sec

[There is no difference in ACL 6.1 between normal and fastest compiler
settings; at least in this specific case]

The Lisp code is without any declarations and can surely be made faster
(though: experts only). But it was a surprise that Python performed very
well.

I still do not buy the alleged debunking of a so called Lisp myth: "Common
Lisp is slow". Every time where Common Lisp performs well, there are 100
other situations where Common Lisp performs not that good.

I personally do not have any problems with that performance situation (I
would even say people in general do not care; how would you otherwise
explain Python's success?); but people should be more honest or otherwise
one should show the original poster how to improve his performance
bottleneck with a factor of 10 or even 100.

S. Gonzi


Erik Naggum

unread,
Apr 19, 2002, 3:16:44 PM4/19/02
to
* Bulent Murtezaoglu

| Yes, but if you assume that the numbers read have been printed by some
| other program that uses the same internal float representation, then this
| complication disappears, does it not?

No. The problem lies in the rounding errors that can easily be
introduced by the division and multiplication by (powers of) 10. You
have to be _very_ careful with these operations to avoid introducing
errors. If done incorrectly, they may even be relatively large and lead
to propagating problems.

| That is, given A.BdEE, printed from an IEEE double in decimal radix with
| B l digits log, it seems we should be able to assume that (* (+ Ad0 Bd-l)
| 1dEE) will retrieve the original IEEE double. At least I don't see how
| this would fail.

Then try it out. This is easy as long as you have an ample supply of
precision bits, but that is not true for the least significant bits.

Bulent Murtezaoglu

unread,
Apr 19, 2002, 3:34:18 PM4/19/02
to
>>>>> "SG" == Siegfried Gonzi <siegfri...@kfunigraz.ac.at> writes:
[...]
SG> I got then some improvement hints for my Lisp version (but the
SG> performance boost was only 40 or 50%). The Lisp and Python
SG> version were about the same when it comes to programming
SG> style.

If you show us the code we'll have a better idea how exactly you are doing it.

[...]
SG> I still do not buy the alleged debunking of a so called Lisp
SG> myth: "Common Lisp is slow". Every time where Common Lisp
SG> performs well, there are 100 other situations where Common
SG> Lisp performs not that good. [...]

Surely you are making these numbers up. I have a messy late-night hack that
did something similar for Doug's shootout at

http://www.bagley.org/~doug/shootout/bench/moments/

which performs at more than 5X Python's speed. (It is buggy because if you
have actual doubles in the input file, you will lose precision. I'd fix it
if Doug was maintaining the programs but you can do the fix yourself by
making sure the partial results are kept in doubles not fixnums).

Now that "Common Lisp is Slow" FUD is being spread again, I might do this
"read doubles from a file" problem later this weekend and post the results.
Most likely the slowness you guys are seeing is coming from (1) _not_ doing
what atof variants do, that is using a robust and felxible parser like read
(2) generating too much garbage by using read-line.

cheers,

BM

Marco Antoniotti

unread,
Apr 19, 2002, 3:37:40 PM4/19/02
to

"Siegfried Gonzi" <siegfri...@kfunigraz.ac.at> writes:

Well, let's see them.

...

> I still do not buy the alleged debunking of a so called Lisp myth: "Common
> Lisp is slow". Every time where Common Lisp performs well, there are 100
> other situations where Common Lisp performs not that good.

You may be right that CL performs bad in certain situations depending
on the implementation you choose. But let's see which ones they are.

> I personally do not have any problems with that performance situation (I
> would even say people in general do not care; how would you otherwise
> explain Python's success?); but people should be more honest or otherwise
> one should show the original poster how to improve his performance
> bottleneck with a factor of 10 or even 100.

Well, again we have not seen your code.

Having said that. I do agree that there is room for improvement.
However, in CL "improvement" (quotes mandatory) is more difficult than
in single implementation languages (I know of JPython and other stuff;
they essentially refer to the "reference" implementation).

Bulent Murtezaoglu

unread,
Apr 19, 2002, 4:03:40 PM4/19/02
to
>>>>> "EN" == Erik Naggum <er...@naggum.net> writes:
[...]

BM> That is, given A.BdEE, printed from an IEEE double in
BM> decimal radix with B l digits log, it seems we should be
BM> able to assume that (* (+ Ad0 Bd-l) 1dEE) will retrieve the
BM> original IEEE double. At least I don't see how this would
BM> fail.

EN> Then try it out. This is easy as long as you have an ample
EN> supply of precision bits, but that is not true for the least
EN> significant bits.

Yup, I think you are correct. Thank you. It is very easy to
introduce bugs by not working things out, I just discovered and fessed
up to one such bug elsewhere. Now I am not even sure the 53 bits IEEE
double gives you can be used to hold the conversions of A and B above
and not cause you to lose precision in the addition.

I wonder whether people who use atof for this purpose realize how the
obvious solutions lose precision. Harbison && Steele (4th ed.) do not
provide guidance on whether printf->strtod loses precision. Dejanews
does point to some discussions, though.

cheers,

BM


Nils Goesche

unread,
Apr 19, 2002, 4:17:12 PM4/19/02
to
In article <a9pptl$pul$1...@newsreader1.netway.at>, Siegfried Gonzi wrote:
> I once wrote a program in Common Lisp in order to retrieve floating point
> values from a file with the following scheme:
>
> header
> 12.23343,0.000223,0.2233,122.23333,....
> 12.2333,NIL,0.2223,2233.22333,...
> ...
> The Lisp code is without any declarations and can surely be made faster
> (though: experts only). But it was a surprise that Python performed very
> well.
>
> I still do not buy the alleged debunking of a so called Lisp myth: "Common
> Lisp is slow". Every time where Common Lisp performs well, there are 100
> other situations where Common Lisp performs not that good.

I am not sure if this is a meaningful test at all. Whenever I wrote
something that did something interesting with megabytes of floats,
the computations took hours to complete (even in C). Just timing
how long it takes to read them in seems pointless to me.
And the code that does the computations should be really fast, if
written correctly.

Regards,
--
Nils Goesche
"Don't ask for whom the <CTRL-G> tolls."

PGP key ID 0x42B32FC9

Erik Naggum

unread,
Apr 19, 2002, 5:32:32 PM4/19/02
to
* "Siegfried Gonzi"

| I still do not buy the alleged debunking of a so called Lisp myth:
| "Common Lisp is slow". Every time where Common Lisp performs well, there
| are 100 other situations where Common Lisp performs not that good.

You have previously made the point very strongly that you do not wish to
become a competent programmer. It is therefore not surprising if you do
not yet understand that some programming languages and environments are
optimized for writing short programs that are fast and some programming
languages and environments are optimized for writing short programs that
are correct. To make a fast program correct takes effort that requires a
good programmer. To make a correct program fast also takes effort that
requries a good programmer. You are not, by your own admission, a good
programmer, nor do you wish to become one. That means that you what you
think or believe about programming languages affects and applies to short
programs that take no effort to write and which accept _all_ the premises
of the language developers. Such programs are rarely useful, but for
some reason, incompetent programmers who do not even know how to _think_
about programming always believe that short programs test the language
the most. What short programs show is what the language has been
optimized for in terms of expressivity, and what one language can do in a
short program is definitely not a good standard to apply to any other.

Example: When a C++ programmer has discovered that "foo" + "bar" yields
"foobar", he will be very disappointed that (concatenate 'string "foo'
"bar") is the Common Lisp equivalent, but that is because he has bought
into the C++ mindset and all the premises underlying its choices. When a
Common Lisp programmer discovers that a matrix represented with lists may
be transposed with (apply #'mapcar #'list <matrix>), he will be highly
disappointed by the effort it takes to both represent and transpose a
matrix in C++.

It takes both education and experience to realize how your understanding
of any programming language has been influenced by the mindset and the
tacit premises acquired by default from your first language, which is why
I so strongly recommend that people approach every programming language
as the first, without all the mental baggage from a previous language.
This is particularly important when one accepts without question the
short programs of one language and only seek to reimplement them in each
new language, instead of trying to discover short programs in each on
their own terms.

It is not Common Lisp that does not perform well, it is the programmer
who does not know how to make a Common Lisp program perform well, but
without skills, a Common Lisp program can perform abysmally, yet produce
correct results. Likewise, if a C++ program is buggy and crashes, it is
the programmer who does not know how to make a C++ program work
correctly, but without skills, a C++ program can crash very quickly.

| I personally do not have any problems with that performance situation (I
| would even say people in general do not care; how would you otherwise
| explain Python's success?); but people should be more honest or otherwise
| one should show the original poster how to improve his performance
| bottleneck with a factor of 10 or even 100.

Perhaps you should leave this issue to (those who would like to become)
real programmers? We have very little use for the kind of "alternative
programming" that is like "alternative medicine" where people of no clue
and little intelligence believe all sorts of things based on random
coincidences and a massive lack of understanding of anything medical.

Joe Marshall

unread,
Apr 19, 2002, 3:41:36 PM4/19/02
to

"Duane Rettig" <du...@franz.com> wrote in message
news:4n0vz6...@beta.franz.com...

> "Joe Marshall" <prunes...@attbi.com> writes:
>
> > It may be the case that your C library has a faster
> > mechanism for parsing floating point numbers than
> > your lisp.
>
> It may be that the former is less accurate as well.

I suspect it is. Write/read consistency isn't a high
priority for C.

> I still maintain that the fastest and most accurate way to treat
> floats is to _stop_ converting between the bit representations
> and the string notations altogether.

Clearly.

For the sake of curiosity, does the Franz patch use the
techniques described in David Gay's paper?

Joe Marshall

unread,
Apr 19, 2002, 3:31:01 PM4/19/02
to

"Bulent Murtezaoglu" <b...@acm.org> wrote in message
news:87g01r3...@nkapi.internal...

> >>>>> "JM" == Joe Marshall <prunes...@attbi.com> writes:
> [...]
> JM> I did a little research and found that it is quite hard to
> JM> efficiently and correctly parse floating numbers. (By
> JM> correctly, I mean finding the closest float to the decimal
> JM> representation. [...]
>
> Yes, but if you assume that the numbers read have been printed by some
> other program that uses the same internal float representation, then this
> complication disappears, does it not?

The surprising answer is no.

>
> That is, given A.BdEE, printed from an IEEE double in decimal radix with B
l
> digits log, it seems we should be able to assume that (* (+ Ad0 Bd-l)
1dEE)
> will retrieve the original IEEE double. At least I don't see how this
would
> fail.

Take a look at the papers I cited.

Duane Rettig

unread,
Apr 19, 2002, 5:00:00 PM4/19/02
to
"Siegfried Gonzi" <siegfri...@kfunigraz.ac.at> writes:

> From: "Duane Rettig" <du...@franz.com>
>
> > >
> > > Reading takes 80s for the file on an AMD K6-2/550.
> >
> > 80 seconds is a _long_ time. One thing that wasn't mentioned by the OP
> > was why these floats had to be in their print-representation. Now, of
> > course, there are good reasons why the file might contain character
> > representations instead of just the bits - it may be required to be
> > human-readable, or endian-agnostic, or readable by IEEE-754 and
> > non-IEEE-754 architectures alike (otherwise, the bit patterns would be
> > identical).
>
>
> I once wrote a program in Common Lisp in order to retrieve floating point
> values from a file with the following scheme:
>
> header
> 12.23343,0.000223,0.2233,122.23333,....
> 12.2333,NIL,0.2223,2233.22333,...
> ...
>
> I wrote at the same time also a Python version.

[ ... ]

I ignored your numbers because unless you give me code and data files,
the numbers are unreproducible and represent FUD only (even though I am
not at all surprised that our floating point reader is slower than
Python's, while potentially being more accurate as well). Besides,
what I was getting to was precisely what you did not do. My suggestion
(since the OP was interested in speed) was to _not_ represent the floats
in the file as ASCII, but as bits. See below.

> I personally do not have any problems with that performance situation (I
> would even say people in general do not care; how would you otherwise
> explain Python's success?); but people should be more honest or otherwise
> one should show the original poster how to improve his performance
> bottleneck with a factor of 10 or even 100.

OK, Easy enough. Note the following:

- If you inspect a single-float, it has a particular bit pattern:

CL-USER(9): :i 12.23343
single-float = 12.23343 [#x4143bc21] @ #x71513732
CL-USER(10):

This knowledge can be used to check that what is retrieved is
exactly the same float value as what had been written.

- I used our simple-streams implementation to create and read the
file. But it would make just as much sense to use read-sequence
to read the file, if desired. A standard or Gray-stream file
would, however, have to be opened with a single-float element-type.

- To create the small file, I used the fact that write-vector will
write a single-float in the same way it writes a one-element
specialized single-float vector.

- This file was written in Allegro CL 6.1 on an x86 platform running
linux. This explains the different byte-ordering, though all the
bytes are present and consistent for similar-byte-ordered machines.
Users of big-endian machines will see slightly different results.
An endian-swap keyword on both write-vector and read-vector allows
the file to be either read or written in the desired endian order.

Finally, note that although I took what data I could, I did not time
it, because what you offered was nowhere near the 7.5 Mb file you used.
So unless/until you provide a file or algorithm for creating the same
file, the timing comparison will have to wait.

CL-USER(1): (setq xxx (open "xxx" :direction :output :if-exists :supersede))
#<FILE-SIMPLE-STREAM #p"xxx" for output pos 0 @ #x7150dcba>
CL-USER(2): (dolist (f '(12.23343 0.000223 0.2233 122.23333
12.2333 -1.0 0.2223 2233.22333))
(write-vector f xxx))
NIL
CL-USER(3): (close xxx)
T
CL-USER(4): (shell "od -x xxx")
0000000 bc21 4143 d51b 3969 a8c1 3e64 7777 42f4
0000020 bb99 4143 0000 bf80 a29c 3e63 9393 450b
0000040
0
CL-USER(5): (setq xxx (open "xxx"))
#<FILE-SIMPLE-STREAM #p"xxx" for input pos 0 @ #x715115da>
CL-USER(6): (setq vec (make-array (/ (file-length xxx) 4)
:element-type 'single-float))
#(0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0)
CL-USER(7): (read-vector vec xxx)
32
CL-USER(8): vec
#(12.23343 2.23e-4 0.2233 122.23333 12.2333 -1.0 0.2223 2233.2234)
CL-USER(9):


One final comment. The last single-float you provided, 2233.22333,
is not representable as a single-float. It's closest representation
is 2233.2234, and is bounded in the low side by

CL-USER(15): 2233.22325
2233.2231
CL-USER(16): 2233.22326
2233.2234
CL-USER(17):

and on the high side by

CL-USER(25): 2233.22354
2233.2234
CL-USER(26): 2233.22355
2233.2236
CL-USER(27):

The bit patterns for each of these floats are
#x450b9393 for the closest float (2233.2234)
#x450b9392 for the lower value (2233.2231)
#x450b9394 for the higher value (2233.2236)

Bulent Murtezaoglu

unread,
Apr 19, 2002, 5:41:22 PM4/19/02
to
>>>>> "BM" == Bulent Murtezaoglu <b...@acm.org> writes:
[...]
BM> Now that "Common Lisp is Slow" FUD is being spread again, I
BM> might do this "read doubles from a file" problem later this
BM> weekend and post the results. Most likely the slowness you
BM> guys are seeing is coming from (1) _not_ doing what atof
BM> variants do, that is using a robust and felxible parser like
BM> read (2) generating too much garbage by using read-line.

Hehe, just to talk to myself some: turns out I am wrong in that I don't
even know how to do things better than atof (or even as good probably)!
Betcha I can still make things go fast, but that's easy when you don't
produce the correct result. I overlooked many of the issues when I wrote
this (leave it EN to point that out in 2 minutes). Sorry.

Eevn googling for print-read cosistency proves enlightening as this has
been talked about many times in cll and elsewhere.

B<thanking God I don't do numerical stuff for a living>M

Erik Naggum

unread,
Apr 19, 2002, 5:55:19 PM4/19/02
to
* Bulent Murtezaoglu

| Now I am not even sure the 53 bits IEEE double gives you can be used to
| hold the conversions of A and B above and not cause you to lose precision
| in the addition.

This is why most reasonable implementations of floating-point hardware
offer much more internal precision, so that intermediate results do not
introduce lots of precision-related errors. If, as a compiler writer,
you are very good at this stuff, you use _only_ the longer internal forms
for as long as you possibly can, storing only single- or double-float
numbers when you have to. Common Lisp supports a long-float format that
should be mapped to this longer form, which is usually an 80-bit format
instead of the 64-bit double-float format used by default. (_Computing_
with single-float these days is just plain nuts, even though it may make
sense to store values of that type.) It appears reasonable to expect
that double-floats will read and print correctly when the underlying
representation used for the computation is such a long-float, but then it
would be wasteful to believe the same for the (internal) long-floats.

Siegfried Gonzi

unread,
Apr 19, 2002, 4:53:45 PM4/19/02
to
From: "Marco Antoniotti" <mar...@cs.nyu.edu>

> Well, let's see them.

Before commencing, some remarks: I have been reprimanded by a Lisp expert (I
believe) for my Common Lisp code. I personally take the stance that there is
no one way in a language which attributes itself as: "Lisp supports whatever
language style you prefer".

First the Python code:


def readInArray(file,nHeader=0,
whatDel=',',NaN='NIL',NaN_NumPy=-1.0,row_ano='nein'):
"""
Read a delimited file.

Header 1
Header 2
Header n
gerlitzen,12.233,2.3344,1.22333,23.233,...
davos,343.455,67.8788,NIL,23.445,...
...

The function retrieves floating point values and NaN becomes
-1.0.

Function:
readInArray(file,nHeader=0,whatDel=',',NaN='NIL',NaN_NumPy=-1.0,row_ano='nei
n')
Parameter: file..file
nHeader...number of header lines
whatDel...delimiter, e.g. ','
NaN...how is NaN represented in a file, e.g. NIL
NaN_NumPy...what value should NaN get?
row_ano...should the function preserve the first rows (which are possibly
strings; see above)
Return: An array with the floating point value or a list with the
following scheme: list[0] = list of first rows,
e.g. ['gerlitzen','davos',...], list[1] = array with the floating point
values. row_ano decides whether
the array alone or a list with annotations and array is the result; default
is the array only.

Bem.: The function relies on the Numeric Python library!

(C) Siegfried Gonzi 2002
"""
f = open(file,'r')
s = f.readlines()
#
cols = 1 + s[nHeader].count(whatDel)
rows = len(s)
if row_ano == 'nein':
ergArray = zeros((rows,cols), Float)
else:
ergArray = zeros((rows,(cols-1)),Float)
ergAno = []
count_rows = nHeader
for x in range(nHeader,rows):
start = 0
floatString = s[x]
if not floatString.isspace():
count_rows = count_rows + 1
for y in range(cols):
if y < (cols - 1):
indx = floatString[start:].find(whatDel)
dummy = floatString[start:(start+indx)]
if dummy == NaN:
if row_ano == 'nein':
ergArray[x,y] = NaN_NumPy
else:
ergArray[x,(y-1)] = NaN_NumPy
else:
if row_ano == 'nein':
ergArray[x,y] = float( dummy )
else:
if y < 1:
ergAno.append( dummy )
else:
ergArray[x,(y-1)] = float( dummy )
start = start + indx + 1
else:
dummy = floatString[start:-1]
if dummy == NaN:
if row_ano == 'nein':
ergArray[x,y] = NaN_NumPy
else:
ergArray[x,(y-1)] = NaN_NumPy
else:
if row_ano == 'nein':
ergArray[x,y] = float( dummy )
else:
ergArray[x,(y-1)] = float( dummy )
# Ergebnis entweder als Array oder als Liste: Liste-Ano + ergArray
# hängt von row_ano ab
if row_ano == 'nein':
return ergArray[nHeader:count_rows,]
else:
return [ergAno,ergArray[nHeader:count_rows,]]


Use it for example for a file without any header lines and ',' as delimiter
and NIL as NaN:

s = readInArray('C:/Wissenschaft/dummy.txt')


Next the Lisp code (the original version and the suggested improvement; but
I did not test the improvement; I had been only teached that my original
version is plagued by to much "consing").

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;
;; Count lines in a file:
;;
;; Function: (count-lines dateiFile :nHeader 0 :break-point "999")
;; Parameter: dateiFile...file
;; :nHeader...number of header lines
;; :break-point...is there a breaking point?
;;
;; Return: number of lines
;;
;; (C) Siegfried Gonzi 2002
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;
(defun count-lines (datei &key (nHeader 0) (break-point "999"))
(let ((n 0)
(next-line t))
(with-open-file (infile datei)
(read-header-lines infile nHeader)
(loop while(not (eq next-line nil)) do
(setf next-line (read-line infile nil))
(if (equal next-line break-point)
(setf next-line nil))
(setf n (+ n 1))))
n))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Skip n header lines:
;;
;; Function: (read-header-lines dateiFile (nHeader 0))
;; Parameter: dateiFile...file
;; nHeader (optional)...number of header lines
;;
;; Ausgabe: filepointer right after the last header line
;;
;; (C) Siegfried Gonzi 2001
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defun read-header-lines (infile &optional (nHeader 0))
(loop :for i :from 0 :below nHeader :do
(read-line infile nil)))


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Number of columns in a data string:
;;
;; 12.23 12.23 23.34 2.34
;;
;; Function: (get-cols string)
;; Parameter: string...string
;;
;; Return: number of columns (e.g. 3 in the above example)
;;
;; (C) Siegfried Gonzi 2001
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defun get-cols (string)
(let ((cols 0)
(s-el t))
(with-input-from-string (s string)
(loop while(not (equal s-el nil)) do
(setf s-el (read s nil))
(setf cols (1+ cols))))
(1- cols )))


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Evaluate the offset of a specific
;; character:
;;
;; z.B.: 1.23,2.233,23.34,45.56
;;
;; returns: (4 10 16)
;;
;; Function: (get-del-list numberString del)
;; Parameter: numberString...string
;; del...delimiter (z.B. ",")
;;
;; Ausgabe: list of positions
;;
;; (C) Siegfried Gonzi 2002
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
(defun get-del-list (numberString del)
(let* ((n (length numberString))
(el 0)
(el-list nil))
(loop :for i :from 0 :below n :do
(setf el (aref numberString i))
(if (string= el del)
(setf el-list (append el-list (list i)))))
el-list))


;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;
;; Read a file of the following scheme:
;;
;; Kopfzeile 1
;; Kopfzeile 2
;; Kopfzeile n
;; 12.23,23.34,1.234,34.456,2.345,56.67,...
;; 1.23,233.44,3.45,2.34,45.5677,6.778,...
;; ...
;;
;; Funktion: (read-in-array-del dateiFile :nHeader 0 :break-point "999"
:del ",")
;; Parameter: dateiFile...file;
;; e.g.: C:/Wissenschaft/R/D/....
;; :nHeader...header lines which should be skipped
;; :break-point...should the function stop before end of
file?
;; If so, then set a break point, e.g.
"999"
;; :del...delimiter (e.g. ',')
;; External functions: count-lines, read-header-lines, get-del-list,
;;
;; Return: array, which holds the retrieved values
;;
;; (C) Siegfried Gonzi 2002
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;
(defun read-in-array-del (datei &key (nHeader 0) (break-point "999") (del
","))
(let* ((rows (- (count-lines datei :nHeader nHeader :break-point
break-point) 1))
(cols nil)
(val nil)
(next-line nil)
(indx-list ())
(indx-start nil)
(indx-end nil)
(array nil))
(with-open-file (infile datei)
(read-header-lines infile nHeader)
(loop :for i :from 0 :below rows :do
(setf next-line (read-line infile nil))
(setf indx-list (get-del-list next-line del))
(setf indx-list (append indx-list (list (length
next-line))))
(if (< i 1)
(progn
(setf cols (length indx-list))
(setf array (append (make-array (list rows
cols))))))
(setf indx-start 0)
(setf indx-end -1)
(loop :for j :from 0 :below cols :do
(setf indx-start (+ indx-end 1))
(setf indx-end (- (nth j indx-list) 0))
(with-input-from-string (s next-line :index k
:start indx-start :end indx-end)
(setf val (read s)))
(setf (aref array i j) val))))
array))

Use it (after compiling) as (same file as Python file):

(time (setf g (read-in-array-del 'C:/Wissenschaft/dummy.txt')))

Note: The Python timer is always wrong; therefore do the Python timing with
a watch.

The Lisp version is not very safe, because often in a file the last line is
only a blank line. As opposed to my Python version the Lisp version does not
test that. I mainly work with Python and did therefore not improve my Lisp
version in this respect.

So, my plane has been to post the improved Lisp version here. I got it some
weeks ago in form of a private email. I do not want to name the person here.
Sorry, but I can remember that the version improved my weak version up to
40% or so. The version was criptic and hence I would fail to try to remember
what I got. I by inccident I deleted the email.


S. Gonzi

.


Siegfried Gonzi

unread,
Apr 19, 2002, 5:02:27 PM4/19/02
to
"Nils Goesche" <car...@cartan.de> schrieb im Newsbeitrag
news:a9pu08$5gcr4$1...@ID-125440.news.dfncis.de...

> I am not sure if this is a meaningful test at all. Whenever I wrote
> something that did something interesting with megabytes of floats,
> the computations took hours to complete (even in C). Just timing
> how long it takes to read them in seems pointless to me.
> And the code that does the computations should be really fast, if
> written correctly.

I did not test it for the sake of testing. I use my Python function nearly
on a daily basis.

My Python cycle is as follows: Reading the floats into any array ->
calculations in Python -> graphical representation with the *high quality
plotting* library DISLIN (for Python: it is free).

But I have to admit that my calculations, mostly, are not very time
consuming. For example, in the last few days I wrote a wrapper function in
Python with the assistance of SWIG in order to call a Mie calculation
function (written in C). I use then this function in order to calculate some
relevant parameters for radiative forcing (climatology). The dimension of my
vectors are not above 1000 elements. Maybe vectors with greater dimensions
will be more of a hindrance in Python.


S. Gonzi


Duane Rettig

unread,
Apr 19, 2002, 7:00:01 PM4/19/02
to
"Joe Marshall" <prunes...@attbi.com> writes:

> "Duane Rettig" <du...@franz.com> wrote in message
> news:4n0vz6...@beta.franz.com...
> > "Joe Marshall" <prunes...@attbi.com> writes:
> >
> > > It may be the case that your C library has a faster
> > > mechanism for parsing floating point numbers than
> > > your lisp.
> >
> > It may be that the former is less accurate as well.
>
> I suspect it is. Write/read consistency isn't a high
> priority for C.
>
> > I still maintain that the fastest and most accurate way to treat
> > floats is to _stop_ converting between the bit representations
> > and the string notations altogether.
>
> Clearly.
>
> For the sake of curiosity, does the Franz patch use the
> techniques described in David Gay's paper?

No, but scanning through his paper (and noting that we consulted
the same Steele/White paper as ref [12], as well as getting JonL
White's help in the patch), most of it looks straighforward to me,
and probably has the same roots as our original implementation
(which preceded Gay's paper) as well as the patch (which postdates
Gay's paper). The main goal, however, was accuracy, not speed, so
I wouldn't be surprised if the technique we used couldn't be sped
up some more...

Joe Marshall

unread,
Apr 19, 2002, 10:07:33 PM4/19/02
to

"Bulent Murtezaoglu" <b...@acm.org> wrote in message
news:87it6nz...@nkapi.internal...

> Betcha I can still make things go fast, but that's easy when you don't
> produce the correct result.

Just return 0.0. Really fast and a reasonable approximation to
many real numbers (in the grand scale of things).

Christopher C. Stacy

unread,
Apr 20, 2002, 4:34:16 AM4/20/02
to
>>>>> On Fri, 19 Apr 2002 21:52:40 +0100, Siegfried Gonzi ("Siegfried") writes:
Siegfried> Before commencing, some remarks: I have been reprimanded
Siegfried> by a Lisp expert (I believe) for my Common Lisp code. I
Siegfried> personally take the stance that there is no one way in a
Siegfried> language which attributes itself as: "Lisp supports
Siegfried> whatever language style you prefer".

Siegfried,

I don't think you have understood what is meant by the idea
that Lisp supports various different styles of programming.

It certainly does not mean that you can write your program in
exactly the same way as you would write it in another language
and expect it to somehow magically perform well. Lisp does
not read your mind, nor is it a 22d century AI program that
can figure out what you wanted, invent an efficient way to
do it, and rewrite your program for you.

On the other hand, it is very easy to write bad Lisp code, and
that's what often results when people don't have a thorough
understanding of the language or try to transliterate from
some other language.

To write high performance Lisp code, you have to understand Lisp
very well, and you also have to be familiar with the particular
system (compiler) that you are using. Lisp makes no guarantees
that anything in particular is efficient -- that is a quality
issue that depends on your vendor. Algorithms that you might
choose to implement have certain characteristics regardless of
what programming language is used, but beyond that you have to
understand what you are asking Lisp to do for you and write your
program accordingly.

When people say that Lisp supports different styles, they are saying
that, for example, Lisp has a way of doing the class-instance-method
kind of programming, but it also supports plain old imperative style.
Another example is that Lisp supports both recursion and iteration.
(The recursion is not guaranteed by the language to be efficient,
although a given vendor might provide that guarantee for you).
Another example is that Lisp allows a functional-style of programming,
but you are not restricted to that. Finally, Lisp allows you to
extend the language (through its powerful macro facility), so that
you can create programming constructs that reflect the way that you
like to think about your program. Lisp has many times been extended
to support declarative (eg. logic-solvers) features.

On the subject of efficiency: Lisp is about expressiveness first,
and about efficiency second. If efficiency is the most important
thing to you, then Lisp is a poor choice of language. Programming
languages that are about efficiency, such as C, force you to make
decisions early in the process of designing and writing your program.
By contrast, Lisp is about making things very malleable and allowing
you to defer those kinds of decisions. This is why in C you must
declare variable data types before you can do anything, but in Lisp
you don't ever have to declare them. Lisp is good for projects that are
complex and that may have changing or poorly understood requirements.

In Lisp, one worries about efficiency when choosing overall algorithms.
For example: whether a linked-list or an array will be used; when it is
acceptable to allocate storage, and how to partition the computations.
But you don't worry about optimizing the program until after it is
already working and after there's obviously a problem. In most languages,
that strategy would be a disaster, because you would have already cast
so many decisions into stone. But in Lisp you never made those decisions,
and things are flexible. You carefully identify the bottlenecks and
then rewrite them, but only when it turns out to be an actual problem.

Another consideration is that Lisp is a good language when you want
the right result more than you want the fast result. For example,
Lisp arithmetic and transendantal functions will give more correct
results than programs written in other languages. The language is
well-defined in that area. Data types will not overflow or underflow
without errors, and data types will not be accidently confused or data
misrepresented (as in C casting). Programs will also not crash from
bugs in manual memory management or reference invalid pointers
(assuming you're doing pure Lisp rather than also calling out to
external languages).

But there's no reason that Lisp has to be inefficient, and some Lisp
compilers produce code that is more efficient than others. When you
find something lacking in performance, you should consult your vendor
to see if they can improve the situation for you.

>>>>> On Fri, 19 Apr 2002 20:07:30 +0100, Siegfried Gonzi ("Siegfried") writes:
Siegfried> I still do not buy the alleged debunking of a so called Lisp myth: "Common
Siegfried> Lisp is slow". Every time where Common Lisp performs well, there are 100
Siegfried> other situations where Common Lisp performs not that good.

"Common Lisp" makes no such claim -- it could not be a meaningful
thing to say that it is fast or slow, because it is not a computer
program and cannot be measured. "Common Lisp" is the name of a
language, not the name of any particular compiler or interpreter.

The claim is that is that Common Lisp does not require programs written
in that language to necessarily be slow, and that particular areas such
as arithmetic and function calling need not be slow. (And this has been
demonstrated in reality plenty of times, whether you are ignorant of that not.)
Whether your program turns out to run slow or fast depends on how well
you designed the program, and the quality of the compiler with respect
to your choice of implementation techniques.

Siegfried> I personally do not have any problems with that performance situation (I
Siegfried> would even say people in general do not care; how would you otherwise
Siegfried> explain Python's success?); but people should be more honest or otherwise
Siegfried> one should show the original poster how to improve his performance
Siegfried> bottleneck with a factor of 10 or even 100.

People did offer some advice to the original poster, and are now also
providing programs to test against your Python program. Since we don't
have your data set, nobody has been able to even replicate your claimed
benchmarks or offer their own. We'll have to wait to see what happens.

Meanwhile, if you're going to throw around language calling people "dishonest",
please cite the people that are you accusing of dishonesty and specify their lies.
(Otherwise, you might consider tempering your remarks until you better understand
what you're talking about and are able to support such accusations.)

Siegfried Gonzi

unread,
Apr 20, 2002, 8:16:09 AM4/20/02
to
----- Original Message -----
From: "Christopher C. Stacy" <cst...@dtpq.com>

> People did offer some advice to the original poster, and are now also
> providing programs to test against your Python program.

My nesreader does not show it. I haven't seen any solution (except that of
Duane R.; but the starting point is a file which is as it is!).

You should not concentrate on the comparison to Python (there are 1
situation were Python performed well: so what?); rather test it again C. I
think this was the motive of the original poster.

> Meanwhile, if you're going to throw around language calling people
"dishonest",
> please cite the people that are you accusing of dishonesty and specify
their lies.
> (Otherwise, you might consider tempering your remarks until you better
understand
> what you're talking about and are able to support such accusations.)

Good try, but I will cut it here. I did post my code for the sake of
completeness. As I wrote: the code was not intented as a benchmark, I really
wrote it in order to use it. I just only wrote my experience and nothing
more. I wrote it also in Clean (even there I had to deal with the same
structure of files) and it is 3 times faster than the Python version (and
the Clean developers even pointed to some improvements, where the code
should run 10 times as fast a the Python version).

My intentiona has been to point out that there are situation were Python is
not necessarily slow (isn't there a study where Python is compared to Common
Lisp and where that magic "80% of C speed" occurs again?); and at the same
time there are1000 situations were Python does not fulfill peoples
expectation (see the Python newsgroup). But there are situations were Python
is fast enough (I can read satellite binary data in Python in a second and
plot the earth data measurements with DISLIN very quickly). But I go not
around and say Python is as fast as C or Fortran; but I do also not say
Common Lisp is slow. I know the arguments from the Forth people: Only 10% of
the mankind is capable of good programming and most of the people are just
stupid and do not understand how to program well in Forth and they always
try to program C in Forth and so and so and so on...(consult your deja news
deposit).


I do not see why people are always that upset when it comes to Common Lisp
and C: If I want C then I must use C (it is hard to circumvent this) and if
I want Lisp then I must use Lisp.

I hope you do not see it as an act of impoliteness but I will close the
discussion here. I am not interested to go into a heated debate; but I would
have no problem to quarrel face to face. Don't get it wrong: for me usnet is
more or like entertainment.

Regards,
S. Gonzi


Erik Naggum

unread,
Apr 20, 2002, 9:12:05 AM4/20/02
to
* "Siegfried Gonzi"

| My nesreader does not show it. I haven't seen any solution (except that of
| Duane R.; but the starting point is a file which is as it is!).

What nonsense! If you are so hung up on performance, a simple solution
is actually to read the file with your favorite speed-freak "language"
and have it produce something that you can use either directly or read as
binary floats. On your own hardware, it is fantastically unlikely that
two different programming languages use different floating-point formats
of the same width, so it is safe to use binary floats between programs.

| I just only wrote my experience and nothing more.

But then people need to be aware of how much you do not want to become
good at what you do. You have made this point yourself several times,
even thinking that being a professional programmer is a bad thing (for
you, obviously, but phrased so it would appear to be a general insult to
all professional programmers). Your experience is that of an unwilling,
uninterested permanent newbie. You will therefore never venture beyond
what the language offers and has optimized for short programs.

Your grievances are based in ignorance and an unwillingness to understand
that what Common Lisp offers without much effort is the ability to write
short, correct programs, _not_ short, fast programs. From what you
write, it is impossible _not_ to conclude that you have failed to grasp
how to write software to begin with, you are only able to make certain
things "work" to your satisfaction and other things not, where the
criteria appear to be quite randomly adopted as opposed to acquired
though a thoughtful process with a particular goal in mind.

| Don't get it wrong: for me usnet is more or like entertainment.

Of course it is. People need to keep this in mind, however, when they
decide to waste their time trying to help you with anything at all, or
when they decide to waste their time to correct your incorrect and
ignorant sputtering of unfounded opinions. Of course, there is much more
"entertainment value" in annoying people than in trying to think on your
own. so people should know what you expect to get out of your postings.

Duane Rettig

unread,
Apr 20, 2002, 12:00:01 PM4/20/02
to

I agreed with your post in general, but

Christopher C. Stacy <cst...@dtpq.com> writes:

> If efficiency is the most important
> thing to you, then Lisp is a poor choice of language.

This sentence, taken as a sound-bite out of context, is completely
wrong. You can have a complement of requirements, with efficiency
being at the top of the list, and Lisp can still be the best choice
for you. What Lisp provides over and above the more efficient
solutions is the ability to also not worry about efficiency at all,
or to come back to it later. This is only a Bad Thing to those who
are not willing to come back to look at efficiency later.

I would rewrite this sound bite to say:

If efficiency is the most important thing to you, and if you're
not willing to do any more than your language requires you to do
in order to get that efficiency, then Lisp is not for you.

Raymond Toy

unread,
Apr 22, 2002, 9:23:17 AM4/22/02
to
>>>>> "Erik" == Erik Naggum <er...@naggum.net> writes:

Erik> * Bulent Murtezaoglu
Erik> | Now I am not even sure the 53 bits IEEE double gives you can be used to
Erik> | hold the conversions of A and B above and not cause you to lose precision
Erik> | in the addition.

Erik> This is why most reasonable implementations of floating-point hardware
Erik> offer much more internal precision, so that intermediate results do not
Erik> introduce lots of precision-related errors. If, as a compiler writer,
Erik> you are very good at this stuff, you use _only_ the longer internal forms
Erik> for as long as you possibly can, storing only single- or double-float
Erik> numbers when you have to. Common Lisp supports a long-float format that
Erik> should be mapped to this longer form, which is usually an 80-bit format
Erik> instead of the 64-bit double-float format used by default. (_Computing_
Erik> with single-float these days is just plain nuts, even though it may make
Erik> sense to store values of that type.) It appears reasonable to expect
Erik> that double-floats will read and print correctly when the underlying
Erik> representation used for the computation is such a long-float, but then it
Erik> would be wasteful to believe the same for the (internal) long-floats.

But doing this has its own problems. If all operations were done
with, say, 80-bit floats, rounding of the final result when stored in
a 64-bit float can produce different results than if all the
computations were done with 64-bit floats.

This shows up on x86 implementations of Lisp where sometimes
double-float-epsilon isn't.

Fortunately, most of my numeric work doesn't really care about these
rounding issues. :-)

Ray

Joe Marshall

unread,
Apr 22, 2002, 11:29:00 AM4/22/02
to

"Raymond Toy" <t...@rtp.ericsson.se> wrote in message
news:4nlmbfk...@rtp.ericsson.se...

>
> But doing this has its own problems. If all operations were done
> with, say, 80-bit floats, rounding of the final result when stored in
> a 64-bit float can produce different results than if all the
> computations were done with 64-bit floats.

Yes, but it won't be less accurate. If all the operations done on
the 64-bit floats are exact, then the 80-bit floats will be exact, too.
Some inexact operations on 64-bit floats will be exact on 80-bit
floats, and this can only improve the precision. Operations that
are inexact on 80-bit floats will be inexact on 64-bit floats, but
the resulting error will be smaller.


There is another benefit to using the extra precision: the regions
of instability in numeric algorithms will be smaller. Naive users
may be far less likely to get bogus answers.

>
> Fortunately, most of my numeric work doesn't really care about these
> rounding issues. :-)

It is an extraordinarily rare case that one would want an answer
that is the same or *worse* than what you could have gotten.

Raymond Toy

unread,
Apr 22, 2002, 12:47:56 PM4/22/02
to
>>>>> "Joe" == Joe Marshall <prunes...@attbi.com> writes:

Joe> "Raymond Toy" <t...@rtp.ericsson.se> wrote in message
Joe> news:4nlmbfk...@rtp.ericsson.se...


>>
>> But doing this has its own problems. If all operations were done
>> with, say, 80-bit floats, rounding of the final result when stored in
>> a 64-bit float can produce different results than if all the
>> computations were done with 64-bit floats.

Joe> Yes, but it won't be less accurate. If all the operations done on
Joe> the 64-bit floats are exact, then the 80-bit floats will be exact, too.
Joe> Some inexact operations on 64-bit floats will be exact on 80-bit
Joe> floats, and this can only improve the precision. Operations that
Joe> are inexact on 80-bit floats will be inexact on 64-bit floats, but
Joe> the resulting error will be smaller.


Joe> There is another benefit to using the extra precision: the regions
Joe> of instability in numeric algorithms will be smaller. Naive users
Joe> may be far less likely to get bogus answers.

I'm not a numerical analyst and don't pretend to be, but I disagree
with this statement. If the extra precision was always available, I'd
agree. But usually you'll have to convert that 80-bit float to a
64-bit float and that's where you lose.

Somewhere on the web there's an article by Kahan about fused
multiply/add operations that some machines provide. Not exactly the
same, but I think it hints at the subtleties. Instead of rounding
twice, it rounds once. He gives some examples where the use of the
fused mac in finding the roots of a quadratic. In some cases, the
fused mac gives the square root of a negative number instead of the
positive number that would have happened with IEEE rounding.

Ray

Roland Kaufmann

unread,
Apr 22, 2002, 2:35:55 PM4/22/02
to
>>>>> "Ray" == Raymond Toy <t...@rtp.ericsson.se> writes:

>>>>> "Joe" == Joe Marshall <prunes...@attbi.com> writes:
Joe> "Raymond Toy" <t...@rtp.ericsson.se> wrote in message
Joe> news:4nlmbfk...@rtp.ericsson.se...
>>>
>>> But doing this has its own problems. If all operations were
>>> done with, say, 80-bit floats, rounding of the final result
>>> when stored in a 64-bit float can produce different results
>>> than if all the computations were done with 64-bit floats.

Joe> Yes, but it won't be less accurate. If all the operations

Joe> done on the 64-bit floats are exact, then the 80-bit floats
Joe> will be exact, too. Some inexact operations on 64-bit floats
Joe> will be exact on 80-bit floats, and this can only improve the
Joe> precision. Operations that are inexact on 80-bit floats will
Joe> be inexact on 64-bit floats, but the resulting error will be
Joe> smaller.

Joe> There is another benefit to using the extra precision: the

Joe> regions of instability in numeric algorithms will be smaller.
Joe> Naive users may be far less likely to get bogus answers.

Ray> I'm not a numerical analyst and don't pretend to be, but I
Ray> disagree with this statement. If the extra precision was
Ray> always available, I'd agree. But usually you'll have to
Ray> convert that 80-bit float to a 64-bit float and that's where
Ray> you lose.

Ray> Somewhere on the web there's an article by Kahan about fused
Ray> multiply/add operations that some machines provide.

I think you mean
http://www.cs.berkeley.edu/~wkahan/MxMulEps.pdf
IIRC, Kahan also argues against Java requiring *all* operations to be
performed in 64-bit IEEE arithmetic.

Ray> Not exactly the same, but I think it hints at the subtleties.
Ray> Instead of rounding twice, it rounds once. He gives some
Ray> examples where the use of the fused mac in finding the roots
Ray> of a quadratic. In some cases, the fused mac gives the
Ray> square root of a negative number instead of the positive
Ray> number that would have happened with IEEE rounding.

HTH
Roland

Joe Marshall

unread,
Apr 22, 2002, 3:15:17 PM4/22/02
to

"Roland Kaufmann" <roland....@space.at> wrote in message
news:tl2lmbf...@space.at...

>
> IIRC, Kahan also argues against Java requiring *all* operations to be
> performed in 64-bit IEEE arithmetic.
>

Actually, Kahan argues the opposite. He suggests that *all* operations
in Java be performed in the most precise format feasible (considering
hardware constraints).

Kahan argues that since Java does not conform to IEEE 754 (it omits
exception flags and directed rounding) it is dangerously inadequate
for floating point.

On page 30 of the paper
http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf
Kahan explicitly recommends (in boldface type, no less)
``Use Double Precision. When the naive program A(x) is run in
arithmetic twice as precise as the data X and the desired result,
it cannot be harmed by roundoff.''

On page 43, Kahan recommends that except for certain unusual situations
such as gargantuan arrays, all floating point arithmetic should be
performed at the widest finite precision that is not too slow or too
narrow.

On page 53, Kahan states that `Java gets us into trouble because it
rounds all subexpressions involving exclusively float operands to
float precision.' Whereas `old fashioned K&R C avoided [this] by
rounding everything by default to double unless an explicit cast
specified otherwise'.

Joe Marshall

unread,
Apr 22, 2002, 2:48:19 PM4/22/02
to

"Raymond Toy" <t...@rtp.ericsson.se> wrote in message
news:4n8z7fk...@rtp.ericsson.se...

>
> Joe> There is another benefit to using the extra precision: the
regions
> Joe> of instability in numeric algorithms will be smaller. Naive
users
> Joe> may be far less likely to get bogus answers.
>
> I'm not a numerical analyst and don't pretend to be, but I disagree
> with this statement. If the extra precision was always available, I'd
> agree. But usually you'll have to convert that 80-bit float to a
> 64-bit float and that's where you lose.
>
> Somewhere on the web there's an article by Kahan about fused
> multiply/add operations that some machines provide. Not exactly the
> same, but I think it hints at the subtleties.

The article to which you refer,
``Matlab's Loss is Nobody's Gain''
is available at
http://www.cs.berkeley.edu/~wkahan/MxMulEps.pdf

Matlab is a numerical computation package that is quite popular.
Matlab versions prior to 5.0 used the full precision available
in the hardware (64-bit mantissa) and rounded to double precision
when storing the results in memory. Upon version 5 on Windows,
Matlab changed to using double-precision (53-bit mantissa) in
the hardware.

Kahan argues that this is a mistake and the extra precision should
be turned back on. To quote Kahan, ``Why not get different results
that are better results whenever the hardware affords the opportunity?''

This paper also talks about the `Fused Multiply Accumulate' instruction
(FMAC) which computes
(fp-round (+ s (* r q)))
rather than
(fp-round (+ s (fp-round (* r q))))

(assume fp-round is a function that maps the exact result into the nearest
result that can be represented in a floating point number).

Kahan notes that FMAC in general produces more accurate answers, but that
there are subtle problems with using it indiscriminately. In particular,
when evaluating an expression like this: (- (* u v) (* r q)), do you
want to round the (* u v) or the (* r q) prior to using the FMAC
instruction? It almost never matters, but if u = r and v = q, FMAC can
produce a non-zero answer whereas rounding both produces a zero.

However, this does not argue against using as much precision as possible.

Raymond Toy

unread,
Apr 23, 2002, 10:11:23 AM4/23/02
to
>>>>> "Joe" == Joe Marshall <prunes...@attbi.com> writes:

Joe> "Raymond Toy" <t...@rtp.ericsson.se> wrote in message
Joe> news:4n8z7fk...@rtp.ericsson.se...


>>
Joe> There is another benefit to using the extra precision: the

Joe> regions


Joe> of instability in numeric algorithms will be smaller. Naive

Joe> users


Joe> may be far less likely to get bogus answers.
>>
>> I'm not a numerical analyst and don't pretend to be, but I disagree
>> with this statement. If the extra precision was always available, I'd
>> agree. But usually you'll have to convert that 80-bit float to a
>> 64-bit float and that's where you lose.
>>
>> Somewhere on the web there's an article by Kahan about fused
>> multiply/add operations that some machines provide. Not exactly the
>> same, but I think it hints at the subtleties.

Joe> The article to which you refer,
Joe> ``Matlab's Loss is Nobody's Gain''
Joe> is available at
Joe> http://www.cs.berkeley.edu/~wkahan/MxMulEps.pdf

This is new to me. I haven't looked at Kahan papers in several years.

Joe> Kahan argues that this is a mistake and the extra precision should
Joe> be turned back on. To quote Kahan, ``Why not get different results
Joe> that are better results whenever the hardware affords the opportunity?''

I have no problem with using as much precision as possible and I
always use doubles when possible.

I do think it is a small problem in lisp where

(= (1+ double-float-epsilon) 1)

may or may not be T depending on whether there was a store between the
comparison or not. And if you change double-float-epsilon to be some
other number, there will be cases were a number smaller than
double-float-epsilon that, when added to 1, isn't equal to 1.

Ray

Erik Naggum

unread,
Apr 23, 2002, 1:36:33 PM4/23/02
to
* Raymond Toy

| I do think it is a small problem in lisp where
|
| (= (1+ double-float-epsilon) 1)
|
| may or may not be T depending on whether there was a store between the
| comparison or not.

Hm. Neither Allegro CL, CMUCL, or CLISP manage to distinguish 1d0 from
(1+ double-float-epsilon), but with an identical epsilon, LispWorks does.

(scale-float double-float-epsilon (float-precision double-float-epsilon))
should in theory be identical to (1+ double-float-epsilon), but couriously,
(1- (scale-float ...)) yields a value twice as big as the epsilon. This
is really weird."

Duane Rettig

unread,
Apr 23, 2002, 3:00:01 PM4/23/02
to
Erik Naggum <er...@naggum.net> writes:

> * Raymond Toy
> | I do think it is a small problem in lisp where
> |
> | (= (1+ double-float-epsilon) 1)
> |
> | may or may not be T depending on whether there was a store between the
> | comparison or not.
>
> Hm. Neither Allegro CL, CMUCL, or CLISP manage to distinguish 1d0 from
> (1+ double-float-epsilon), but with an identical epsilon, LispWorks does.
>
> (scale-float double-float-epsilon (float-precision double-float-epsilon))
> should in theory be identical to (1+ double-float-epsilon), but couriously,
> (1- (scale-float ...)) yields a value twice as big as the epsilon. This
> is really weird."

I assume you are trying this on an Intel processor. You might want try
it on a processor which implements true IEEE floats. I tried the
following form (derived form the Ansi Spec):

(let ((epsilon double-float-epsilon))
(not (= (float 1 epsilon) (+ (float 1 epsilon) epsilon))))

on a Sparc, HP, SGI, and Powerpc, and they all returned true, as required.

I think that this is a case where too much precision gets you incorrect
accuracy. I'm fully prepared to take this as a "bug" in our software, where
we don't "dumb down" enough for the architecture, but I do place the blame
on the Intel architecture itself for not following IEEE-754 fp specs.
The workaround for such a bug is to define a new, larger double-float-epsilon
for x86 archiectures only.

Historically, the epsilons have been a problem for us. We were called
on their values being too large (i.e. not being "the smallest value for
which ...") and in the same 5.0.1 patch which we gave more exact float
printing and reading, we also changed the float epsilons:

5.0.1:

USER(1): (inspect double-float-epsilon)
double-float = 2.220446049250313d-16 [#x3cb00000 00000000] @ #x400cfca
[1i] USER(2): (inspect single-float-epsilon)
single-float = 1.1920929e-7 [#x34000000] @ #x400e192
[2i] USER(3):

6.x:

CL-USER(1): (inspect double-float-epsilon)
double-float = 1.1102230246251568d-16 [#x3ca00000 00000001] @ #x4025fc2
[1i] CL-USER(2): (inspect single-float-epsilon)
single-float = 5.960465e-8 [#x33800001] @ #x4024862
[2i] CL-USER(3):

Unfortunately, though the single-float-epsilon is good for x86,
the double-float-epsilon is not right for x86 because of the 80-bit
internal representation, which messes up the rounding.

Joe Marshall

unread,
Apr 23, 2002, 2:52:06 PM4/23/02
to

"Erik Naggum" <er...@naggum.net> wrote in message
news:32285721...@naggum.net...

> * Raymond Toy
> | I do think it is a small problem in lisp where
> |
> | (= (1+ double-float-epsilon) 1)
> |
> | may or may not be T depending on whether there was a store between the
> | comparison or not.
>
> Hm. Neither Allegro CL, CMUCL, or CLISP manage to distinguish 1d0 from
> (1+ double-float-epsilon), but with an identical epsilon, LispWorks
does.

Allegro CL 6.1 on Windows with the latest patches seems to have no problems.
(= 1.0d0 (+ 1.0d0 double-float-epsilon)) gives T

The next smaller float, (double-float (/ 4503599627370497 (expt 2 105))),
gives NIL


>
> (scale-float double-float-epsilon (float-precision
double-float-epsilon))
> should in theory be identical to (1+ double-float-epsilon), but
couriously,
> (1- (scale-float ...)) yields a value twice as big as the epsilon. This
> is really weird."

Did you use double-float-negative-epsilon for the subtractive case?
Since 1 is a power of 2 (expt 2 0), the float scale changes there, so the
epsilons are asymmetric.

Duane Rettig

unread,
Apr 23, 2002, 4:00:01 PM4/23/02
to
"Joe Marshall" <prunes...@attbi.com> writes:

> "Erik Naggum" <er...@naggum.net> wrote in message
> news:32285721...@naggum.net...
> > * Raymond Toy
> > | I do think it is a small problem in lisp where
> > |
> > | (= (1+ double-float-epsilon) 1)
> > |
> > | may or may not be T depending on whether there was a store between the
> > | comparison or not.
> >
> > Hm. Neither Allegro CL, CMUCL, or CLISP manage to distinguish 1d0 from
> > (1+ double-float-epsilon), but with an identical epsilon, LispWorks
> does.
>
> Allegro CL 6.1 on Windows with the latest patches seems to have no problems.
> (= 1.0d0 (+ 1.0d0 double-float-epsilon)) gives T

That it returns T is a problem. Did you forget the NOT in your call,
as per spec?

Erik Naggum

unread,
Apr 23, 2002, 4:07:33 PM4/23/02
to
* "Joe Marshall"

| Allegro CL 6.1 on Windows with the latest patches seems to have no problems.
| (= 1.0d0 (+ 1.0d0 double-float-epsilon)) gives T

That _is_ the problem. It should return false.

| Did you use double-float-negative-epsilon for the subtractive case?

Look, I subtract 1 from the equivalent of (1+ double-float-epsilon).

Joe Marshall

unread,
Apr 23, 2002, 4:30:40 PM4/23/02
to
Apologies to both Duane and Erik, I cut and paste the
wrong thing. It appears that double-float-epsilon is too
small on Allegro CL 6.1 for windows.


(defun check-epsilon (putative-epsilon)
(not (= (float 1 putative-epsilon) (+ (float 1 putative-epsilon)
putative-epsilon))))

(check-epsilon putative-epsilon) => NIL

You can compute double-float-epsilon as follows:

(defun compute-double-float-epsilon ()
(- 1.0d0 (* (- (/ 4.0d0 3.0d0) 1.0d0) 3.0d0)))

(check-epsilon (compute-double-float-epsilon)) => T

(integer-decode-float double-float-epsilon)
4503599627370497
-105
1

(integer-decode-float (find-epsilon))
4503599627370496
-104
1

The formula for computing double-float-epsilon is from Kahan.
It works as follows:
4/3 = 1.0101010101...
The rounded quotient differs from the exact answer by 1/3 ULP
(unit in the last place stored).
Subtracting 1 from the rounded quotient is exact.
Multiplying this result by 3 is exact, so the result of the
multiplication differs from 1 by 1 ULP.

"Duane Rettig" <du...@franz.com> wrote in message

news:4d6wqt...@beta.franz.com...

Duane Rettig

unread,
Apr 23, 2002, 7:00:04 PM4/23/02
to
"Joe Marshall" <prunes...@attbi.com> writes:

> Apologies to both Duane and Erik, I cut and paste the
> wrong thing. It appears that double-float-epsilon is too
> small on Allegro CL 6.1 for windows.

Yes, this is true. It is x86 specific, so the same applies for Linux
and FreeBSD.

> (defun check-epsilon (putative-epsilon)
> (not (= (float 1 putative-epsilon) (+ (float 1 putative-epsilon)
> putative-epsilon))))
>
> (check-epsilon putative-epsilon) => NIL
>
> You can compute double-float-epsilon as follows:
>
> (defun compute-double-float-epsilon ()
> (- 1.0d0 (* (- (/ 4.0d0 3.0d0) 1.0d0) 3.0d0)))
>
> (check-epsilon (compute-double-float-epsilon)) => T
>
> (integer-decode-float double-float-epsilon)
> 4503599627370497
> -105
> 1
>
> (integer-decode-float (find-epsilon))
> 4503599627370496
> -104
> 1

This may be _an_ epsilon, but it is not the CL epsilon - it is
not the _smallest_ value for which the test holds. To show this,
I only have to find a smaller float for which the test is true:

CL-USER(1): 2.1510571102112408d-16
2.1510571102112408d-16
CL-USER(2): (integer-decode-float *)
8725724278030336
-105
1
CL-USER(3): (not (= (float 1 **) (+ (float 1 **) **)))
T
CL-USER(4):

> The formula for computing double-float-epsilon is from Kahan.
> It works as follows:
> 4/3 = 1.0101010101...
> The rounded quotient differs from the exact answer by 1/3 ULP
> (unit in the last place stored).
> Subtracting 1 from the rounded quotient is exact.
> Multiplying this result by 3 is exact, so the result of the
> multiplication differs from 1 by 1 ULP.

[I started to write up why Kahan's algorithm might be different,
but then I realized what the problem really was - your
implementation of the algorithm counts on the epsilon calculation
taking place in the 80-bit environment. But all of the operations
in your caculations are being rounded and boxed up into 64-bit
double-floats. So the actual epsilon is smaller than what you
calculated, because the correct calculations carry more precision.
To get the real values, one would have to create a function of high
speed and pass the three values in as variables. Unfortunately,
when I tried this, I must have made a mistake, because I got a
negative number. Oh, well.]

I suspect that the correct epsilon for x86 is only a couple of bits
different than the current one, to counteract any too-eager rounding by
the 80-bit hardware.

Joe Marshall

unread,
Apr 23, 2002, 9:41:10 PM4/23/02
to

"Duane Rettig" <du...@franz.com> wrote in message
news:47kmyt...@beta.franz.com...

>
> This may be _an_ epsilon, but it is not the CL epsilon - it is
> not the _smallest_ value for which the test holds. To show this,
> I only have to find a smaller float for which the test is true:
>
> CL-USER(1): 2.1510571102112408d-16
> 2.1510571102112408d-16
> CL-USER(2): (integer-decode-float *)
> 8725724278030336
> -105
> 1
> CL-USER(3): (not (= (float 1 **) (+ (float 1 **) **)))
> T
> CL-USER(4):

You are right of course.

I wrote a program that searches for the epsilon on my machine.
I get the answer

1.1107651257113995d-16

(integer-decode-float *)
4505798650626049
-105
1

(setq foo 1.1107651257113995d-16)

(not (= (float 1 foo) (+ (float 1 foo) foo)))
T

The next float down is:
(* 4505798650626048 (expt 2 -105))

(setq bar (double-float (* 4505798650626048 (expt 2 -105))))
1.1107651257113993d-16

(not (= (float 1 bar) (+ (float 1 bar) bar)))
NIL

So unless I've blown it again (today's typo-braino factor is rather high),
double-float-epsilon on my machine is
1.1107651257113995d-16

whereas the binding of double-float-epsilon is
1.1102230246251568d-16

> > The formula for computing double-float-epsilon is from Kahan.
> > It works as follows:
> > 4/3 = 1.0101010101...
> > The rounded quotient differs from the exact answer by 1/3 ULP
> > (unit in the last place stored).
> > Subtracting 1 from the rounded quotient is exact.
> > Multiplying this result by 3 is exact, so the result of the
> > multiplication differs from 1 by 1 ULP.
>
> [I started to write up why Kahan's algorithm might be different,
> but then I realized what the problem really was - your
> implementation of the algorithm counts on the epsilon calculation
> taking place in the 80-bit environment.

The problem is actually this: The above computes a difference of
1 in the last place stored. However, you don't need a whole digit
difference in the last place stored, you only need one half to get
it rounded up to one. So the computed result ought to be twice
the value of double-float-epsilon.

Unfortunately, it isn't! (This is interesting.)

So let me check my work again, in case I'm being dumb.
(setq computed-epsilon 1.1107651257113995d-16)
(setq computed-epsilon-x 1.1107651257113993d-16)

(not (= (float 1 computed-epsilon) (+ (float 1 computed-epsilon)
computed-epsilon)))
T

(not (= (float 1 computed-epsilon-x) (+ (float 1 computed-epsilon-x)
computed-epsilon-x))
NIL

Ok, so far so good. Let's look at the binary value of computed-epsilon.
(format t "~b" (integer-decode-float computed-epsilon))
10000000000100000000000000000000000000000000000000001

This is just bizarre! Mathematically, when we add 1 to
computed epsilon, we are performing this operation:

1 + 4505798650626049/40564819207303340847894502572032

Now in order to add these, they need a common denominator,
so converting we get

40564819207303340847894502572032 + 4505798650626049
---------------------------------------------------
40564819207303340847894502572032

This would be the exact answer if we could represent it.
However, we can't (the numerator has too many digits).
In binary, the numerator is:
10000000000000000000000000000000000000000000000000000
10000000000100000000000000000000000000000000000000001

but we can only keep 52 digits beyond the most-significant bit.
The 53rd bit is a 1, so we can't tell if we need to round up
or down (1 is half-way) unless we look at more bits. Since we
round to even, we ought to round down if *all* the remaining
bits are zero, but round up if *any* of them are 1.

Ignoring for just a moment that 1 bit just a few digits down
from the 53rd bit, we actually have what we'd expect. The
very last bit is 1, so we round up giving us a rounded
numerator of
10000000000000000000000000000000000000000000000000001

If that very last bit were a zero, we'd round down giving
us a numerator of
10000000000000000000000000000000000000000000000000000

which would give us a result of 1.0

We'd expect that any precision float epsilon would have
a mantissa of 10000....0001 (with the appropriate number
of zeroes) because the next smaller mantissa ought to be
rounded down.

But what's with that bit we were ignoring?! Apparently
the rounding algorithm is ignoring it too!

> But all of the operations
> in your caculations are being rounded and boxed up into 64-bit
> double-floats. So the actual epsilon is smaller than what you
> calculated, because the correct calculations carry more precision.

It's pretty clear that in this case I'm getting answers *less*
precise rather than more. Comparing Franz Allegro to other
floating point operations in various other software on my machine,
I'm seeing only Franz giving me this weird result.

I think there may be a bug in how Franz is handling double floats.

Erik Naggum

unread,
Apr 23, 2002, 10:23:19 PM4/23/02
to
* "Joe Marshall"

| (setq bar (double-float (* 4505798650626048 (expt 2 -105))))
| 1.1107651257113993d-16

The canonical way to get back to an floating point number from the
integer-decode-float values is a lot simpler and less computationally
intensive: (scale-float (float <mantissa> 1d0) <exponent>). In all
likelihood, it is implemented with a simple hardware-supported conversion
from large integer mantissa to large floating point value and adjusting
the exponent rather than building a rational number out of two bignums on
my way to the double-float.

Interesting analysis. I doubt that the above has any influence, so I
have not worked out whether it had.

Joe Marshall

unread,
Apr 23, 2002, 10:31:58 PM4/23/02
to

"Erik Naggum" <er...@naggum.net> wrote in message
news:32286037...@naggum.net...

> * "Joe Marshall"
> | (setq bar (double-float (* 4505798650626048 (expt 2 -105))))
> | 1.1107651257113993d-16
>
> The canonical way to get back to an floating point number from the
> integer-decode-float values is a lot simpler and less computationally
> intensive: (scale-float (float <mantissa> 1d0) <exponent>). In all
> likelihood, it is implemented with a simple hardware-supported
conversion
> from large integer mantissa to large floating point value and adjusting
> the exponent rather than building a rational number out of two bignums
on
> my way to the double-float.

Thanks for the tip.

Duane Rettig

unread,
Apr 24, 2002, 12:00:01 AM4/24/02
to
"Joe Marshall" <prunes...@attbi.com> writes:

> > But all of the operations
> > in your caculations are being rounded and boxed up into 64-bit
> > double-floats. So the actual epsilon is smaller than what you
> > calculated, because the correct calculations carry more precision.
>
> It's pretty clear that in this case I'm getting answers *less*
> precise rather than more.

I would argue that the "more precision" is yielding "less accuracy".
It it just like your example earlier where you mention Kahan's argument
that a formula (- (* u v) (* r q)) which should be zero will turn out not
to be zero (i.e., inaccurate) if one of the terms is carried out to more
precision than the other.

> Comparing Franz Allegro to other
> floating point operations in various other software on my machine,
> I'm seeing only Franz giving me this weird result.

> I think there may be a bug in how Franz is handling double floats.

I can understand this, and it is likely that the reason for such
inaccuracies is due to the mixing of in-register (i.e. 80-bit)
operations with memory (64-bit) operations. As I said, I bear
the responsibility for such bugs, but I place the blame squarely on
the shoulders of Intel designers for not following the IEEE 754 specs.
But it seems like they have also seen the same problem, because
their new SSE and SSE2 units are 32 and 64 bits, instead of the
80-bit internal representations of the fp unit.

Lieven Marchand

unread,
Apr 24, 2002, 2:05:23 AM4/24/02
to
Duane Rettig <du...@franz.com> writes:

> I can understand this, and it is likely that the reason for such
> inaccuracies is due to the mixing of in-register (i.e. 80-bit)
> operations with memory (64-bit) operations. As I said, I bear
> the responsibility for such bugs, but I place the blame squarely on
> the shoulders of Intel designers for not following the IEEE 754 specs.
> But it seems like they have also seen the same problem, because
> their new SSE and SSE2 units are 32 and 64 bits, instead of the
> 80-bit internal representations of the fp unit.

I don't know how feasible this is, but one way to keep numerical
analysts happy would be to have an option to save the full 80-bit
internal representations whenever fp registers have to spilled into
memory.

--
Honest praise, this stony part insisted, was what the bunglers of the world
heaped on the heads of the barely competent. -- Stephen King

Raymond Toy

unread,
Apr 24, 2002, 8:52:11 AM4/24/02
to
>>>>> "Lieven" == Lieven Marchand <m...@wyrd.be> writes:

Lieven> Duane Rettig <du...@franz.com> writes:
>> I can understand this, and it is likely that the reason for such
>> inaccuracies is due to the mixing of in-register (i.e. 80-bit)
>> operations with memory (64-bit) operations. As I said, I bear
>> the responsibility for such bugs, but I place the blame squarely on
>> the shoulders of Intel designers for not following the IEEE 754 specs.
>> But it seems like they have also seen the same problem, because
>> their new SSE and SSE2 units are 32 and 64 bits, instead of the
>> 80-bit internal representations of the fp unit.

Lieven> I don't know how feasible this is, but one way to keep numerical
Lieven> analysts happy would be to have an option to save the full 80-bit
Lieven> internal representations whenever fp registers have to spilled into
Lieven> memory.

CMUCL used to have an 80-bit long-float.

Another option, implemented in CMUCL, is to allow the user to set the
precision bits in the FP control register. I think this tells the
hardware to perform rounding at the desired point. So if you said
round at 53 bits (double-float), all basic ops are rounded to
double-float precision, even if the numbers are 80-bit floats.

Ray

Joe Marshall

unread,
Apr 24, 2002, 2:43:06 PM4/24/02
to
Apologies to those of you with narrow email buffers, but if I'm
going to be typing 80-bit floats, I have little choice. I have
attempted to persuade my email client to wrap at 110 chars, but
who knows what kind of treatment my email will get at the hands
of various and sundry SMTP mail agents? (Do they call it email
because they treat your letter no better than the post office
would?)

I've seen a number of messages that suggest that extra precision
carried by my floating point hardware may be to blame. For
example:

"Raymond Toy" <t...@rtp.ericsson.se> wrote in message

news:4nlmbfk...@rtp.ericsson.se...


>
> If all operations were done with, say, 80-bit floats, rounding
> of the final result when stored in a 64-bit float can produce
> different results than if all the computations were done with
> 64-bit floats.

* Raymond Toy


| I do think it is a small problem in lisp where
|
| (= (1+ double-float-epsilon) 1)
|
| may or may not be T depending on whether there was a store between the
| comparison or not.

"Duane Rettig" <du...@franz.com> wrote


>
> I think that this is a case where too much precision gets you incorrect
> accuracy. I'm fully prepared to take this as a "bug" in our software, where
> we don't "dumb down" enough for the architecture, but I do place the blame
> on the Intel architecture itself for not following IEEE-754 fp specs.

"Duane Rettig" <du...@franz.com> wrote in message news:4ofg9z...@beta.franz.com...
>
> It is likely that the reason for such inaccuracies is due to the


> mixing of in-register (i.e. 80-bit) operations with memory
> (64-bit) operations.

What's puzzling to me is that Kahan recommends almost always using the widest
floating point precision available (provided it's fast enough), and he has
nothing bad to say about the Intel floating point architecture. In fact,
in ``The Baleful Effect of Computer Benchmarks upon Applied Mathematics,
Physics and Chemistry'' available at:
http://www.cs.berkeley.edu/~wkahan/ieee754status/baleful.ps

he includes the Intel 386, 486, Pentium and P6 among the conforming
implementations of IEEE 754.

My understanding is that the Intel chips perform all calculations in
IEEE Double Extended format, and that the result may be rounded to
IEEE Double format if stored in memory. Kahan defends this practice:

`The idea of an Extended format has been amply vindicated by its
use in Hewlett-Packard's financial calculators, which all perform
all arithmetic and financial functions to there more sig. decimals
than they store or display.'

Kahan cautions against the `stopped clock paradox' where `inferior
computers might get exactly correct results when superior ones get
merely excellent approximations'.

"Duane Rettig" <du...@franz.com> wrote in message news:4ofg9z...@beta.franz.com...


> But it seems like they have also seen the same problem, because
> their new SSE and SSE2 units are 32 and 64 bits, instead of the
> 80-bit internal representations of the fp unit.

Kahan says ``Without an Extended format some ostensibly straightforward
double computations are prone to malfunction unless carried out in devious
ways known only to experts; and matrix computations upon vast arrays of
double data degrade too rapidly as increasing dimensions engender
worsened roundoff.'' Narrowing the internal precision of the floating
point unit is never going to make things better. You can emulate the
narrowing on a higher-precision fp unit by storing to memory between
operations (or using a mode bit if available). You cannot easily
emulate higher precision on a narrow fp unit.

Now back to double-float-epsilon....
Double-float-epsilon as defined by the hyperspec is `the smallest
positive [double] float ..., such that the following expression is
true when evaluated:


(not (= (float 1 epsilon) (+ (float 1 epsilon) epsilon)))

The hyperspec does not require any floating point to conform to IEEE
specifications, but given any floating point specification, there
is a corresponding `float-epsilon' for it. In particular,
double-float-epsilon for IEEE 754 Double format is:

10000000000000000000000000000000000000000000000000001
times 2^-105

IEEE 754 Single epsilon is
100000000000000000000001
times 2^-47

and Intel's 80-bit internal floats (conforming to IEEE 754
double-extended) epsilon is:
1000000000000000000000000000000000000000000000000000000000000001
times 2^-127

Now what happens if we add 1.0d0 and double-float-epsilon?
There are 8 cases corresponding to whether the operands are
in double format (memory) or extended format (registers),
and whether the answer is left in a register or saved to memory.
When the float is brought in to the floating point unit it
is extended by appending 11 zeroes to the mantissa and adjusting
the exponent appropriately. (I'll be ignoring the exponents)
So double-float-epsilon in a register looks like this:
1000000000000000000000000000000000000000000000000000100000000000

When we add this to 1.0, the correct sum is
10000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000010000
0000000

which, when rounded to fit in the registers is:
1000000000000000000000000000000000000000000000000000010000000000

(so we lose that bit out there at position 105)

> "Joe Marshall" <prunes...@attbi.com> writes:
>
> > > But all of the operations
> > > in your caculations are being rounded and boxed up into 64-bit
> > > double-floats. So the actual epsilon is smaller than what you
> > > calculated, because the correct calculations carry more precision.
> >
> > It's pretty clear that in this case I'm getting answers *less*
> > precise rather than more.
>
> I would argue that the "more precision" is yielding "less accuracy".
> It it just like your example earlier where you mention Kahan's argument
> that a formula (- (* u v) (* r q)) which should be zero will turn out not
> to be zero (i.e., inaccurate) if one of the terms is carried out to more
> precision than the other.
>
> > Comparing Franz Allegro to other
> > floating point operations in various other software on my machine,
> > I'm seeing only Franz giving me this weird result.
>
> > I think there may be a bug in how Franz is handling double floats.
>

> I can understand this, and As I said, I bear


> the responsibility for such bugs, but I place the blame squarely on
> the shoulders of Intel designers for not following the IEEE 754 specs.
>

Lieven Marchand

unread,
Apr 24, 2002, 5:02:23 PM4/24/02
to
"Joe Marshall" <prunes...@attbi.com> writes:

> What's puzzling to me is that Kahan recommends almost always using the widest
> floating point precision available (provided it's fast enough), and he has
> nothing bad to say about the Intel floating point architecture.

He helped design it, so he isn't a totally unbiased source. I'm not
either because I got trained in numerical analysis by people who were
on the losing side of that debate.

> My understanding is that the Intel chips perform all calculations in
> IEEE Double Extended format, and that the result may be rounded to
> IEEE Double format if stored in memory. Kahan defends this practice:
>
> `The idea of an Extended format has been amply vindicated by its
> use in Hewlett-Packard's financial calculators, which all perform
> all arithmetic and financial functions to there more sig. decimals
> than they store or display.'
>
> Kahan cautions against the `stopped clock paradox' where `inferior
> computers might get exactly correct results when superior ones get
> merely excellent approximations'.

One of the things against it, is that it makes calculations dependent
on the state of the fpu. Consider (+ (foo x) (bar z)) versus (foo x)
with all appropriate declarations to allow unboxed floats. This can
give different results for (foo x) in the two expressions dependent on
the number of fp variables live at that point. If that number exceeds
the available number of registers (or in the case of the x86
monstrosity, places on the stack), the value can be spilled to memory
in shortened format and reloaded later. There are algorithms that will
run provably correct in a conforming IEEE 754 implementation that will
fail because of that because they rely on identical behaviour and
identical precision of every operation.

--
Lieven Marchand <m...@wyrd.be>
She says, "Honey, you're a Bastard of great proportion."
He says, "Darling, I plead guilty to that sin."
Cowboy Junkies -- A few simple words

Raymond Toy

unread,
Apr 25, 2002, 8:56:12 AM4/25/02
to
>>>>> "Joe" == Joe Marshall <prunes...@attbi.com> writes:

[snip nice example of what's happening]
Joe> When we add this to 1.0, the correct sum is
Joe> 10000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000010000
Joe> 0000000

Joe> which, when rounded to fit in the registers is:
Joe> 1000000000000000000000000000000000000000000000000000010000000000

Joe> (so we lose that bit out there at position 105)

This just ends here, without even a signature. Was there supposed to
be more?

Ray

Joe Marshall

unread,
Apr 25, 2002, 10:10:26 AM4/25/02
to

"Raymond Toy" <t...@rtp.ericsson.se> wrote in message news:4nit6ge...@rtp.ericsson.se...


No. It wasn't supposed to go out. I was in the middle of composing
it. I interrupted the connection to my ISP, but apparently that
isn't enough. Correctly composed mail often gets dropped on the
floor, but truncated usenet posts are delivered.

I am gratified to see that someone noticed it was truncated!

The reason I stopped composing it was that I was going back over
what I was doing and re-testing things. An odd thing happened.
I had rebooted between the day before yesterday and yesterday,
and the floating point math changed behavior.

Some experimentation has shown that some of the time it works
with 53-bits precision, but other times it is in a very
odd state (that isn't 53, 80, or any reasonable `mixture').
It may be that the signal handler is not saving and restoring
the floating point state correctly. The output below shows
the weirdness.

CL-USER(1): (+ 1.0d0 double-float-epsilon)
1.0000000000000002d0
CL-USER(2): (loop)
Error: Received signal number 2 (Keyboard interrupt)
[condition type: INTERRUPT-SIGNAL]

Restart actions (select using :continue):
0: continue computation
1: Return to Top Level (an "abort" restart).
2: Abort entirely from this process.
[1c] CL-USER(3): ,cont 1
CL-USER(4): (+ 1.0d0 double-float-epsilon)
1.0d0

If I control-c out of anything and return to top level,
the floating point math no longer works.

ObLisp:
Just for kicks, here is some of the code I was using:

(defun next-float (f)
(multiple-value-bind (mantissa exponent sign)
(integer-decode-float f)
(scale-float (float (+ (* sign mantissa) 1) f) exponent)))

(defun previous-float (f)
(multiple-value-bind (mantissa exponent sign)
(integer-decode-float f)
(scale-float (float (- (* sign mantissa) 1) f) exponent)))

(defun test-epsilon (epsilon)
(not (= (float 1 epsilon) (+ (float 1 epsilon) epsilon))))

(defun check-epsilon (putative-epsilon)
(and (test-epsilon putative-epsilon)
(not (test-epsilon (previous-float putative-epsilon)))))

(defun kahan-epsilon (f)
"Given an arbitray float f, return the difference
between 1 and 1 + epsilon in the same precision
format as f."
(abs (- 1 (* (- (/ (float 4 f)
(float 3 f)) 1) 3))))

(defun cl-epsilon (f)
"Given an arbitrary float f, return the common lisp
epsilon for that format float.

Kahan's epsilon is (- 1 (+ 1 eps)), but the cl-epsilon is
the smallest float such that (= 1 (+ 1 eps)) is false.

Thus the cl-epsilon, when added to 1, should round up to
(+ 1 kahan-eps). In other words, something incrementally
bigger than one-half kahan's epsilon."

(next-float (/ (kahan-epsilon f) 2)))

Raymond Toy

unread,
Apr 25, 2002, 11:15:36 AM4/25/02
to
>>>>> "Joe" == Joe Marshall <prunes...@attbi.com> writes:

Joe> "Raymond Toy" <t...@rtp.ericsson.se> wrote in message news:4nit6ge...@rtp.ericsson.se...


>> >>>>> "Joe" == Joe Marshall <prunes...@attbi.com> writes:
>>
>> [snip nice example of what's happening]
Joe> When we add this to 1.0, the correct sum is
Joe>
Joe> 10000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000010000
Joe> 0000000
>>
Joe> which, when rounded to fit in the registers is:
Joe> 1000000000000000000000000000000000000000000000000000010000000000
>>
>>
>>
Joe> (so we lose that bit out there at position 105)
>>
>> This just ends here, without even a signature. Was there supposed to
>> be more?


Joe> No. It wasn't supposed to go out. I was in the middle of composing
Joe> it. I interrupted the connection to my ISP, but apparently that
Joe> isn't enough. Correctly composed mail often gets dropped on the
Joe> floor, but truncated usenet posts are delivered.

Joe> I am gratified to see that someone noticed it was truncated!

I wanted to see where you were heading, and it just stopped. It was
like a joke without the punchline!

Joe> The reason I stopped composing it was that I was going back over
Joe> what I was doing and re-testing things. An odd thing happened.
Joe> I had rebooted between the day before yesterday and yesterday,
Joe> and the floating point math changed behavior.

Joe> Some experimentation has shown that some of the time it works
Joe> with 53-bits precision, but other times it is in a very
Joe> odd state (that isn't 53, 80, or any reasonable `mixture').
Joe> It may be that the signal handler is not saving and restoring
Joe> the floating point state correctly. The output below shows
Joe> the weirdness.

Joe> CL-USER(1): (+ 1.0d0 double-float-epsilon)
Joe> 1.0000000000000002d0
Joe> CL-USER(2): (loop)
Joe> Error: Received signal number 2 (Keyboard interrupt)
Joe> [condition type: INTERRUPT-SIGNAL]

Yeah, I think CMUCL used to have this problem too. I believe it's
fixed now for the x86/linux version.

Ray

Duane Rettig

unread,
Apr 25, 2002, 5:00:00 PM4/25/02
to
Raymond Toy <t...@rtp.ericsson.se> writes:

> >>>>> "Lieven" == Lieven Marchand <m...@wyrd.be> writes:
>
> Lieven> Duane Rettig <du...@franz.com> writes:
> >> I can understand this, and it is likely that the reason for such
> >> inaccuracies is due to the mixing of in-register (i.e. 80-bit)
> >> operations with memory (64-bit) operations. As I said, I bear
> >> the responsibility for such bugs, but I place the blame squarely on
> >> the shoulders of Intel designers for not following the IEEE 754 specs.
> >> But it seems like they have also seen the same problem, because
> >> their new SSE and SSE2 units are 32 and 64 bits, instead of the
> >> 80-bit internal representations of the fp unit.
>
> Lieven> I don't know how feasible this is, but one way to keep numerical
> Lieven> analysts happy would be to have an option to save the full 80-bit
> Lieven> internal representations whenever fp registers have to spilled into
> Lieven> memory.
>
> CMUCL used to have an 80-bit long-float.

I assume that this means that it no longer has one. Do you know
what the reasoning was for removing it?

> Another option, implemented in CMUCL, is to allow the user to set the
> precision bits in the FP control register. I think this tells the
> hardware to perform rounding at the desired point. So if you said
> round at 53 bits (double-float), all basic ops are rounded to
> double-float precision, even if the numbers are 80-bit floats.

I believe that this is the problem for Allegro CL. Trouble is, the
default precision is not predictable, and the Calling Standards call
for the preservation of the FPU Control Word across function calls.
This means that it has to be saved and restored when calling out to
foreign functions, though it can be assumed to be default values when
within Lisp code.

Duane Rettig

unread,
Apr 25, 2002, 5:00:01 PM4/25/02
to
"Joe Marshall" <prunes...@attbi.com> writes:

> The article to which you refer,
> ``Matlab's Loss is Nobody's Gain''
> is available at
> http://www.cs.berkeley.edu/~wkahan/MxMulEps.pdf

I have now skimmed through the following references:

http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf

http://www.cs.berkeley.edu/~wkahan/MxMulEps.pdf

The first has some dates as late as Nov 2001, but was apparently
originally presented Mar 1, 1998. The second has a date of
Aug 23, 1998. So they both seem to be constructed in the same
time frame.

(With apologies in advance to Dr. Kahan for this reaction):

I was most entertained by the first paper, and somewhat put off, by
the apparent religious fervor that was being spewed. It got a little
tiring to be preached to about the evils of the saying: "Arithmetic
should be barely more precise than the data and the desired result".
Repeating the phrase over and over again, for the purposes of
discrediting it, sometimes tends to have the opposite effect. It is
similar to a preacher slamming any particular movie for its evil
nature, and by the end of the sermon the entire congregation wants
to go out and see this terrible movie to judge for themselves ...
I was also made ashamed of my depraved and ignorant nature by the
constant use of the phrase "95% of the folks out there are completely
clueless about floating point." (it makes me feel guilty even though
I know I am not clueless about floating point). Methinks the gentleman
doeth protest too much...

He even hints an attribution of Apple's decline to their
abandonment of the 68k and its superior floating point, comparing
it to the Intel chips which are the only ones left with such superior
fp, and then asks the question "Coincidence?" This question, having been
written in 1998, does not forsee Apple's renewed strength (having come
about without the use of a superior FP chip :-).

===
(a story of Kahan):
I remember in early 1990, when Franz got our first RS/6000 beta several
months before the IBM's FCS (first customer ship) of the RS/6000, I had
to have my office door changed to have a perpetual keyed-only lock (it's
there to this day) and nobody (not even Franz employees) was admitted to
even _see_ the machine who had not signed the NDA with IBM. Dr. Kahan
was one of the few people from outside the company who was allowed by
IBM to see the machine (he had helped them to design it). He spent 30
seconds looking at the machine, and I opened up the cage for him to see
the insides, including the chip set, and then he took the rest of his
time with me explaining (in strong terms) what a huge mistake IBM had
made in its creating the Fused Multiply/Accumulate instruction, and how
it can lose accuracy in some cases (because it does not do a proper IEEE
rounding between the multiply and add/subtract phases).

===

But the gist of both papers seem at the same time conflicting and in
agreement. I draw my own conclusions, apart from the very directed
agendae of these papers, that seem to pop out at me. Dr. Kahan seems
to have softened his initial intense dislike for the
Fused Multiply/Accumulate instruction, apparently after eventually
finding a way to use it and still be accurate.

Please bear in mind that I only scanned these documents, and do not
pretend to have the correct interpretations of Dr. Kahan's intentions.
But this is what I got out of it:

- Dr. Kahan seems to recognize the tradeoffs between speed and accuracy,
and states that float operations should be performed with as much
precision as can be done quickly in hardware. This does _not_ necessarily
tell me that he is thus advocating 80-bit floats, though he does indeed
advocate them; most of the time he promotes usage of more precision is
in those times when floats (32-bit) single-floats are pitted against
doubles (64-bit double-floats). Thus, I presume that he does not advocate
80-bit floats on RISC hardware, nor even 64-bit floats on hardware that
only has 32-bit floats (like the old Crays, for instance).

- His general request, as I see it, is for more programmer control over
floating point operations. He does not want the languages to make
decisions about what is "correct", but to give the programmer the
ability to select the parameters that are right for the problem.

- - His main gripe against Java was not that they didn't have 80-bit
floats, but that they did not have any way to handle IEEE exceptions.

- - He wants languages to allow for all three float types. This
would have a direct application in Common Lisp if we implemented
long-floats as 80-bit floats.

- - He wants to be able to programmatically tell the compiler to
turn off the Fused Multiply/Accumulate instruction.

There are several places in which he advocates promotion of precision
within calculations, but he does not at all address the problem of
mixing precisions in different legs of the calculations. Based on his
warning against the Fused Multiply/Accumulate instruction, I suspect
that he would consider it obvious that both legs of any calculation
must be done with similar precisions.

Duane Rettig

unread,
Apr 25, 2002, 5:00:01 PM4/25/02
to
"Joe Marshall" <prunes...@attbi.com> writes:

> The reason I stopped composing it was that I was going back over
> what I was doing and re-testing things. An odd thing happened.
> I had rebooted between the day before yesterday and yesterday,
> and the floating point math changed behavior.

Was this on Windows or Linux?

> Some experimentation has shown that some of the time it works
> with 53-bits precision, but other times it is in a very
> odd state (that isn't 53, 80, or any reasonable `mixture').
> It may be that the signal handler is not saving and restoring
> the floating point state correctly. The output below shows
> the weirdness.

The precision bits on the x86 are supposed to be preserved across
function calls, but they apparently aren't (Allegro CL does, but
apparently some libraries do not). It is unlikely that the signal
handlers are performing incorrectly, because the FSAVE and FRSTORE
instructions do save and restore the entire FP Control word along with
the rest of the FP context, which should work correctly unless someone
is aggressively modifying that context. So I have to assume that this
requirement is being ignored by some library routines, and we will have
to protect ourselves from it.

Not that too much precision is a Bad Thing, of course, but when you want
to measure epsilons for particular float sizes, too much precision will
give you unexpected results ...

Raymond Toy

unread,
Apr 25, 2002, 5:26:39 PM4/25/02
to
>>>>> "Duane" == Duane Rettig <du...@franz.com> writes:

Duane> Raymond Toy <t...@rtp.ericsson.se> writes:
>>
>> CMUCL used to have an 80-bit long-float.

Duane> I assume that this means that it no longer has one. Do you know
Duane> what the reasoning was for removing it?

It's still in the sources. I just think no one tried to use it much
and it didn't get maintained or updated. It was nice to have, though.
Might try to do the cross-compile to get this working again some
day....

Ray

Raymond Toy

unread,
Apr 25, 2002, 5:35:36 PM4/25/02
to
>>>>> "Duane" == Duane Rettig <du...@franz.com> writes:


Duane> Not that too much precision is a Bad Thing, of course, but when you want
Duane> to measure epsilons for particular float sizes, too much precision will
Duane> give you unexpected results ...

A while ago, as a test of f2cl, I converted TOMS SPECFUN library
(double precision version). It came with a bunch of tests of the
accuracy of the routines. When I ran them, I got results that would
say such and such test had about 52.x bits of precision. Pretty good
for a double-float algorithm.

But then it also said it lost 10.y bits of precision!

Took me a while to figure how this could be. The library included a
routine to compute at run-time various parameters of the FP
arithmetic, and it obviously computed epsilon to 80-bits because the
compiler (CMUCL) was able to keep those computations in the FP
registers.

Ray

Joe Marshall

unread,
Apr 25, 2002, 10:48:41 PM4/25/02
to

"Duane Rettig" <du...@franz.com> wrote in message news:47kmve...@beta.franz.com...

>
> (With apologies in advance to Dr. Kahan for this reaction):
>
> I was most entertained by the first paper, and somewhat put off, by
> the apparent religious fervor that was being spewed. It got a little
> tiring to be preached to about the evils of the saying: "Arithmetic
> should be barely more precise than the data and the desired result".

A few of the `papers' on Kahan's web site seem to be slide presentations
and, of course, the accompanying commentary is lost. Subtlety isn't
at a premium for a slide presentation.

> I was also made ashamed of my depraved and ignorant nature by the
> constant use of the phrase "95% of the folks out there are completely
> clueless about floating point." (it makes me feel guilty even though
> I know I am not clueless about floating point). Methinks the gentleman
> doeth protest too much...

This is just a special case of `95% of the folks out there are
completely clueless.'

A quick google on the web turned up these comments about floating point:

`Floating-point arithmetic on digital computers is inherently inexact.'

`Most floating-point numbers a computer can represent are just approximations.'

`Every time a floating point operation is performed, rounding occurs.'

`...real and complex numbers cannot be represented exactly on computers...'

>
> Please bear in mind that I only scanned these documents, and do not
> pretend to have the correct interpretations of Dr. Kahan's intentions.
> But this is what I got out of it:
>
> - Dr. Kahan seems to recognize the tradeoffs between speed and accuracy,
> and states that float operations should be performed with as much
> precision as can be done quickly in hardware. This does _not_ necessarily
> tell me that he is thus advocating 80-bit floats, though he does indeed
> advocate them; most of the time he promotes usage of more precision is
> in those times when floats (32-bit) single-floats are pitted against
> doubles (64-bit double-floats). Thus, I presume that he does not advocate
> 80-bit floats on RISC hardware, nor even 64-bit floats on hardware that
> only has 32-bit floats (like the old Crays, for instance).
>
> - His general request, as I see it, is for more programmer control over
> floating point operations. He does not want the languages to make
> decisions about what is "correct", but to give the programmer the
> ability to select the parameters that are right for the problem.

Ultimately, you want the best answer you can get in a reasonable amount
of time without having to spend too much time instructing the computer
how to do the work. In the case of floating point, the ideal situation
would be for the ultimate answer to be the same as if you had computed
using infinite precision and rounded only at the last step. And when
you want to use basic functions, these too ought to be as precise as
possible.

Unfortunately, it is really hard to do this right over the entire
range of floating point numbers, and someone always finds the screw
cases. One of Kahan's major gripes is it isn't always easy to
fix the problem without the extra control.

>
> - - His main gripe against Java was not that they didn't have 80-bit
> floats, but that they did not have any way to handle IEEE exceptions.
>
> - - He wants languages to allow for all three float types. This
> would have a direct application in Common Lisp if we implemented
> long-floats as 80-bit floats.
>
> - - He wants to be able to programmatically tell the compiler to
> turn off the Fused Multiply/Accumulate instruction.
>

This is pretty much what I got out of it, too. Oh, and perhaps
one other thing: floating point math is not a `black art', you
just have to apply a little careful thought.

0 new messages