CMUCL performance in the Shootout

Michael Park

unread,

May 23, 2003, 10:10:22 PM5/23/03

to

Why did CMUCL do so poorly in the "Shootout" ? It is a few times
slower than C for matrix multiplication. All types were declared. On
my machine, when I test it from REPL, so there are no start-up costs,
Lisp is 4-5 times slower than C. Why?

http://www.bagley.org/~doug/shootout/bench/matrix/

P.S. I hope no one gets over-sensitive because of this question.

Scott A Douglass

unread,

May 24, 2003, 2:04:02 AM5/24/03

to

--On Friday, May 23, 2003 7:10 PM -0700 Michael Park
<dont_...@whoever.com> wrote:

> Why did CMUCL do so poorly in the "Shootout" ? It is a few times
> slower than C for matrix multiplication. All types were declared. On
> my machine, when I test it from REPL, so there are no start-up costs,
> Lisp is 4-5 times slower than C. Why?
>
> http://www.bagley.org/~doug/shootout/bench/matrix/

Why did C do so poorly in the "Shootout"? It is a few times slower that
CMUCL for regular expression matching. Why?

http://www.bagley.org/~doug/shootout/bench/regexmatch/

I'd hardly call 6th overall "doing so poorly"!

I guess when Doug states in his "Best of Show":

"I've dabbled with Lisp (e.g. CMUCL) on and off for many years, but the
shootout gave me an opportunity to look more at Common Lisp and at Scheme
(Bigloo) too, which I hadn't done before. This family of languages has so
much to offer, it's a shame that they are not more popular."

he simply didn't know what he was talking about.

Quit trying to get a rise out of the group...

Florian Weimer

unread,

May 24, 2003, 9:13:25 AM5/24/03

to

Scott A Douglass <sd...@andrew.cmu.edu> writes:

> Why did C do so poorly in the "Shootout"? It is a few times slower that
> CMUCL for regular expression matching. Why?

In this case, Shootout does not measure C performance, but PCRE
performance, 8-)

Rolf Wester

unread,

May 24, 2003, 9:16:04 AM5/24/03

to

I have benchmarked a 300x300 matrix multiplication. The fastest CMUCL
version was only 2 times slower than the C++ version using TNT::Array2D
(Template Numerical Library). The CMUCL program still made array bound
checks while the C++ program did not. So I think CMUCL is quite fast in
this example although not as fast as C++ (a C program using raw C-arrays
may be even faster).

The main advantage of CMUCL for me is the combination of fast
interactive and dynamic developement and reasonable fast compiled code.
There is no other language/implementation that is comparable to CMUCL
in that respect.

Rolf Wester

Gareth McCaughan

unread,

May 24, 2003, 9:54:33 AM5/24/03

to

Michael Park wrote:

> Why did CMUCL do so poorly in the "Shootout" ? It is a few times
> slower than C for matrix multiplication. All types were declared. On
> my machine, when I test it from REPL, so there are no start-up costs,
> Lisp is 4-5 times slower than C. Why?

Looking at the disassembled code from both versions
(which I'll call "C" and "L" with the obvious meanings):

- C has an inner loop 9 instructions long; L has an
inner loop 21 instructions long.

- C keeps the accumulated sum in a register
throughout the calculation; L spills it.

- C keeps the inner loop's counter in a register
throughout the calculation; L spills it.

- C moves one indexing operation out of the inner loop;
L doesn't.

> P.S. I hope no one gets over-sensitive because of this question.

If you want to avoid people getting over-sensitive,
it might be advisable to phrase the question in a
less provocative way. You found *one* test in which
CMUCL does much worse than C; that doesn't constitute
"doing poorly" on its own. (My vague recollection:
there are a couple of tests on which it's substantially
worse than C, and for the rest it's similar in speed.)

--
Gareth McCaughan Gareth.M...@pobox.com
.sig under construc

Scott McKay

unread,

May 24, 2003, 10:55:23 AM5/24/03

to

"Scott A Douglass" <sd...@andrew.cmu.edu> wrote in message
news:29817535.1053741842@pepermint...

>
> --On Friday, May 23, 2003 7:10 PM -0700 Michael Park
> <dont_...@whoever.com> wrote:
>
> > Why did CMUCL do so poorly in the "Shootout" ? It is a few times
> > slower than C for matrix multiplication. All types were declared. On
> > my machine, when I test it from REPL, so there are no start-up costs,
> > Lisp is 4-5 times slower than C. Why?
> >
> > http://www.bagley.org/~doug/shootout/bench/matrix/
>
> Why did C do so poorly in the "Shootout"? It is a few times slower that
> CMUCL for regular expression matching. Why?
>
> http://www.bagley.org/~doug/shootout/bench/regexmatch/
>
>

Doesn't this graph really show that we should all be using OCaml?

;-)

Nikodemus Siivola

unread,

May 24, 2003, 11:27:11 AM5/24/03

to

Scott McKay <s...@attbi.com> wrote:

> Doesn't this graph really show that we should all be using OCaml?

Btw -- there is somewhere an interesting variation on the shootout, with
the *bytes* of code metric, using both plain and bzip2'edded
sources. Can't seem to find it right now, though.

IIRC Ocaml did a lot worse in those if any significant weight was given to
the source size.

What I'd like to see (or do, if I find the time) would be a version of the
shootout using various CL implementations -- both commercial and free --
plus maybe g++ and Sun javac to give a non-CL baseline. Also, a FP
intensive test would be interesting as well.

Cheers,

-- Nikodemus

Bruce Hoult

unread,

May 24, 2003, 8:55:43 PM5/24/03

to

In article <vJLza.251402$pa5.2...@rwcrnsc52.ops.asp.att.net>,
"Scott McKay" <s...@attbi.com> wrote:

> Doesn't this graph really show that we should all be using OCaml?

I've played with OCaml and it certainly seems to be a very attractive
language if you already know what you want before you start, but I never
seem to. The OCaml guys have had some spectacular sucesses in the ICFP
contest (e.g. the Ray Tracer in '00), but also some big failures (e.g.
last year).

Four weeks to the next one!

-- Bruce
sticking with Dylan, but hoping to see more CL entries

Neil Schemenauer

unread,

May 25, 2003, 12:34:04 PM5/25/03

to

Bruce Hoult <br...@hoult.org> wrote:
> The OCaml guys have had some spectacular sucesses in the ICFP
> contest (e.g. the Ray Tracer in '00), but also some big
> failures (e.g. last year).

I think you mean two years ago. Last year first place was taken
by a team using OCaml.

Neil

Bruce Hoult

unread,

May 25, 2003, 5:35:59 PM5/25/03

to

In article <0g6Aa.13240$IK4....@nwrddc03.gnilink.net>,
Neil Schemenauer <nas-u...@arctrix.com> wrote:

Hmm ... you're correct.

I was thinking of "the OCaml guys" at INRIA (Camls 'R Us), who came in
81st last year.

There were a number of other teams using OCaml who did better. Which is
of course a good reflection on the language.

-- Bruce

David Steuber

unread,

May 25, 2003, 6:30:49 PM5/25/03

to

I love trolls. My favorite troll is one that has been sautaid (ispell
sucks) in a mushroom, garlic, butter sauce. hmmm mmm good!

Just don't expose them to sunlight. Stone is much harder to eat.

--
(describe 'describe)

Marc Spitzer

unread,

May 25, 2003, 6:50:51 PM5/25/03

to

David Steuber <david....@verizon.net> writes:

> I love trolls. My favorite troll is one that has been sautaid (ispell
> sucks) in a mushroom, garlic, butter sauce. hmmm mmm good!
>
> Just don't expose them to sunlight. Stone is much harder to eat.

But then you get lawn trolls, get enough and you can start
your own miniature golf course/theme park.

marc

Dan Schmidt

unread,

May 25, 2003, 11:18:04 PM5/25/03

to

Bruce Hoult <br...@hoult.org> writes:

My team came in 24th out of 168, and the three of us probably had
around 10 man-hours of experience in OCaml before the contest started.
That slowed us down, but there was a lot about OCaml that sped us up
even more. (I do like Lisp too.)

--
http://www.dfan.org

Michael Park

unread,

May 26, 2003, 2:22:55 AM5/26/03

to

Gareth McCaughan <Gareth.M...@pobox.com> wrote in message news:<slrnbcuucp.59a....@g.local>...

> Michael Park wrote:
>
> > Why did CMUCL do so poorly in the "Shootout" ? It is a few times
> > slower than C for matrix multiplication. All types were declared. On
> > my machine, when I test it from REPL, so there are no start-up costs,
> > Lisp is 4-5 times slower than C. Why?
>
> Looking at the disassembled code from both versions
> (which I'll call "C" and "L" with the obvious meanings):
>
> - C has an inner loop 9 instructions long; L has an
> inner loop 21 instructions long.
>
> - C keeps the accumulated sum in a register
> throughout the calculation; L spills it.
>
> - C keeps the inner loop's counter in a register
> throughout the calculation; L spills it.
>
> - C moves one indexing operation out of the inner loop;
> L doesn't.

1.) Is there any way I can convince CMUCL to perform these particular
optimizations? Or do some kind of inline assembly? "Silly loops" like
these are kind of important in the code I tend to write.

2.) I always kind of envied people who could read assembly. With no
prior knowledge of x86 registers or assembly language, how much
learning would be required (in earth years) to learn this kind of
stuff? Any specific tutorials suggestions? Thanks

P.S. I don't think "bounds checking" necessarily needs to be
compromised. Bounds checks can probably be optimised away, as no one
resizes the matrices while they are being multiplied.

P.P.S. I want Lisp programs to run faster, I must be a troll. Thanks
for noticing.

Michael Park

unread,

May 26, 2003, 2:59:40 AM5/26/03

to

Bruce Hoult <br...@hoult.org> wrote in message news:<bruce-052C32....@copper.ipg.tsnz.net>...

> In article <vJLza.251402$pa5.2...@rwcrnsc52.ops.asp.att.net>,
> "Scott McKay" <s...@attbi.com> wrote:
>
> > Doesn't this graph really show that we should all be using OCaml?
>
> I've played with OCaml and it certainly seems to be a very attractive
> language if you already know what you want before you start, but I never
> seem to. The OCaml guys have had some spectacular sucesses in the ICFP
> contest (e.g. the Ray Tracer in '00), but also some big failures (e.g.
> last year).
>
> Four weeks to the next one!

Four weeks or four months?

> sticking with Dylan, but hoping to see more CL entries

How does one submit a CL entry? As code only? As an image? As fasl?
What if their version of CMUCL is different?

Christian Lynbech

unread,

May 26, 2003, 4:23:11 AM5/26/03

to

>>>>> "Michael" == Michael Park <dont_...@whoever.com> writes:

Michael> How does one submit a CL entry? As code only? As an image? As fasl?
Michael> What if their version of CMUCL is different?

There will be all the info you need on how to submit an entry. My
recollection is that it (at least has been that) you get some general
info on the environment in which you run, how the application is
started and how data is fed to the application and you basically just
submit a binary that can fit into that framework (I cannot remember
specifics but I would be surprised if the rules would exclude
something like CMUCL from participating).

Check out links from some of the old contests to learn more.

------------------------+-----------------------------------------------------
Christian Lynbech | christian #\@ defun #\. dk
------------------------+-----------------------------------------------------
Hit the philistines three times over the head with the Elisp reference manual.
- pet...@hal.com (Michael A. Petonic)

Bruce Hoult

unread,

May 26, 2003, 6:06:43 AM5/26/03

to

In article <ff20888b.03052...@posting.google.com>,
dont_...@whoever.com (Michael Park) wrote:

> Bruce Hoult <br...@hoult.org> wrote in message
> news:<bruce-052C32....@copper.ipg.tsnz.net>...
> > In article <vJLza.251402$pa5.2...@rwcrnsc52.ops.asp.att.net>,
> > "Scott McKay" <s...@attbi.com> wrote:
> >
> > > Doesn't this graph really show that we should all be using OCaml?
> >
> > I've played with OCaml and it certainly seems to be a very attractive
> > language if you already know what you want before you start, but I never
> > seem to. The OCaml guys have had some spectacular sucesses in the ICFP
> > contest (e.g. the Ray Tracer in '00), but also some big failures (e.g.
> > last year).
> >
> > Four weeks to the next one!
>
> Four weeks or four months?

27-29 June. OK, that's five weeks.

> > sticking with Dylan, but hoping to see more CL entries
>
> How does one submit a CL entry? As code only? As an image? As fasl?
> What if their version of CMUCL is different?

In the past the judge's machine has always been x86 Linux.

If your language allows you to submit a statically-linked binary then
that is the safest option.

They also put quite a number of interpreters and other language
implementations on the machine. If you ask for something -- e.g. the
latest version of CMUCL -- then going on past record they will be happy
to have that pre-installed.

Going from the results table, last year there were three entries using
Common Lisp, so clearly someone knows the best way to do it :-)

-- Bruce

Christophe Rhodes

unread,

May 26, 2003, 7:10:35 AM5/26/03

to

dont_...@whoever.com (Michael Park) writes:

> Gareth McCaughan <Gareth.M...@pobox.com> wrote in message news:<slrnbcuucp.59a....@g.local>...

>> Looking at the disassembled code from both versions
>> (which I'll call "C" and "L" with the obvious meanings):
>>
>> - C has an inner loop 9 instructions long; L has an
>> inner loop 21 instructions long.
>>
>> - C keeps the accumulated sum in a register
>> throughout the calculation; L spills it.
>>
>> - C keeps the inner loop's counter in a register
>> throughout the calculation; L spills it.
>>
>> - C moves one indexing operation out of the inner loop;
>> L doesn't.
>
> 1.) Is there any way I can convince CMUCL to perform these particular
> optimizations? Or do some kind of inline assembly? "Silly loops" like
> these are kind of important in the code I tend to write.

Yes. Easily, no.

Inline assembly can be obtained in a manner described in the following
message (well, it's for SBCL, but the details have diverged little):
<URL:http://www.geocrawler.com/lists/3/SourceForge/1664/0/6521166/>.

In terms of getting this kind of thing to happen automatically:
CMUCL's register allocator is a little bit lame. Firstly, and most
egregiously, it allocates inwards rather than outwards, so that outer
loop index variables tend to be allocated registers and inner loops
memory locations. Yes, this is suboptimal; no, I don't know why;
maybe it isn't too hard to fix. If you're interested in delving into
this area, look for TN (standing for "Temporary Name", we think :) in
the code.

Secondly, CMUCL's optimizer is unable to reorder loops, combine loop
variables and the like because, somewhat surprisingly, it does no
structural analysis (as understood these days) at all. No SSA, no
identification of schemata, nothing. I have partially-written code to
begin giving the compiler some smarts in this area, but absent some
copious free time "partially-written" is how it'll stay :-/.
Similarly, it ought to be possible to do some automatic vectorization,
using the SSE units on modern x86s; see
<http://www.geocrawler.com/lists/3/SourceForge/1663/0/6811094/> for
one hack to do it. This may help if you're doing vector arithmetic as
in modern Fortrans, I guess.

> 2.) I always kind of envied people who could read assembly. With no
> prior knowledge of x86 registers or assembly language, how much
> learning would be required (in earth years) to learn this kind of
> stuff? Any specific tutorials suggestions? Thanks

To the level that you need to know, not very long. Maybe a couple of
days? As long as you don't want to build something like a "standalone
executable", assembly language is just another programming language :-)

> P.P.S. I want Lisp programs to run faster, I must be a troll. Thanks
> for noticing.

Um... maybe I shouldn't have bothered replying.

Christophe
--
http://www-jcsu.jesus.cam.ac.uk/~csr21/ +44 1223 510 299/+44 7729 383 757
(set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b)))
(defvar b "~&Just another Lisp hacker~%") (pprint #36rJesusCollegeCambridge)

Kaz Kylheku

unread,

May 26, 2003, 1:55:26 PM5/26/03

to

dont_...@whoever.com (Michael Park) wrote in message news:<ff20888b.0305...@posting.google.com>...

> Why did CMUCL do so poorly in the "Shootout" ? It is a few times
> slower than C for matrix multiplication. All types were declared. On
> my machine, when I test it from REPL, so there are no start-up costs,
> Lisp is 4-5 times slower than C. Why?

Who cares? If your C compiler does a better job than your Lisp one,
then write the multiplication routine in C and call it from Lisp! You
don't have to choose one language over the other.

These stupid compiler benchmarks do not lead to any useful conclusion
beyond their numeric results. Should I waste ten years writing a
program in C that can be written in five months in Lisp, because at
the end of that ten years, that program will be able to multiply
matrices faster, the odd time that it needs to? The market for the
program will probably evaporate in a decade!

That Lisp compilers can handle numeric processing at even 50% of the
speed of C compilers already destroys the ``Lisp is slow'' argument.
You can find such discrepancies between the implementations of the
*same* language!

Scott McKay

unread,

May 26, 2003, 5:01:07 PM5/26/03

to

"Michael Park" <dont_...@whoever.com> wrote in message
news:ff20888b.03052...@posting.google.com...

> Gareth McCaughan <Gareth.M...@pobox.com> wrote in message
news:<slrnbcuucp.59a....@g.local>...

>

> 1.) Is there any way I can convince CMUCL to perform these particular
> optimizations? Or do some kind of inline assembly? "Silly loops" like
> these are kind of important in the code I tend to write.
>
> 2.) I always kind of envied people who could read assembly. With no
> prior knowledge of x86 registers or assembly language, how much
> learning would be required (in earth years) to learn this kind of
> stuff? Any specific tutorials suggestions? Thanks
>
> P.S. I don't think "bounds checking" necessarily needs to be
> compromised. Bounds checks can probably be optimised away, as no one
> resizes the matrices while they are being multiplied.
>

Good type inference can be used to optimize out some
bounds-checking, but the optimizations might are not
necessarily thead-safe.

(loop for x across v doing ...) can be optimized

(loop for i from 0 below (length v) doing ...) can be optimized

These are pretty common...

Gareth McCaughan

unread,

May 26, 2003, 2:51:37 PM5/26/03

to

Michael Park wrote:

[I said:]

> > Looking at the disassembled code from both versions
> > (which I'll call "C" and "L" with the obvious meanings):
> >
> > - C has an inner loop 9 instructions long; L has an
> > inner loop 21 instructions long.
> >
> > - C keeps the accumulated sum in a register
> > throughout the calculation; L spills it.
> >
> > - C keeps the inner loop's counter in a register
> > throughout the calculation; L spills it.
> >
> > - C moves one indexing operation out of the inner loop;
> > L doesn't.
>
> 1.) Is there any way I can convince CMUCL to perform these particular
> optimizations? Or do some kind of inline assembly? "Silly loops" like
> these are kind of important in the code I tend to write.

I don't know of any way to make CMU CL compile this sort of
code better, but I'm not an expert on CMU CL's back-end.
It's possible that the difficulty is quite fundamental;
for instance, if some bits of CMU CL assume that certain
registers have particular information in them *at all times*
then there might just not be enough registers left. ia32
is a pretty register-starved architecture.

I think some CMU CL experts read comp.lang.lisp. If you
don't want to bet on whether they'll read this thread,
you could try mailing the CMU CL developers' mailing list,
but if you do that then I strongly recommend that you be
a little more careful in how you word your criticisms. :-)

Being iggerant of CMU CL, I also don't know offhand
whether it can do inline assembly. I do know that it
has a foreign function interface; perhaps you can write
your inner loops in C. Take a look in the CMU CL user's
manual at the chapter called "Alien Objects". It even
has a section on accessing Lisp arrays from C.

> 2.) I always kind of envied people who could read assembly. With no
> prior knowledge of x86 registers or assembly language, how much
> learning would be required (in earth years) to learn this kind of
> stuff? Any specific tutorials suggestions? Thanks

It depends on what you want to be able to do, but it
shouldn't be too hard to learn enough for drawing the
sorts of conclusions I drew above. (In particular, I
don't really know ia32 assembler, though I do have some
experience at that level with other architectures.)

I don't have any particular texts to recommend; sorry.
Most of the ones I remember learning from are way out
of date, and in fact my learning has mostly been by
doing.

> P.S. I don't think "bounds checking" necessarily needs to be
> compromised. Bounds checks can probably be optimised away, as no one
> resizes the matrices while they are being multiplied.

At (speed 3) (safety 0) they are probably omitted anyway.
There certainly aren't any bounds checks in the inner loop
of the code CMU CL generates.

> P.P.S. I want Lisp programs to run faster, I must be a troll. Thanks
> for noticing.

I don't think you're a troll, but comp.lang.lisp has seen
so many trolls asking "why oh why oh why is lisp so slow?
such-and-such a stupid benchmark runs twice as fast in C",
where usually it turns out that their benchmark has
absolutely nothing to do with reality and they've done
it wrong anyway and they aren't really interested in Lisp
other than for telling us all how bad it is.

Gareth McCaughan

unread,

May 26, 2003, 6:20:16 PM5/26/03

to

Kaz Kylheku wrote:

> Who cares? If your C compiler does a better job than your Lisp one,
> then write the multiplication routine in C and call it from Lisp! You
> don't have to choose one language over the other.

Very true.

> These stupid compiler benchmarks do not lead to any useful conclusion
> beyond their numeric results. Should I waste ten years writing a
> program in C that can be written in five months in Lisp, because at
> the end of that ten years, that program will be able to multiply
> matrices faster, the odd time that it needs to? The market for the
> program will probably evaporate in a decade!

Not so very true. Firstly, most things don't take 20 times
longer in C than in Lisp. Secondly, some people's programming
is mostly inner-loopy numeric stuff, so "the odd time that
it needs to" may be way off. Thirdly, some programs really
do need every bit of speed they can get; I've written some
where writing them in half the time and getting half the
speed would have been a really bad tradeoff.

That doesn't mean Lisp is a bad choice for these applications;
just that your first point (foreign function calls) is the
one that really matters.

> That Lisp compilers can handle numeric processing at even 50% of the
> speed of C compilers already destroys the ``Lisp is slow'' argument.
> You can find such discrepancies between the implementations of the
> *same* language!

Yep. So now we're onto the "Lisp isn't the absolute fastest
language around" argument. That's a pretty good argument to
be in even if we lose, frankly. :-)

(But for the particular microbenchmark we're talking about
here, it was 25% rather than 50%.)

Michael Park

unread,

May 26, 2003, 6:58:28 PM5/26/03

to

k...@ashi.footprints.net (Kaz Kylheku) wrote in message news:<cf333042.03052...@posting.google.com>...

> dont_...@whoever.com (Michael Park) wrote in message news:<ff20888b.0305...@posting.google.com>...
> > Why did CMUCL do so poorly in the "Shootout" ? It is a few times
> > slower than C for matrix multiplication. All types were declared. On
> > my machine, when I test it from REPL, so there are no start-up costs,
> > Lisp is 4-5 times slower than C. Why?
>
> Who cares? If your C compiler does a better job than your Lisp one,
> then write the multiplication routine in C and call it from Lisp! You
> don't have to choose one language over the other.

Good point, I guess. Is FFI fast?

> These stupid compiler benchmarks do not lead to any useful conclusion
> beyond their numeric results. Should I waste ten years writing a
> program in C that can be written in five months in Lisp,

Hyperbole? Or was Lisp 24 times more productive than C or C++ in your
personal experience?

Aleksandr Skobelev

unread,

May 27, 2003, 3:39:05 AM5/27/03

to

dont_...@whoever.com (Michael Park) writes:

> Why did CMUCL do so poorly in the "Shootout" ? It is a few times
> slower than C for matrix multiplication. All types were declared. On
> my machine, when I test it from REPL, so there are no start-up costs,
> Lisp is 4-5 times slower than C. Why?

Just adding '(declaim (start-block main))'/'(declaim (end-block))' around
the code gives about 25% of speedup. So 10000 of iterations with 30x30
matrixes takes 3.27÷3.57 secs for CMUCL vs 1.13 secs for GCC 3.2.3[1] and
4.86÷5.56 secs vs 2.91÷3.00 for 10 iterations with 300x300 matrixes.

Footnotes:
[1] gcc-3.2 -O3 -ffast-math -march=athlon-tbird -o matrix matrix.c

Thomas F. Burdick

unread,

May 27, 2003, 3:03:11 PM5/27/03

to

dont_...@whoever.com (Michael Park) writes:

> k...@ashi.footprints.net (Kaz Kylheku) wrote:
>
> > These stupid compiler benchmarks do not lead to any useful conclusion
> > beyond their numeric results. Should I waste ten years writing a
> > program in C that can be written in five months in Lisp,
>
> Hyperbole? Or was Lisp 24 times more productive than C or C++ in your
> personal experience?

Not hyperbole for someone used to both Lisp and C. In my personal
experience, it's usually an order of magnitude more productive. Once
in a while, two orders.

C++ has its own issues, in that you *can* be a lot more productive
than in C by making heavy use of the STL, but if the program isn't
fast enough, it's notoriously difficult to increase its efficiency
without just massively rewriting it.

--
/|_ .-----------------------.
,' .\ / | No to Imperialist war |
,--' _,' | Wage class war! |
/ / `-----------------------'
( -. |
| ) |
(`-. '--.)
`. )----'

Kaz Kylheku

unread,

May 27, 2003, 3:25:05 PM5/27/03

to

Scott A Douglass <sd...@andrew.cmu.edu> wrote in message news:<29817535.1053741842@pepermint>...

> --On Friday, May 23, 2003 7:10 PM -0700 Michael Park
> <dont_...@whoever.com> wrote:
>
> > Why did CMUCL do so poorly in the "Shootout" ? It is a few times
> > slower than C for matrix multiplication. All types were declared. On
> > my machine, when I test it from REPL, so there are no start-up costs,
> > Lisp is 4-5 times slower than C. Why?
> >
> > http://www.bagley.org/~doug/shootout/bench/matrix/
>
> Why did C do so poorly in the "Shootout"? It is a few times slower that
> CMUCL for regular expression matching. Why?
>
> http://www.bagley.org/~doug/shootout/bench/regexmatch/

Probably because the C program was using a regular expression library?
The C program uses something called ``pcre'' and the C++ program uses
some library called ``zopyra''. So these are effectively benchmarks
for these libraries compiled with these compilers.

Regular expression libraries for C tend to be poorly optimized,
because the language provides no built in compiler. In effect, the
writer of a regular expression library is subject to Greenspun's Tenth
Rule: he has to not only write the parser and translator for regular
expressions from scratch, but choose an intermediate representation,
and write the virtual machine that interprets that representation.
I've never seen a regex library for C which dynamically compiled
regexes to native machine code; the reason is that it's probably too
much work to be worth the effort.

A Lisp-based regex compiler could, on the other hand, quite easily
take advantage of dynamic compilation. Just translate the regex to the
source code of a Lisp function, and compile it.

But the CMUCL code doesn't actually use regular expressions. It uses
the META package: META:WITH-STRING-META macro and explicit calls of
META:MATCH. It's not entirely clear that this should be categorized as
a regular expression benchmark.

It would make more sense to choose some actual regex implementation
for the benchmark, like Edi Weitz's CLPPCRE:
http://www.weitz.de/cl-ppcre/

From the CLPPCRE abstract:

``It is fast. If compiled with CMUCL it outperforms Perl's highly
optimized regex engine (written in C) which to my knowledge is faster
than most other regex engines around. If compiled with CLISP it is
still comparable to CLISP's own regex implementation which is also
written in C.''

Edi Weitz

unread,

May 27, 2003, 6:58:02 PM5/27/03

to

k...@ashi.footprints.net (Kaz Kylheku) writes:

FWIW, I actually did that yesterday out of curiosity. It turned out
that my code (which basically just combined Jochen Schmidt's CMUCL
code with the Perl regex found at the "Shootout" site) was almost
identical in speed on my machine:

Perl 5.8: 27.0 sec
Perl 5.6.1: 19.7 sec
O'Caml 3.04: 19.3 sec
CMUCL 18e with META: 14.5 sec
CMUCL 18e with CL-PPCRE 0.5.4: 12.5 sec

[Average values for three consecutive runs with n = 100000 excluding
one initial run to fill the disk cache.]

BTW, CL-PPCRE does not do any compilation at run-time. It rather
translates the regex into a chain of closures which were compiled when
the library was built.

Edi.

(proclaim '(optimize (speed 3) (safety 0) (space 0) (debug 0) (compilation-speed 0)))
(setf ext:*bytes-consed-between-gcs* 5000000)
(use-package :cl-ppcre)

(defparameter *scanner* (load-time-value
(create-scanner
"(?: ^ | [^\\d\$]) # must be preceeded by non-digit
( \\( )? # match 1: possible initial left paren
(\\d\\d\\d) # match 2: area code is 3 digits
(?(1) \$ ) # if match1 then match right paren
[ ] # area code followed by one space
(\\d\\d\\d) # match 3: prefix of 3 digits
[ -] # separator is either space or dash
(\\d\\d\\d\\d) # match 4: last 4 digits
(?: \\D | $ ) # must be followed by a non-digit"
:extended-mode t)))

(defun main ()
(let ((n (parse-integer (or (car (last extensions:*command-line-strings*)) "1")))
(input (loop for line = (read-line *standard-input* nil 'eof)
until (eq line 'eof) collect line)))
(loop for i of-type fixnum from 1 below n do
(loop for line of-type simple-base-string in input
do (scan *scanner* line)))
(loop with i of-type fixnum = 0
for line in input
do (multiple-value-bind (match-start match-end reg-starts reg-ends)
(scan *scanner* line)
(declare (ignore match-start match-end))
(when reg-starts
(locally
(declare (simple-base-string line))
(format t "~A: (~A) ~A-~A~%"
(incf i)
(subseq line (svref reg-starts 1) (svref reg-ends 1))
(subseq line (svref reg-starts 2) (svref reg-ends 2))
(subseq line (svref reg-starts 3) (svref reg-ends 3)))))))))

Kirk Kandt

unread,

May 27, 2003, 8:57:25 PM5/27/03

to

You need to be careful about benchmarks. Seldom are they done well. Seldom
are they not biased.

Note that the java implementation is about nine times slower than the g++
implementation. Note that the java implementation is about three times
slower than the lisp implementation. Does this make sense to you? It does
not to me. Something is invalid about the benchmark.

After spending 20 years programming in Lisp, I have spent the last ten
programming in C++ and Java. :( About 5 years ago I wrote an optimization
system that solved non-linear programming problems in C++. I did everything
possible to optimize it. Virtually all computations were double-precision
matrix operations. Later, I ported it to Java 1.2 without doing anything to
compile the code or improving code generation settings; the settings and
compiler were whatever was standard in JDK 1.2. The Java version was 40%
slower. When doing the quick port to Java, I no longer performed my own
memory management since I did not write a Java equivalent of a C++ new and
delete operator. (I planned on doing it but never got around to finishing
the port.) I assume that cost me between 20-40% of my performance penalty.
Regardless, there was not a 300% difference.

I just spent about 15 seconds looking at the C++ code. Some brief
comments... (1) It uses the naive approach to matrix multiplication, which
is slower than other known approaches for solving the problem. (See Matrix
Computation by Golub and van Loan, for example.) (2) The m1 and m3 vectors
could be cached instead of recomputing them over and over (slightly less
readable but much more efficient). (3) Is there something unique about 30 x
30 matrices? Would the same results be achieved if the matrices were
different sizes? If not, why not? (4) Integer arrays? That's what I always
use when I solve my intensive mathematical problems. :) Would the same
results have been achieved if the programmer had used items that were
double-precision, which are actually useful for solving real problems? The
java and C++ implementations are different! Were other Lisp optimization
settings used? The "better" ones are not always better. I haven't hack Lisp
in a long time but I expect there are some subtle inefficiencies in the Lisp
implementation.

The bottom-line is this benchmark is meaningless, as most published ones
are.

"Michael Park" <dont_...@whoever.com> wrote in message

news:ff20888b.0305...@posting.google.com...

> Why did CMUCL do so poorly in the "Shootout" ? It is a few times
> slower than C for matrix multiplication. All types were declared. On
> my machine, when I test it from REPL, so there are no start-up costs,
> Lisp is 4-5 times slower than C. Why?
>
> http://www.bagley.org/~doug/shootout/bench/matrix/
>

Edi Weitz

unread,

May 27, 2003, 9:18:35 PM5/27/03

to

"Kirk Kandt" <kka...@dslextreme.com> writes:

> You need to be careful about benchmarks. Seldom are they done
> well. Seldom are they not biased.
>

> [snip]

>
> The bottom-line is this benchmark is meaningless, as most published
> ones are.

At least the author himself doesn't deny this:

<http://www.bagley.org/~doug/shootout/conclusion.shtml>

Adrian Kubala

unread,

May 28, 2003, 3:26:19 PM5/28/03

to

On Tue, 27 May 2003, Kirk Kandt wrote:
> The java and C++ implementations are different!

You imply that every language should use the same algorithm. Part of the
point in benchmarking languages is to see which make it easier to write
more efficient algorithms with the same expenditure. Not that this
benchmark really does that either... I'd like to see an experiment where
we take experts in each language and give them a maximum of X hours to
write something -- then we change the requirements totally and give them Y
more hours. They can use any libraries or existing code they want (though
the tasks must obviously be chosen carefully).

I guess you can see things like this already in programming
competitions...

Kirk Kandt

unread,

May 28, 2003, 4:31:07 PM5/28/03

to

I did not mean to imply that the same algorithm should be used in each
language. What I meant to say is that you need "apples" to "apples"
comparisons. In this case the person should have coded both implementations
in both C++ and Java and then reported the results. Then, you'd have a
clearer picture that would allow you to make a better assessment of the
benchmarks.

"Adrian Kubala" <adr...@sixfingeredman.net> wrote in message
news:Pine.LNX.4.44.030528...@gwen.sixfingeredman.net...

Christopher Browne

unread,

May 28, 2003, 9:27:24 PM5/28/03

to

In an attempt to throw the authorities off his trail, Adrian Kubala <adr...@sixfingeredman.net> transmitted:

If a particular algorithm is optimal for a particular purpose, then it
is likely that people will choose that algorithm whatever the
language. Or you'd _hope_ that you'd be able to consider it.

Apparently, in this case, the matrix algorithms were so different that
it amounts to the different implementations computing different
things, or, at least, that some programs did a Bad Job.

It is quite unfortunate that CMUCL gets "injured" by not optimizing
the code quite as well as some other language implementations,
effectively changing the algorithm to a less efficient one. That's
not what you want to havve happen. Fortunately/Hopefully, the
"injury" is one that isn't typical.
--
select 'aa454' || '@' || 'freenet.carleton.ca';
http://www.ntlug.org/~cbbrowne/linuxxian.html
"Those who doubt the importance of a convenient notation should try
writing a LISP interpreter in COBOL or doing long division with Roman
numerals." -- Hal Fulton

ozan s yigit

unread,

May 29, 2003, 10:36:02 AM5/29/03

to

Kaz Kylheku writes, among other things:

> I've never seen a regex library for C which dynamically compiled
> regexes to native machine code;

actually, ken thompson's original (well known, rarely read) paper on
regular expressions shows how his implementation dynamically compiles
them to native machine code.

cheers... oz
---
[1] Thompson, K. "Regular expression search algorithm,"
Comm. ACM 11:6, 419-422, 1968.
---
logic's hell! - russell

Gavin

unread,

Jun 20, 2003, 10:53:29 AM6/20/03

to

"Gareth McCaughan" <Gareth.M...@pobox.com> wrote in message

news:slrnbd4ohp.2eb1....@g.local...

Read or skim the cmu-user.pdf that comes with the CMUCL distribution for
type inferencing , compiler settings and
design choices and extensions.

Any program in Lisp( eg CMUCL) will run differently (faster or slower)
depending on the actual physcial machine
it is running on. (even with the same compiler settings and optimizations
turned, the [performance and benchmarks would be different)
Most of the Lisp users are not using the same kind of Unix or Linux are
they? with the same amount RAM, virtual memory settings
, same number of processes running in the background , the quality and
number of registers it uses ( different OS would have different types) etc

The same Lisp program written in CMUCL or some other implementation will run
much faster in a 2.3 Ghz Pentium processor machine with 1GB RAM
than a 1.8 Ghz Celeron processor with 256 RAM ( Assuming both users use the
same CMUCL compiler settings) . In this case, even a Lisp program running
in the 2.3 Ghz machine may run faster than a C/C++ running in the 1.8 Ghz
machine. Even the kind of hard disk used may play part in the speed of the
Lisp program a person writes. Performance metric would be best be made using
the same computer or two or more computer having identical specifications.

Anywhere CMUCL is one of the fastest free Lisp implementations around. Here
is an URL for CMUCL benchmarks.
http://www.cons.org/cmucl/benchmarks/index.html

Any realistic comparisons can only be made if both users use the same PC or
computer to test out CMUCL or some other Lisp or language implementations.
Thanks Gareth for your comments.

Gavin

Rayiner Hashem

unread,

Jun 20, 2003, 8:38:29 PM6/20/03

to

As a mainly C++ guy, this thread has been interesting to me. I find
Lisp very elegant as a language, but am concerned about performance.
For some of the work I do, a 50% performance differential means that I
have to wait 2 days for results instead of one. Besides, performance
is just another aspect of elegance, no? Anyway, I have a few questions
for those with some experience with various Lisp implementations.

1) Is it worth buying a commercial implementation? Is Allegro CL or
Franz CL significantly faster than CMU CL, particularly for numerics
work?

2) How are Lisp compilers at doing high level optimizations? One of
the things I like about C++ is that C++ compilers have gotten good
enough that they can take high-level, abstract interfaces and make
them perform as fast as hand-written, low-level C code will be. So you
really just have to pay attention to performance when choosing
algorithms, not really when you're just writing the code. Simple,
straight-forward code will easily get you more than halfway to optimal
performance. Do good Lisp compilers behave the same way, or do you
have "uglify" your code significantly to get the kind of performance
Lisp advocates brag about?

Pascal Costanza

unread,

Jun 20, 2003, 9:10:09 PM6/20/03

to

In article <a3995c0d.03062...@posting.google.com>,
hel...@mindspring.com (Rayiner Hashem) wrote:

> As a mainly C++ guy, this thread has been interesting to me. I find
> Lisp very elegant as a language, but am concerned about performance.
> For some of the work I do, a 50% performance differential means that I
> have to wait 2 days for results instead of one. Besides, performance
> is just another aspect of elegance, no? Anyway, I have a few questions
> for those with some experience with various Lisp implementations.
>

There are other aspects to consider here: for example, what are your
options when you get a bug very late in a running program?

Lisp's interactive nature allows you to correct the bug and let the
program continue from that point without having to restart the whole
program. So, say, if your C++ program has a bug that lets it crash after
20 hours you will have to fix the bug and then start all over again.
Then your performance advantage is already lost. Now you get another bug
after 22 hours - and then you have already reached the third day because
you need to start your program for the third time.

With Lisp you get the same first bug after 40 hours. You correct it and
continue the program. You get the second bug after another 4 hours. You
correct it and then let the program finish its job. Voila, you have
beaten the C++ version. :)

Of course, this is an example made up on the spot. But there are some
people here in c.l.l who have actual experience with similar situations.
The point is: you have to balance the pure and raw runtime efficiency
with your actual productivity. Indeed, you pay a price for Lisp's
flexibility - but you alse get something for it...

Pascal

Bruce Hoult

unread,

Jun 20, 2003, 9:54:57 PM6/20/03

to

In article <costanza-F314F5...@news.netcologne.de>,
Pascal Costanza <cost...@web.de> wrote:

> There are other aspects to consider here: for example, what are your
> options when you get a bug very late in a running program?
>
> Lisp's interactive nature allows you to correct the bug and let the
> program continue from that point without having to restart the whole
> program. So, say, if your C++ program has a bug that lets it crash after
> 20 hours you will have to fix the bug and then start all over again.
> Then your performance advantage is already lost. Now you get another bug
> after 22 hours - and then you have already reached the third day because
> you need to start your program for the third time.
>
> With Lisp you get the same first bug after 40 hours. You correct it and
> continue the program. You get the second bug after another 4 hours. You
> correct it and then let the program finish its job. Voila, you have
> beaten the C++ version. :)

I'm wondering whether this might play a part in next weekend's ICFP
contest.

Unlike in previous years, it has been announced that you won't submit a
program to them to test, but will run your program on your own machine.

Unless they're assuming that no one is on a dial-up connection or behind
a firewall, I suspect this means that what you have to submit is simply
an optimal solution to the problem, not the program to find the
solution. Which further leads me to suspect that the trade-off between
programming time and execution time may be critical.

And of course there is another alternative between the Lisp and C++
scenarios you give above: you can write a C++ program with a designed-in
"checkpointing" feature, so that you have opportunities to dump partial
results on a regular basis and restart later.

-- Bruce

Edi Weitz

unread,

Jun 21, 2003, 3:36:33 AM6/21/03

to

hel...@mindspring.com (Rayiner Hashem) writes:

> 1) Is it worth buying a commercial implementation? Is Allegro CL or
> Franz CL significantly faster than CMU CL, particularly for numerics
> work?

There are lots of scenarios where it is worth buying one of the
commercial implementations (Franz, Xanalys, Corman, Digitool,
Scieneer, ...) but I'd say that this is most often not about speed. In
many cases CMUCL is indeed the fastest CL around today, for numerics
work it might also be worthwhile to check out CLISP.

> 2) How are Lisp compilers at doing high level optimizations? One of
> the things I like about C++ is that C++ compilers have gotten good
> enough that they can take high-level, abstract interfaces and make
> them perform as fast as hand-written, low-level C code will be. So
> you really just have to pay attention to performance when choosing
> algorithms, not really when you're just writing the code. Simple,
> straight-forward code will easily get you more than halfway to
> optimal performance. Do good Lisp compilers behave the same way, or
> do you have "uglify" your code significantly to get the kind of
> performance Lisp advocates brag about?

Usually you'll proceed like this if you want to squeeze more
performance out of your Lisp program:

1. Use a profiler (almost all implementations have one, and there are
also some implementation-dependent profilers around) to identify
the "hot spots" of your program.

2. Add some type and optimization declarations to these hot spots to
increase performance.

3. The good compilers, notably Python (the compiler for CMUCL, SBCL,
and SCL), will also help you in identifying places where the
compiler needs your guidance for optimization.

4. If you're still not happy with your results, you might want to
decide to change one or two important data representations because
your profiler found out that you "cons" too much, i.e. that your GC
is more busy than it should be. Some knowledge about your
particular implementation might help here, and you might actually
come across situations that I'd call "uglification" but Lisp
(especially its macro facilities) will help you in writing pretty
good abstractions and if you've done so the "uglification" effort
might be negligible.

5. As a last resort you can use your Lisp's foreign function interface
to implement critical sections in C or C++. (I've never done this,
though.)

These steps will generally lead to programs sufficiently fast for most
purposes. A comparable C/C++ program _might_ still be a bit faster but
you have to decide (in advance...) if it's worth it. This newsgroup
has a lot of regular contributors who are C++ experts and who would
nevertheless opt for Lisp in most cases.

Or, for a different perspective: I myself am not a C++ expert so I've
to take your word that current C++ compilers are pretty good at
optimizing high-level abstractions, maybe better than current CL
compilers. But there's nothing inside the CL language specification
that "forbids" Lisp compilers to do the same thing. If you're a
commercial customer of one of the big Lisp vendors I'm sure they'll be
happy to optimize specific parts of the compilation process if you can
convince them that this is a significant bottleneck for you.

HTH,
Edi.

prunes...@attbi.com

unread,

Jun 21, 2003, 10:53:05 AM6/21/03

to

Edi Weitz <e...@agharta.de> writes:

> Or, for a different perspective: I myself am not a C++ expert so I've
> to take your word that current C++ compilers are pretty good at
> optimizing high-level abstractions, maybe better than current CL
> compilers.

In my experience, C++ compilers are absolutely *terrible* at
optimizing high-level abstractions. But I mean high-level
abstractions such as Lisp might define, not what seems high-level to a
C++ programmer.

To give an example, suppose you wanted to do perform some operation
like this:

(let ((counter 0))
(map 'list (lambda (element)
(cons (incf counter) element))
some-list))

Well, C++ doesn't have closures, but it *does* have classes, so you
can stack allocate a class that contains a reference to the counter
variable and pass a reference to that to the MAP member of the LIST
class. Hey, it works. Slowly.

What C++ doesn't do (I've never seen it), is realize that MAP can be
open-coded and that the closure doesn't need to exist.

This is a simple high-level optimization.

Paolo Amoroso

unread,

Jun 21, 2003, 2:11:18 PM6/21/03

to

On Sat, 21 Jun 2003 03:10:09 +0200, Pascal Costanza <cost...@web.de>
wrote:

> With Lisp you get the same first bug after 40 hours. You correct it and
> continue the program. You get the second bug after another 4 hours. You
> correct it and then let the program finish its job. Voila, you have
> beaten the C++ version. :)
>
> Of course, this is an example made up on the spot. But there are some
> people here in c.l.l who have actual experience with similar situations.

Some coverage of this can be found in the report:

The Planning and Scheduling Working Group Report on Programming Languages
http://www.ess.stsci.edu/psdb/spike/spike-wg/psflang-report.pdf

Paolo
--
Paolo Amoroso <amo...@mclink.it>

Thomas F. Burdick

unread,

Jun 21, 2003, 10:51:07 PM6/21/03

to

hel...@mindspring.com (Rayiner Hashem) writes:

> As a mainly C++ guy, this thread has been interesting to me. I find
> Lisp very elegant as a language, but am concerned about performance.
> For some of the work I do, a 50% performance differential means that I
> have to wait 2 days for results instead of one.

In cases where it really matters, you can generally get performance
more like 80-90% of low-level C code. One of the many reasons that
people often get performance more like 50% is that they hold Lisp code
to higher standards: where they #define'd constants in the C code,
they make the Lisp code use user-specified parameters; they want the
Lisp code to get correct results for all input, where they're okay
with C's lousy numeric tower; etc.

> Besides, performance is just another aspect of elegance, no?

I certainly think so.

> 1) Is it worth buying a commercial implementation? Is Allegro CL or
> Franz CL significantly faster than CMU CL, particularly for numerics
> work?

It might be, but not because of the compiler. Python (the CMUCL
compiler) can generate very good code (esp on RISC architectures),
although in some cases you need to learn how to optimize your code for
it (but that's the case in any language). The CMUCL mailing lists are
generally very helpful, but if you want more support than that, you'll
either need a commercial implementation, or you can hire a CMUCL
developer as a consultant.

> 2) How are Lisp compilers at doing high level optimizations? One of
> the things I like about C++ is that C++ compilers have gotten good
> enough that they can take high-level, abstract interfaces and make
> them perform as fast as hand-written, low-level C code will be.

Really? In my experience, if you write idiomatic, higher-level C++
code, you'll get code more like 85-95% of low-level C code (unless you
get bitten by an inefficient bit of the STL). One thing you might
have noticed about Lisp code is that "high level" means something
higher level in Lisp than in C++. All this said, Lisp compilers
aren't bad. They don't do some of the static analysis heroics they
could, but you can generally expect them to do a good job, and only
deal with the times they fail.

Also, this is a big point: in C++, you're at the mercy of the
compiler, whereas in Lisp, the compiler works *with* you. You can
build mini-compilers in your macros, and you can write compiler macros
that can perform weird domain-specific optimizations that you couldn't
expect any general-purpose compiler to do. (This is actually what
brought me to Lisp -- I wanted to write efficient code without losing
elegance; so I was reinventing compiler macros in C++, until someone
pointed me at Lisp).

> So you really just have to pay attention to performance when
> choosing algorithms, not really when you're just writing the
> code. Simple, straight-forward code will easily get you more than
> halfway to optimal performance. Do good Lisp compilers behave the
> same way, or do you have "uglify" your code significantly to get the
> kind of performance Lisp advocates brag about?

Speaking only for CMUCL here, optimized code generally looks nice.
Once in a while, you need to uglify it to get performance (and you can
usually hide this behind only semi-ugly macros), but the vast majority
of the time the ugly code you see for CMUCL doesn't need to be. The
proper way to optimize code with Python is to stepwise change it to be
friendlier to Python's analysis. Just throwing 100k declarations and
(the ...) forms at your code is the wrong approach.

As an often performance-minded ex-C++'er, I highly recommend you try
Lisp. My code looks nicer than it ever did in C++, and where it
matters, I get the performance I need: two of my personal Lisp success
stories were a CL program that was rewritten in C++ (gcc) that never
was able to match the performance of the CL version; and a CL rewrite
of a C++ (Sun Forte) program where CMUCL was able to get performance
in the 90-105% range of Forte.

David Steuber

unread,

Jun 22, 2003, 2:23:29 PM6/22/03

to

I don't really know much about how compilers work, particularly when
it comes to producing optimized code. However, would I be wrong in
thinking that a high level language has more information available for
the compiler to optimize the generated code?

Some years back, I ran into an interesting case with a Java vs C test
I had written for numerical performance. I had this little bit of
code:

sin(x);
cos(x);

The Java compiler recognized what was going on and generated assembler
using Intel's sincos FPU instruction. The C++ compiler (VC++ 5.0)
missed that optimization. The result was the Java code executed a
fair bit faster.

Considering the mathematical nature of Lisp, I would wonder if it is
generally possible to have the compiler do code transformations to
produce the best possible code for the algorithm. Or if not possible
in the general case, have a large library of patterns that it
recognizes so that it can optimize those specific cases.

--
One Editor to rule them all. One Editor to find them,
One Editor to bring them all and in the darkness bind them.

(do ((a 1 b) (b 1 (+ a b))) (nil a) (print a))

Rayiner Hashem

unread,

Jun 22, 2003, 5:36:59 PM6/22/03

to

> I don't really know much about how compilers work, particularly when
> it comes to producing optimized code. However, would I be wrong in
> thinking that a high level language has more information available for
> the compiler to optimize the generated code?

I realize that a high level language theoretically has more
information available to give to the optimizer, I was just wondering
whether the current Lisp compilers really took advantage of that. From
what I've seen in the very helpful thread that followed, it appears
that cmucl does a pretty good job of this, a lot better than the
shootout results would lead me to believe. A lot of my work isn't
speed sensitive (heck, Python is fast enough most of the time) but the
4x-5x differentials in the shootout results made me a little
uncomfortable for the computational stuff. To its credit, C++
generally does a good job for this type of work, in particular, I've
never had a bug crop up in the middle of a large computation. However,
from what I've seen, Lisp is a fine language, and I wanted to try it
out for something which it seems well suited to. Anyway, thanks to all
those who took the time to respond to a Lisp-newb like me :)

Hartmann Schaffer

unread,

Jun 22, 2003, 5:46:21 PM6/22/03

to

In article <87fzm2t...@verizon.net>,
David Steuber <david....@verizon.net> writes:
> ...

> Some years back, I ran into an interesting case with a Java vs C test
> I had written for numerical performance. I had this little bit of
> code:
>
> sin(x);
> cos(x);
>
> The Java compiler recognized what was going on and generated assembler
> using Intel's sincos FPU instruction. The C++ compiler (VC++ 5.0)
> missed that optimization. The result was the Java code executed a
> fair bit faster.

you can do this optimization in any language, provided the standard
library is part of the language specification and the standard forbids
replacing the standard library functions by functions that behave
differently. I have seen C compilers perform this particular
optimization. I also remember some early paper on Algol68 state that
this is something an Algol68 user should his system administrator to
add to the compiler

> ...

hs

--

ceterum censeo SCO esse delendam

Rayiner Hashem

unread,

Jun 22, 2003, 5:48:18 PM6/22/03

to

> With Lisp you get the same first bug after 40 hours. You correct it and
> continue the program. You get the second bug after another 4 hours. You
> correct it and then let the program finish its job. Voila, you have
> beaten the C++ version. :)

Assuming that the bug wasn't one that generated incorrect data in
certain cases wihtout causing a crash. Without a good analysis of the
code to prove that the data produced prior to the bug being
encountered is valid, I'd be very uncomfortable using the generated
results. So basically, you have to get a bug that:

a) Always crashes the program when a bug is encountered, and never
generates bad data
- and -
b) Doesn't show up in the tests you did on smaller datasets.

That's an awfully convenient bug for the Lisp guy, no?

Paul F. Dietz

unread,

Jun 22, 2003, 5:58:52 PM6/22/03

to

Rayiner Hashem wrote:

> I realize that a high level language theoretically has more
> information available to give to the optimizer, I was just wondering
> whether the current Lisp compilers really took advantage of that. From
> what I've seen in the very helpful thread that followed, it appears
> that cmucl does a pretty good job of this, a lot better than the
> shootout results would lead me to believe.

CMUCL's (and SBCL's) compiler needs work most in the register allocator
and other more mundane compiler optimizations (for example, lifting
invariant code out of loops.)

Paul

Henrik Motakef

unread,

Jun 22, 2003, 6:05:42 PM6/22/03

to

On Sun, 22 Jun 2003 19:23:29 +0000, David Steuber wrote:
> Considering the mathematical nature of Lisp, I would wonder if it is
> generally possible to have the compiler do code transformations to produce
> the best possible code for the algorithm. Or if not possible in the
> general case, have a large library of patterns that it recognizes so that
> it can optimize those specific cases.

In a way, CL even allows to to build such a library yourself, as a mere
programmer, using DEFINE-COMPILER-MACRO and friends.

Of course, this does not necessarily mean that you are able to get down to
the level of optimized processor instructions - that would be rather hard
to specify portably, after all. Yet, your implementation of choice may
allow you to do this - using SBCL or CMUCL, you could use stuff in
compiler macros (or elsewhere) that is roughly equivalent to inline
assembly in C, only more convenient. I guess other implementations allow
similar stunts.

Peter Seibel

unread,

Jun 22, 2003, 7:00:02 PM6/22/03

to

hel...@mindspring.com (Rayiner Hashem) writes:

> > With Lisp you get the same first bug after 40 hours. You correct it and
> > continue the program. You get the second bug after another 4 hours. You
> > correct it and then let the program finish its job. Voila, you have
> > beaten the C++ version. :)

> Assuming that the bug wasn't one that generated incorrect data in
> certain cases wihtout causing a crash. Without a good analysis of
> the code to prove that the data produced prior to the bug being
> encountered is valid, I'd be very uncomfortable using the generated
> results.

On the other hand, in either language, a program that runs to
completion (without crashing or dropping you into the debugger) could
also have a bug that silently corrupts data. Why be comfortable with
those results? (That's a rhetorical question but probably one worth
thinking about.)

> So basically, you have to get a bug that:
>
> a) Always crashes the program when a bug is encountered, and never
> generates bad data
> - and -
> b) Doesn't show up in the tests you did on smaller datasets.
>
> That's an awfully convenient bug for the Lisp guy, no?

Well, it's certainly in inconvenient bug for the C++ guy. ;-) Here's
an example that might be more common than you think: your program is
running along and 40 hours in it tries to write some data to disk.
Whoops, disk full! And your code doesn't have any code to deal with
that (that's the bug).

In Lisp you'd get dropped into the debugger with a message saying the
write failed because the disk is full. You clean up some space on the
disk and tell it to retry the write. Assuming you can count on the
Lisp system to make the disk write to have failed cleanly--i.e. when
it failed it didn't write part of the data that it is going to rewrite
there's no reason to expect that the data you've generated so far is
corrupted--everything would have worked fine if you had cleaned up
your disk before starting the run. But with Lisp you can do it now and
carry on. Other obvious candidates are, permissions set wrong on a
file, tape not in tape drive, host temporarily unreachable because of
networking glitch, etc.

-Peter

--
Peter Seibel pe...@javamonkey.com

Lisp is the red pill. -- John Fraser, comp.lang.lisp

Daniel Barlow

unread,

Jun 22, 2003, 6:33:04 PM6/22/03

to

hel...@mindspring.com (Rayiner Hashem) writes:

> speed sensitive (heck, Python is fast enough most of the time) but the
> 4x-5x differentials in the shootout results made me a little
> uncomfortable for the computational stuff. To its credit, C++

You might also consider that CMUCL (and SBCL) arrange certain types of
specialised array in memory in a C-compatible way, so if you're
talking about computations on arrays you can code that bit in C (or
even in assembler following C calling conventions) and have it access
the Lisp array directly (i.e. without copying anything). There's an
example of this in the CMUCL manual.

-dan

--

http://www.cliki.net/ - Link farm for free CL-on-Unix resources

Pascal Costanza

unread,

Jun 23, 2003, 9:25:32 AM6/23/03

to

Sure. And very annoying for the C++ guy because he has to restart the
program nonetheless.

The point is: Not all bugs are disastrous. See also Peter's reply.

Pascal

--
Pascal Costanza University of Bonn
mailto:cost...@web.de Institute of Computer Science III
http://www.pascalcostanza.de Römerstr. 164, D-53117 Bonn (Germany)

Paolo Amoroso

unread,

Jun 23, 2003, 11:06:14 AM6/23/03

to

On 22 Jun 2003 14:48:18 -0700, hel...@mindspring.com (Rayiner Hashem)
wrote:

> Assuming that the bug wasn't one that generated incorrect data in
> certain cases wihtout causing a crash. Without a good analysis of the
> code to prove that the data produced prior to the bug being
> encountered is valid, I'd be very uncomfortable using the generated
> results. So basically, you have to get a bug that:

When this happens in Lisp, you are thrown in a break loop. From there, you
can use powerful inspection and introspection tools--not to mention the
whole language itself--for checking whether the data produced so far is
consistent and usable.

Rayiner Hashem

unread,

Jun 23, 2003, 6:57:34 PM6/23/03

to

> CMUCL's (and SBCL's) compiler needs work most in the register allocator
> and other more mundane compiler optimizations (for example, lifting
> invariant code out of loops.)

Hmm. Has anybody considered having cmucl apply its high-level
optimizations, then generate RTL code so GCC's (excellent, btw)
back-end could perform these optimizations and generate machine code?

Edi Weitz

unread,

Jun 23, 2003, 7:08:34 PM6/23/03

to

hel...@mindspring.com (Rayiner Hashem) writes:

Considered, yes. See this thread:

<http://article.gmane.org/gmane.lisp.cmucl.general/203>

But I don't think anything has come out of it yet.

Edi.

Gareth McCaughan

unread,

Jun 23, 2003, 7:33:41 PM6/23/03

to

"Gavin" <gov...@pacific.net.sg> writes:

[in reply to me saying various things about CMU CL, though
I can't see anything of mine that Gavin actually replied to]

> Read or skim the cmu-user.pdf that comes with the CMUCL distribution for
> type inferencing , compiler settings and
> design choices and extensions.

None of that makes any difference to the very specific case
being discussed here, in which the problem is not related
to any shortcoming in CMU CL's type inference.

> Any program in Lisp( eg CMUCL) will run differently (faster or slower)
> depending on the actual physcial machine
> it is running on. (even with the same compiler settings and optimizations
> turned, the [performance and benchmarks would be different)

Er, yes. Is that supposed to be news?

[SNIP: further variations on the not very surprising theme
that running a program on a better computer can make it run faster.]

> Anywhere CMUCL is one of the fastest free Lisp implementations around. Here
> is an URL for CMUCL benchmarks.
> http://www.cons.org/cmucl/benchmarks/index.html
>
> Any realistic comparisons can only be made if both users use the same PC or
> computer to test out CMUCL or some other Lisp or language implementations.

Which they have been doing.

--
Gareth McCaughan

David Steuber

unread,

Jun 24, 2003, 2:39:41 AM6/24/03

to

h...@heaven.nirvananet (Hartmann Schaffer) writes:

This was of course a single, trivial example. I was actually rather
put off that the C++ compilter didn't do the optimization also. Sorry
for any confusion.

My question still stands though. Is it possible for a Lisp compiler
to produce the best possible machine code instructions for the Lisp
code it is compiling?

I'm one of those people who likes a program to be efficient in time
and space. I also like the idea that a language environment offers
the tools to allow the programmer to be effecient. I know that CMUCL
produces some very fast code in many situations and that the Python
compiler still has room for improvement.

Eh, I still have a lot to learn in any case. I'm already convinced
that Lisp is an excelent programming language. I just hate the idea
of C folks (like me) having the criticism that Lisp produces slow
code. This is especially so when a C programmer would have to write a
heck of a lot more than what is in my sig to produce the Fibonacci
sequence (with bignums). Well, GNU does have a bignum library, but I
think you get what I mean.

Joe Marshall

unread,

Jun 24, 2003, 9:39:14 AM6/24/03

to

David Steuber <david....@verizon.net> writes:

> My question still stands though. Is it possible for a Lisp compiler
> to produce the best possible machine code instructions for the Lisp
> code it is compiling?

What do you mean `best possible'? Do you care about compilation time?

Henry Massalin's `Superoptimizer' produces code by exhaustively
searching the instruction sequence space. It takes a *long* time to
run, but the resulting code is hard to improve upon.

Eric Smith

unread,

Jun 24, 2003, 2:41:20 PM6/24/03

to

Joe Marshall <j...@ccs.neu.edu> wrote in message news:<adc7lh...@ccs.neu.edu>...

> Henry Massalin's `Superoptimizer' produces code by exhaustively
> searching the instruction sequence space. It takes a *long* time to
> run, but the resulting code is hard to improve upon.

One way to possibly improve the speed of superoptimized code is to
optimize it some more at run time, taking the patterns of data it
happens to be processing into account. This is one way a 100 MB
program can run faster than an equivalent 100 KB program. This
assumes of course that the amount of data being processed is enough to
cover the speed cost of the run time optimization. But that's likely
to be true in a lot of cases, because when a program most needs to be
superoptimized is when it has to process large amounts of data.

John Klein

unread,

Jun 25, 2003, 8:31:19 AM6/25/03

to

David Steuber <david....@verizon.net> wrote in message news:<87smq0r...@verizon.net>...

> Eh, I still have a lot to learn in any case. I'm already convinced
> that Lisp is an excelent programming language. I just hate the idea
> of C folks (like me) having the criticism that Lisp produces slow
> code.

If C folks are concerned with speed, why are they using C and not Fortran [1]?
(For numerical work, at least.)

Also note that the difference between gcc and native C can be much larger [2]
than the usual 30% difference quoted for C (gcc?) and CMUCL. So
I hope that these C folks are staying away from gcc, provided they haven't all
converted to Fortran.

This is just to put the C-centric speed obsession into perspective.

In response to the original question 'why was cmucl several times slower than
gcc on the matrix part of the shootout', one possible answer came up in the
CMUCL mailing list. CMUCL stores multidimensioal arrays as a header containing
a pointer to a 1-d array containing the data. So (aref a i j) involves a
bit more work than a[i][j]. It was mentioned that this could be changed,
but the CMUCL maintainers are already rather busy and, erm, underpaid.
Perhaps the shootout could be rewritten to use matrices represented as 1d
arrays.

[1]http://www.hpcf.cam.ac.uk/C_rant.html

[2]http://www.coyotegulch.com/reviews/intel_comp/intel_gcc_bench2.html

Rayiner Hashem

unread,

Jun 25, 2003, 6:11:46 PM6/25/03

to

> If C folks are concerned with speed, why are they using C and not Fortran [1]?
> (For numerical work, at least.)
>
> Also note that the difference between gcc and native C can be much larger [2]
> than the usual 30% difference quoted for C (gcc?) and CMUCL. So
> I hope that these C folks are staying away from gcc, provided they haven't all
> converted to Fortran.

Most of us C folk aren't concerned with something small like a 30%
difference. We're mainly concerned when we see a difference of 5x in a
simple benchmark. At least for me, it leads to the fear that I'll
start having to be careful about code-level (rather than
algorithm-level) optimizations to avoid hitting those 5x cases. For
most of us, we don't have to do that with C/C++, because the compilers
are so good. Of course, there is the factor that good C/C++
programmers have an intuitive understanding of the performance
characteristics of the language, and don't have to think too hard
about avoiding those pitfalls. Lisp, being a different language with
different performance characteristics, doesn't allow us to use that
same intuition. On the other hand, I don't thikn there is any hard
information about whether Lisp has more of those pitfalls to fall
into, though I'd be inclined to think so, given the shootout results
(which were coded in a straightforward way).

John Klein

unread,

Jun 26, 2003, 5:07:10 AM6/26/03

to

hel...@mindspring.com (Rayiner Hashem) wrote in message news:<a3995c0d.0306...@posting.google.com>...

> Most of us C folk aren't concerned with something small like a 30%
> difference.

Well, sometimes native C (and presumably native Fortran) thrashes gcc
by a factor of 5 (Monte-Carlo in SciMark 2.0 in the 2nd ref I gave).
And in some Shootout bencharks (eg Ackermann) ocaml thrashes gcc by a
factor of 2. So C people concerned with a possible five-fold loss of speed
in CMUCL should also worry that they are using the wrong implementation or
even language.

> We're mainly concerned when we see a difference of 5x in a
> simple benchmark. At least for me, it leads to the fear that I'll
> start having to be careful about code-level (rather than
> algorithm-level) optimizations to avoid hitting those 5x cases.

Perhaps. But this is just one benchmark of many.

In practice I've found that simply using C is one great big
code level optimization. C forces you to write at the machine's level, always.
Lisp allows you to program at an abstract human friendly level, and only when
necessary to revert explicitly to machine conscious C-like programming - that
is, to revert to the code level optimizations that C forces on you implicitly.
Paul Graham, I think, describes Lisp as two langauges: a slow-ish, flexible,
abstract, and easy to write language, and a fast but heavily declared and
machine concious language. C contains only the latter.

I used to have some of these concerns when I came to Lisp from C, but then I
found out that they were misplaced, and my priorities were all wrong.
Human time is valuable, machine time is cheap. Bugs waste far more
hours of precious human time than a program that runs 50% slower than the
theoretical optimum.

Now, C and Fortran just anger me. Absurdities like core dumps make me
want to pick up my machine and hurl it through the window. Because after
using Lisp I realise that it doesn't have to be this way.

David Steuber

unread,

Jun 28, 2003, 3:52:38 PM6/28/03

to

Joe Marshall <j...@ccs.neu.edu> writes:

> David Steuber <david....@verizon.net> writes:
>
> > My question still stands though. Is it possible for a Lisp compiler
> > to produce the best possible machine code instructions for the Lisp
> > code it is compiling?
>
> What do you mean `best possible'? Do you care about compilation time?

I guess I have to fall back to hand waving here. By "best possible",
I would mean code that runs as fast as possible on the target
processor. Where static compilation is possible, compile time isn't a
huge issue. Then again, I am still thinking as a C programmer. Since
CMUCL compiles each function before it is evaluated, compile time
could be important, particularly if there isn't enough information to
compile the function until runtime.

> Henry Massalin's `Superoptimizer' produces code by exhaustively
> searching the instruction sequence space. It takes a *long* time to
> run, but the resulting code is hard to improve upon.

I wasn't aware that a brute force solution would be required. What
sort of compile time is saved by deciding that code that will probably
run at least 90% as fast as some unknown theoretical fastest possible
code? What about 80% as fast? Is there a well defined point of
diminishing returns?

What Eric Smith posted about code optimizing itself for datasets
sounds interesting. I hadn't thought about that. If particular
sections of code get many passes, it would be cool if the program
could gradually optimize itself over time. I don't know how practical
that is in real life.

David Steuber

unread,

Jun 28, 2003, 4:08:42 PM6/28/03

to

jk27...@yahoo.com (John Klein) writes:

> If C folks are concerned with speed, why are they using C and not Fortran [1]?

Does Fortran still have numerical superiority over C? I thought that
was gone.

> Also note that the difference between gcc and native C can be much larger [2]
> than the usual 30% difference quoted for C (gcc?) and CMUCL. So
> I hope that these C folks are staying away from gcc, provided they haven't all
> converted to Fortran.

I'm not sure what you mean by 'native C' and GCC. Are you refering to
the Intel C compiler that recently benchmarked faster than GCC 3.0?

> In response to the original question 'why was cmucl several times slower than
> gcc on the matrix part of the shootout', one possible answer came up in the
> CMUCL mailing list. CMUCL stores multidimensioal arrays as a header containing
> a pointer to a 1-d array containing the data. So (aref a i j) involves a
> bit more work than a[i][j]. It was mentioned that this could be changed,
> but the CMUCL maintainers are already rather busy and, erm, underpaid.
> Perhaps the shootout could be rewritten to use matrices represented as 1d
> arrays.

The C compilers I've used all represent multidimensional arrays as a
contiguous block of memory. When you use a[i][j] notation to index
the array, you are still doing the calculations for the array offset.
Perhaps compilers have gotten good enough to optimize that out. I
don't know. In most cases, I've been able to just use a pointer and
increment that. If the pointer is a register variable, you can fly
through the array.

> [1]http://www.hpcf.cam.ac.uk/C_rant.html
>
> [2]http://www.coyotegulch.com/reviews/intel_comp/intel_gcc_bench2.html

Ah, you are refering to the Intel compiler. I remember this review.

Christophe Rhodes

unread,

Jun 28, 2003, 4:14:39 PM6/28/03

to

David Steuber <david....@verizon.net> writes:

> jk27...@yahoo.com (John Klein) writes:
>
>> If C folks are concerned with speed, why are they using C and not
>> Fortran [1]?
>
> Does Fortran still have numerical superiority over C? I thought that
> was gone.

Certainly the superiority is not gone if one compares Fortran77 code
to C. It's possible that heavy use of some of the newer Fortran
features might destroy the simplicity of the internal representations
that older-style code allows, which makes it relatively easy to
implement heavily optimizing compilers, but I don't know.

Christophe
--
http://www-jcsu.jesus.cam.ac.uk/~csr21/ +44 1223 510 299/+44 7729 383 757
(set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b)))
(defvar b "~&Just another Lisp hacker~%") (pprint #36rJesusCollegeCambridge)

Florian Weimer

unread,

Jun 28, 2003, 4:17:18 PM6/28/03

to

David Steuber <david....@verizon.net> writes:

> My question still stands though. Is it possible for a Lisp compiler
> to produce the best possible machine code instructions for the Lisp
> code it is compiling?

If you phrase it this way, the answer is clearly no, because of the
halting problem. 8-)

David Steuber

unread,

Jun 28, 2003, 4:35:06 PM6/28/03

to

jk27...@yahoo.com (John Klein) writes:

> Well, sometimes native C (and presumably native Fortran) thrashes gcc
> by a factor of 5 (Monte-Carlo in SciMark 2.0 in the 2nd ref I gave).
> And in some Shootout bencharks (eg Ackermann) ocaml thrashes gcc by a
> factor of 2. So C people concerned with a possible five-fold loss of speed
> in CMUCL should also worry that they are using the wrong implementation or
> even language.

This is certainly a valid point. There is an FFT library that was
written in ocamel because the authors didn't like C or C++. Perhaps
they also expected ocamel to produce faster code.

> > We're mainly concerned when we see a difference of 5x in a
> > simple benchmark. At least for me, it leads to the fear that I'll
> > start having to be careful about code-level (rather than
> > algorithm-level) optimizations to avoid hitting those 5x cases.
>
> Perhaps. But this is just one benchmark of many.

Yes, but that psychological factor shouldn't be ignored.

> I used to have some of these concerns when I came to Lisp from C, but then I
> found out that they were misplaced, and my priorities were all wrong.
> Human time is valuable, machine time is cheap. Bugs waste far more
> hours of precious human time than a program that runs 50% slower than the
> theoretical optimum.

As time passes, I come closer to realizing the above fact. My last
programming project was in Perl. That is nowhere near as fast as C.
But the application was web based. Saturating a T3 line was not a
problem even in Perl. Bandwidth was the limiting factor, not computer
power.

> Now, C and Fortran just anger me. Absurdities like core dumps make me
> want to pick up my machine and hurl it through the window. Because after
> using Lisp I realise that it doesn't have to be this way.

I hate it when an application up and dies. I would probably be a
strong Java advocate and programmer now if several things didn't
happen when I was first experimenting with the language. It was in an
NT environment with Sun's JDK and NT Emacs for my editor. I was ready
to start doing some GUI programming. I started finding bugs in the
behavior of the AWT. The last straw was when the JRE pulled an access
violation, NT's segfault, and died.

After all the hype about producing bug free software faster, Java went
and pulled that little trick. I bailed on Java then. It's not like I
was thrilled with the langauge in the first place anyway. While it
had a nice shallow learning curve, it was a low level programming
language masquarading as a high level language. I was no more
productive in it than with C++.

Lisp feels sooooo different. I'm not productive in it yet because of
various personal issues getting in the way. But I can already see
that the language is inherantly more productive. The basics aren't
hard at all. I expect that before I qualify as a Lisp master, I will
be more productive than I ever was with any other language.

I'm looking forward to that; the magical point where the language
becomes internalized and I start thinking in Lisp. I think that may
be the proverbial "it" in my programming career.

Thomas F. Burdick

unread,

Jun 28, 2003, 7:01:53 PM6/28/03

to

David Steuber <david....@verizon.net> writes:

> jk27...@yahoo.com (John Klein) writes:
>
> > If C folks are concerned with speed, why are they using C and not Fortran [1]?
>
> Does Fortran still have numerical superiority over C? I thought that
> was gone.

Last I looked, the gap hadn't narrowed very much.

> > Also note that the difference between gcc and native C can be much larger [2]
> > than the usual 30% difference quoted for C (gcc?) and CMUCL. So
> > I hope that these C folks are staying away from gcc, provided they haven't all
> > converted to Fortran.
>
> I'm not sure what you mean by 'native C' and GCC. Are you refering to
> the Intel C compiler that recently benchmarked faster than GCC 3.0?

And the Forte compiler on Sun, and generally all the native compilers
on various architectures.

> > In response to the original question 'why was cmucl several times slower than
> > gcc on the matrix part of the shootout', one possible answer came up in the
> > CMUCL mailing list. CMUCL stores multidimensioal arrays as a header containing
> > a pointer to a 1-d array containing the data. So (aref a i j) involves a
> > bit more work than a[i][j]. It was mentioned that this could be changed,
> > but the CMUCL maintainers are already rather busy and, erm, underpaid.
> > Perhaps the shootout could be rewritten to use matrices represented as 1d
> > arrays.
>
> The C compilers I've used all represent multidimensional arrays as a
> contiguous block of memory. When you use a[i][j] notation to index
> the array, you are still doing the calculations for the array offset.
> Perhaps compilers have gotten good enough to optimize that out.

Reread the quoted passage above. What you have in CMUCL (for >=2d
arrays only) is something like this:

VAR ---> [ description of dimentions and contents ]
|
+---> [ row-major vector of data ]

So, to do an (aref a i j), you have to follow the pointer to the
header block, figure out how to do the multiplication, do the
multiplication, then index into the 1-d array.

In contrast, in C you just have:

VAR ---> [ row-major vector of data ]

When you do a[i][j], you just do the multiplication, and index into
the 1-d array.

> I don't know. In most cases, I've been able to just use a pointer
> and increment that. If the pointer is a register variable, you can
> fly through the array.

Usually, C programmers don't use real multi-dimentional arrays, they
just use vectors, plus some information about the array's dimensions.
This is because the a[i][j] syntax only works if the a's dimensions
are always the same. That's the disadvantage of the C approach. Of
course, you can always do the same thing in Lisp, which is probably
why no one's changed this in CMUCL.

Gareth McCaughan

unread,

Jun 28, 2003, 9:18:06 PM6/28/03

to

David Steuber wrote:

> This is certainly a valid point. There is an FFT library that was
> written in ocamel because the authors didn't like C or C++. Perhaps
> they also expected ocamel to produce faster code.

If you mean FFTW, the library itself isn't in OCaml, but
some of its code is generated by an OCaml program.

--
Gareth McCaughan

prunes...@attbi.com

unread,

Jun 29, 2003, 12:11:57 PM6/29/03

to

David Steuber <david....@verizon.net> writes:

>> > My question still stands though. Is it possible for a Lisp compiler
>> > to produce the best possible machine code instructions for the Lisp
>> > code it is compiling?

> Joe Marshall <j...@ccs.neu.edu> writes:
>>
>> What do you mean `best possible'? Do you care about compilation time?
>
> I guess I have to fall back to hand waving here. By "best possible",
> I would mean code that runs as fast as possible on the target
> processor.

[snip]

Since you seem to be extremely concerned about the performance of
compiled code, let me point you at a few interesting papers:

Todd Proebstring ( http://www.research.microsoft.com/~toddpro/ )
suggests that advances in compiler optimization double computing power
every 18 years. While the 18 years figure was mostly to echo `Moore's
Law', consider this thought experiment by Bill Pugh
( http://www.cs.umd.edu/~pugh/IsCodeOptimizationRelevant.pdf )

``18 years from now, if we pull a Pentium III out of the deep
freeze, apply our future compiler technology to SPECINT2000, and
get an additional 2x speed improvement, I will be
impressed/amazed.''

The point here being that the `theoritically fastest possible code'
really isn't *that much* faster than what compilers currently
produce. In the amount of time you spend looking for a better
compiler optimization, computer hardware will have gotten at least
that much faster.

> Does Fortran still have numerical superiority over C?

FORTRAN77 will generally still beat C.
Assembly will generally beat C.

But if you really want flat-out performance, you want to use... Lisp.
The `Supercomputer Toolkit'
( http://www.hpl.hp.com/techreports/94/HPL-94-30.pdf )
The partially evaluated Lisp program was able to issue a floating
point operation on 98% of the instructions.

Hartmann Schaffer

unread,

Jun 29, 2003, 4:43:30 PM6/29/03

to

In article <xcvd6gx...@conquest.ocf.berkeley.edu>,
t...@conquest.OCF.Berkeley.EDU (Thomas F. Burdick) writes:
> ...

>> The C compilers I've used all represent multidimensional arrays as a
>> contiguous block of memory. When you use a[i][j] notation to index
>> the array, you are still doing the calculations for the array offset.
>> Perhaps compilers have gotten good enough to optimize that out.
>
> Reread the quoted passage above. What you have in CMUCL (for >=2d
> arrays only) is something like this:
>
> VAR ---> [ description of dimentions and contents ]
> |
> +---> [ row-major vector of data ]
>
> So, to do an (aref a i j), you have to follow the pointer to the
> header block, figure out how to do the multiplication, do the
> multiplication, then index into the 1-d array.

actually, given that these descriptions tend not to change between
accesses, quite a few optimizations should be fairly easy to
implement, esp. since aliassing problems should be much easier to
track in lisp than in C

> ...

> Usually, C programmers don't use real multi-dimentional arrays, they
> just use vectors, plus some information about the array's dimensions.
> This is because the a[i][j] syntax only works if the a's dimensions
> are always the same.

except the outermost dimension

Joe Marshall

unread,

Jun 30, 2003, 9:29:48 AM6/30/03

to

Florian Weimer <f...@deneb.enyo.de> writes:

I dunno if the halting problem applies here. Given two pieces of
code, it is possible to prove that one has the same effect and
produces the same answer as the other, provided that the other
produces an answer. So you don't have to prove that the program
terminates, just that the new program produces the same answer as the
old one (and some people, myself included, are happy if the new
program *doesn't* produce something *different* than the old one).

David Steuber

unread,

Jun 30, 2003, 10:29:42 PM6/30/03

to

prunes...@attbi.com writes:

> Since you seem to be extremely concerned about the performance of
> compiled code, let me point you at a few interesting papers:

Let's just say that performance does matter to me. If I buy hardware
that is twice as fast, I expect to have my programs run in half the
time or do twice as much.

> The point here being that the `theoritically fastest possible code'
> really isn't *that much* faster than what compilers currently
> produce. In the amount of time you spend looking for a better
> compiler optimization, computer hardware will have gotten at least
> that much faster.

If the compiler is producing reasonable code, meaning not slower than
it needs to be, then I guess I won't worry about the possibility that
I may be able to hand code better in assembler.

In general, GCC is the metric I would use to compare the performance
of Lisp against something else like C. If the difference is small,
then the faster development time of Lisp will certainly pay off.

> But if you really want flat-out performance, you want to use... Lisp.

I really like that answer. And performance of the code is not my only
concern. My primary goal is faster development and spending less time
debugging. I just don't want to be punished with slow code as a
result of rapid application development.

Thanks to everyone for their insights.