C speed vs functional languages

Bruce Hoult

unread,

Mar 20, 2000, 3:00:00 AM3/20/00

to

In article <s71d7ot...@barnowl.CS.Berkeley.EDU>, David Gay
<dg...@barnowl.CS.Berkeley.EDU> wrote:

> bruce...@pobox.com (Bruce Hoult) writes:
> > The conceptual work neded to make compiled scheme go as fast as C was done
> > by ... oh ... 1976 or so [1].
> >
> > These days, Scheme compilers such as Stalin [2] produce awesome code. In
> > fact Stalin often produces faster code than hand-written C despite the
> > fact that it generates C, because you would never ever have the patience
> > and skill to do all the bookkeeping required to do it by hand.
>
> I've heard these kinds of claim rather too often to believe them without
> substantial proof. References ? (I looked through the Stalin release
> and the author's web site, but didn't find anything)

OK, I've found a concrete (and impressive) claim...

In the articles...

<http://deja.com/getdoc.xp?fmt=text&AN=338901555>
<http://deja.com/getdoc.xp?fmt=text&AN=339742270>
<http://deja.com/getdoc.xp?fmt=text&AN=341354188>

... Siskind claims a 21:1 speed advantage for Stalin-compiled Scheme vs C,
on a 2D numerical integration code.

Here is the Scheme code:

(define (integrate-1D L U F)
(let ((D (/ (- U L) 8.0)))
(* (+ (* (F L) 0.5)
(F (+ L D))
(F (+ L (* 2.0 D)))
(F (+ L (* 3.0 D)))
(F (+ L (* 4.0 D)))
(F (- U (* 3.0 D)))
(F (- U (* 2.0 D)))
(F (- U D))
(* (F U) 0.5))
D)))

(define (integrate-2D L1 U1 L2 U2 F)
(integrate-1D L2 U2 (lambda (y) (integrate-1D L1 U1 (lambda (x) (F x y))) )))

(define (zark U V)
(integrate-2d 0.0 U 0.0 V (lambda (X Y) (* X Y)) ))

(define (r-total N)
(do ((I 1 (+ I 1))
(Sum 0.0 (+ Sum (zark (* I 1.0) (* I 2.0)))))
((> I N) Sum)))

(define (i-total N)
(do ((I 1 (+ I 1))
(Sum 0.0 (+ Sum (let ((I2 (* (* I I) 1.0))) (* I2 I2)))))
((> I N) Sum)))

(define (error-sum-of-squares N)
(do ((I 1 (+ I 1))
(Sum 0.0 (+ Sum (let ((E (- (r-total I) (i-total I)))) (* E E)))))
((> I N) Sum)))

(begin (display (error-sum-of-squares 1000)) (newline))

Siskind does admit to a slight cheat in this two year old message -- he
hand-expanded the 35 line scheme program into a 342 line version, claiming
that Stalin would soon make this transformation automatically.

Given that it's now two years later, I just tried the Scheme code (35
lines) and the C code (58 lines) from the second deja message referenced
above. Using what appears to be the standard set of flags (those in
benchmarks/compile-stalin-benchmark) I got the following runtimes on my
200 MHz Pentium Pro, 256 KB cache, RedHat Linux 5.2 (1.0.36),
egcs-2.91.66:

C : 27.90 sec
Scheme: 5.85 sec

I don't know if some other set of flags would get the Scheme program down
to the claimed 1 second runtime -- I just use the standard set of flags
all the time. Beating the C code by a factor of nearly 5 is still pretty
good.

-- Bruce

Bernd Paysan

unread,

Mar 20, 2000, 3:00:00 AM3/20/00

to

Bruce Hoult wrote:
> ... Siskind claims a 21:1 speed advantage for Stalin-compiled Scheme vs C,
> on a 2D numerical integration code.
>
> Here is the Scheme code:

[snip]

> (define (integrate-2D L1 U1 L2 U2 F)
> (integrate-1D L2 U2 (lambda (y) (integrate-1D L1 U1 (lambda (x) (F x y))) )))

And on the C code, he used function pointers to emulate lambda
functions. Yes, we know, that's not what you really want. After all, the
Scheme compiler doesn't anything more than expand the lambda functions
and compile the expanded function instead of using function pointers.

It's completely fair to say "C isn't well suited for functional
programming", because C isn't a functional programming language. Use
Scheme to do functional programming, any day.

The impressive part is not that Scheme can beat C on functional
programming, the impressing part is that Scheme compilers like Stalin
can do a good job on functional programming style, and allow the
programmer to use such a - comfortable - style instead of writing ugly C
code.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

Robert Harley

unread,

Mar 20, 2000, 3:00:00 AM3/20/00

to

Bernd Paysan <bpa...@mikron.de> writes:
> The impressive part is not that Scheme can beat C on functional
> programming, the impressing part is that Scheme compilers like Stalin
> can do a good job on functional programming style, and allow the
> programmer to use such a - comfortable - style instead of writing ugly C
> code.

Indeed, it is so nice to write:

(let ((D (/ (- U L) 8.0)))

instead of:

d = (u-l)/8;

I think I'll switch to Lisp on the spot.

Bye,
Rob.

Ketil Z Malde

unread,

Mar 20, 2000, 3:00:00 AM3/20/00

to

Robert Harley <har...@corton.inria.fr> writes:

>> can do a good job on functional programming style, and allow the
>> programmer to use such a - comfortable - style instead of writing ugly C
>> code.

> Indeed, it is so nice to write:

> (let ((D (/ (- U L) 8.0)))

> instead of:

> d = (u-l)/8;

Apart from the two code snippets not being equivalent, what's the
purpose of this remark? Why not compare trivial pieces of code using,
say, continuation passing? Or even a simple tree descent or graph
traversal? Or why not a routine taking a function as a parameter?

> I think I'll switch to Lisp on the spot.

So syntax is the determining factor? Tell me, what is the semicolon
doing up there? Not using C or C++, are we? Surely not.

-kzm
--
If I haven't seen further, it is by standing in the footprints of giants

Sander Vesik

unread,

Mar 20, 2000, 3:00:00 AM3/20/00

to

Bruce Hoult <bruce...@pobox.com> wrote:
> OK, I've found a concrete (and impressive) claim...

> In the articles...

> <http://deja.com/getdoc.xp?fmt=text&AN=338901555>
> <http://deja.com/getdoc.xp?fmt=text&AN=339742270>
> <http://deja.com/getdoc.xp?fmt=text&AN=341354188>

> ... Siskind claims a 21:1 speed advantage for Stalin-compiled Scheme vs C,

> on a 2D numerical integration code.

Please forgive my french, but:

While sham benchmarks suck, this is just *FUCKING* *DISGUSTING*.

First, they do function in lining in their scheme compiler, but don't
run the compilation process for the C code with -finline-functions,
so that the C compiler can do the same. The claim that C compilers
don't (or cannot do) cross-procedure optimisations is a lie. The way
they write the code is not only (contrary to their claims) unnatural but
also not standard in C.

If you change the C code so that function zarkon is outside of zark,
the time it takes to call zark 10^6 times reduces by 1/3th. There is
no reason why zarkon should be a subfunction of zark. Even keeping
with the "higher function" paradigm, achieveing 4x performance boost
in a rewrite is trivial.

> -- Bruce

--
Sander

There is no love, no good, no happiness and no future -
these are all just illusions.

Sander Vesik

unread,

Mar 20, 2000, 3:00:00 AM3/20/00

to

Bernd Paysan <bpa...@mikron.de> wrote:

> Bruce Hoult wrote:
>> ... Siskind claims a 21:1 speed advantage for Stalin-compiled Scheme vs C,
>> on a 2D numerical integration code.
>>

>> Here is the Scheme code:
> [snip]
>> (define (integrate-2D L1 U1 L2 U2 F)
>> (integrate-1D L2 U2 (lambda (y) (integrate-1D L1 U1 (lambda (x) (F x y))) )))

> And on the C code, he used function pointers to emulate lambda
> functions. Yes, we know, that's not what you really want. After all, the
> Scheme compiler doesn't anything more than expand the lambda functions
> and compile the expanded function instead of using function pointers.

That's not the bad part. You can use function pointers and still get
good and fast code. What he did was taking advantage of the fact that
gcc provides an extension that it cannot propely otimise itself.

> It's completely fair to say "C isn't well suited for functional
> programming", because C isn't a functional programming language. Use
> Scheme to do functional programming, any day.

> The impressive part is not that Scheme can beat C on functional

> programming, the impressing part is that Scheme compilers like Stalin

> can do a good job on functional programming style, and allow the
> programmer to use such a - comfortable - style instead of writing ugly C
> code.

A straightforward numerical C code id no more ugly than LISP or Scheme. When
propely written, both should run at roughly the same speed with a decent
compiler (pair for the functional language). The hard part of the work is in
both cases done by the C compiler.

> --
> Bernd Paysan
> "If you want it done right, you have to do it yourself"
> http://www.jwdt.com/~paysan/

--

Maynard Handley

unread,

Mar 20, 2000, 3:00:00 AM3/20/00

to

In article <brucehoult-21...@bruce.bgh>, bruce...@pobox.com
(Bruce Hoult) wrote:

>In article <rz74sa1...@corton.inria.fr>, Robert Harley
><har...@corton.inria.fr> wrote:

I hate to spoil the party here, but doesn't this argument pretty much sum
up why 99% of all software sucks?
Here we have examples of how code written in one-style vs another is
supposed to be so much simpler to understand, yet the REAL issues that
help us understand code, namely comments and sensible function names are
apparently considered unimportant.
Apparently a function name of zark() is supposed to convey meaningful
information, and we are supposed to pick up on the difference between
i-total and r-total by intuition. The name of the algorithm being used to
perform the integration is notably mentioned nowhere.

As long as people continue to engage in these bullshit wars about
epiphenomena that are tangential to the real issues of how to write
comprehensible code, the SW industry will continue as the province of
amateurs rightly sneered at by real engineers.

Maynard

>> Bernd Paysan <bpa...@mikron.de> writes:
>> > The impressive part is not that Scheme can beat C on functional
>> > programming, the impressing part is that Scheme compilers like Stalin
>> > can do a good job on functional programming style, and allow the
>> > programmer to use such a - comfortable - style instead of writing ugly C
>> > code.
>>

>> Indeed, it is so nice to write:
>>
>> (let ((D (/ (- U L) 8.0)))
>>
>> instead of:
>>
>> d = (u-l)/8;
>>

>> I think I'll switch to Lisp on the spot.
>

>I agree wih you. Which is why I'm progamming in Dylan as much as I can
>get away with these days.
>
>Here is the same code in Dylan. Note that it's a direct translation of
>the Scheme, and is ony different in shallow syntatic ways -- the semantics
>are (for this program) absolutely identical -- but it's sooo much easier
>to understand, in my opinion at least:
>
>module: test2D
>
>define method integrate-1D(L, U, F)
> let D = (U - L) / 8.0;
> D * (F(L) * 0.5 +
> F(L + D) +
> F(L + 2.0 * D) +
> F(L + 3.0 * D) +
> F(L + 4.0 * D) +
> F(U - 3.0 * D) +
> F(U - 2.0 * D) +
> F(U - D) +
> F(U) * 0.5)
>end integrate-1D;
>
>define method integrate-2D(L1, U1, L2, U2, F)
> integrate-1D(L2, U2,
> method(y) integrate-1D(L1, U1, method(x) F(x, y) end) end
> )
>end integrate-2D;
>
>define method zark(U, V)
> integrate-2d(0.0, U, 0.0, V, method(X, Y) X * Y end)
>end zark;
>
>define method r-total(N)
> for (I from 1 to N,
> Sum = 0.0 then Sum + zark(I * 1.0, I * 2.0))
> finally Sum
> end
>end r-total;
>
>define method i-total(N)
> for (I from 1 to N,
> Sum = 0.0 then Sum + begin
> let I2 = I * I * 1.0;
> I2 * I2
> end)
> finally Sum
> end
>end i-total;
>
>define method error-sum-of-squares(N)
> for (I from 1 to N,
> Sum = 0.0 then Sum + begin
> let E = r-total(I) - i-total(I);
> E * E
> end)
> finally Sum
> end
>end error-sum-of-squares;
>
>define method main(appname, #rest arguments)
> format-out("%=\n", error-sum-of-squares(1000));
> exit(exit-code: 0);
>end main;

Bruce Hoult

unread,

Mar 21, 2000, 3:00:00 AM3/21/00

to

Greg Alexander

unread,

Mar 21, 2000, 3:00:00 AM3/21/00

to

AMEN!! Every time I see a comparison thread like this I feel bad that I
don't have the time necessary to understand what the benchmark is really
all about (if it's not obvious in 1 minute and it's on usenet, it's
generally not worth the time)...I always mean to eventually figure out
what the benchmark is all about and how the party claiming teh better
results is cheating, but because they never program with good style,
because they never comment except in self-centered and meandering
paragraphs disconnected from the code itself often filled with
newly-invented terminology specific to the side of the tracks that they
are advocating, I never make any headway.

In article <handleym-200...@handma3.apple.com>, Maynard Handley wrote:
>In article <brucehoult-21...@bruce.bgh>, bruce...@pobox.com
>(Bruce Hoult) wrote:
>

>>In article <rz74sa1...@corton.inria.fr>, Robert Harley
>><har...@corton.inria.fr> wrote:
>

>I hate to spoil the party here, but doesn't this argument pretty much sum
>up why 99% of all software sucks?
>Here we have examples of how code written in one-style vs another is
>supposed to be so much simpler to understand, yet the REAL issues that
>help us understand code, namely comments and sensible function names are
>apparently considered unimportant.
>Apparently a function name of zark() is supposed to convey meaningful
>information, and we are supposed to pick up on the difference between
>i-total and r-total by intuition. The name of the algorithm being used to
>perform the integration is notably mentioned nowhere.
>
>As long as people continue to engage in these bullshit wars about
>epiphenomena that are tangential to the real issues of how to write
>comprehensible code, the SW industry will continue as the province of
>amateurs rightly sneered at by real engineers.
>
>Maynard
>

Ketil Z Malde

unread,

Mar 21, 2000, 3:00:00 AM3/21/00

to

hand...@ricochet.net (Maynard Handley) writes:

> I hate to spoil the party here, but doesn't this argument pretty much sum
> up why 99% of all software sucks?

Touché.

Another suspicion that has been slowly creeping into my mind for some
time, is that such microbenchmarks are next to useless for measuring
performance for larger applications with more interaction.

I've seen plenty of arguments about how the C family (except Java)
allows you to write faster programs than higher level languages, and
yet most of the programs I encounter are dog slow, bordering on
painful.

I don't think that is a problem that can be solved by more inlining of
functions, or using cpp #defines instead of virtual functions[0].

> the SW industry will continue as the province of amateurs rightly
> sneered at by real engineers.

Hey, at least we can quote properly! Sneer, sneer :-)

-kzm

[0] A defense for MFC's braindead idea of everything being an int,
presented in a course I attended. Really.

Bruce Hoult

unread,

Mar 21, 2000, 3:00:00 AM3/21/00

to

In article <95357265...@haldjas.folklore.ee>, Sander Vesik
<san...@haldjas.folklore.ee> wrote:

> A straightforward numerical C code id no more ugly than LISP or Scheme. When
> propely written, both should run at roughly the same speed with a decent
> compiler (pair for the functional language). The hard part of the work is in
> both cases done by the C compiler.

Care to present your properly-written version of this algorithm?

Maynard is of course correct in that real progams need comments and
meaningful names. But he seems to forget that the less code you have to
write, and the less "tricky" it is, the fewer comments you need and the
less work you have to put in to figure out what it does (whether someone
else wrote it or the someone is you).

-- Bruce

Bruce Hoult

unread,

Mar 21, 2000, 3:00:00 AM3/21/00

to

In article <95357175...@haldjas.folklore.ee>, Sander Vesik
<san...@haldjas.folklore.ee> wrote:

> While sham benchmarks suck, this is just *FUCKING* *DISGUSTING*.
>
> First, they do function in lining in their scheme compiler, but don't
> run the compilation process for the C code with -finline-functions,
> so that the C compiler can do the same.

You know, you'd be a lot more credible if you'd actually try it before you
spout.

I see no difference whatsoever between compiling that code with
-finline-functions and without. It's 27.97 seconds either way, and five
times slower than the Scheme. That's with egcs-2.91.66, on a PPro 200.
Quite possibly, that's because I *already* used the -O3 flag, which
includes -finline-functions in the first place.

Just for you, I just tried it using "gcc -O3 -fomit-frame-pointer
-funroll-all-loops", which got it down to 26.84 sec, vs the scheme at 5.85
sec.

> The claim that C compilers
> don't (or cannot do) cross-procedure optimisations is a lie. The way
> they write the code is not only (contrary to their claims) unnatural but
> also not standard in C.

So show us how you'd write it. But don't forget that the Scheme can also
be written differently, with opportunities for optomization more explicit,
and quite possibly faster.

> If you change the C code so that function zarkon is outside of zark,
> the time it takes to call zark 10^6 times reduces by 1/3th. There is
> no reason why zarkon should be a subfunction of zark. Even keeping
> with the "higher function" paradigm, achieveing 4x performance boost
> in a rewrite is trivial.

That's a bold claim. Prove it.

Bearing in mind, of course, that this is a numeric problem, the natural
province of C and FORTRAN, and that:
- the garbage collection overhead will kill the scheme code
- the constant boxing and unboxing overhead will kill the scheme code
- the function call overhead will kill the scheme code

I've got no great truck for Scheme and 90+% of the code I've ever written
is in C++, which I'm expert at (or, at least, was in ARM days and before
-- I haven't really tracked the changes in the ANSI standard), and most of
the rest is in C or Pascal. But I've been playing with Scheme (and Dylan)
and they really *do* seem to work as advertised, and I'm looking for
reasons not to use them for real, serious, work. And I just can't see
much downside, given the current crop of (free, open source) compilers.

-- Bruce

Stefan Monnier <foo@acm.com>

unread,

Mar 21, 2000, 3:00:00 AM3/21/00

to

>>>>> "Bruce" == Bruce Hoult <bruce...@pobox.com> writes:
> the rest is in C or Pascal. But I've been playing with Scheme (and Dylan)
> and they really *do* seem to work as advertised, and I'm looking for
> reasons not to use them for real, serious, work. And I just can't see
> much downside, given the current crop of (free, open source) compilers.

From what I can tell, Scheme has reasonably poor support for separate
compilation, which might be relevant to you or not.

Stefan

Andrew Reilly

unread,

Mar 21, 2000, 3:00:00 AM3/21/00

to

On 21 Mar 2000 06:04:05 -0500, "Stefan Monnier wrote:
>From what I can tell, Scheme has reasonably poor support for separate
>compilation, which might be relevant to you or not.

I think that's the price that you just have to pay for abstraction
flattening and compile-time optimisation.

Until Crusoe/Fx32/Dynamo/HotSpot-style run-time optimisation is
more widespread and sorted, fast and separate compilation are ideas
that just don't want to be in the same box.

The alternatives that have both "fast" _and_ "separate compilation"
don't also have "abstraction flattening" (C, Fortran), and those
that have the abstractions and the separate compilation don't have
"fast".

--
Andrew

Brian Drummond

unread,

Mar 21, 2000, 3:00:00 AM3/21/00

to

On Tue, 21 Mar 2000 07:53:30 GMT, Ketil Z Malde <ke...@ii.uib.no> wrote:

>hand...@ricochet.net (Maynard Handley) writes:
>
>> I hate to spoil the party here, but doesn't this argument pretty much sum
>> up why 99% of all software sucks?
>
>Touché.
>
>Another suspicion that has been slowly creeping into my mind for some
>time, is that such microbenchmarks are next to useless for measuring
>performance for larger applications with more interaction.
>
>I've seen plenty of arguments about how the C family (except Java)
>allows you to write faster programs than higher level languages, and
>yet most of the programs I encounter are dog slow, bordering on
>painful.
>
>I don't think that is a problem that can be solved by more inlining of
>functions, or using cpp #defines instead of virtual functions[0].
>

Agreed. I get the impression, especially when stepping through Windows
code, that C is often used to emulate, step by painful step, brick by
brick, exactly those mechanisms that were offered for "free" (to the
programmer) by those high level languages, and which were rejected as
slow and inefficient ten years ago.

Sometimes I think we _have_ closed the semantic gap.
By programming in C.

(Just took a look at Dylan though... at first glance, it seems to get a
lot right from the point of view of expressive power ... thanks Bruce! )

- Brian

KSG

unread,

Mar 21, 2000, 3:00:00 AM3/21/00

to

In article <brucehoult-21...@bruce.bgh>,
bruce...@pobox.com (Bruce Hoult) wrote:

>> If you change the C code so that function zarkon is outside
of zark,
>> the time it takes to call zark 10^6 times reduces by 1/3th.
There is
>> no reason why zarkon should be a subfunction of zark. Even
keeping
>> with the "higher function" paradigm, achieveing 4x
performance boost
>> in a rewrite is trivial.
>
>That's a bold claim. Prove it.

I used the second of the handwritten C codes given in the Deja
archive. This code ran in 10.47 seconds on a 200MHz PPro
machine with gcc-2.7.2.3 using -O3.

A few optimizations (all known, from strength reduction to
constant folding to inlining to selective procedure cloning),
that I think even increase readability, drop the run time
to .6s. This compared to the C code generated by the Scheme
code which runs in 1.6s. Here is my resulting C code. It is
possible that the optimizations I performed actually don't
generate the correct answer, but it looks correct.

I could also obviously inline the 'integrate_1d' call in
the 'integrate_1d_out' procedure, but I get bored easily. I'm
sure if anyone cares they can do it themselves. Here is the new
code that I timed:

#include <stdio.h>

double closure_y, closure_l1, closure_u1, (*closure_f)(double,
double);

double integrate_1d(double, double, double);
double integrate_1d_out(double, double);
double integrate_2d(double, double, double, double,
double (*)(double, double));
double zarkon(double, double);
double zark(double, double);
double r_total(int);
double sqr(double);
double i_total(int);
double error_sum_of_squares(int);
int main(void);

double integrate_1d(double l, double u, double y)
{ double d = (u-l)/8.0;
return ( 4.0*d + 4.5*l + 3.5*u)*d*y;}

double integrate_1d_out(double l, double u)
{ double d = (u-l)/8.0;
double l1 = closure_l1;
double u1 = closure_u1;
return (integrate_1d(l1, u1, (l)*0.5)+
integrate_1d(l1, u1, (l+d))+
integrate_1d(l1, u1, (l+2.0*d))+
integrate_1d(l1, u1, (l+3.0*d))+
integrate_1d(l1, u1, (l+4.0*d))+
integrate_1d(l1, u1, (u-3.0*d))+
integrate_1d(l1, u1, (u-2.0*d))+
integrate_1d(l1, u1, (u-d))+
integrate_1d(l1, u1, (u)*0.5))*d;}

double integrate_2d(double l1, double u1, double l2, double u2,
double (*f)(double, double))
{ closure_l1 = l1;
closure_u1 = u1;
closure_f = f;
return integrate_1d_out(l2, u2);}

double zarkon(double x, double y) {return x*y;}

double zark(double u, double v) {return integrate_2d(0.0, u,
0.0, v,
zarkon);}

double r_total(int n)
{ double sum = 0.0;
int i;
for (i = 1; i<=n; i++) sum += zark(i*1.0, i*2.0);
return sum;}

double sqr(double x) {return x*x;}

double i_total(int n)
{ double sum = 0.0;
int i;
for (i = 1; i<=n; i++) sum += sqr(i*i*1.0);
return sum;}

double error_sum_of_squares(int n)
{ double sum = 0.0;
int i;
for (i = 1; i<=n; i++) sum += sqr(r_total(i)-i_total(i));
return sum;}

int main(void)
{ printf("%f \n", error_sum_of_squares(1000));
return 0;}

* Sent from RemarQ http://www.remarq.com The Internet's Discussion Network *
The fastest and easiest way to search and participate in Usenet - Free!

Martin Rodgers

unread,

Mar 21, 2000, 3:00:00 AM3/21/00

to

Voice in the desert: Quiet, isn't it, Stefan Monnier <f...@acm.com>?

> From what I can tell, Scheme has reasonably poor support for separate
> compilation, which might be relevant to you or not.

Stalin has poor support for separate compilation, but then Stalin does
whole program analysis. Would a C compiler doing WPA be any better?

Scheme doesn't preclude separate compilation. What does EVAL do? R5RS
says, "Evaluates expression in the specified environment and returns
its value." What is a Scheme program? R5RS says, "Programs are
typically stored in files or entered interactively to a running Scheme
system, although other paradigms are possible; questions of user
interface lie outside the scope of this report."

http://swissnet.ai.mit.edu/~jaffer/r5rs_toc.html
--
Email address intentionally munged | You can never browse enough
http://www.wildcard.demon.co.uk

Bruce Hoult

unread,

Mar 22, 2000, 3:00:00 AM3/22/00

to

In article <0cfdb2c0...@usw-ex0101-008.remarq.com>, KSG
<kgatlin...@cs.ucsd.edu.invalid> wrote:

> In article <brucehoult-21...@bruce.bgh>,
> bruce...@pobox.com (Bruce Hoult) wrote:
> >> If you change the C code so that function zarkon is outside

> >> of zark,the time it takes to call zark 10^6 times reduces by

> >> 1/3th. There is no reason why zarkon should be a subfunction
> >> of zark. Even keeping with the "higher function" paradigm,
> >> achieveing 4x performance boost in a rewrite is trivial.
> >
> >That's a bold claim. Prove it.
>
> I used the second of the handwritten C codes given in the Deja
> archive. This code ran in 10.47 seconds on a 200MHz PPro
> machine with gcc-2.7.2.3 using -O3.
>
> A few optimizations (all known, from strength reduction to
> constant folding to inlining to selective procedure cloning),
> that I think even increase readability, drop the run time
> to .6s. This compared to the C code generated by the Scheme
> code which runs in 1.6s. Here is my resulting C code. It is
> possible that the optimizations I performed actually don't
> generate the correct answer, but it looks correct.

It looks correct to me, too. And the optomisations are, as you say, all
well known. Baically, you've done specialised versions of the 1D integral
depending on whether it is being called with the actual function to be
integrated or as the 2D outer loop, and then you've inlined the the
function to be integrated in the actual 1D case and applied a bit of
simple algebra. Good.

The problem is:

- if you can do this, why can't your C compiler?

- what if you want to integrate a function other than Z = X * Y? Do you
want to hand-expand that one as well? And the third one? And the forth?
Or would you rather the compiler did it for you?

- what if you want to do a three-dimensional integral instead of a
two-dimensional one? The scheme code extends in a very obvious and
straighforward way (this is the heart of the example). What if you wanted
to try out a more sophisticated integrate-1D, maybe with an adaptve step
size or something? It's a trivial change to the scheme program, but it's
a heck of a lt of work for your hand-expanded C version.

If this is the performance bottleneck in your whole program then, cool,
you can do it by hand. Hell, you could hand-code it in assembler if you
wanted to. But do you want to do this for your *entire* program, or is it
better to let the compiler do it, if it can?

-- Bruce

p.s. isn't it amazing that we all (Siskind, two years ago, me and you
now) are using 200 MHz Pentium Pros?

Bruce Hoult

unread,

Mar 22, 2000, 3:00:00 AM3/22/00

to

In article <38d7...@tequila.cs.yale.edu>, "Stefan Monnier <f...@acm.com>"
<monnier+comp/arch/news/@flint.cs.yale.edu> wrote:

> >>>>> "Bruce" == Bruce Hoult <bruce...@pobox.com> writes:
> > the rest is in C or Pascal. But I've been playing with Scheme (and Dylan)
> > and they really *do* seem to work as advertised, and I'm looking for
> > reasons not to use them for real, serious, work. And I just can't see
> > much downside, given the current crop of (free, open source) compilers.
>

> From what I can tell, Scheme has reasonably poor support for separate
> compilation, which might be relevant to you or not.

I don't understand. C/C++ also don't have support for separate
compilation, at least not as the term is understood by the designers of
Modula-2, Ada and so forth.

OTOH, Dylan *does* have excellent support for separate compilation, while
also having the semantics of Scheme in a conventional ("ALGOL-like", as in
C/Java/Pascal/ada/Modula-2 etc) syntax and object-orientation.

Actually, I'm becoming less and less convinced of the utility of separate
compilation. It *can* speed up builds during iterative development, but
I've seen far too many linkers that take longer to sort out the object
files on a real project than it would have taken to just recompile
everything in a single file. Separate compilation is also necessary to
interface to code for which the author doesn't want to give you source
(e.g. the OS itself).

-- Bruce

Al Grant

unread,

Mar 22, 2000, 3:00:00 AM3/22/00

to

"Bruce Hoult" <bruce...@pobox.com> wrote in message
news:brucehoult-22...@bruce.bgh...

> - what if you want to integrate a function other than Z = X * Y? Do you
> want to hand-expand that one as well? And the third one? And the forth?
> Or would you rather the compiler did it for you?

Yes, what if. You have several functions, you put them in a
list, and integrate them all, say
(map integrate-2d (f1 f2 f3 f4))
or whatever the Scheme syntax is. Scheme's a functional
language so this is a perfectly natural way of doing it.
Do you think Scheme does that integration 21 times faster
than C?

Bruce Hoult

unread,

Mar 22, 2000, 3:00:00 AM3/22/00

to

In article <8ba4kf$f2t$1...@pegasus.csx.cam.ac.uk>, "Al Grant"
<ag...@cam.ac.uk> wrote:

> "Bruce Hoult" <bruce...@pobox.com> wrote in message
> news:brucehoult-22...@bruce.bgh...
> > - what if you want to integrate a function other than Z = X * Y? Do you
> > want to hand-expand that one as well? And the third one? And the forth?
> > Or would you rather the compiler did it for you?
>
> Yes, what if. You have several functions, you put them in a
> list, and integrate them all, say
> (map integrate-2d (f1 f2 f3 f4))
> or whatever the Scheme syntax is. Scheme's a functional
> language so this is a perfectly natural way of doing it.

Indeed it is.

> Do you think Scheme does that integration 21 times faster
> than C?

It is *extremely* likely that a Scheme compiler, when presented with a map
over a short literal list, will expand it to...

(list (integrate-2d(f1) integrate-2d(f2) integrate-2d(f3) integrate-2d(f4))

... at which point it comes down to whether it can optomize the individual
function calls. Which, as we have previously seen, Stalin does.

In fact, I'd expect any good C compiler, when presented with...

float f1(float, float){ ... };
float f2(float, float){ ... };
float f3(float, float){ ... };
float f4(float, float){ ... };

typedef float (*func2d)(float, float);

func2d myFuncs[] = {f1, f2, f3, f4};
float myResults[4];

for (int i=0; i<4; ++i){
myResults[i] = integrate-2d(myFuncs[i]);
}

... to be able to unroll the loop.

-- Bruce

Martin Schoenert

unread,

Mar 22, 2000, 3:00:00 AM3/22/00

to

Bruce Hoult <bruce...@pobox.com> wrote:

Actually, I'm becoming less and less convinced of the utility of
separate
compilation. It *can* speed up builds during iterative development, but
I've seen far too many linkers that take longer to sort out the object
files on a real project than it would have taken to just recompile
everything in a single file.

Ha, my thought exactely. Hm, wait.
That was what I thought about Turbo Pascal
on a Apple ][ with Z80@1MHz running CPM.
I guess it is an idea whose time comes and goes.

Martin.

--
Martin Sch"onert, Martin.S...@web.de
One must imagine Sisyphos happy. - Albert Camus

KSG

unread,

Mar 24, 2000, 3:00:00 AM3/24/00

to

In article <brucehoult-25...@bruce.bgh>,
bruce...@pobox.com (Bruce Hoult) wrote:
>In article <38DBF599...@moene.indiv.nluug.nl>, Toon Moene
><to...@moene.indiv.nluug.nl> wrote:

>
>> Bruce Hoult wrote:
>>
>> > In article <0cfdb2c0...@usw-ex0101-008.remarq.com>,
KSG
>> > <kgatlin...@cs.ucsd.edu.invalid> wrote:
>>
>> > > I used the second of the handwritten C codes given in the
Deja
>> > > archive. This code ran in 10.47 seconds on a 200MHz PPro
>> > > machine with gcc-2.7.2.3 using -O3.
>>

>> > p.s. isn't it amazing that we all (Siskind, two years ago,
me and you
>> > now) are using 200 MHz Pentium Pros?
>>

>> Or, for that matter, gcc-2.7.2.3, which is 5 years old ?
>
>Well, *I* was using egcs-2.91.66, which is less than a year old
(yeah, OK,
>I should grab 2.95.x sometime...)

I should just mention that I actually use a Pentium II in lab
running 2.95. My home computer is a Pentium 100 with Win'98
(gotta have my Tiger Woods Golf, which BTW "requires" a 133MHz
Pentium but runs perfectly fine on my machine). The PPro200 was
another machine that I just have access to, but I wanted to use
the same configuration as the original author (at least to the
extent that was reasonable).

I had some comments regarding the actual topic at hand, but I
think I smell Chinese food...

KSG

Toon Moene

unread,

Mar 25, 2000, 3:00:00 AM3/25/00

to

Bruce Hoult wrote:

> In article <0cfdb2c0...@usw-ex0101-008.remarq.com>, KSG
> <kgatlin...@cs.ucsd.edu.invalid> wrote:

> > I used the second of the handwritten C codes given in the Deja
> > archive. This code ran in 10.47 seconds on a 200MHz PPro
> > machine with gcc-2.7.2.3 using -O3.

> p.s. isn't it amazing that we all (Siskind, two years ago, me and you
> now) are using 200 MHz Pentium Pros?

Or, for that matter, gcc-2.7.2.3, which is 5 years old ?

--
Toon Moene - mailto:to...@moene.indiv.nluug.nl - phoneto: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
GNU Fortran 95: http://xena.eas.asu.edu/~andy (under construction)

Bruce Hoult

unread,

Mar 25, 2000, 3:00:00 AM3/25/00

to

In article <38DBF599...@moene.indiv.nluug.nl>, Toon Moene
<to...@moene.indiv.nluug.nl> wrote:

> Bruce Hoult wrote:
>
> > In article <0cfdb2c0...@usw-ex0101-008.remarq.com>, KSG
> > <kgatlin...@cs.ucsd.edu.invalid> wrote:
>
> > > I used the second of the handwritten C codes given in the Deja
> > > archive. This code ran in 10.47 seconds on a 200MHz PPro
> > > machine with gcc-2.7.2.3 using -O3.
>
> > p.s. isn't it amazing that we all (Siskind, two years ago, me and you
> > now) are using 200 MHz Pentium Pros?
>
> Or, for that matter, gcc-2.7.2.3, which is 5 years old ?

Well, *I* was using egcs-2.91.66, which is less than a year old (yeah, OK,

I should grab 2.95.x sometime...)

-- Bruce

q...@pobox.com

unread,

Mar 30, 2000, 3:00:00 AM3/30/00

to

Sander Vesik <san...@haldjas.folklore.ee> wrote:
> Bruce Hoult <bruce...@pobox.com> wrote:
> > OK, I've found a concrete (and impressive) claim...

I am not so impressed ...

> > In the articles...
>
> > <http://deja.com/getdoc.xp?fmt=text&AN=338901555>
> > <http://deja.com/getdoc.xp?fmt=text&AN=339742270>
> > <http://deja.com/getdoc.xp?fmt=text&AN=341354188>
>

> > ... Siskind claims a 21:1 speed advantage for Stalin-compiled
> > Scheme vs C, on a 2D numerical integration code.
>

> Please forgive my french, but:
>

> While sham benchmarks suck, this is just *FUCKING* *DISGUSTING*.
>
> First, they do function in lining in their scheme compiler, but don't
> run the compilation process for the C code with -finline-functions,

> so that the C compiler can do the same. The claim that C compilers

> don't (or cannot do) cross-procedure optimisations is a lie. The way
> they write the code is not only (contrary to their claims) unnatural
> but also not standard in C.

Yeah, it looked like they were trying to write in pascal, with their
localized function. I think I'm going to have to agree with your
french here.

> If you change the C code so that function zarkon is outside of zark,
> the time it takes to call zark 10^6 times reduces by 1/3th. There is
> no reason why zarkon should be a subfunction of zark. Even keeping
> with the "higher function" paradigm, achieveing 4x performance boost
> in a rewrite is trivial.

Why not just change the function calls to macro defines. That would
undoubtedly have lead to equal or better performance from the C
compiler.
--
Paul Hsieh
(Bear with me ... this is my post via deja)

Sent via Deja.com http://www.deja.com/
Before you buy.

Paul Hsieh

unread,

Mar 30, 2000, 3:00:00 AM3/30/00

to

bruce...@pobox.com says...

> In article <8bvomb$sfs$1...@nnrp1.deja.com>, q...@pobox.com wrote:
> > Sander Vesik <san...@haldjas.folklore.ee> wrote:
> > > Bruce Hoult <bruce...@pobox.com> wrote:
> > > > OK, I've found a concrete (and impressive) claim...
> > I am not so impressed ...
> > > > In the articles...
> > >
> > > > <http://deja.com/getdoc.xp?fmt=text&AN=338901555>
> > > > <http://deja.com/getdoc.xp?fmt=text&AN=339742270>
> > > > <http://deja.com/getdoc.xp?fmt=text&AN=341354188>
> > >
> > > > ... Siskind claims a 21:1 speed advantage for Stalin-compiled
> > > > Scheme vs C, on a 2D numerical integration code.
> > >
> > > Please forgive my french, but:
> > >
> > > While sham benchmarks suck, this is just *FUCKING* *DISGUSTING*.
> > >
> > > First, they do function in lining in their scheme compiler, but don't
> > > run the compilation process for the C code with -finline-functions,
> > > so that the C compiler can do the same. The claim that C compilers
> > > don't (or cannot do) cross-procedure optimisations is a lie. The way
> > > they write the code is not only (contrary to their claims) unnatural
> > > but also not standard in C.
> >
> > Yeah, it looked like they were trying to write in pascal, with their
> > localized function. I think I'm going to have to agree with your
> > french here.
>

> This is rather amusing. You want to use a gcc-specific flag
> ("-finline-functions") on the one hand

I do? That was Sander. I was mostly agreeing with his french.

> [...] , but don't want to use gcc extensions which improve the power
> and readabilty on the other.

Using non-ANSI syntax that to this very moment I am having a hard time
grasping is indeed something I don't want.

More importantly gcc is something I don't want. It in no way is
representative of real world C compiler performance ability (except on
Linux where there is currently no choice.)

> First of all, the extensions make no difference to the speed,

Well they do to a person trying to write C-code in a natural way since
there is no way that would even consider using them. I mean you are
using a very pascal-like (I mean as far as a C-programmer who knows
nothing about scheme is concerned) extension whose usage is ridiculously
rare even in the pascal world that doesn't even work in the real C-world.

> [...] and secondly on the tests which *I* did, a sufficient
> optimisation level was used so that -finline-functions was
> automatically applied.

Which means nothing to me. gcc is not a serious compiler, and that is
not serious C code.

> > > If you change the C code so that function zarkon is outside of zark,
> > > the time it takes to call zark 10^6 times reduces by 1/3th. There is
> > > no reason why zarkon should be a subfunction of zark. Even keeping
> > > with the "higher function" paradigm, achieveing 4x performance boost
> > > in a rewrite is trivial.
> >
> > Why not just change the function calls to macro defines. That would
> > undoubtedly have lead to equal or better performance from the C
> > compiler.
>

> No it won't. Try it for yourself, and show the results.

Well I don't have a scheme compiler to work with. But beyond that, I'll
be honest, I've been staring at this for serveral minutes just trying to
figure out how to precisely understand what it is you are trying to write
here and its just a little hairy to me.

I mean I think I understand on a high level that this is serious scope
abuse, coupled with potentially, the absolute worst possible case of
pointer aliasing as really just a contorted way of re-expressing a doubly
nested for-loop, but I can't tell for sure (I don't have a compiler that
can compile *any* of the small source code at the first URL.)

Maybe if you (or someone else) re-expresses it in Fortran (and by that I
mean, by not using any scope tricks or artificial language structures
that only serve to obfuscate the code) so that I at least know what this
calculation is I'd be more able to tackle this. I mean is it really just
a simplistic numeric 2d integration?

I am willing to go the distance on this one -- you make a bold claim, and
if I am right I want to crush this sham with a devastating blow. If I am
wrong, then I would like to know what computational structure is missing
in the C language that I've been missing all these years that somehow is
encapsulated in this scheme language. Just give me a better description
of the program.

> Macros are unmanagable in large-scale programs.

Ok, thanks for your definitive judgement on this. Without the opinion of
fair minded scheme coders, there is no way I'd ever correct these
misconceptions of mine. I think I'll just stop using macros right away.

> [...] You can't pass macros to library functions.

No but you can pass library functions to macros. integrate_1d() (in the
short code sequence) is an obvious candidate for macro-fication -- the
result would remove two artificial levels of indirection. I think
integrate_2d() can also be macrofied but as my brain is blowing up from
the scoping details I will reserve judgement on it.

> Macros don't unroll loops for you. If macros are your solution then
> you've just thoroughly discredited yourself.

*sigh ...* Look if you don't like my comments why don't you just
killfile me?

--
Paul Hsieh
http://www.pobox.com/~qed/programming.html

James Van Buskirk

unread,

Mar 31, 2000, 3:00:00 AM3/31/00

to

Paul Hsieh wrote in message ...

>Maybe if you (or someone else) re-expresses it in Fortran (and by that I
>mean, by not using any scope tricks or artificial language structures
>that only serve to obfuscate the code) so that I at least know what this
>calculation is I'd be more able to tackle this. I mean is it really just
>a simplistic numeric 2d integration?

Yeah, I tried re-expressing it in Fortran but many readers of
comp.lang.fortran found my code (all 1 executable statements of it)
undecipherable. However, Jos Bergervoet provided a link to a gif
file that is eminently readable. The thread is entitled "fortran 90
slower than 77?"

Tom Payne

unread,

Mar 31, 2000, 3:00:00 AM3/31/00

to

Bruce Hoult <bruce...@pobox.com> wrote:
> In article <8bvomb$sfs$1...@nnrp1.deja.com>, q...@pobox.com wrote:

>> Sander Vesik <san...@haldjas.folklore.ee> wrote:

>> > Bruce Hoult <bruce...@pobox.com> wrote:

[...]

>> > > OK, I've found a concrete (and impressive) claim...
>> I am not so impressed ...
>> > > In the articles...
>> >
>> > > <http://deja.com/getdoc.xp?fmt=text&AN=338901555>
>> > > <http://deja.com/getdoc.xp?fmt=text&AN=339742270>
>> > > <http://deja.com/getdoc.xp?fmt=text&AN=341354188>
>> >
>> > > ... Siskind claims a 21:1 speed advantage for Stalin-compiled
>> > > Scheme vs C, on a 2D numerical integration code.
>> >
>> > Please forgive my french, but:
>> >
>> > While sham benchmarks suck, this is just *FUCKING* *DISGUSTING*.
>> >
>> > First, they do function in lining in their scheme compiler, but don't
>> > run the compilation process for the C code with -finline-functions,
>> > so that the C compiler can do the same. The claim that C compilers
>> > don't (or cannot do) cross-procedure optimisations is a lie. The way
>> > they write the code is not only (contrary to their claims) unnatural
>> > but also not standard in C.

>> Yeah, it looked like they were trying to write in pascal, with their
>> localized function. I think I'm going to have to agree with your
>> french here.

> This is rather amusing.
[...]

Agreed!

If I understand correctly, you are comparing two functionally
equivalent C programs, one of which is 21 times faster than the other
under a particular C implementation, and on that basis trying to make
a claim that Scheme is more efficient than C. Right?

Tom Payne

Bruce Hoult

unread,

Apr 1, 2000, 3:00:00 AM4/1/00

to

In article <MPG.134de16fc...@news.pacbell.net>,
DONT.qed...@pobox.com (Paul Hsieh) wrote:

It's hardly *my* problem if you can't grasp the use of a feature that is
present and commonly used in everything from Algol to Pascal to PL/I to
Modula-2 to Ada to Dylan to Perl to Scheme.

> More importantly gcc is something I don't want. It in no way is
> representative of real world C compiler performance ability (except on
> Linux where there is currently no choice.)

It seems to be used a *lot* more places than merely Linux. I'm a contract
programmer and am often in the position of using whatever compiler people
tell me to use, and to date that's almost always been gcc if the platform
is any kind of Un*x -- certainly that's always been the case on Solaris
jobs I've had (as well as Linux), and I seem to recall it being the case
on SGI and Stratus jobs I've had as well.

as it happens, I prefer MrC as an optomising C compiler, but you probably
haven't heard of that.

> > [...] and secondly on the tests which *I* did, a sufficient
> > optimisation level was used so that -finline-functions was
> > automatically applied.
>
> Which means nothing to me. gcc is not a serious compiler, and that is
> not serious C code.

What is "serious code"?

To me, "serious code" is code that is short, succinct, easy to write and
easy to understand later. The fewer lines of fluff and boilerplate the
better because every line you don't have to write is a potential bug you
didn't make, and every line of boilerplate you don't have to read is
something that you don't have to ask yourself "is this *really* just
standard boilerplate, or is there actually something subtly different
*this* time?"

If those goals are not incompatable with fast execution then all the better.

> > > > If you change the C code so that function zarkon is outside of zark,
> > > > the time it takes to call zark 10^6 times reduces by 1/3th. There is
> > > > no reason why zarkon should be a subfunction of zark. Even keeping
> > > > with the "higher function" paradigm, achieveing 4x performance boost
> > > > in a rewrite is trivial.
> > >
> > > Why not just change the function calls to macro defines. That would
> > > undoubtedly have lead to equal or better performance from the C
> > > compiler.
> >
> > No it won't. Try it for yourself, and show the results.
>
> Well I don't have a scheme compiler to work with. But beyond that, I'll
> be honest, I've been staring at this for serveral minutes just trying to
> figure out how to precisely understand what it is you are trying to write
> here and its just a little hairy to me.

I didn't write it. Actually, neither did Sisking, I believe it's just an
example that someone sent to him.

> I mean I think I understand on a high level that this is serious scope
> abuse,

Abuse? It's purely lexical scoping and downward function arguments.
There aren't even closures involved Even the much maligned Pascal can do
this.

> coupled with potentially, the absolute worst possible case of
> pointer aliasing

The mere fact that references are involved in the high level formulation
of the algorithm doesn't necessarily imply that pointers are involved in
the low level implementation. Except in C, of course, where they are a
major impediment to optomisation.

> as really just a contorted way of re-expressing a doubly
> nested for-loop, but I can't tell for sure (I don't have a compiler that
> can compile *any* of the small source code at the first URL.)

Yes, the end effect is a doubly nested for-loop, using generic components
(algorithms). It's much like C++'s template functions, except a lot
easier to understand.

> Maybe if you (or someone else) re-expresses it in Fortran (and by that I
> mean, by not using any scope tricks or artificial language structures
> that only serve to obfuscate the code) so that I at least know what this
> calculation is I'd be more able to tackle this. I mean is it really just
> a simplistic numeric 2d integration?

In this particular case, yes. Obtained from recursive use of a
one-dimensional numeric integration. With a couple more lines of code you
could have a 3- or 4-dimensional integration.

> I am willing to go the distance on this one -- you make a bold claim, and
> if I am right I want to crush this sham with a devastating blow. If I am
> wrong, then I would like to know what computational structure is missing
> in the C language that I've been missing all these years

Oh, about a dozen of them. GOTO-free programming. Assignment-free
programming. Side effect-free programming. Loop-free programming.
Pointer arithmetic-free programming. Generic algorithms. Automatic type
inferencing. Automatic memory management.

> > Macros are unmanagable in large-scale programs.
>
> Ok, thanks for your definitive judgement on this. Without the opinion of
> fair minded scheme coders, there is no way I'd ever correct these
> misconceptions of mine. I think I'll just stop using macros right away.

Fair minded scheme coders? Petit moi? Well, getting that way, after
surviving (to date) eleven years of professional C++ programming and I
don't want to know how manybefore that of C and Pascal and PL/I. I happen
to quite like infix syntax and so prefer to use Dylan rather than Scheme,
but Scheme is definitely a bit ahead in compiler research right at the
moment.

> > [...] You can't pass macros to library functions.
>
> No but you can pass library functions to macros. integrate_1d() (in the
> short code sequence) is an obvious candidate for macro-fication -- the
> result would remove two artificial levels of indirection.

It's far better as a generic algorithm, as in e.g. the C++ STL. Whch is
effetively exactly what it is in Scheme, except expressed with a lot less
fluff.

> > Macros don't unroll loops for you. If macros are your solution then
> > you've just thoroughly discredited yourself.
>
> *sigh ...* Look if you don't like my comments why don't you just
> killfile me?

Because you're going to crush me, remember? Bad form to not see that.

-- Bruce

Bruce Hoult

unread,

Apr 1, 2000, 3:00:00 AM4/1/00

to

In article <8c1k43$b8c$1...@slb1.atl.mindspring.net>, "James Van Buskirk"
<tor...@ix.netcom.com> wrote:

> Paul Hsieh wrote in message ...
>

> >Maybe if you (or someone else) re-expresses it in Fortran (and by that I
> >mean, by not using any scope tricks or artificial language structures
> >that only serve to obfuscate the code) so that I at least know what this
> >calculation is I'd be more able to tackle this. I mean is it really just
> >a simplistic numeric 2d integration?
>
>

> Yeah, I tried re-expressing it in Fortran but many readers of
> comp.lang.fortran found my code (all 1 executable statements of it)
> undecipherable. However, Jos Bergervoet provided a link to a gif
> file that is eminently readable. The thread is entitled "fortran 90
> slower than 77?"

You wrote:

> program zaxxon
> implicit none
> double precision, parameter :: weights(9) = 0.5*(/1,2,2,2,2,2,2,2,1/)
> integer i, j, k, n
>
> write(*,*) sum((/((sum((/(dot_product(weights*2*k/8,matmul(reshape((/ &
> (((k*j/8.0d0)*(2*k*i/8.0d0),i=0,8),j=0,8)/),(/9,9/)),weights*k/8)) &
> ,k=1,n)/))-sum((/(dble(k)**4,k=1,n)/)))**2,n=1,1000)/))
>
> end program zaxxon
>
> If the compiler sees all the opportunities for optimization here it
> should be able to deal with this in a couple of microseconds. Do
> any current compilers get within a factor of 1000 of this ideal
> time?

You're absolutely correct. For example, one really big win is to simply
memoize the results of the zork function, as it is repeatedly called with
the same arguments (that's the k=1,n part in your code).

And, of course, there is no input so the compiler *could* simply rewrite
the code to be "print 0.0". I would argue that it *shouldn't* do that,
because that won't minimize the total of compile time and runtine -- the
most a compiler should do is have the generted program calculate the
result the first time it is run and store it somewhere permanent, checking
for a pre-calculated result on subsequent executions (if any).

-- Bruce

Bruce Hoult

unread,

Apr 1, 2000, 3:00:00 AM4/1/00

to

In article <8c3d88$ets$1...@pravda.ucr.edu>, Tom Payne
<t...@roam-thp2.cs.ucr.edu> wrote:

> > This is rather amusing.
> [...]
>
> Agreed!
>
> If I understand correctly, you are comparing two functionally
> equivalent C programs, one of which is 21 times faster than the other
> under a particular C implementation, and on that basis trying to make
> a claim that Scheme is more efficient than C. Right?

That's right. You've got two choices. You can spend a given amount of
effort writing a program in C. Or you can spend the same amount of effort
writing the same program in Scheme instead and then have the Scheme
compiler write C code for you, and have your program execute faster as a
result.

The Scheme compiler doesn't do anything that you couldn't do yourself, but
if the computer can do it for you then why not let it? Given heroic
effort you could make the C as fast as the Scheme, but it's likely to take
you a lot longer than the 20 seconds that it takes Stalin. And if the
Scheme compiler misses something (as it no doubt will), you are of course
free to put more effort into *that* version as well.

This is exactly parallel to the old argument about whether you should
write your programs in assembler or in a "high level" language such as C
or FORTRAN. All the C or FORTRAN compiler does is write assembler, the
same as you can. On small sections of code you can usually beat the
compiler, given sufficient effort, but it is seldom worth the trouble, and
is probably never worth the trouble for an entire large program.

-- Bruce

Paul Hsieh

unread,

Apr 1, 2000, 3:00:00 AM4/1/00

to

bruce...@pobox.com says...

> DONT.qed...@pobox.com (Paul Hsieh) wrote:
> > bruce...@pobox.com says...

> > > [...] but don't want to use gcc extensions which improve the power

> > > and readabilty on the other.
> >
> > Using non-ANSI syntax that to this very moment I am having a hard time
> > grasping is indeed something I don't want.
>
> It's hardly *my* problem if you can't grasp the use of a feature that is
> present and commonly used in everything from Algol to Pascal to PL/I to
> Modula-2 to Ada to Dylan to Perl to Scheme.

Its not present in C, and when written in C code, I didn't understand it.

> > More importantly gcc is something I don't want. It in no way is
> > representative of real world C compiler performance ability (except on
> > Linux where there is currently no choice.)
>
> It seems to be used a *lot* more places than merely Linux. I'm a contract
> programmer and am often in the position of using whatever compiler people
> tell me to use, and to date that's almost always been gcc if the platform
> is any kind of Un*x -- certainly that's always been the case on Solaris
> jobs I've had (as well as Linux), and I seem to recall it being the case
> on SGI and Stratus jobs I've had as well.

Well, but this was a speed test. While gcc might do great on Sun boxes,
it is woefully behind the times on pretty much all other platforms.

> as it happens, I prefer MrC as an optomising C compiler, but you probably
> haven't heard of that.

Its the C compiler Motorola came up with for their PowerPC. Chris Hecker
wrote about it in his article where in he disparaged various C compilers
as being "not up to snuff".

> > > [...] and secondly on the tests which *I* did, a sufficient
> > > optimisation level was used so that -finline-functions was
> > > automatically applied.
> >
> > Which means nothing to me. gcc is not a serious compiler, and that is
> > not serious C code.
>
> What is "serious code"?

Code that you might actually write, and you might actually want to
maintain, and which, since you have timed it, have at least some
motivation to consider performance.

> To me, "serious code" is code that is short, succinct, easy to write and
> easy to understand later. The fewer lines of fluff and boilerplate the
> better because every line you don't have to write is a potential bug you
> didn't make, and every line of boilerplate you don't have to read is
> something that you don't have to ask yourself "is this *really* just
> standard boilerplate, or is there actually something subtly different
> *this* time?"
>
> If those goals are not incompatable with fast execution then all the better.

And what about portability? (I'll admit that's not real high on my list,
but by using this scoped function implementation you aren't cutting
yourself off from a few compilers, but in fact most of them;
irreperably!)

> > > > > If you change the C code so that function zarkon is outside of zark,
> > > > > the time it takes to call zark 10^6 times reduces by 1/3th. There is
> > > > > no reason why zarkon should be a subfunction of zark. Even keeping
> > > > > with the "higher function" paradigm, achieveing 4x performance boost
> > > > > in a rewrite is trivial.
> > > >
> > > > Why not just change the function calls to macro defines. That would
> > > > undoubtedly have lead to equal or better performance from the C
> > > > compiler.
> > >
> > > No it won't. Try it for yourself, and show the results.

Here is it:

#define mac_int1d(xl,xu,xd,yl,yu,f) \
( f(yl,yu,xl, 0)*0.5 + \
f(yl,yu,xl, xd) + \
f(yl,yu,xl, 2*xd) + \
f(yl,yu,xl, 3*xd) + \
f(yl,yu,xl, 4*xd) + \
f(yl,yu,xu,-3*xd) + \
f(yl,yu,xu,-2*xd) + \
f(yl,yu,xu, -xd) + \
f(yl,yu,xu, 0)*0.5)*xd

#define mac_inner(y,dummy,x,xd) zarkon(x+xd,y)

static double mac_outer(double xl,double xu,double y,double yd) {
double xd = (xu-xl)/8;

return mac_int1d(xl,xu,xd,y+yd,0,mac_inner);
}

static double mac_int2d(double l1, double u1, double l2, double u2) {
double yd = (u2-l2)/8;

return mac_int1d(l2,u2,yd,l1,u1,mac_outer);
}

static double mac_zark( double u, double v ) {
return mac_int2d(0.0, u, 0.0, v);
}

This leads to a completely inlined solution -- which in of itself should
be fairly hard to beat. Using WATCOM C/C++, it ends up using the
Athlon's FPU add pipeline with 35% efficiency, which lead me to the
conclusion that a 21:1 ratio in performance of scheme over C is a serious
exaggeration. If the scheme solution even beat 1:1, I would be very
surprised.

For a more complete view of my analysis see:

http://www.pobox.com/~qed/int2.zip

> > I mean I think I understand on a high level that this is serious scope
> > abuse,
>
> Abuse? It's purely lexical scoping and downward function arguments.

And it doesn't exist in the ANSI C standard.

> There aren't even closures involved Even the much maligned Pascal can do
> this.

Actually Pascal is the only language that I've ever seen this done it.
But then I graduated.

> > coupled with potentially, the absolute worst possible case of
> > pointer aliasing
>
> The mere fact that references are involved in the high level formulation
> of the algorithm doesn't necessarily imply that pointers are involved in
> the low level implementation. Except in C, of course, where they are a
> major impediment to optomisation.

But in this case, its more than that. Its obvious that its superflous to
at least some degree.

> > as really just a contorted way of re-expressing a doubly
> > nested for-loop, but I can't tell for sure (I don't have a compiler that
> > can compile *any* of the small source code at the first URL.)
>
> Yes, the end effect is a doubly nested for-loop, using generic components
> (algorithms).

That's all I needed to know.

> [...] It's much like C++'s template functions, except a lot easier to
> understand.

Except by me.

> > Maybe if you (or someone else) re-expresses it in Fortran (and by that I
> > mean, by not using any scope tricks or artificial language structures
> > that only serve to obfuscate the code) so that I at least know what this
> > calculation is I'd be more able to tackle this. I mean is it really just
> > a simplistic numeric 2d integration?
>
> In this particular case, yes. Obtained from recursive use of a
> one-dimensional numeric integration. With a couple more lines of code you
> could have a 3- or 4-dimensional integration.

The code I've written above can similarly be extended to 3- or 4-
dimensions as well.

> > I am willing to go the distance on this one -- you make a bold claim, and
> > if I am right I want to crush this sham with a devastating blow. If I am
> > wrong, then I would like to know what computational structure is missing
> > in the C language that I've been missing all these years
>

> Oh, about a dozen of them. GOTO-free programming. Side effect-free

> programming. Loop-free programming. Pointer arithmetic-free
> programming. Generic algorithms.

These I've already got as much as I want. Its just a question of writing
with a particular style (like using the preprocessor as I've done above.)

> Assignment-free programming. Automatic type inferencing. Automatic
> memory management.

The first, I don't understand the benefit of. The second is available in
C++, though it does not add anything functional to the final binaries.

The last is a double edged sword at best -- by giving up low level
control, you have less control on how this affects performance. Given
that processors seem to be heading towards an era of memory bandwidth
limitations, I should think that this will become an increasely large
issue.

Tim Bradshaw

unread,

Apr 2, 2000, 4:00:00 AM4/2/00

to

* Paul Hsieh wrote:

> #define mac_int1d(xl,xu,xd,yl,yu,f) \
> ( f(yl,yu,xl, 0)*0.5 + \
> f(yl,yu,xl, xd) + \
> f(yl,yu,xl, 2*xd) + \
> f(yl,yu,xl, 3*xd) + \
> f(yl,yu,xl, 4*xd) + \
> f(yl,yu,xu,-3*xd) + \
> f(yl,yu,xu,-2*xd) + \
> f(yl,yu,xu, -xd) + \
> f(yl,yu,xu, 0)*0.5)*xd

This will blow up in a really splendid way if any of the arguments
have side-effects.

--tim

Paul Hsieh

unread,

Apr 2, 2000, 4:00:00 AM4/2/00

to

In article <ey33dp4...@cley.com>, t...@cley.com says...

> * Paul Hsieh wrote:
> > #define mac_int1d(xl,xu,xd,yl,yu,f) \
> > ( f(yl,yu,xl, 0)*0.5 + \
> > f(yl,yu,xl, xd) + \
> > f(yl,yu,xl, 2*xd) + \
> > f(yl,yu,xl, 3*xd) + \
> > f(yl,yu,xl, 4*xd) + \
> > f(yl,yu,xu,-3*xd) + \
> > f(yl,yu,xu,-2*xd) + \
> > f(yl,yu,xu, -xd) + \
> > f(yl,yu,xu, 0)*0.5)*xd
>

> This will blow up in a really splendid way if any of the arguments
> have side-effects.

Well yeah, that's typical of any macro that takes arguments. But hey,
thanks for the programming tip. A closer inspection of the code I
presented will show that this macro does not take arguments that have
side effects.

--
Paul Hsieh
http://www.pobox.com/~qed/int2.zip

Tim Bradshaw

unread,

Apr 2, 2000, 4:00:00 AM4/2/00

to

* Paul Hsieh wrote:
> Well yeah, that's typical of any macro that takes arguments. But hey,
> thanks for the programming tip. A closer inspection of the code I
> presented will show that this macro does not take arguments that have
> side effects.

My mistake, I'd assumed that you were interested in stuff like
maintainable code. It's easy to maintain a toy example like your
code, perhaps a little harder if the macro is off in some header file,
which was my point.

--tim

Bernd Paysan

unread,

Apr 2, 2000, 4:00:00 AM4/2/00

to

If you want maintainable macros, and can use GCC, use GCC's bracketed
blocks:

#define mac_int1d(_xl,_xu,_xd,_yl,_yu,f) \
({ typeof(_xl) xl=_xl; \
typeof(_xu) xu=_xu; \
typeof(_xd) xd=_xd; \
typeof(_yl) yl=_yl; \
typeof(_yu) yu=_yu; \
typeof(_yd) yd=_yd; \
(f(yl,yu,xl, 0)*0.5 + \

f(yl,yu,xl, xd) + \
f(yl,yu,xl, 2*xd) + \
f(yl,yu,xl, 3*xd) + \
f(yl,yu,xl, 4*xd) + \
f(yl,yu,xu,-3*xd) + \
f(yl,yu,xu,-2*xd) + \
f(yl,yu,xu, -xd) + \

f(yl,yu,xu, 0)*0.5)*xd; })

The original example used GCC extensions to make the program slower on
C, so now we use GCC extensions to keep it fast and make it more
maintainable. The assignments go away (no cost) if there are no side
effects.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

Andy Glew

unread,

Apr 2, 2000, 4:00:00 AM4/2/00

to

> Actually, I didn't know that gcc did that. Is there any real advantage
> over C++ inline functions?

Oh, hell yes!

There are a whole slew of situations where C++ inline functions
and templates don't work. E.g. any place where you want to
stringify a variable name.

Macros are, overall, much more powerful than inline functions.
C++ statement expressions make them safe, in the limited
area of functionality that inline functions cover.

Another neat use of statement expressions, this time outside
macros: you can use them to create lambda functions in C++.

Andy Glew

unread,

Apr 2, 2000, 4:00:00 AM4/2/00

to

> Macros are, overall, much more powerful than inline functions.
> C++ statement expressions make them safe, in the limited
> area of functionality that inline functions cover.

Woops. That's "GCC statement expressions..."

Andy Glew

unread,

Apr 2, 2000, 4:00:00 AM4/2/00

to

> And, of course, the possibility of passing syntactically-invalid
> fragments, which I trust no one here is crazy enough to do :-)

#define public_getter_setter(type,name) \
private: \
type m_ ##name; \
public: \
type get_ ##name() { return m_ ##name; } \
void set_ ##name( type val) { m_ ##name = val; } \
// end public_getter_setter

Does this qualify as a fragment?

By the way: I was heartily opposed to non-syntactic macros,
until Bjarne Stroustrup showed how useful they could be.

Bruce Hoult

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

In article <38E7BF3B...@gmx.de>, Bernd Paysan <bernd....@gmx.de> wrote:

> If you want maintainable macros, and can use GCC, use GCC's bracketed
> blocks:

But gcc is a crap compiler! You don't want to use that!

Actually, I didn't know that gcc did that. Is there any real advantage
over C++ inline functions?

-- Bruce

Zack Weinberg

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

In article <brucehoult-03...@bruce.bgh>,

The optimizer does a better job; for example, it can see if some variable is
actually a constant, and CSE will move code between the bracketed block and
the surroundings. This should be taken to indicate that the function
inliner sucks, not that bracketed blocks are so wonderful.

In C++, the function inliner just got a lot better recently, so the above
may not be true. It is still true of C.

zw

[Disclaimer: I am paid to work on GCC.]

Natarajan Krishnaswami

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

On Mon, 03 Apr 2000 11:58:31 +1200, Bruce Hoult <bruce...@pobox.com> wrote:
> Actually, I didn't know that gcc did that. Is there any real advantage
> over C++ inline functions?

The example Bernd Paysan gave was type-generic (via the gcc typeof
extension), so the C++ analogue would be an inline template function.

One difference, then, is that type-safety is not mandatory. Since
they're macros, their arguments are unevaluated. Similarly, they have
access to any variables and (local) functions in the scope of the
"calling" environment.

E.g.:
#define foo(x,unused) ({typeof(x) y=x; y, z=y+1;})
#include <stdio.h>
int main()
{
int a, z=1;
a=foo(3, 1.1); // no type checking is performed on 1.1
a=foo(3, "elephant");
printf("%d\n", z);
}

will print 4.

Also, these are always emitted inline (e.g., regardless of whether one
uses -fno-inline-functions), whereas C++ functions marked for inlining
may or may not actually be inlined.

Any potential utility beyond that offered by inline functions arises
precisely because they're not functions, aesthetic considerations aside.
;-)

<N/>
--
you have been evaluated. you have a negative reference count. prepare
to be garbage collected. persistence is futile.
-- Erik Naggum

Bruce Hoult

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

In article <8c8q7n$3...@spool.cs.wisc.edu>, "Andy Glew" <gl...@cs.wisc.edu> wrote:

> Another neat use of statement expressions, this time outside
> macros: you can use them to create lambda functions in C++.

Now you're talking ;-)

However, I expect that Paul Hsieh would complain that he can't understand
it in a C program and that it makes the program slower...

-- Bruce

Bruce Hoult

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

In article <8c8sbk$q96$1...@eeyore.INS.CWRU.Edu>, nx...@po.cwru.edu wrote:

> On Mon, 03 Apr 2000 11:58:31 +1200, Bruce Hoult <bruce...@pobox.com> wrote:
> > Actually, I didn't know that gcc did that. Is there any real advantage
> > over C++ inline functions?
>
> The example Bernd Paysan gave was type-generic (via the gcc typeof
> extension), so the C++ analogue would be an inline template function.

Right.

> One difference, then, is that type-safety is not mandatory. Since
> they're macros, their arguments are unevaluated. Similarly, they have
> access to any variables and (local) functions in the scope of the
> "calling" environment.
>
> E.g.:
> #define foo(x,unused) ({typeof(x) y=x; y, z=y+1;})
> #include <stdio.h>
> int main()
> {
> int a, z=1;
> a=foo(3, 1.1); // no type checking is performed on 1.1
> a=foo(3, "elephant");
> printf("%d\n", z);
> }
>
> will print 4.

OK, so we're down to precisely *one* semantic difference from the
lexically nested functions with untyped arguments -- offered by at least
Scheme, Dylan and Perl -- that Paul complained so bitterly about: the
possibility of unevaluated arguments (which your example doesn't make use
of).

And, of course, the possibility of passing syntactically-invalid
fragments, which I trust no one here is crazy enough to do :-)

> Also, these are always emitted inline (e.g., regardless of whether one
> uses -fno-inline-functions), whereas C++ functions marked for inlining
> may or may not actually be inlined.

Well, that's a quality of implementation issue, not a language design
one. It' often useful to be able to *not* have things inlined -- for
debugging, for example.

-- Bruce

Bruce Hoult

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

In article <8c8vu0$6...@spool.cs.wisc.edu>, "Andy Glew" <gl...@cs.wisc.edu> wrote:

As I was typing that I thought to myself "bet I get a reply from 'crazy'
Glew" :-)

> > And, of course, the possibility of passing syntactically-invalid
> > fragments, which I trust no one here is crazy enough to do :-)
>

> #define public_getter_setter(type,name) \
> private: \
> type m_ ##name; \
> public: \
> type get_ ##name() { return m_ ##name; } \
> void set_ ##name( type val) { m_ ##name = val; } \
> // end public_getter_setter
>
> Does this qualify as a fragment?

No, I don't think so. Both arguments are complete syntactic elements (a
typename and a variable name), and the macro even generates a complete
syntactic element -- not perhaps a stand-alone one, but as long as it's
used within a class declaration there shouldn't be too many problems.
Certainly the intent is clean.

In Dylan or other languages with powerful macro facilities built into the
language (rather than as a language-ignorant preprocessor) you couldn't
have this as a stand-alone macro, but it could be one production of a
bigger macro.

> By the way: I was heartily opposed to non-syntactic macros,
> until Bjarne Stroustrup showed how useful they could be.

Of course this is a classic example of Stroustrup's thesis that most uses
of macros are illustrations of somthing lacking in the language. In
Dylan, when you write "slot name :: type;" in a class declaration it
automatically does everything you did in your macro, plus it's easier to
customize because you can instead write something like...

slot name :: type,
getter: myName,
setter: #f;

... to use a different public name, and not have a setter at all. If you
want to do that for a field then you'll have to either make your macro
*much* more compilcated (could you even do it?), or have multiple macros,
or not use the macro at all.

-- Bruce

Tim Bradshaw

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

* Bernd Paysan wrote:

> If you want maintainable macros, and can use GCC, use GCC's bracketed
> blocks:

> #define mac_int1d(_xl,_xu,_xd,_yl,_yu,f) \

> ({ typeof(_xl) xl=_xl; \
> typeof(_xu) xu=_xu; \
> typeof(_xd) xd=_xd; \
> typeof(_yl) yl=_yl; \
> typeof(_yu) yu=_yu; \
> typeof(_yd) yd=_yd; \

> (f(yl,yu,xl, 0)*0.5 + \

> f(yl,yu,xl, xd) + \
> f(yl,yu,xl, 2*xd) + \
> f(yl,yu,xl, 3*xd) + \
> f(yl,yu,xl, 4*xd) + \
> f(yl,yu,xu,-3*xd) + \
> f(yl,yu,xu,-2*xd) + \
> f(yl,yu,xu, -xd) + \

> f(yl,yu,xu, 0)*0.5)*xd; })

It's nice to know that GCC has this extension, but I think if I was
using gcc I'd use inline functions, which is what this macro seems to
be really, since I suspect this still suffers from name-capture issues
(the other classic macro problem, well known to Scheme / Lisp people),
and I'm not sure there's a way around that even in gcc other than
really obscure variable names and hope -- you need a considerably more
powerful & introspective macro system than C offers.

Anyway, this is kind of off-topic by now, sorry!

--tim

Andi Kleen

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

Tim Bradshaw <t...@cley.com> writes:

>
> It's nice to know that GCC has this extension, but I think if I was
> using gcc I'd use inline functions, which is what this macro seems to
> be really, since I suspect this still suffers from name-capture issues

Gcc often generates faster code when you use the macro instead of
the inline function (better register allocation, better CSE).

gcc's poor register allocation on x86 makes it often an advantage
to _not_ use inline/macros though :/ [current releases can only
assign variables to registers over the whole livetime of the variable,
moving some bigger block into a subroutine often eliminates a lot
of spilling]

-Andi

Andi Kleen

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

"Andy Glew" <gl...@cs.wisc.edu> writes:
>
> By the way: I was heartily opposed to non-syntactic macros,
> until Bjarne Stroustrup showed how useful they could be.

.. unless you get an error message (the difference between
macro assembler and a real high level language)

-Andi

--
This is like TV. I don't like TV.

Ketil Z Malde

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

bruce...@pobox.com (Bruce Hoult) writes:

> And, of course, there is no input so the compiler *could* simply rewrite
> the code to be "print 0.0".

Ahem, you won't mind if I butt in?

I think it's a sensible thing to do, if the compiler can prove that it
is in fact all the program is going to do. Normally, a program would
not be written to only evaluate one constant, anyway.

I've seen (somebody with better memory might provide a reference) a
benchmark won by C++, where the whole calculation were done with some
clever template tricks (and of course, you can use C macros in many
cases). Of course, doing this explicitly and uglifying your code in
the process is probably not a good idea in general.

-kzm
--
If I haven't seen further, it is by standing in the footprints of giants

James Van Buskirk

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

Ketil Z Malde wrote in message ...

>bruce...@pobox.com (Bruce Hoult) writes:

>> And, of course, there is no input so the compiler *could* simply rewrite
>> the code to be "print 0.0".

>I think it's a sensible thing to do, if the compiler can prove that it
>is in fact all the program is going to do. Normally, a program would
>not be written to only evaluate one constant, anyway.

I was trying to make a statement about my wierd religious views on
language and benchmarks with my Fortran code, much like the thread
originator (see, I'm on-topic.) For example I think it looks compact
and readable to write the solution to a difference equation as

solution = (/(sum(differences(:i), i = 1, size(differences))/)

and since the above is so 'normal' in my view that compilers by
now should recognize this idiom and carry out the solution with
only size(differences)-1 additions and no memory allocation. The
Fortran code I gave was similar in that the 'outer but one'
(indexed by k) loops depended only on the outer index (n) through
the upper bounds of their summations so that an obvious
optimization would be to evaluate each k-sum by just adding the
next term to the previous k-sum for a savings factor of about 500.
Compiler vendors seem to consider this behavior more suitable
for a CAS than a compiler, however. Also I wrote the inner loops
(indexed by i by and j) as a tensor contraction in order to
make it possible in principle for a compiler to see that
dble(k)**4 could be factored out leaving a constant tensor
contraction that could be avaluated outside of the k- and n-
loops. Either of these optimizations would save a huge chunk
of time and if all the code is there for the compiler to figure
out it may be possible for a given compiler for a given language
to do so and create tremendously fast code. This makes for a
rather dubious benchmark in my view because one may really be
testing whether the compiler just 'gets it' as regards one of
these algebraic insights rather than generates efficient code
in the more general case where programmers typically assume
that the compiler isn't smart enough to figure these things
out and makes any algebraic simpifications by hand before
presenting their code to the compiler. In the given case the
typical programmer would be correct -- the number of Fortran
compilers that as much as run the code as given to completion
that I am aware of at this point is fewer than the number that
encounter an internal compiler error, and I haven't seen any
yet that have spotted either of the above simplifications.

Maynard Handley

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

In article <ey3vh20...@cley.com>, Tim Bradshaw <t...@cley.com> wrote:

>* Paul Hsieh wrote:
>> Well yeah, that's typical of any macro that takes arguments. But hey,
>> thanks for the programming tip. A closer inspection of the code I
>> presented will show that this macro does not take arguments that have
>> side effects.
>
>My mistake, I'd assumed that you were interested in stuff like
>maintainable code. It's easy to maintain a toy example like your
>code, perhaps a little harder if the macro is off in some header file,
>which was my point.

This is somewhat bogus in a decent environment. Metrowerks IDE colors
macros differently from functions, and allows one to double-click on a
macro to go straight to the definition. If one now adds in some personal
discipline (eg naming any macro that has side effects on an argument by
with an appended string of _CAREFUL) the problem becomes non-existent.
(Of course that would require caring about real issues like sane naming
conventions, rather than pointless issues like my language is bigger than
your language.)

Maynard

Maynard Handley

unread,

Apr 3, 2000, 3:00:00 AM4/3/00

to

In article <MPG.13507801d...@news.pacbell.net>,
DONT.qed...@pobox.com (Paul Hsieh) wrote:

>> as it happens, I prefer MrC as an optomising C compiler, but you probably
>> haven't heard of that.
>
>Its the C compiler Motorola came up with for their PowerPC. Chris Hecker
>wrote about it in his article where in he disparaged various C compilers
>as being "not up to snuff".

Credit where credit is due.
Motorola own and are responsible for mcc.
Apple own and are responsible for mrc.
When I last had access to mcc, (three to four years ago) mcc and mrc were
much of a muchness, but mrc was easier to deal with (in the sense that it
had far fewer flags to tweak code gen). Mrc has improved a bit since then.
I imagine mcc has as well, but I've not had contact with it.

In theory IBM's xlc has them both beat. In my VERY limited dealings with
xlc it would appear that this is true but not significantly.

Maynard

Jay

unread,

Apr 4, 2000, 3:00:00 AM4/4/00

to

> > Using non-ANSI syntax that to this very moment I am having a hard time
> > grasping is indeed something I don't want.

Right. Using non-standard language features that are
not in a majority of C/C++ compilers is basically
programming incompetence.

On the other hand, I would say not understanding
the abilities and limitations of the current
level of compiler optimization that is generally
available and writing terrible hand optimized code
even though it wouldn't run any faster than
clean code is also incompetence.

> It's hardly *my* problem if you can't grasp the use of a feature that is
> present and commonly used in everything from Algol to Pascal to PL/I to
> Modula-2 to Ada to Dylan to Perl to Scheme.
>

A basic understanding of this field would indicate
that the field computer language design is a despicable
sewer. Its a worthless leper field barely acknowledged
by Computer Scientists. The honest CS academics realize
that they can't software engineer/program there way
out of a paper bag and in general they have been holding
down the whole Software Engineering field so as to
force their own worthless Computer "Science"
theoretical masturbation (or pure DARPA prostitution).
Simply, most CS-ers are too incompetent at SE to know squat about
good language design.

That said, "Pascal" local functions are an anachronism
(nested functions that inherit their scope). C got it right by accident
(K&R were basically high level language design clowns).
Pascal got it because Algol had it.
Modula2 and Ada got it because Pascal had it, even though
they had other superior language features that covered its uses.
PL/I and Perl are language design crap. Scheme takes it
to a different level, but I don't want to go into
the Scheme loneliness.

The best software engineering language design I have seen is
Eiffel. Eiffel does not have local nested functions. Hurray!
Unfortunately, in the last Java language change,
the pinheads at Sun added something like nested functions
(inside classes) into Java. Again an incompetent MIT Scheme
looney was the cause.

> > More importantly gcc is something I don't want. It in no way is
> > representative of real world C compiler performance ability (except on
> > Linux where there is currently no choice.)
>

> What is "serious code"?

>
> To me, "serious code" is code that is short, succinct, easy to write and
> easy to understand later.

IMO, serious code is well documented, straightforwardly (simple to the point
of stupid) and written with excellent identifier names. Note that I don't
include, short, succinct (or terse) or easy to write as drivers.

> The fewer lines of fluff and boilerplate the
> better because every line you don't have to write is a potential bug you
> didn't make,

Bzzt. Code can be longer and be easier to understand and maintain and
have fewer errors.

> > I mean I think I understand on a high level that this is serious scope
> > abuse,
>
> Abuse? It's purely lexical scoping and downward function arguments.
> There aren't even closures involved Even the much maligned Pascal can do
> this.

Nope he is right, its pretty much anti-SE scope abuse.

> > I am willing to go the distance on this one -- you make a bold claim, and
> > if I am right I want to crush this sham with a devastating blow. If I am
> > wrong, then I would like to know what computational structure is missing
> > in the C language that I've been missing all these years
>
> Oh, about a dozen of them. GOTO-free programming. Assignment-free
> programming. Side effect-free programming. Loop-free programming.
> Pointer arithmetic-free programming. Generic algorithms. Automatic type
> inferencing. Automatic memory management.

GOTO is convenient automatic code generators.

Assignment-free programming, Loop-free, are functional programming and that
has so far been a failure. Automatic type inferencing is another crackpot
"isn't this neat" feature that has not proved its case that it is
totally necessary or even desirable for Software Engineering.

"comp.arch" seems a weird place to argue high-level language design
issues. Even more ironic is that increasingly complex architectures
are making niggling performance issues that require even
more attention and control (such as manipulating memory footprint)
by programmers.

Jay

Bruce Hoult

unread,

Apr 4, 2000, 3:00:00 AM4/4/00

to

In article <8calkq$3un$1...@slb1.atl.mindspring.net>, "James Van Buskirk"
<tor...@ix.netcom.com> wrote:

> I was trying to make a statement about my wierd religious views on
> language and benchmarks with my Fortran code, much like the thread
> originator (see, I'm on-topic.)

I've looked up some of your stuff over on c.l.fortran and I think we're
largely of a mind.

To me, the criteria for some feature to be in a language or not are:

- does it make life more convenient for the programmer? Can they as a
result:
- write less code to do the same thing
- have less chance of making stupid errors
- have an easier time coming back and understanding it later
- still have the flexibility to do non-standard things without
abandoning the construct altogether
- do we know, at least at the level of having been done in someone's
PhD thesis, how to implement this construct at least as efficiently
as the manually-witten alternatives, and preferably far more
efficiently in special cases.

Stroustrup's additions to C to get C++ are all of this nature, as were
Sussman and Steele's design of the Lisp dialect Scheme and the additions
and trimmings at Apple/CMU/Harlequin of those who took the best parts of
Scheme, SmallTalk and CLOS to create Dylan.

> For example I think it looks compact
> and readable to write the solution to a difference equation as
>
> solution = (/(sum(differences(:i), i = 1, size(differences))/)
>
> and since the above is so 'normal' in my view that compilers by
> now should recognize this idiom and carry out the solution with
> only size(differences)-1 additions and no memory allocation.

As usual, implementation can lag somewhat behind definition...

> Either of these optimizations would save a huge chunk
> of time and if all the code is there for the compiler to figure
> out it may be possible for a given compiler for a given language
> to do so and create tremendously fast code. This makes for a
> rather dubious benchmark in my view because one may really be
> testing whether the compiler just 'gets it' as regards one of
> these algebraic insights

Well, that is *exactly* the point of this benchmark: to see how well the
compiler can do on a compactly-expressed high level desription of an
algorithm so that the programmer *doesn't* have to spend the time finding
all the optomisations himself.

> rather than generates efficient code
> in the more general case where programmers typically assume
> that the compiler isn't smart enough to figure these things
> out and makes any algebraic simpifications by hand before
> presenting their code to the compiler. In the given case the
> typical programmer would be correct -- the number of Fortran
> compilers that as much as run the code as given to completion
> that I am aware of at this point is fewer than the number that
> encounter an internal compiler error, and I haven't seen any
> yet that have spotted either of the above simplifications.

Then perhaps it's a good example to submit to compiler vendors?

-- Bruce

Bruce Hoult

unread,

Apr 4, 2000, 3:00:00 AM4/4/00

to

In article <38E930F7...@jay.jay.org>, Jay <j...@jay.jay.org> wrote:

> [...] despicable sewer. Its a worthless leper field [...] worthless
> Computer "Science" theoretical masturbation [...] language design clowns
> [...] Scheme loneliness [...] pinheads at Sun [...] incompetent MIT
> Scheme looney [...] crackpot "isn't this neat" feature

I suspect that you're really not expecting a serious answer. However...

> Right. Using non-standard language features that are
> not in a majority of C/C++ compilers is basically
> programming incompetence.

That depends on your aims, surely? If you explicitly want portability to
something other than the several dozen OS's on which gcc runs, then yes,
using gcc extensions would be incompetence. it is, however, a *lot* more
portable than using the various extensions in MS VC++.

I don't use gcc extensions myself, because I tend to move code back and
forth between gcc, CodeWarrior and VC++. But I don't regard those who do
-- such as Andy Glew -- as necessarily incompetent.

> On the other hand, I would say not understanding
> the abilities and limitations of the current
> level of compiler optimization that is generally
> available and writing terrible hand optimized code
> even though it wouldn't run any faster than
> clean code is also incompetence.

I agree here.

> > It's hardly *my* problem if you can't grasp the use of a feature that is
> > present and commonly used in everything from Algol to Pascal to PL/I to
> > Modula-2 to Ada to Dylan to Perl to Scheme.
> >
>
> A basic understanding of this field would indicate
> that the field computer language design is a despicable
> sewer. Its a worthless leper field barely acknowledged
> by Computer Scientists. The honest CS academics realize
> that they can't software engineer/program there way
> out of a paper bag and in general they have been holding
> down the whole Software Engineering field so as to
> force their own worthless Computer "Science"
> theoretical masturbation (or pure DARPA prostitution).
> Simply, most CS-ers are too incompetent at SE to know squat about
> good language design.

I'm a little confused here. You have a very low opinion of CS academics,
who you say often have very little real world software engineer
experience. And at the same time your criticism of the language design
rests on the opinions of those whom you yourelf hold in low esteem?

> That said, "Pascal" local functions are an anachronism
> (nested functions that inherit their scope). C got it right by accident
> (K&R were basically high level language design clowns).
> Pascal got it because Algol had it.
> Modula2 and Ada got it because Pascal had it, even though
> they had other superior language features that covered its uses.

What "superior language features" are those? The only possibilities I can
see are "objects", which neither Ada nor Modula-2 has. Built in support
for tasking is the only other common feature in Modula-2 and Ada that
isn't in the others mentioned, but I don't see how that relates.

> PL/I and Perl are language design crap.

Agreed.

> > > I mean I think I understand on a high level that this is serious scope
> > > abuse,
> >
> > Abuse? It's purely lexical scoping and downward function arguments.
> > There aren't even closures involved Even the much maligned Pascal can do
> > this.
>
> Nope he is right, its pretty much anti-SE scope abuse.

So what scopes do you want? Local and global and that's it? Not even
structs/objects? Ick.

> "comp.arch" seems a weird place to argue high-level language design
> issues. Even more ironic is that increasingly complex architectures
> are making niggling performance issues that require even
> more attention and control (such as manipulating memory footprint)
> by programmers.

The pioneers in any new field of architecture will of course painstakingly
hand-write their assembler to get that control, but the people who follow
will expect their compilers to do it for them. The more abstract the
language, the more scope the compiler has to take advantage of the
characteristics of future machines -- whatever they are. Yes, that's
*hard*, and in the past hasn't been done well. My thesis, however, is
that compilation techniques for high-level languages have now, in the real
world not just in academic dreams, caught up with that can be done with a
low level language such as C. I've been programming real-world solutions
to my employers' and clients' problems in C for fifteen years now and it's
time it was replaced. C++ solved a lot of real SE problems, but at a
*horrific* cost in language complexity -- I can write and understand it
just fine but I have a devil of a time finding others who can. Java is
cleaner but a huge step backwards in so many respects.

-- Bruce

Bernd Paysan

unread,

Apr 4, 2000, 3:00:00 AM4/4/00

to

Bruce Hoult wrote:
> But gcc is a crap compiler! You don't want to use that!

GCC's backend has to catch up, because development slowed down under
Kenner. GCC's extensions (designed under RMS) are fine. I've worked with
other compilers, and while GCC doesn't have the best backend anymore, it
is still the best compiler for cross-plattform projects. And some of the
extensions are crucial if you want performance, good or bad backend
nonwithstanding (e.g. you can't write a fast portable Forth interpreter
without labels as values). And since the EGCS effort took off, there is
again rapid development.

> Actually, I didn't know that gcc did that. Is there any real advantage
> over C++ inline functions?

Difficult to say. A compiler can still reduce function pointers to
static function calls when inlining, and strip of the reference
character of a &var parameter, if it's only used as "call by name"
inside the inlined function.

Bernd Paysan

unread,

Apr 4, 2000, 3:00:00 AM4/4/00

to

Bruce Hoult wrote:
> That depends on your aims, surely? If you explicitly want portability to
> something other than the several dozen OS's on which gcc runs, then yes,
> using gcc extensions would be incompetence. it is, however, a *lot* more
> portable than using the various extensions in MS VC++.

AFAIK GCC runs on a lot more platforms than "several dozen". Gforth e.g.
relies heavily on GCC, and for those platforms we don't have a GCC, we
do a native port (i.e. use assembler), because on those platforms, using
C doesn't make sense at all (they are performance-impaired 8 bit
microprocessors).

Andy Glew

unread,

Apr 4, 2000, 3:00:00 AM4/4/00

to

> I don't use gcc extensions myself, because I tend to move code back and
> forth between gcc, CodeWarrior and VC++. But I don't regard those who do
> -- such as Andy Glew -- as necessarily incompetent.

I oscillate on using GCC extensions, usually veering towards
not using them.

Pro: GCC exists on all the platforms I need to access.
GCC extensions are a hell of a lot more portable than
standard C++ --- almost no compilers support the full C++
standard

Con: but GCC, is a pretty poor compiler. Using an extension
makes it impossible to use a better optimizer, or to use
some neat non-free tools, like code browsers, etc.

For example, last time I checked, 64-bit "long long" was
not in standard C/C++ --- but I won't give it up. Fortunately,
64 bit long long is supported by most C++ compilers.
Actually, I try to use the ANSI C standard uint64_t, etc.,
anticipating that they will make it into standard C++
at some point (unless C++ dies completely, as seems likely).
Just checked: there *is* one (1) use of long long in the C++
standard, but it is not defined. This implies that it may be
in the ANSI C standard, which I do not have online in a searchable
form.

GCC extensions like statement expressions, and typeof
are *really* *nice*. They make code much more maintainable.
But, I almost never use them. I do try to campaign to get them
accepted by the standard.

I do admit to using varargs macros, suitably ifdeffed GCC
Mainly for one thing: debug and trace functions:
#define eprintf(fmt,args...) fprintf(stderr,fmt,args...)
(or whatever).
This is just easier than having to link in a separate varargs
function for every proxy like this. "Keeping it in the header"
has distinct advantages.

Similarly, I use __FUNCTION__ and __PRETTY_FUNCTION__
in debug output.

I'm willing to use zero length arrays. I think they are called
"flex arrays" in the most recent C standard. This illustrates
the dynamics of standard: zero length arrays were de-facto
in pre ANSI C standard days, but they did not make it into
the ANSI C standard; they did not get into C++, and, in fact,
many C++ compilers took steps to make them impossible;
but they have made it into the most recent C standard.
They're a good idea, corresponding to real machines.
If everyone programmed strictly to the standard, no progress
would be made.

I am constantly on the verge of using GCC's
#pragma interface
#pragma implementation
Standard C++ is a wasteful and redundant language;
declaring and defining things in two places is
redundant. GCC has taken steps to reduce this
redundancy.

Of the extensions I do use, they are mostly #ifdeffed.
This makes it less hassle tomaintain my programs under
GCC than under other compilers, but I still can use other
compilers, albeit with some loss of features.
*** varargs macros
*** long long

The extensions I do not use are mostly un-ifdeffable
--- if you use ifdefs, you would be missing the point.
*** interface/implementation
*** typeof

Jonathan Thornburg

unread,

Apr 4, 2000, 3:00:00 AM4/4/00

to

This thread started with the observation that for a particular toy
benchmark,
[I'm using Hennessy and Patterson (H&P)'s definition of
"toy benchmark": one with typically O(100) or fewer
source code lines, not derived from a "real" application,
taking no or almost no input, and hence producing
predictible-in-advance output.]
a particular Scheme implementation was 20 or so times faster than a
particular C implementation. The implicit (but clearly implied)
generalization was (is) that this performance ratio is likely to hold
for a wide range of other programs.

Unfortunately, one of the basic lessons about benchmarking we should
all have learned by now, is that toy benchmarks do *not* accurately
predict performance on "real-world" programs.
[See H&P for an extensive and authoratative discussion,
examples, etc. Their discussion is aimed at comparing
different _hardware_ platforms, but it's equally applicable
to comparing different _software_ environments.]
In fact, H&P argue that toy benchmarks have exactly one valid use:
teaching beginning students how to program.

Having said this, the claim that a given problem programmed in Scheme
is likely to run a lot faster than the same problem programmed in C,
is an interesting one. So far I haven't seen any substantive evidence
for it, but I'm open to being convinced otherwise. (If nothing else,
I have plenty of C and C++ code which I'd _love_ to see run 20+ times
faster! In fact, I'd quite happily settle for "just" a factor of 2.)

So... What would constitute _substantive_ evidence for Scheme's claimed
performance superiority over C/C++/etc? Well, how about taking some
existing _credible_ C/C++/Fortran (say) programs, reengineering them
in Scheme, and showing that the Scheme versions are smaller, faster,
more accurate, more robust in the face of ill-conditioned inputs,
have fewer bugs, or whatever.

Since the original "Scheme is 21 times faster than C" claim was for
a toy numerical benchmark, here are some suggested "credible numerical
programs" (in this case subroutines or subroutine libraries) for this
experiment. The Fortran, C, or C++ versions of all of these are available
from http://www.netlib.org and mirrors.
* Quadpack [numerical integration]
* ODEPACK or RKSUITE [ODE integration]
* the basic LU-decomposition routines from LAPACK [linear algebra]
* Dierckx or Pchip [spline fitting]
* subplex [unconstrained optimization via simplex method]
* Lawson-Hanson [least-squares]
* meschach, laso, y12m, Kundert [sparse-matrix linear algebra]

Or, for non-numerical code, how about Apache, GNU Make, or Lynx?
Or your favorite module of the Linux or {Open,Net,Free}BSD kernel?
(I'll nominate some of the TCP/IP stack, or if that's too big, how
about the Linux /dev/random driver and/or ipsec cryptography stack?)

The key point is that all of these are nontrivial programs, with
good reputations (i.e. the existing C/Fortran/C++/whatever versions
are good enough that improvements on them would be quite interesting),
well-defined semantics and freely-available source code to start the
reengineering process, and small enough (many only 1-10K lines of
C/C++/Fortran each) that (particularly with the claimed
higher-level-language-than-C/Fortran/etc programmer productivity
benefits) the experiment shouldn't take too long.

*This* experiment would be interesting, and unlike toy benchmarks,
it would actually tell us something useful about performance on other
real-world applications.

--
-- Jonathan Thornburg <jth...@galileo.thp.univie.ac.at>
http://www.thp.univie.ac.at/~jthorn/home.html
Universitaet Wien (Vienna, Austria) / Institut fuer Theoretische Physik
Software Development Axiom: "The first 90% of the code accounts for
the first 90% of the development time. The remaining 10% of the code
accounts for the other 90% of the development time." -- Tom Cargill

Larry Elmore

unread,

Apr 4, 2000, 3:00:00 AM4/4/00

to

"Bruce Hoult" <bruce...@pobox.com> wrote in message
news:brucehoult-04...@bruce.bgh...

> In article <38E930F7...@jay.jay.org>, Jay <j...@jay.jay.org> wrote:
>
> > That said, "Pascal" local functions are an anachronism
> > (nested functions that inherit their scope). C got it right by accident
> > (K&R were basically high level language design clowns).
> > Pascal got it because Algol had it.
> > Modula2 and Ada got it because Pascal had it, even though
> > they had other superior language features that covered its uses.

What's so bad about local functions? They're one of the reasons I prefer Ada
to C/C++, but a minor one (a big reason is that Ada's generics are _vastly_
better than the abortion used by C++, IMHO (though still limited compared to
CLOS)).

> What "superior language features" are those? The only possibilities I can
> see are "objects", which neither Ada nor Modula-2 has. Built in support
> for tasking is the only other common feature in Modula-2 and Ada that
> isn't in the others mentioned, but I don't see how that relates.

Actually, Ada 95 does have full support for objects.

Larry

Jay

unread,

Apr 4, 2000, 3:00:00 AM4/4/00

to

Bruce Hoult wrote:
>
> In article <38E930F7...@jay.jay.org>, Jay <j...@jay.jay.org> wrote:
>

> > [...] despicable sewer. Its a worthless leper field [...] worthless
> > Computer "Science" theoretical masturbation [...] language design clowns
> > [...] Scheme loneliness [...] pinheads at Sun [...] incompetent MIT
> > Scheme looney [...] crackpot "isn't this neat" feature
>
> I suspect that you're really not expecting a serious answer. However...

I just wanted to state my position on the state of this field.

I would really LOVE for there to even be a
respected "Roger Ebert" pundit for language design
in this field or even a "Crossfire" like pundits in
language camps. But I don't see this, its pretty much a bunch
of political bickering (and envy) that then turned
into silence and dismissal.

> > > It's hardly *my* problem if you can't grasp the use of a feature that is
> > > present and commonly used in everything from Algol to Pascal to PL/I to
> > > Modula-2 to Ada to Dylan to Perl to Scheme.
> > >
> >
> > A basic understanding of this field would indicate
> > that the field computer language design is a despicable
> > sewer. Its a worthless leper field barely acknowledged
> > by Computer Scientists. The honest CS academics realize
> > that they can't software engineer/program there way
> > out of a paper bag and in general they have been holding
> > down the whole Software Engineering field so as to
> > force their own worthless Computer "Science"
> > theoretical masturbation (or pure DARPA prostitution).
> > Simply, most CS-ers are too incompetent at SE to know squat about
> > good language design.
>
> I'm a little confused here. You have a very low opinion of CS academics,
> who you say often have very little real world software engineer
> experience. And at the same time your criticism of the language design
> rests on the opinions of those whom you yourelf hold in low esteem?

The problem with CS is that language design
has been been marginalized in the academic community
and is not studied as the academics have mostly gone on to
other areas. The problem is not so much that they don't
have real world software engineering experience is that
they take the word "Science" in Computer Science too
seriously, this pushes them to study more abstract
"timeless" scientific and mathematical principles and trying
to be the next Von Neumann type of stuff, than a dirty and
incremental subject such as SE and SE language design
which is more project sociology and psychology (of code understanding
by humans).

> > That said, "Pascal" local functions are an anachronism
> > (nested functions that inherit their scope). C got it right by accident
> > (K&R were basically high level language design clowns).
> > Pascal got it because Algol had it.
> > Modula2 and Ada got it because Pascal had it, even though
> > they had other superior language features that covered its uses.
>

> What "superior language features" are those? The only possibilities I can
> see are "objects", which neither Ada nor Modula-2 has.

Ada and Modula2 had packages/modules which are abstract data types
and are object except they lack runtime interface polymorphism
and less importantly code sharing structural inheritance.
But they are hella old now. As far as nesting functions, packages
and modules provided superior capabilities.

> > PL/I and Perl are language design crap.
>
> Agreed.
>
> > > > I mean I think I understand on a high level that this is serious scope
> > > > abuse,
> > >
> > > Abuse? It's purely lexical scoping and downward function arguments.
> > > There aren't even closures involved Even the much maligned Pascal can do
> > > this.
> >
> > Nope he is right, its pretty much anti-SE scope abuse.
>
> So what scopes do you want? Local and global and that's it? Not even
> structs/objects? Ick.

There should primarily be only one scope for (non-constant) variables:
parameters and local variables. OO languages have an implicit
this (self) parameter on every object with a short cut
(*this).Variable --> Variable. Not declaring the "this" parameter
and the short cut are for convenience. It would be trivial
to define a "C++" which required the first parameter to be
a variable "O" (object) of the class type, in fact
it would not be so bad because you could declare
it constant there instead of the weird "void f(...) const;"
syntax. Except at the local block level inside
a method, scoping is now a pretty a much less relevant concept.
The scoping ("what does this code do") "puzzles" taught in CS
should be ridiculed as crap code.

Also, global variables should almost never be used.

> > "comp.arch" seems a weird place to argue high-level language design
> > issues. Even more ironic is that increasingly complex architectures
> > are making niggling performance issues that require even
> > more attention and control (such as manipulating memory footprint)
> > by programmers.
>
> The pioneers in any new field of architecture will of course painstakingly
> hand-write their assembler to get that control, but the people who follow
> will expect their compilers to do it for them.

Yes, people should expect excellence from their compilers, but they
also must know the limitations of the current (and near future)
compilers. I think the analogy breaks down a bit here
because I see the language field as mature or at least stagnant.

> The more abstract the
> language, the more scope the compiler has to take advantage of the
> characteristics of future machines -- whatever they are.
> Yes, that's > *hard*, and in the past hasn't been done well.

Thats the problem. It doesn't help us in the here and now (or
the near future). The problem is of course is that we have
seen too many bizarre wiz-bang languages greatly oversold
as the next silver bullet. What also irritates me is
that such new language features never seem to have
any Software Engineering analysis of why they are
desirable. That would be too "unscientific" I guess.

> My thesis, however, is
> that compilation techniques for high-level languages have now, in the real
> world not just in academic dreams, caught up with that can be done with a
> low level language such as C. I've been programming real-world solutions
> to my employers' and clients' problems in C for fifteen years now and it's
> time it was replaced. C++ solved a lot of real SE problems, but at a
> *horrific* cost in language complexity -- I can write and understand it
> just fine but I have a devil of a time finding others who can. Java is
> cleaner but a huge step backwards in so many respects.

Agreed, on all points with respect to C, C++ and Java.
But, I would say that many of these compiler techniques
and language design features have been around forever
but there use has been hampered by the
fact that the language field is more politics
and religion than technically serious.

Jay

David Hoyt

unread,

Apr 4, 2000, 3:00:00 AM4/4/00

to

Jay wrote:

> The problem with CS is that language design
> has been been marginalized in the academic community
> and is not studied as the academics have mostly gone on to
> other areas. The problem is not so much that they don't
> have real world software engineering experience is that
> they take the word "Science" in Computer Science too
> seriously, this pushes them to study more abstract
> "timeless" scientific and mathematical principles and trying
> to be the next Von Neumann type of stuff, than a dirty and
> incremental subject such as SE and SE language design
> which is more project sociology and psychology (of code understanding
> by humans).

Golly, what a troll. There are plenty of (computer) language people
dealing with real issues. Self and Smalltalk are very good examples
of sociology inspired languages. David Unger and both of the
Wirth-Brocks have spent a fair amount of time and energy on
teaching, software engineering, communication and other
development related issues. David and Alan have shown how to
make maintainable code real fast (or fit into really small places).
Rebecca spends a lot of time on teaching real world software
development skills to students. While they all work in industry,
David especially, has a pretty big impact via academia.

I have a hard time believing that I am justifying academia here,
it's not my usual stick; but if there is one place where I think
there is a huge amount of good work going on, its in language
design and compiler technology. Try reading SigPOPL, related
conferences and the like. There is a lot of good (real) stuff going
on. It's hardly the fault of the academics that superior software-
engineering languages like Dylan, Smalltalk or ML find fewer
users than Cobol or C.

david | dh...@acm.org

Bruce Hoult

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

In article <38EA4EC0...@jay.jay.org>, Jay <j...@jay.jay.org> wrote:

> I would really LOVE for there to even be a
> respected "Roger Ebert" pundit for language design
> in this field or even a "Crossfire" like pundits in
> language camps. But I don't see this, its pretty much a bunch
> of political bickering (and envy) that then turned
> into silence and dismissal.

I don't get to see US TV very often, but I've seldom or never seen
anything productive come from "Crossfire". Film criticism is a lot easier
than language criticism in that the range of what people expect from the
film is a lot narrower.

> > > That said, "Pascal" local functions are an anachronism
> > > (nested functions that inherit their scope). C got it right by accident
> > > (K&R were basically high level language design clowns).
> > > Pascal got it because Algol had it.
> > > Modula2 and Ada got it because Pascal had it, even though
> > > they had other superior language features that covered its uses.
> >
> > What "superior language features" are those? The only possibilities I can
> > see are "objects", which neither Ada nor Modula-2 has.
>
> Ada and Modula2 had packages/modules which are abstract data types
> and are object except they lack runtime interface polymorphism
> and less importantly code sharing structural inheritance.
> But they are hella old now. As far as nesting functions, packages
> and modules provided superior capabilities.

The really funny thing here is that packages/modules/objects are *exactly*
semantically equivalent to closures.

> > So what scopes do you want? Local and global and that's it? Not even
> > structs/objects? Ick.
>
> There should primarily be only one scope for (non-constant) variables:
> parameters and local variables. OO languages have an implicit
> this (self) parameter on every object with a short cut
> (*this).Variable --> Variable. Not declaring the "this" parameter
> and the short cut are for convenience. It would be trivial
> to define a "C++" which required the first parameter to be
> a variable "O" (object) of the class type, in fact
> it would not be so bad because you could declare
> it constant there instead of the weird "void f(...) const;"
> syntax.

Not *all* OO languages have this strange distinction of the 'this'
parameter -- in Dylan and CLOS, for example, the method to call is
determined by the runtime type of *all* the arguments.

And, once again, once you accept the need for "object" scope, and for
block scope inside a function, I don't see how you can complain about
nested functions and closures which are *exactly* the same thing as
objects, just more conveniently expressed.

> Except at the local block level inside
> a method, scoping is now a pretty a much less relevant concept.
> The scoping ("what does this code do") "puzzles" taught in CS
> should be ridiculed as crap code.

Of course code should be written so as to be understandable. But people
should also understand the language rules and a "what does this code do"
puzzle is a good test of that understanding. It may be, of course, that
the rules for some particular language are too complex, and *that* is what
makes the code hard to understand. In fact for C++ that is almost
certainly the case. I'm certainly in favour of languages which are far
simpler (and yet more powerful) than C++.

> Also, global variables should almost never be used.

Agreed.

> > The pioneers in any new field of architecture will of course painstakingly
> > hand-write their assembler to get that control, but the people who follow
> > will expect their compilers to do it for them.
>
> Yes, people should expect excellence from their compilers, but they
> also must know the limitations of the current (and near future)
> compilers. I think the analogy breaks down a bit here
> because I see the language field as mature or at least stagnant.

I suspect that just about every sensible (and many not) language design
idea has been tried *somewhere*. Hell, most of them had been tried by
1980. What is not at all stagnant is the *implementation* of these ideas,
and the selection of which ideas make a language simple, powerful, easy to
use, and capable of being efficiently implemented.

> > The more abstract the
> > language, the more scope the compiler has to take advantage of the
> > characteristics of future machines -- whatever they are.
> > Yes, that's > *hard*, and in the past hasn't been done well.
>
> Thats the problem. It doesn't help us in the here and now (or
> the near future).

Well I think that's where you're wrong. Stay tuned.

-- Bruce

Bruce Hoult

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

In article <8ccvob$rf2$1...@mach.thp.univie.ac.at>,
jth...@mach.thp.univie.ac.at (Jonathan Thornburg) wrote:

> This thread started with the observation that for a particular toy
> benchmark,
> [I'm using Hennessy and Patterson (H&P)'s definition of
> "toy benchmark": one with typically O(100) or fewer
> source code lines, not derived from a "real" application,
> taking no or almost no input, and hence producing
> predictible-in-advance output.]
> a particular Scheme implementation was 20 or so times faster than a
> particular C implementation. The implicit (but clearly implied)
> generalization was (is) that this performance ratio is likely to hold
> for a wide range of other programs.

No I don't think that's the claim. At least, it's certainly not *my*
claim, and I started the thread :-) What I hope people are taking from
the 20:1 claim on a particular benchmark is not that Scheme is faster than
C in any absolute sense -- which of course can't be true given that the
Scheme compiler in question is compiling to C -- but that there is no
reason for Scheme to be *slower* than C, and that speed is not therefore a
reason to avoid it.

There are a large number of reasons why you might program in a high level
language such as Scheme, Common LISP, Dylan, CAML or Smalltalk. These
include such things as shorter, easier to understand code. Better
type-safety and memory management. Overflow and bounds checks. Ability
to express more power abstractions.

The traditional reasons *not* to use these languages have been:
- unfamiliar syntax
- slow speed
- implementations not available on the platforms you want
- poor integration with the OS and other legacy code
- multi-megabyte "hello world"

All of these have been addressed recently: unfamiliar syntax by Dylan,
widely available implementations by compiling to C, integration by
providing a C FFI and compiling to C, huge executables by removing
features such as "eval" that require the development environment at
runtime (and also by letting C++ catch up -- g++ makes half-megabyte
images for trivial programs).

> Unfortunately, one of the basic lessons about benchmarking we should
> all have learned by now, is that toy benchmarks do *not* accurately
> predict performance on "real-world" programs.
> [See H&P for an extensive and authoratative discussion,
> examples, etc. Their discussion is aimed at comparing
> different _hardware_ platforms, but it's equally applicable
> to comparing different _software_ environments.]
> In fact, H&P argue that toy benchmarks have exactly one valid use:
> teaching beginning students how to program.

I've got, read, and taken to heart both H&P and P&H.

Speed differences on toy benchmarks of course don't carry through
unchanged to larger programs. Once you start dealing with out-of-cache
data, paging, disk or network I/O the difference in speed between a fast
and a slow language implementation largely goes away. Which is why things
such as Perl are doing so well now. Hell, even interpreting a different
ISA is getting close to being a "below the noise threshold" thing, if the
interpreter fits in cache and the actual purpose of the program doesn't.
See the JVM, VirtualPC, VMWare, Apple's 68K emulator, and all the other
interpreter/JIT compiler efforts which are becoming common.

The interesting thing is that most programs contain *parts* with are
effectively toy benchmarks, and this is where a lot of the real-world
speed differences come from. The overall speed ratios aren't anywhere
near the ones seen on toy benchmarks, but they are big enough that they
are a worry. Thee are huge numbers of special-purpose high level
languages out there that are just great as long as what you are doing fits
into what the designer of the language expected you to do, but as soon as
you leave that area it all falls apart speed-wise. This includes most
so-called "4th generation" languages and languages that have a compiled
library but user code is interpreted.

I've tried using those special-purpose languages, and I've always in the
end, after much frustration, ended up coming back to C/C++ -- hell I'm
probably one of the few heretics who prefers to write his web CGIs in C++
(with a library that I've built up over time) rather than in Perl.

> Having said this, the claim that a given problem programmed in Scheme
> is likely to run a lot faster than the same problem programmed in C,
> is an interesting one. So far I haven't seen any substantive evidence
> for it, but I'm open to being convinced otherwise. (If nothing else,
> I have plenty of C and C++ code which I'd _love_ to see run 20+ times
> faster! In fact, I'd quite happily settle for "just" a factor of 2.)

The claim that all programs run faster in Scheme than in C is a straw
man. If, however, you'd like to see some evidence then all you have to do
is download Siskind's "stalin" Scheme compiler and try it for yourself.
It comes as a (pretty huge) single C file which you just have to compile
using your local C compiler and you're away.

Agreed, and I've noted those programs as ones to look at for my own work.

-- Bruce

Jay

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

David Hoyt wrote:

>
> Jay wrote:
>
> > The problem with CS is that language design
> > has been been marginalized in the academic community
> > and is not studied as the academics have mostly gone on to
> > other areas. The problem is not so much that they don't
> > have real world software engineering experience is that
> > they take the word "Science" in Computer Science too
> > seriously, this pushes them to study more abstract
> > "timeless" scientific and mathematical principles and trying
> > to be the next Von Neumann type of stuff, than a dirty and
> > incremental subject such as SE and SE language design
> > which is more project sociology and psychology (of code understanding
> > by humans).
>

> Golly, what a troll. There are plenty of (computer) language people
> dealing with real issues. Self and Smalltalk are very good examples
> of sociology inspired languages.

Smalltalk? Self? What incompetently SE designed crap!
Lets throw away static type checking! We must be geniuses!
Sociologically "inspired" maybe, but the result is basic
SE language design incompetence. And wasn't these made
at Sun and Xerox Parc, not academia?

> David Unger and both of the
> Wirth-Brocks have spent a fair amount of time and energy on
> teaching, software engineering, communication and other
> development related issues. David and Alan have shown how to
> make maintainable code real fast (or fit into really small places).
> Rebecca spends a lot of time on teaching real world software
> development skills to students. While they all work in industry,
> David especially, has a pretty big impact via academia.

What percentage of "elite" CS academia is working on software engineering
and SE language design (<5%)??? How many BS (or even MS/PHD) software
engineering degrees are offered at big colleges even now?
I tell you first hand that the CS academics mostly don't give a fuck about
SE and that CS graduates are basically totally SE incompetent
when the get their degree.

Let me see they mostly all work in industry....
What impact do they have on academia? Let me see
academia is pretty much doing/teaching C/C++ and
poorly at that. Looks like zip to me.

> I have a hard time believing that I am justifying academia here,
> it's not my usual stick; but if there is one place where I think
> there is a huge amount of good work going on, its in language
> design and compiler technology. Try reading SigPOPL, related
> conferences and the like. There is a lot of good (real) stuff going
> on.

Worthless masturbation and minutia:

You guys don't believe me go to

http://www.cs.princeton.edu/~appel/popl99/program.html

and see for yourself.

> It's hardly the fault of the academics that superior software-
> engineering languages like Dylan, Smalltalk or ML find fewer
> users than Cobol or C.

Dylan, Smalltalk and ML, what a load of crap!
Watch them go nowhere. I am most open
to ML, but I have never seen a definitive
answer of why "type inferencing" is a critical
language feature for SE reasons. Given that
the vast majority of programmers are total
bozos do this is a shit hole field with zero
qualifications and no SE degrees,
"flexiblity" (say over generics/templates)
is a double edge sword.

One must separate academic fantasies in a stale
(almost dead) field from reality.

Jay

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

Bruce Hoult wrote:
>
> In article <38EA4EC0...@jay.jay.org>, Jay <j...@jay.jay.org> wrote:
>

> > I would really LOVE for there to even be a
> > respected "Roger Ebert" pundit for language design
> > in this field or even a "Crossfire" like pundits in
> > language camps. But I don't see this, its pretty much a bunch
> > of political bickering (and envy) that then turned
> > into silence and dismissal.
>

> I don't get to see US TV very often, but I've seldom or never seen
> anything productive come from "Crossfire". Film criticism is a lot easier
> than language criticism in that the range of what people expect from the
> film is a lot narrower.
>

> > > > That said, "Pascal" local functions are an anachronism
> > > > (nested functions that inherit their scope). C got it right by accident
> > > > (K&R were basically high level language design clowns).
> > > > Pascal got it because Algol had it.
> > > > Modula2 and Ada got it because Pascal had it, even though
> > > > they had other superior language features that covered its uses.
> > >
> > > What "superior language features" are those? The only possibilities I can
> > > see are "objects", which neither Ada nor Modula-2 has.
> >
> > Ada and Modula2 had packages/modules which are abstract data types
> > and are object except they lack runtime interface polymorphism
> > and less importantly code sharing structural inheritance.
> > But they are hella old now. As far as nesting functions, packages
> > and modules provided superior capabilities.
>

> The really funny thing here is that packages/modules/objects are *exactly*
> semantically equivalent to closures.

Semantics? Jeez. Who gives a rat's ass about semantics?
Not really relevant to the thought processes of a programmer on the street.

>
> > > So what scopes do you want? Local and global and that's it? Not even
> > > structs/objects? Ick.
> >
> > There should primarily be only one scope for (non-constant) variables:
> > parameters and local variables. OO languages have an implicit
> > this (self) parameter on every object with a short cut
> > (*this).Variable --> Variable. Not declaring the "this" parameter
> > and the short cut are for convenience. It would be trivial
> > to define a "C++" which required the first parameter to be
> > a variable "O" (object) of the class type, in fact
> > it would not be so bad because you could declare
> > it constant there instead of the weird "void f(...) const;"
> > syntax.
>

> Not *all* OO languages have this strange distinction of the 'this'
> parameter -- in Dylan and CLOS, for example, the method to call is
> determined by the runtime type of *all* the arguments.

Yawn. Dylan and CLOS are crackpot languages on the fringes with
close to zip use. So now you want us to look at the world from
the theoretical views of Dylan and CLOS. Bzzzt.

> And, once again, once you accept the need for "object" scope,

I don't need to think in object scope terms with
most "OO" languages.

> and for block scope inside a function,

Declare a temporary local variable within a block or at the
point it is used. Not exactly and an earth shaking
concept and since subroutines should be short
and fit on a screen in well engineered code, how complex
can it get?

> I don't see how you can complain about
> nested functions and closures which are *exactly* the same thing as
> objects, just more conveniently expressed.

An orthogonality argument, you have nesting in the small (blocks or
for-loops, etc) so you must have nesting everywhere. Unfortunately,
nesting in the small is okay, but nesting in the large (if say
a program grows) becomes complex and hard to read.
The programmer laziness starts to become an issue, small
code may start under a screen and grow to over a screen.

My position is orthogonality a bad language design argument
and one must look at each case/combination to see if you allow
something or not for SE reasons.

> > Except at the local block level inside
> > a method, scoping is now a pretty a much less relevant concept.
> > The scoping ("what does this code do") "puzzles" taught in CS
> > should be ridiculed as crap code.
>

> Of course code should be written so as to be understandable. But people
> should also understand the language rules and a "what does this code do"
> puzzle is a good test of that understanding. It may be, of course, that
> the rules for some particular language are too complex, and *that* is what
> makes the code hard to understand. In fact for C++ that is almost
> certainly the case. I'm certainly in favour of languages which are far
> simpler (and yet more powerful) than C++.

I am also a simple language fan, yet I would make it less flexible
than C++. Flexibility breeds complexity and "code that goes over
the heads" of average programmers (isn't this trick cool).
IMO, the goal should be to restrict and simplify and dumb down
languages (Java looked promising but I don't like the
way Sun's is doing with it). I really don't see programming nirvana
coming from "mathematically elegant" language design.

Jay

Bruce Hoult

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

In article <38EA954B...@jay.jay.org>, Jay <j...@jay.jay.org> wrote:

> > The really funny thing here is that packages/modules/objects are *exactly*
> > semantically equivalent to closures.
>
> Semantics? Jeez. Who gives a rat's ass about semantics?
> Not really relevant to the thought processes of a programmer on the street.

Once you've learned the basic syntax of a language, the meaning of a
construct is the only thing which is important. You're trying to get a
job done, which means that what you're doing is trying to make a program
with the same meaning as your specification.

Anything which adds to that is useful, anything which subtracts from that
is harmfull.

> > Not *all* OO languages have this strange distinction of the 'this'
> > parameter -- in Dylan and CLOS, for example, the method to call is
> > determined by the runtime type of *all* the arguments.
>
> Yawn. Dylan and CLOS are crackpot languages on the fringes with
> close to zip use. So now you want us to look at the world from
> the theoretical views of Dylan and CLOS. Bzzzt.

Mere unsupported assertion. Every new language starts without users. I
was using C++ when it was in the same situation and it wasn't at all clear
whether it was possible to convince all the existing C programmers to use
it and now look at it.

> > And, once again, once you accept the need for "object" scope,
>
> I don't need to think in object scope terms with
> most "OO" languages.

Bullshit. What are all those names that aren't declared locally and
aren't globals either? Are they in the current class, or in one of the
superclasses? And which one? You've obviously never programmed using one
of the typical C++ GUI frameworks.

> > and for block scope inside a function,
>
> Declare a temporary local variable within a block or at the
> point it is used. Not exactly and an earth shaking
> concept and since subroutines should be short
> and fit on a screen in well engineered code, how complex
> can it get?

And if a function, including any nested functions, fits on a screen then
how complex can it get? There's no difference whatever in terms of
complexity of understanding between a nested function and a nested loop.

> > I don't see how you can complain about
> > nested functions and closures which are *exactly* the same thing as
> > objects, just more conveniently expressed.
>
> An orthogonality argument, you have nesting in the small (blocks or
> for-loops, etc) so you must have nesting everywhere. Unfortunately,
> nesting in the small is okay, but nesting in the large (if say
> a program grows) becomes complex and hard to read.
> The programmer laziness starts to become an issue, small
> code may start under a screen and grow to over a screen.

Then it's time to refactor.

Nesting, like anything else, can be overdone. That doens't mean that it
shouldn't be possible at all.

> My position is orthogonality a bad language design argument
> and one must look at each case/combination to see if you allow
> something or not for SE reasons.

Which is *exactly* what the designers of Dylan did.

-- Bruce

Jay

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

Bruce Hoult wrote:
>
> In article <38EA954B...@jay.jay.org>, Jay <j...@jay.jay.org> wrote:
>

> > > The really funny thing here is that packages/modules/objects are *exactly*
> > > semantically equivalent to closures.
> >
> > Semantics? Jeez. Who gives a rat's ass about semantics?
> > Not really relevant to the thought processes of a programmer on the street.
>

> Once you've learned the basic syntax of a language, the meaning of a
> construct is the only thing which is important. You're trying to get a
> job done, which means that what you're doing is trying to make a program
> with the same meaning as your specification.

All programmers need is an intuitive understanding of what a language
feature does. When you link it semantics with closures I must
assume your are talking about a more theoretical definition of
semantics. Closures are really not a necessary concepts
for understanding objects to do software. That these concepts
can be theoretically linked is a good example
of mathematical masturbation and not focusing on SE fundamentals
(dumbass programmers).

> Anything which adds to that is useful, anything which subtracts from that
> is harmfull.
>

> > > Not *all* OO languages have this strange distinction of the 'this'
> > > parameter -- in Dylan and CLOS, for example, the method to call is
> > > determined by the runtime type of *all* the arguments.
> >
> > Yawn. Dylan and CLOS are crackpot languages on the fringes with
> > close to zip use. So now you want us to look at the world from
> > the theoretical views of Dylan and CLOS. Bzzzt.
>

> Mere unsupported assertion. Every new language starts without users.

> was using C++ when it was in the same situation and it wasn't at all clear
> whether it was possible to convince all the existing C programmers to use
> it and now look at it.

I predict doom for those languages, C++ only succeeded because it
was seen as THE new version of C and C was dominant.

>
> > > And, once again, once you accept the need for "object" scope,
> >
> > I don't need to think in object scope terms with
> > most "OO" languages.
>

> Bullshit. What are all those names that aren't declared locally and
> aren't globals either? Are they in the current class, or in one of the
> superclasses? And which one? You've obviously never programmed using one
> of the typical C++ GUI frameworks.

Thats not scoping thats inheritance and type extension.

> > > and for block scope inside a function,
> >
> > Declare a temporary local variable within a block or at the
> > point it is used. Not exactly and an earth shaking
> > concept and since subroutines should be short
> > and fit on a screen in well engineered code, how complex
> > can it get?
>

> And if a function, including any nested functions, fits on a screen then
> how complex can it get? There's no difference whatever in terms of
> complexity of understanding between a nested function and a nested loop.

(1) Scoping makes all the parameters (including the implicit this)
and locals available to the function thus making its function
dependent of the surrounding scope and not really an
independent function. I find it easier to understand
functions that pass all variables that it uses instead
of relying on its scope. My theory is that
some programmers like to use scoping because they
are too lazy to pass parameters.

(2) I don't like separating the function's parameter list
from its code with inserted functions and its easy to
just move the function above the current one. Its
independent and thus has a higher chance of being
reused.

(3) My belief is that a nested loop is actually more easily
understood by most programmers, especially if the programmers
start doing "interesting" things with recursion.

Basically, just unnest it!

> > > I don't see how you can complain about
> > > nested functions and closures which are *exactly* the same thing as
> > > objects, just more conveniently expressed.
> >
> > An orthogonality argument, you have nesting in the small (blocks or
> > for-loops, etc) so you must have nesting everywhere. Unfortunately,
> > nesting in the small is okay, but nesting in the large (if say
> > a program grows) becomes complex and hard to read.
> > The programmer laziness starts to become an issue, small
> > code may start under a screen and grow to over a screen.
>

> Then it's time to refactor.

Basic SE factoid: Programmers are too lazy and incompetent to refactor

>
> Nesting, like anything else, can be overdone. That doens't mean that it
> shouldn't be possible at all.

I always hate the no Nanny-ism argument. I look at language design
as a budget, you don't want to go over budget with too much complexity
(degrees of freedom). You look at a feature and where it would
be convenient to use and whether you think that the application
justifies its inclusion. My feeling that this
feature is not really totally necessary for most programming
and is subject to abuse, looks like a good place to make a budget cut.
If you don't make these types of hard decisions you get C++.

> > My position is orthogonality a bad language design argument
> > and one must look at each case/combination to see if you allow
> > something or not for SE reasons.
>

> Which is *exactly* what the designers of Dylan did.

I should look at Dylan then, its just so depressing
because such new languages almost always fail.

Jay

Paul Hsieh

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

bruce...@pobox.com says...

> jth...@mach.thp.univie.ac.at (Jonathan Thornburg) wrote:
> > This thread started with the observation that for a particular toy
> > benchmark,
> > [I'm using Hennessy and Patterson (H&P)'s definition of
> > "toy benchmark": one with typically O(100) or fewer
> > source code lines, not derived from a "real" application,
> > taking no or almost no input, and hence producing
> > predictible-in-advance output.]
> > a particular Scheme implementation was 20 or so times faster than a
> > particular C implementation. The implicit (but clearly implied)
> > generalization was (is) that this performance ratio is likely to hold
> > for a wide range of other programs.
>
> No I don't think that's the claim. At least, it's certainly not *my*
> claim, and I started the thread :-) What I hope people are taking from
> the 20:1 claim on a particular benchmark is not that Scheme is faster than
> C in any absolute sense -- which of course can't be true given that the

> Scheme compiler in question is compiling to C -- [...]

Well, what I am hoping for is that my previous post was not ignored, and
that unfortunately. This 21:1 claim just looks like a case of poor
benchmarking. My belief is that the code is no better than 1:1 than
*real* C code in the example given (though I could only really prove that
it could not be better than 3:1 on an Athlon using WATCOM C/C++, and
would only be achieved that if it somehow performed better scheduling
than the base C compiler.)

> [...] but that there is no reason for Scheme to be *slower* than C,

> and that speed is not therefore a reason to avoid it.

That isn't supported by anything posted thus far. No fair comparisons
that I know of have yet been fully explored (I did not get a chance to
try a Scheme compile using WATCOM C/C++ as a back end on my Athlon.)

> There are a large number of reasons why you might program in a high level
> language such as Scheme, Common LISP, Dylan, CAML or Smalltalk. These
> include such things as shorter, easier to understand code.

That is too much of a generalization. Scheme is a derivative of Lisp
after all. This is of itself can pose problems for algorithms that are
irregular or non-recursive.

(I can repost my Chess program inner loop, if you are looking for a real
challenge in programming for clarity. My C implementation has a couple
gotos strategically placed. While they can be removed, I don't see how
to do it without impacting performance.)

> [...] Better type-safety and memory management.

Does it support programmatically aligned memory allocations?

> [...] Overflow and bounds checks.

Well there goes any credible claims about performance.

> [...] Ability to express more power abstractions.

And you don't think a "slow" langauge exacerbates this problem by
requiring larger code footprints, having no mechanisms for aligning data,
potentially worse problems with register spill/fill?

> [...] Which is why things such as Perl are doing so well now.

Well hang on there. I don't know of any head to head speed trials
between C and Perl programs. If people are using Perl, its because they
don't care about performance. Nobody believes that Perl in of itself has
acceptable performance versus C (or anything compiled for that matter.)

> [...] Hell, even interpreting a different

> ISA is getting close to being a "below the noise threshold" thing, if the
> interpreter fits in cache and the actual purpose of the program doesn't.
> See the JVM, VirtualPC, VMWare, Apple's 68K emulator, and all the other
> interpreter/JIT compiler efforts which are becoming common.

Woah there nelly. JIT compilers are compilers. They don't insert type
checking, or better memory manamgement or anything like that. We are
talking about apples and oranges here. The reasons why interpreting ISAs
are interesting and showing some degree of promise in terms of
performance has nothing to do with higher level languages like Scheme.

> > Having said this, the claim that a given problem programmed in Scheme
> > is likely to run a lot faster than the same problem programmed in C,
> > is an interesting one. So far I haven't seen any substantive evidence
> > for it, but I'm open to being convinced otherwise. (If nothing else,
> > I have plenty of C and C++ code which I'd _love_ to see run 20+ times
> > faster! In fact, I'd quite happily settle for "just" a factor of 2.)
>
> The claim that all programs run faster in Scheme than in C is a straw

> man. [...]

Then can we change that to "any programs"? I haven't seen the first one.

> [...] If, however, you'd like to see some evidence then all you have to do

> is download Siskind's "stalin" Scheme compiler and try it for yourself.
> It comes as a (pretty huge) single C file which you just have to compile
> using your local C compiler and you're away.

Does it target ANSI C (as opposed to gcc)?

> > So... What would constitute _substantive_ evidence for Scheme's claimed
> > performance superiority over C/C++/etc? Well, how about taking some
> > existing _credible_ C/C++/Fortran (say) programs, reengineering them
> > in Scheme, and showing that the Scheme versions are smaller, faster,
> > more accurate, more robust in the face of ill-conditioned inputs,
> > have fewer bugs, or whatever.
> >
> > Since the original "Scheme is 21 times faster than C" claim was for
> > a toy numerical benchmark, here are some suggested "credible numerical
> > programs" (in this case subroutines or subroutine libraries) for this
> > experiment. The Fortran, C, or C++ versions of all of these are available
> > from http://www.netlib.org and mirrors.
> > * Quadpack [numerical integration]
> > * ODEPACK or RKSUITE [ODE integration]
> > * the basic LU-decomposition routines from LAPACK [linear algebra]
> > * Dierckx or Pchip [spline fitting]
> > * subplex [unconstrained optimization via simplex method]
> > * Lawson-Hanson [least-squares]
> > * meschach, laso, y12m, Kundert [sparse-matrix linear algebra]

Well, why don't we start with Stream? That's a simple and short one.
Lets see just how fast Scheme's array bounds checking functionality is as
compared with C's non-existent array bounds checking. (These examples
all look like they would take quite some time to port over.)

> > Or, for non-numerical code, how about Apache, GNU Make, or Lynx?
> > Or your favorite module of the Linux or {Open,Net,Free}BSD kernel?
> > (I'll nominate some of the TCP/IP stack, or if that's too big, how
> > about the Linux /dev/random driver and/or ipsec cryptography stack?)

Well, then we get all the same complaints that are heap on Spec, and then
some. I would instead propose implementation of some standard algorithms
such as (1) the Boyer-Moore string searching algorithm, (2) A hash table
look up, (3) Huffman encoding, (4) Maze solving.

And in each of these cases, I would be desirous of avoiding the
"specifically engineered" scenario that the 81-sample numerical
integration suffered from (the example looks far less sensible if you
want to take several thousand samples (which would be more realistic),
for example.)

> > The key point is that all of these are nontrivial programs, with
> > good reputations (i.e. the existing C/Fortran/C++/whatever versions
> > are good enough that improvements on them would be quite interesting),
> > well-defined semantics and freely-available source code to start the
> > reengineering process, and small enough (many only 1-10K lines of
> > C/C++/Fortran each) that (particularly with the claimed
> > higher-level-language-than-C/Fortran/etc programmer productivity
> > benefits) the experiment shouldn't take too long.

Its a nice idea, but I'd bet against getting any takers to *your*
challenges, because I think they are longer than they look when you take
into consideration that they are being ported to a language that requires
a different mind set then that of a C programmer.

> > *This* experiment would be interesting, and unlike toy benchmarks,
> > it would actually tell us something useful about performance on other
> > real-world applications.
>
> Agreed, and I've noted those programs as ones to look at for my own work.

Dude, if you are serious about establishing the credibility of such
efforts, you need to involve a larger audience, and that means starting
from even simpler examples.

--
Paul Hsieh
http://www.pobox.com/~qed/

Ketil Z Malde

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

jth...@mach.thp.univie.ac.at (Jonathan Thornburg) writes:

> * Quadpack [numerical integration]
> * ODEPACK or RKSUITE [ODE integration]
> * the basic LU-decomposition routines from LAPACK [linear algebra]
> * Dierckx or Pchip [spline fitting]
> * subplex [unconstrained optimization via simplex method]
> * Lawson-Hanson [least-squares]
> * meschach, laso, y12m, Kundert [sparse-matrix linear algebra]

I notice you don't seem to have any FFT in there, I wonder why that
is? :-)

Also, you point at several typical server applications, now, I can
think of one popular mail server that really should have been written
in a different language, not for speed, but for safety.

Tim Bradshaw

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

* Paul Hsieh wrote:

> That is too much of a generalization. Scheme is a derivative of Lisp
> after all. This is of itself can pose problems for algorithms that are
> irregular or non-recursive.

What? Learn something about Lisp before you post rubbish, please!
Common Lisp has the hairiest looping constructs you've ever seen, as
well as GO TO and an extremely powerful macro system if you want to
cook your own control constructs. Scheme, which tends to provide
general lower-level facilities rather than prepackaged tools like CL,
has call/cc from which control constructs of arbitrary complexity can
be created.

--tim

Jonathan Thornburg

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

In article <brucehoult-05...@bruce.bgh>,

Bruce Hoult <bruce...@pobox.com> wrote:
>g++ makes half-megabyte
>images for trivial programs).

Hmm. On an x86 GNU/Linux system, I just tried the following:

% cat >hello.cc
#include <stdio.h>

int main()
{
printf("hello, world\n");
return 0;
}
% cat >hello2.cc
#include <iostream>

int main()
{
cout << "hello, world\n";
return 0;
}
% g++ --version
2.95.2
% g++ -O -o hello hello.cc
% g++ -O -o hello2 hello2.cc
% ./hello
hello, world
% ./hello2
hello, world
% lsl ./hello ./hello2
-rwxr-xr-x 1 jthorn hpc 12779 Apr 5 13:14 ./hello*
-rwxr-xr-x 1 jthorn hpc 12933 Apr 5 13:15 ./hello2*
%

(I also got essentially-identical results with the egcs-2.91.66 which
Red Hat ships with their 6.1 GNU/Linux release.)

Admittedly, this is only a single pair of trivial programs, but that
still sure doesn't look like half a megabyte to me. Perhaps your g++
was mistakenly built without shared library support?

--
-- Jonathan Thornburg <jth...@galileo.thp.univie.ac.at>
http://www.thp.univie.ac.at/~jthorn/home.html
Universitaet Wien (Vienna, Austria) / Institut fuer Theoretische Physik

Amount of all stock owned by the least wealthy 90% of America: 18%
Amount of all stock owned by the most wealthy 1% of America: 41%
-- Economic Policy Institute

Jonathan Thornburg

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

In article <8ccvob$rf2$1...@mach.thp.univie.ac.at>, I
(Jonathan Thornburg <jth...@galileo.thp.univie.ac.at>) suggested
JT= So... What would constitute _substantive_ evidence for Scheme's claimed
JT= performance superiority over C/C++/etc? Well, how about taking some
JT= existing _credible_ C/C++/Fortran (say) programs, reengineering them
JT= in Scheme, and showing that the Scheme versions are smaller, faster,
JT= more accurate, more robust in the face of ill-conditioned inputs,
JT= have fewer bugs, or whatever.
JT=
JT= Since the original "Scheme is 21 times faster than C" claim was for
JT= a toy numerical benchmark, here are some suggested "credible numerical
JT= programs" (in this case subroutines or subroutine libraries) for this
JT= experiment. The Fortran, C, or C++ versions of all of these are available
JT= from http://www.netlib.org and mirrors.
JT= * Quadpack [numerical integration]
JT= * ODEPACK or RKSUITE [ODE integration]
JT= * the basic LU-decomposition routines from LAPACK [linear algebra]
JT= * Dierckx or Pchip [spline fitting]
JT= * subplex [unconstrained optimization via simplex method]
JT= * Lawson-Hanson [least-squares]
JT= * meschach, laso, y12m, Kundert [sparse-matrix linear algebra]

In article <MPG.1354aa5f9...@news.pacbell.net>,
Paul Hsieh <DONT.qed...@pobox.com> replied
PH> Well, why don't we start with Stream? That's a simple and short one.

Unfortunately, stream (http://www.cs.virginia.edu/stream/), though
an excellent benchmark for measuring sustainable memory bandwidth as
seen from a high-level language program, isn't a very good _language_
benchmark: any at-least-reasonable language implementation should be
able to get comparable stream scores. (A few Fortran/C compilers have
boosted stream scores somewhat (I think on the order of 50%) by clever
prefetching, but that's still not a huge effect.

In fact, given today's main-memory/cache access time ratios of 30-100,
you can hide a lot of cpu-core (language and/or hardware) inefficiency
under stream's totally main-memory-bandwidth-dominated access patterns.

PH> Lets see just how fast Scheme's array bounds checking functionality is as
PH> compared with C's non-existent array bounds checking.

In particular, there should be no problem hiding array bounds checking
in stream, even with a dumb compiler which doesn't optimize it away.
(This is not an argument for or against array bounds checking itself,
merely a statement that its performance impact should be negligible
in stream.)

Jonathan Thornburg

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

In article <8ccvob$rf2$1...@mach.thp.univie.ac.at>,
I (Jonathan Thornburg <jth...@galileo.thp.univie.ac.at>) suggested

as credible benchmark programs for language comparisons,
>> * Quadpack [numerical integration]

>> * ODEPACK or RKSUITE [ODE integration]

>> * the basic LU-decomposition routines from LAPACK [linear algebra]

>> * Dierckx or Pchip [spline fitting]

>> * subplex [unconstrained optimization via simplex method]

>> * Lawson-Hanson [least-squares]

>> * meschach, laso, y12m, Kundert [sparse-matrix linear algebra]

In article <KETIL-vk1k...@eris.bgo.nera.no>,
Ketil Z Malde <ke...@ii.uib.no> asked

>I notice you don't seem to have any FFT in there, I wonder why that
>is? :-)

It's because I haven't personally done any FFTs in the last couple of
years, so I don't off-the-cuff know of any good FFT libraries. But a
quick search on netlib shows several of them, any of which would also
make interesting benchmarks.

>Also, you point at several typical server applications, now, I can
>think of one popular mail server that really should have been written
>in a different language, not for speed, but for safety.

Certainly a reengineering of sendmail would be of interest, as
would comparisons with other more-recently-written MTAs such as
Bernstein's qmail, Weitse's vmail, etc.

--
-- Jonathan Thornburg <jth...@galileo.thp.univie.ac.at>
http://www.thp.univie.ac.at/~jthorn/home.html
Universitaet Wien (Vienna, Austria) / Institut fuer Theoretische Physik

"The first strike in the American Colonies was in 1776 in Philadelphia,
when [...] carpenters demanded a 72-hour week." -- Anatole Beck

Jonathan Thornburg

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

In article <brucehoult-05...@bruce.bgh>,
Bruce Hoult <bruce...@pobox.com> wrote

> Once you start dealing with out-of-cache
> data, paging, disk or network I/O the difference in speed between a fast
> and a slow language implementation largely goes away.

I think there's an important point hidden here. For *some* programs,
Bruce's statement is or may be true. But for other programs, it's
spectacularly false.

> Which is why things
> such as Perl are doing so well now. Hell, even interpreting a different
> ISA is getting close to being a "below the noise threshold" thing, if the
> interpreter fits in cache and the actual purpose of the program doesn't.

Again, for *some* programs, this is true, and for other's it's not.

For example, I've been working on 3 main pieces of code in the last month:
* One is written in Maple (a Lisp-like symbolic algebra system), and
typically takes half an hour to run on a very slow computer, but
doesn't get run very often. In practice this performance is "good
enough", and the program is doing a lot of symbolic manipulation
anyway, so changing to a lower-level language would be really dumb.
* One is written in Perl, and typically takes a minute or so to run
(mostly cpu-bound, though also doing a moderate amount of i/o) on
a midrange PC. This performance is also "good enough" in practice,
and Perl is about the right level of language, so this too is likely
to stay in its current form.
* One is written in a mixture of C++, C generated by the Maple code,
and Fortran, and typically takes a week or so (almost entirely cpu-bound)
to run a small test case on a midrange PC; this is clearly "too slow".
This program may get moved to a supercomputer soon, but will probably
stay in its present mix of languages for the time being. (Other
comparable programs are mostly C or Fortran.)
From my knowledge of what this program is doing (time-integrating
a very complicated system of partial differential equations), I think
Bruce's statement would be spectacularly false for this program. That
is, I think moving this program to a higher-level language(s) would make
it a lot slower. But I don't have any proof of this, and given that
I've spent a year's part-time work writing the current code, I'm
reluctant to start over again on a new version in a different language
just for the experiment. Sigh...

I suggested as good benchmarks for inter-language comparisons,

JT= * Quadpack [numerical integration]
JT= * ODEPACK or RKSUITE [ODE integration]
JT= * the basic LU-decomposition routines from LAPACK [linear algebra]
JT= * Dierckx or Pchip [spline fitting]
JT= * subplex [unconstrained optimization via simplex method]
JT= * Lawson-Hanson [least-squares]

JT= * meschach, laso, y12m, Kundert [sparse-matrix linear algebra]
JT=
JT= Or, for non-numerical code, how about Apache, GNU Make, or Lynx?
JT= Or your favorite module of the Linux or {Open,Net,Free}BSD kernel?
JT= (I'll nominate some of the TCP/IP stack, or if that's too big, how
JT= about the Linux /dev/random driver and/or ipsec cryptography stack?)

and Bruce replied

> Agreed, and I've noted those programs as ones to look at for my own work.

On other key point that I'd want to see addressed before seriously
trying a Lisp-family language for number-crunching, is debugger support
for mixed-language environments. That is, if my present C++/C/Fortran
executable dumps core, gdb will let me look at what's going on fairly
well (provided I can reproduce the problem with a -g binary). And
most Lisp-family languages have long come with excellent debuggers
for their environments. But what happens if my Dylan program calls
(via the Dylan foreign-function interface) a C++ class, which calls
a Fortran program, which crashes? I don't know of any debugger which
will let me do a good job tracking the bug(s) across the Lisp-family
and Algol/Fortran-family languages. Improvements in this direction
would be *very* welcome!

Kjetil Torgrim Homme

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

[Paul Hsieh]

> Well hang on there. I don't know of any head to head speed trials
> between C and Perl programs. If people are using Perl, its
> because they don't care about performance. Nobody believes that
> Perl in of itself has acceptable performance versus C (or anything
> compiled for that matter.)

If the performance is unacceptable, why are people using it?

People are generally using Perl for tasks which are I/O-bound or
require a lot of text processing. Perl contains a library which many
C programs will have a hard time beating

Kernighan and Van Wyk: "Timing Trials, or, the Trials of Timing:
Experiments with Scripting and User-Interface Languages"

<URL:http://cm.bell-labs.com/cm/cs/who/bwk/interps/pap.html>

Kjetil T.

Dave Hansen

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

On Tue, 4 Apr 2000 08:39:30 -0500, "Andy Glew" <gl...@cs.wisc.edu>
wrote:

[...]

>For example, last time I checked, 64-bit "long long" was
>not in standard C/C++ --- but I won't give it up. Fortunately,

FWIW, "long long" *is* in the latest C standard. It was a subject of
much debate. Some felt the natural name for a 64-bit integer was
"long", others objected simply to the syntax ("long long" isn't a
type: it's a syntax error), but the overwhelming weight of existing
practice tipped the balance.

[...]

>I do admit to using varargs macros, suitably ifdeffed GCC

Varargs macros are also in the new C standard, but they're not quite
the same as GCC's.

[...]

>Similarly, I use __FUNCTION__ and __PRETTY_FUNCTION__
>in debug output.

__FUNCTION__ is also there, but I think it might be spelled
differently (__FUNC__? I don't have the new standard handy).

[...]

>If everyone programmed strictly to the standard, no progress
>would be made.

It's hard (or at least it *should* be hard) to get anything into the
standard that doesn't have an example implementation.

Regards,

-=Dave
--
The opinions expressed in this post are my own and do not
represent the views of CyberOptics Corporation.

Andy Glew

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

> In Dylan, when you write "slot name :: type;" in a class declaration it
> automatically does everything you did in your macro, plus it's easier to
> customize because you can instead write something like...
>
> slot name :: type,
> getter: myName,
> setter: #f;
>
> -- Bruce

I'm familiar with languages that
(1) allow a data member or a getter/setter to be syntactically
indistinguishable - so you can start off with an exposed
data member and then convert to a getter/setter without
changing any code
(2) implicit create getter/setters.
E.g. Sather, Eiffel.

I'd accept Bertrand Meyer's dictum that this should be a basic
principle of language design, except for one issue, which I have
not managed to come to a satisfactory resolution for, and
which I seek advice anywhere:

* Pass by Reference *
or, equivalently, pass by pointer.

Let's talk about C-style pass by pointer, since C++ style
pass by reference is just syntactic sugar about this.

Say you have a function that modifies an argument
passed by reference:

void increment(int* x) { ++(*x); }

If you have a raw data member, it can be passed by reference
to this function
struct Sclass { int i; int j; };
struct Sclass sobj;
increment(&sobj.i);
increment(&sobj.j);

If you want to make this data member private, then you can't take the
address any longer. (Or, if you can take the address of the member, then
it isn't very private.)

One could imagine passing a thunk that uses accessors:

template<class C, typename MemberType>
class getset_value

const C& obj;
// pardon syntax errors - I am trying to declare
// member pointers, always need a manual to do so
MemberType (C::*getoff)();
void (C::*setoff)(MemberType& new);
public:
getset_value(C& o, MemberType (C::*g)(), void (C::*s)() )
: obj(o), getoff(g), setoff(s) {}
public:
// minimal proxying
getset_value& operator++() {
MemberType m = this->*getoff();
++m;
this->*setoff(m);
}
};

But this requires the using code to be changed

template<class I>
void increment(I* x) { ++(*x); }

class Sclass

private:
int i; int j;
public:
int get_i() { return i; }
void set_i(int n) { i = n; }
int get_j() { return j; }
void set_j(int n) { j = n; }
};
struct Sclass sobj;
increment(&getset_value(sobj,&sobj::get_i,&sobj::.set_i));
increment(&getset_value(sobj,&sobj::get_j,&sobj::.set_j));

Although one can imagine making the above totally transparent
- e.g. there could be an Sclass::i "data" member that is a getset_value
this does not hide the fact that the original user function
increment(int* i)
needs to be changed;
or that such thunking is a lot less expensive than
dereferencing a pointer.

Even such relatively heroic measures are not 100% complete.

The bottom line, at least insofar as I can tell, is that
attempting to transparently substitute a getter/setter for a data member
fails in the presence ofd pass by reference.

There appear tp be only a limited number of possible choices:

(1) Don't provide transparent getter/setters --- C++
(2) Provide transparent getter/setters, but don't allow pass by reference
-- Eiffel
(3) Provide both implicit getter/setters and pass by reference;
situations where a passed by reference data member is changed
to a getter/setter acces are flagged as compile-time errors
This is not totally transparent, but it is good enough
for 90% of all situations.
I think Ada is in this camp
(4) Use a language that implicitly supports the sort of thunking
I loosely describe above. E.g. LISP or Smalltalk.

For a time I was seriously interested in (2) approaches,
replacing pass by reference by copy-in/out.
But, pass by reference is required for large linked
data structures, and I think that allowing pass by reference
for these but not for basic types like "int" would be
a regrettable irregularity in a language definition.

Anyway.... I keep hoping that there is some magic way
out of this.

---

Somewhere I read a comment, I think it was by Stroustrup
talking about a conversation with the designer of the original
C, or someone of that ilk, that said something like
"C was successful because it had a machine model
that corresponded well to actual hardware."

I think this applies here:
getter/setters are the *right* thing to do from a programmer's
point of view - functional interception is the most general form
- but the getter/setter breaks down in the presence of pass-by-reference,
and pointers in general. And pass-by-reference and pointers in
general correspond so exactly to CPU hardware, that it would
be silly to do without them in a general purpose systems language.
(Even Java's references are only slightly adorned pointers.)

David Gay

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

bruce...@pobox.com (Bruce Hoult) writes:
> In article <8ccvob$rf2$1...@mach.thp.univie.ac.at>,
> jth...@mach.thp.univie.ac.at (Jonathan Thornburg) wrote:
>
> > This thread started with the observation that for a particular toy
> > benchmark,
> > [I'm using Hennessy and Patterson (H&P)'s definition of
> > "toy benchmark": one with typically O(100) or fewer
> > source code lines, not derived from a "real" application,
> > taking no or almost no input, and hence producing
> > predictible-in-advance output.]
> > a particular Scheme implementation was 20 or so times faster than a
> > particular C implementation. The implicit (but clearly implied)
> > generalization was (is) that this performance ratio is likely to hold
> > for a wide range of other programs.
>
> No I don't think that's the claim. At least, it's certainly not *my*
> claim, and I started the thread :-) What I hope people are taking from
> the 20:1 claim on a particular benchmark is not that Scheme is faster than
> C in any absolute sense -- which of course can't be true given that the
> Scheme compiler in question is compiling to C -- but that there is no
> reason for Scheme to be *slower* than C, and that speed is not therefore a
> reason to avoid it.

Reasons why Scheme might be slower than C:
- untyped language, dynamic type checks (and array bounds checks)
- "everything is a pointer" (aka no unboxing)
- garbage collection
- the Scheme rules for arithmetic

Reasons why Scheme might be faster than C:
- no violation of type rules allowed, so better analysis of data structures
should be possible (note that this is true for Java, ML, etc. It isn't really
true for Pascal&co)

Personnally, and though there is a bunch of nice research addressing all
these issues, I don't think you can wholly get rid of the overheads listed
above.

I also notice that this discussion is has two views of language performance:
- the performance you get in language X for a "typical" program
- the best performance you can get in language X with large amounts of hand
optimisation

The people who care most about performance are generally talking about the
second kind of performance. I believe that C will always win over Scheme
(or Java) for the "best possible performance". I'm ready to contemplate the
idea that Java (but not Scheme) might beat C for typical programs, though
I'm not extremely optimistic (Java at least doesn't have to deal with all
values being untyped, or the Scheme arithmetic rules).

--
David Gay - Yet Another Starving Grad Student
dg...@cs.berkeley.edu

Michael Lyle

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

In article <8cfmcp$2...@spool.cs.wisc.edu>, "Andy Glew" <gl...@cs.wisc.edu> wrote:

> > In Dylan, when you write "slot name :: type;" in a class declaration it
> > automatically does everything you did in your macro, plus it's easier to
> > customize because you can instead write something like...
> >
> > slot name :: type,
> > getter: myName,
> > setter: #f;
> >
> > -- Bruce
>
>
> I'm familiar with languages that
> (1) allow a data member or a getter/setter to be syntactically
> indistinguishable - so you can start off with an exposed
> data member and then convert to a getter/setter without
> changing any code
> (2) implicit create getter/setters.
> E.g. Sather, Eiffel.
>
> I'd accept Bertrand Meyer's dictum that this should be a basic
> principle of language design, except for one issue, which I have
> not managed to come to a satisfactory resolution for, and
> which I seek advice anywhere:
>
> * Pass by Reference *
> or, equivalently, pass by pointer.
>
> Let's talk about C-style pass by pointer, since C++ style
> pass by reference is just syntactic sugar about this.

..... Stuff snipped....>

> Although one can imagine making the above totally transparent
> - e.g. there could be an Sclass::i "data" member that is a getset_value
> this does not hide the fact that the original user function
> increment(int* i)
> needs to be changed;
> or that such thunking is a lot less expensive than
> dereferencing a pointer.
>

...... more stuf snipped.......>

>
> Anyway.... I keep hoping that there is some magic way
> out of this.
>

Like maybe call by name? In spite of another thread this week disparaging
Algol and all its descendants.

>
> Somewhere I read a comment, I think it was by Stroustrup
> talking about a conversation with the designer of the original
> C, or someone of that ilk, that said something like
> "C was successful because it had a machine model
> that corresponded well to actual hardware."
>
> I think this applies here:
> getter/setters are the *right* thing to do from a programmer's
> point of view - functional interception is the most general form
> - but the getter/setter breaks down in the presence of pass-by-reference,
> and pointers in general. And pass-by-reference and pointers in
> general correspond so exactly to CPU hardware, that it would
> be silly to do without them in a general purpose systems language.
> (Even Java's references are only slightly adorned pointers.)

I like this, and Burroughs/Unisys A series have this in hardware, thus
meeting both criteria ("the right thing", and "corresponded well to
hardware"): Any reference to an Indirect Reference Word (IRW) or a Stuffed
IRW (SIRW) by the hardware will either reference a primitive type, or if it
finds a PCW (program control word), will execute the thunk. It uses tags
to distinguish PCWs from primitive types at runtime, so depending on which
one shows up at the end of an (S)IRW chain, you get exactly what you
wanted.

The really nice thing about this mechanism from a progammer's point of view
is that the thunk gets executed in the environment of the variable referred
to, rather than the environment of the function accessing the "variable".
This allows, for instance, a correct action by a fixup routine called by an
IEEE trap on a case by case basis, rather than a generic basis.

So, this being comp.arch, maybe we should fix the hardware rather than fix
the software?

Michael

--
Michael
Michael Lyle (ml...@wco.com)

Andy Glew

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

> Like maybe call by name? In spite of another thread this week disparaging
> Algol and all its descendants.
>

> I like this, and Burroughs/Unisys A series have this in hardware, thus
> meeting both criteria ("the right thing", and "corresponded well to
> hardware"): Any reference to an Indirect Reference Word (IRW) or a Stuffed
> IRW (SIRW) by the hardware will either reference a primitive type, or if it
> finds a PCW (program control word), will execute the thunk. It uses tags
> to distinguish PCWs from primitive types at runtime, so depending on which
> one shows up at the end of an (S)IRW chain, you get exactly what you
> wanted.
>
> The really nice thing about this mechanism from a progammer's point of view
> is that the thunk gets executed in the environment of the variable referred
> to, rather than the environment of the function accessing the "variable".
> This allows, for instance, a correct action by a fixup routine called by an
> IEEE trap on a case by case basis, rather than a generic basis.
>
> So, this being comp.arch, maybe we should fix the hardware rather than fix
> the software?

Q: can you use this technique to implement something like:

int i;
read(0,&i,sizeof(int));

or, as a more aggressive example:

void foo( int* iptr ) {
*(char*)(iptr+1) = 'c';
*(char*)(iptr+3) = 'b';
}

This is usually where attempts to emulate references/pointers
via call-by-name break down.

Now, I can imagine that "the right thing" to do is a read-modify-write
for the above. I haven't seen it done, though.

The IRW technique seems to work best on a word-oriented architecture.
For reasons like this, I have great sympathy for word oriented architectures.

Tim McCaffrey

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

In article <8cftqm$6...@spool.cs.wisc.edu>, Andy Glew says...

>
>> Like maybe call by name? In spite of another thread this week disparaging
>> Algol and all its descendants.
>>
>

>Q: can you use this technique to implement something like:

[snip]

>or, as a more aggressive example:
>
> void foo( int* iptr ) {
> *(char*)(iptr+1) = 'c';
> *(char*)(iptr+3) = 'b';
> }
>
>This is usually where attempts to emulate references/pointers
>via call-by-name break down.
>
>Now, I can imagine that "the right thing" to do is a read-modify-write
>for the above. I haven't seen it done, though.
>
>The IRW technique seems to work best on a word-oriented architecture.
>For reasons like this, I have great sympathy for word oriented architectures.
>
>

The above example is one of those cases which don't really map well from C to
Algol. you can't really take a pointer of just any integer, it must be an
array. You can do the following, which accomplishes the same end result:

Begin
Integer Array I[0:1];
Pointer Iptr;

Procedure Foo(I);
Integer Array I[*];

Begin
Pointer Iptr;
Iptr := Pointer(I,8);
Replace Iptr+1 by "c" for 1;
Replace Iptr+3 by "b" for 1;
End;

Iptr := Pointer(I,8);
replace Iptr by 8" " for 5,48"00";
display(I);
Foo(I);
display(I);
End.

On the other hand, here is something that doesn't exactly map to C, and really
shows what call-by-name does:

begin
File term(KIND=REMOTE,MYUSE=IO);
integer Array I[0:5];
integer Ix;

procedure Foo(I);
integer I;

begin
Write(term,//,I,I,I,I,I,I,I,I,I);
end;

Ix := 0;

FOR Ix := 0 STEP 1 UNTIL 5 DO I[Ix] := Ix*Ix;
Ix := 0;
Foo(IF (Ix LEQ 4) Then I[Ix:=*+1] Else I[5]);
End.

Output is:
1 4 9 16 25 25 25 25 25

Tim McCaffrey

Andy Freeman

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

In article <s71n1n8...@barnowl.CS.Berkeley.EDU>,

David Gay <dg...@barnowl.CS.Berkeley.EDU> wrote:
> I also notice that this discussion is has two views of language
performance:
> - the performance you get in language X for a "typical" program
> - the best performance you can get in language X with large amounts of
hand
> optimisation
>
> The people who care most about performance are generally talking about
the
> second kind of performance.

There are other assumptions made by the "performance, performance,
performance" folks. Often their core algorithms are simple or at
least small, and more often, they are fairly stable. In addition,
the ratio of code time to run time is fairly small. Those circumstances
make hand optimization possible and reasonable and doesn't penalize
"unsafe" practices.

However, those circumstances are far less common than their proponents
assert and they certainly don't justify their influence on the vast
majority of programming. Moreover, said proponents often "forget"
that the huge advances in their domain often comes from changing the
algorithm, which is where their methods are weakest (but again, that
doesn't hurt them much because of the aforementioned stability).

Late answers are wrong answers, but program speed isn't the only
thing that delays answers. Programming time also delays answers.
Moreover, delay isn't the only error - wrong answers are also
wrong, no matter how quickly they are produced.

-andy

Sent via Deja.com http://www.deja.com/
Before you buy.

Andy Glew

unread,

Apr 5, 2000, 3:00:00 AM4/5/00

to

> >Q: can you use this technique to implement something like:

> >or, as a more aggressive example:
> >
> > void foo( int* iptr ) {
> > *(char*)(iptr+1) = 'c';
> > *(char*)(iptr+3) = 'b';
> > }
>

> The above example is one of those cases which don't really map well from C to
> Algol. you can't really take a pointer of just any integer, it must be an
> array. You can do the following, which accomplishes the same end result:

It's not a question of "maps well to Algol".

The point is that all byte oriented machines can do
what I described in the code above. And just about
all modern machines are byte oriented. And even
word oriented machines can do the above, it's just
that the hardware is limited not to.

The point is that on all modern machines you *can* take
a reference or pointer to any integer, not just an array.
Conversely, and more importantly, you *can* write into the
middle of an array, or to a single data member of an object;
you don't need to adopt the functional viewpoint of treating all
arrays and objects as immutable - if you want to modify
one part, you have to copy the whole thing.

This is what I think Stroustrup meant by the "machine
model" implicit in the language. Just putting IRW
into microcode or a sequencer doesn't change
what the hardware is doing.

(By the way, I don't thik this means that IRW is a bad idea.
I think that IRW can be adapted to a byte oriented world.)

If a technique like IRW is to be seriously considered for
modern machines, it must either support all C code that
is legal and/or runs on widely used machines, or it must
have a good story why not. example, it may be reasonable
to say "IRW allows data members to be replaced transparently
by getter/setter functions, except in case of subword access,
in which case IRW traps."

> On the other hand, here is something that doesn't exactly map to C, and really
> shows what call-by-name does

Yes, I think all long-time comp.archers are familiar with Jensen's device.
Points

a) Call by name is considerably more expensive than call-by-reference/pointer.
Even with the hardware/microcode support for IRW.

b) Jensen's device is considered to be
REALLY BAD PROGRAMMING STYLE
leading to hard to read and maintain programs.
(For that matter, call-by-reference/pointer also causes problems,
when several arguments overlap.)

It may be that call-by-name is the "best" way to try to achieve
transparent getter/setters. If so, then it is actually a good argument
for not having transparency.

Trying to avoid call-by-name was one of the things that led me to
consider adapting copy-in/out argument passing for transparent
getter/setters:

E.g.

If you have a call

void foo( int* i);
struct C { int dm; } c;
foo(&c.dm);

and you want to replace the raw data member by
functional interception

class C {
int dm1, dm2;
public:
int get_dm() { return dm1 + dm2; }
void set_dm( int v ) { dm1 = v *3/4; dm2 = v - dm1; }
}

note: there *is* no data member any more.

Recall: I want transparency:
I want code
c.dm = 8978; ==to map to==> c.set_dm(8978);
foo = c.dm; ==to map to==> foo = c.get_dm();

The question is, how to handle references and pointers
to dm?

The copy-in/out approach is to instantiate a temporary integer,
initialized by get_dm(). The address of the integer is passed.
Then, after the function returns, the result is copied back to whence
it came via set_dm().

I.e.
foo(&c.dm);
==maps to==>
{ // temporary scope
int tmp = c.get_dm();
foo(&tmp);
c.set_dm(tmp);
}

By the way, I have created a C++ template & a macro
for syntactic sugar that accomplishes the above.

This approach starts to have problems when the
same reference is passed more than once, or
two overlapping references are passed.

void bar(int*ip,char*cp);

bar( &c.dm, (char*)(&c.dm)+1 );

It's a bit of a challenge to arrange so that only
one temporary object is created for the
copy-in/out of overlapping parameters.

The biggest problem with copy-i/out is, what
do you do when the "member" is/was a large
linked data structure, to which you now have
a functional interface? Instantiating a large
linked data structure just to preserve copy-in/out
semantics would be foolish.

This forces me to the conclusion: we need pass by
reference. Uniformity requires pass by reference
for all types of objects. Since pass by reference
does not seem to be compatible with transparent
getters/setters, the latter lose. I still want to be able
to use semi-transparent getters/setters, but code
changes will be required if pass by reference is used.

It might be nice to have the concept "This data member
cannot be passed by reference or pointer". You can come
close in C++ by overloading the address of obj.operator&().

====

By the way, just in case the Ada and Java people are feeling smug:

I have already seen, on a programming mailing list,
the following workaround for pass by reference:
since Java apparently won't pass primitive types by reference,
the programmer had fallen into the habit of putting
such objects into one element arrays,. since arrays *are*
passed by reference.

Bruce Hoult

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

In article <MPG.1354aa5f9...@news.pacbell.net>,
DONT.qed...@pobox.com (Paul Hsieh) wrote:

> > No I don't think that's the claim. At least, it's certainly not *my*
> > claim, and I started the thread :-) What I hope people are taking from
> > the 20:1 claim on a particular benchmark is not that Scheme is faster than
> > C in any absolute sense -- which of course can't be true given that the
> > Scheme compiler in question is compiling to C -- [...]
>
> Well, what I am hoping for is that my previous post was not ignored, and
> that unfortunately.

Not ignored, just postponed. I've got a looming deadline :-(

> This 21:1 claim just looks like a case of poor
> benchmarking. My belief is that the code is no better than 1:1 than
> *real* C code in the example given (though I could only really prove that
> it could not be better than 3:1 on an Athlon using WATCOM C/C++, and
> would only be achieved that if it somehow performed better scheduling
> than the base C compiler.)

It doesn't have to perform better scheduling than the base C compiler, but
merely a) expose bigger and better basic blocks to the base C compiler to
be scheduled, or b) move repeated calculations out of loops or c) remove
redundant calculations entirely.

> > [...] but that there is no reason for Scheme to be *slower* than C,
> > and that speed is not therefore a reason to avoid it.
>
> That isn't supported by anything posted thus far. No fair comparisons
> that I know of have yet been fully explored (I did not get a chance to
> try a Scheme compile using WATCOM C/C++ as a back end on my Athlon.)

Well you're free to download Stalin and try it. You only need one file...

<ftp://ftp.nj.nec.com/pub/qobi/stalin.tar.Z>

But you don't even need to do that -- just take the Stalin-generated C
code at the end of <http://deja.com/getdoc.xp?fmt=text&AN=341354188> (it's
2252 lines long) and compile and run it.

> > There are a large number of reasons why you might program in a high level
> > language such as Scheme, Common LISP, Dylan, CAML or Smalltalk. These
> > include such things as shorter, easier to understand code.
>
> That is too much of a generalization. Scheme is a derivative of Lisp
> after all. This is of itself can pose problems for algorithms that are
> irregular or non-recursive.

I do believe you missed Steele's 1976 paper showing that tail recursion
and goto are equivalent and that either can be mechanically and
efficiently translated into the other.

> (I can repost my Chess program inner loop, if you are looking for a real
> challenge in programming for clarity. My C implementation has a couple
> gotos strategically placed. While they can be removed, I don't see how
> to do it without impacting performance.)

Sure, go ahead if it's not too big. Please post a complete, compilable
and runnable example, rather than just a single function.

> > [...] Better type-safety and memory management.
>
> Does it support programmatically aligned memory allocations?

Sorry. What is "it"?

> > [...] Overflow and bounds checks.
>
> Well there goes any credible claims about performance.

Not true. Compilers have now gotten very good at keeping track of the
type of variables, and this includes the possible range of values of an
integer value. This analysis itself benefits immensely from the simple
control structures and assign-once semantics of functional languages. If
you've proved that an integer is in a certain range then if that integer
is used to index an array of the same or greater bounds then clearly no
bounds check is needed. Even in the case where you don't know the bounds
until runtime, you can still at least move the checks outside of inner
loops.

> > Speed differences on toy benchmarks of course don't carry through
> > unchanged to larger programs. Once you start dealing with out-of-cache
> > data, paging, disk or network I/O the difference in speed between a fast
> > and a slow language implementation largely goes away.
>
> And you don't think a "slow" langauge exacerbates this problem by
> requiring larger code footprints, having no mechanisms for aligning data,
> potentially worse problems with register spill/fill?

I see I've lost you. I'm explaining here the reasons that although Scheme
or Dylan may be a "fast" language on micro-benchmarks the same magnitude
of gains won't carry through to full-sized programs. I'm doing this by
pointing out why even "slow" languages can do quite well on many tasks.

> > [...] Which is why things such as Perl are doing so well now.
>
> Well hang on there. I don't know of any head to head speed trials
> between C and Perl programs. If people are using Perl, its because they
> don't care about performance. Nobody believes that Perl in of itself has
> acceptable performance versus C (or anything compiled for that matter.)

That's not so. See:

<http://www.lib.uchicago.edu/keith/crisis/benchmarks/invert/>

Here, the author presents a simple task which can be implemented in Perl
as (I've slightly rewritten it by using arrays instead of strings):

while (<STDIN>) {
chomp;
($a, $b) = split(/\t/);
($hash{$b} && push @{$hash{$b}}, $a) || $hash{$b} = [$a];
}
}

foreach $b (sort keys %hash) {
print "$b";
foreach $a (sort @{$hash{$b}}){
print "\t$a";
}
print "\n";
}

This program takes data like this...

a 1
a 2
b 2
c 2
c 1
c 3

... and turns it into this ...

1 a c
2 a b c
3 c

The actual test data file is around 5600 lines and 600 KB.

On this type of task Perl is not easy to beat -- none of the other
languages that he benchmarked even came close. He didn't try C, but I did
and a reasonably direct translation to C++ turned out to be slower than
Perl:

#include <string>
#include <map>
#include <vector>
#include <algo.h>
using namespace std;

int main()
{
map<string, vector<string> > hash;
string a, b;
while (!cin.eof()){
cin >> a >> b;
hash[b].push_back(a);
}

for (map<string, vector<string> >::iterator b = hash.begin();
b != hash.end();
++b)
{
cout << (*b).first;
sort((*b).second.begin(), (*b).second.end());
for (vector<string>::iterator a = (*b).second.begin();
a != (*b).second.end(); ++a){
cout << "\t" << *a;
}
cout << endl;
}
}

I've also translated it into Dylan:

define inline method find(c, buf, start)
for (p from start below buf.size, until: buf[p] == c) finally p; end;
end;

define method main(appname, #rest arguments)
let hash = make(<string-table>);
block ()
while (#t)
let line :: <byte-string> = read-line(*standard-input*);
let tab = find('\t', line, 0);
let a = copy-sequence(line, start: 0, end: tab);
let b = copy-sequence(line, start: tab + 1);
hash[b] := pair(a, element(hash, b, default: #()));
end;
exception (e :: <end-of-stream-error>)

for (b in hash.key-sequence.sort!)
format-out("%s", b);
for (a in hash[b].sort!)
format-out("\t%s", a);
end;
format-out("\n");
end;

end;
end main;

Using d2c this turns out to be about the same speed as the C++, and slower
than the Perl. The reason appears to be extreme pessimisation in the d2c
text I/O library -- which appears to be quite fixable but I haven't done
it yet.

> > [...] Hell, even interpreting a different
> > ISA is getting close to being a "below the noise threshold" thing, if the
> > interpreter fits in cache and the actual purpose of the program doesn't.
> > See the JVM, VirtualPC, VMWare, Apple's 68K emulator, and all the other
> > interpreter/JIT compiler efforts which are becoming common.
>
> Woah there nelly. JIT compilers are compilers. They don't insert type
> checking, or better memory manamgement or anything like that. We are
> talking about apples and oranges here. The reasons why interpreting ISAs
> are interesting and showing some degree of promise in terms of
> performance has nothing to do with higher level languages like Scheme.

And I didn't say it did. I'm pointing out that quite huge differences in
raw execution speed (speed on toy benchmarks) can often be masked in real
programs.

> > > Having said this, the claim that a given problem programmed in Scheme
> > > is likely to run a lot faster than the same problem programmed in C,
> > > is an interesting one. So far I haven't seen any substantive evidence
> > > for it, but I'm open to being convinced otherwise. (If nothing else,
> > > I have plenty of C and C++ code which I'd _love_ to see run 20+ times
> > > faster! In fact, I'd quite happily settle for "just" a factor of 2.)
> >
> > The claim that all programs run faster in Scheme than in C is a straw
> > man. [...]
>
> Then can we change that to "any programs"? I haven't seen the first one.

Yes you have. Try the C generated by Staln (from a much shorter Scheme
program) at the end of <http://deja.com/getdoc.xp?fmt=text&AN=341354188>.

> > [...] If, however, you'd like to see some evidence then all you have to do
> > is download Siskind's "stalin" Scheme compiler and try it for yourself.
> > It comes as a (pretty huge) single C file which you just have to compile
> > using your local C compiler and you're away.
>
> Does it target ANSI C (as opposed to gcc)?

Yes. There is one gcc extension used (a progma to inform the compiler
that certain functions never return) but that if #ifdef __GNUC__.

> Well, why don't we start with Stream? That's a simple and short one.
> Lets see just how fast Scheme's array bounds checking functionality is as
> compared with C's non-existent array bounds checking. (These examples
> all look like they would take quite some time to port over.)

Sure. The source can be found where?

> > > *This* experiment would be interesting, and unlike toy benchmarks,
> > > it would actually tell us something useful about performance on other
> > > real-world applications.
> >
> > Agreed, and I've noted those programs as ones to look at for my own work.
>
> Dude, if you are serious about establishing the credibility of such
> efforts, you need to involve a larger audience, and that means starting
> from even simpler examples.

Then you get the accusation of "toy benchmarks". You can't win...

-- Bruce

Bruce Hoult

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

In article <8cf7fg$t4g$1...@mach.thp.univie.ac.at>,
jth...@mach.thp.univie.ac.at (Jonathan Thornburg) wrote:

> % g++ -O -o hello hello.cc
> % g++ -O -o hello2 hello2.cc
> % ./hello
> hello, world
> % ./hello2
> hello, world
> % lsl ./hello ./hello2
> -rwxr-xr-x 1 jthorn hpc 12779 Apr 5 13:14 ./hello*
> -rwxr-xr-x 1 jthorn hpc 12933 Apr 5 13:15 ./hello2*
> %
>
> (I also got essentially-identical results with the egcs-2.91.66 which
> Red Hat ships with their 6.1 GNU/Linux release.)
>
> Admittedly, this is only a single pair of trivial programs, but that
> still sure doesn't look like half a megabyte to me. Perhaps your g++
> was mistakenly built without shared library support?

No, I'm deliberately using it to generate statically linked images in
order to compare the amount of overhead in the runtime libraries for
various languages.

-- Bruce

Bruce Hoult

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

In article <8cfaiq$t8f$1...@mach.thp.univie.ac.at>,
jth...@mach.thp.univie.ac.at (Jonathan Thornburg) wrote:

> On other key point that I'd want to see addressed before seriously
> trying a Lisp-family language for number-crunching, is debugger support
> for mixed-language environments. That is, if my present C++/C/Fortran
> executable dumps core, gdb will let me look at what's going on fairly
> well (provided I can reproduce the problem with a -g binary). And
> most Lisp-family languages have long come with excellent debuggers
> for their environments. But what happens if my Dylan program calls
> (via the Dylan foreign-function interface) a C++ class, which calls
> a Fortran program, which crashes? I don't know of any debugger which
> will let me do a good job tracking the bug(s) across the Lisp-family
> and Algol/Fortran-family languages. Improvements in this direction
> would be *very* welcome!

The Dylan compiler I'm using compiles to C, so you can always debug the C
in the normal way. d2c generates the names of the C functions and
variables from the names of the Dylan functions and variables so you can
keep track fairly well. The biggest problem is the amount of inlining
that goes on in a typical Dylan compilation, but the same applies to C++
these days. If you need to debug then it really helps to turn off
inlining... d2c also includes "#line" directives in the generated C so
gdb can relate you right back to the original Dylan, but once again that's
a lot more useful if you turn off a few optomisations.

-- Bruce

Chris Morgan

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

bruce...@pobox.com (Bruce Hoult) writes:

> Speed differences on toy benchmarks of course don't carry through
> unchanged to larger programs. Once you start dealing with out-of-cache
> data, paging, disk or network I/O the difference in speed between a fast
> and a slow language implementation largely goes away. Which is why things
> such as Perl are doing so well now.

It couldn't actually be that Perl actually does some stuff quite quick
now, could it?

I'm reminded of the measurements that Kernighan and Pike did in their
Markov chain analysis case study (in their 'The Practice of
Programming' - http://cm.bell-labs.com/cm/cs/tpop/) - four
implementations of the same program in good idiomatic style for each
language (straightforware pointers, malloc, free in C, STL style in
C++). The four languages were C, C++, Perl and Java. As I remember
(the book's at work) C was fastest, then Perl, then C++, then Java. It
surprised me anyway (I work in C and C++).

Cheers,

Chris

--
Chris Morgan <cm at mihalis.net> http://mihalis.net
"O gummier hum warder buffer-lore rum
Enter dare enter envelopes ply"

Bruce Hoult

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

In article <8cftqm$6...@spool.cs.wisc.edu>, "Andy Glew" <gl...@cs.wisc.edu> wrote:

> Q: can you use this technique to implement something like:
>

> int i;
> read(0,&i,sizeof(int));

>
> or, as a more aggressive example:
>
> void foo( int* iptr ) {
> *(char*)(iptr+1) = 'c';
> *(char*)(iptr+3) = 'b';
> }
>

> This is usually where attempts to emulate references/pointers
> via call-by-name break down.
>
> Now, I can imagine that "the right thing" to do is a read-modify-write
> for the above. I haven't seen it done, though.

I think the appropriate question is "what are you *really* trying to
achieve here?" The things that you have expressed are inherently
low-level things. But what is the reason for doing them? What is the
bigger picture?

The second example, in particular is likely to be a lot slower than
something like (on bigendian)...

int foo(int i){
i = (i & ~(255<<16)) | (('c'<<16) & (255<<16));
i = (i & ~255) | ('c' & 255);
return i;
}

... because the semantics of C force *iptr into memory, even if the
function is inlined.

-- Bruce

Bruce Hoult

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

In article <87u2hgy...@think.mihalis.net>, Chris Morgan
<c...@mihalis.net> wrote:

> bruce...@pobox.com (Bruce Hoult) writes:
>
> > Speed differences on toy benchmarks of course don't carry through
> > unchanged to larger programs. Once you start dealing with out-of-cache
> > data, paging, disk or network I/O the difference in speed between a fast
> > and a slow language implementation largely goes away. Which is why things
> > such as Perl are doing so well now.
>
> It couldn't actually be that Perl actually does some stuff quite quick
> now, could it?

Certainly it is. as long as you stay within its domain -- searching and
manipulating arrays and hashes of text -- Perl is very hard to beat
without writing a *lot* more code.

But get out of its domain and it's *slow*. My favourite quick&dirty
benchmark of a new system is:

time perl -e 'for($i=0;$i<=1000000;++$i){$t+=$i;}print"$t\n"'

My PPro 200 takes about 2.6 seconds, my PPC Macs are about 2.0, and I've
seen plenty of Suns etc at 10 - 20 sec.

Any reasonable compiler will produce code that takes about 1/100 sec on a
200 MHz PPC.

> I'm reminded of the measurements that Kernighan and Pike did in their
> Markov chain analysis case study (in their 'The Practice of
> Programming' - http://cm.bell-labs.com/cm/cs/tpop/) - four
> implementations of the same program in good idiomatic style for each
> language (straightforware pointers, malloc, free in C, STL style in
> C++). The four languages were C, C++, Perl and Java. As I remember
> (the book's at work) C was fastest, then Perl, then C++, then Java. It
> surprised me anyway (I work in C and C++).

I posted a reference yesterday to another benchmark which showed Perl to
be very good.

-- Bruce

Andy Glew

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

> > void foo( int* iptr ) {
> > *(char*)(iptr+1) = 'c';
> > *(char*)(iptr+3) = 'b';
> > }
>

> The second example, in particular is likely to be a lot slower than
> something like (on bigendian)...
>
> int foo(int i){
> i = (i & ~(255<<16)) | (('c'<<16) & (255<<16));
> i = (i & ~255) | ('c' & 255);
> return i;
> }
>
> ... because the semantics of C force *iptr into memory, even if the
> function is inlined.

Sorry, no.

*iptr may be read at the beginning of foo, modified
in a register as you propose, and then written back.
The semantics of C do *NOT* force iptr to be in memory
for the entire life of the function, because it can be trivially
proven that there are no overlaps in the function foo.
Whether any compilers do this is moot.

Especially since the in-memory code is faster...

If *iptr had been declared volatile, it would have to stay in
memory.

Tzvetan Mikov

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

Bruce Hoult <bruce...@pobox.com> wrote in message
news:brucehoult-07...@bruce.bgh...
> In article <8ci4jn$d...@spool.cs.wisc.edu>, "Andy Glew" <gl...@cs.wisc.edu>
wrote:
>

> > > > void foo( int* iptr ) {
> > > > *(char*)(iptr+1) = 'c';
> > > > *(char*)(iptr+3) = 'b';
> > > > }
> > >
> > > The second example, in particular is likely to be a lot slower than
> > > something like (on bigendian)...
> > >
> > > int foo(int i){
> > > i = (i & ~(255<<16)) | (('c'<<16) & (255<<16));
> > > i = (i & ~255) | ('c' & 255);
> > > return i;
> > > }
> > >
> > > ... because the semantics of C force *iptr into memory, even if the
> > > function is inlined.

These two functions are not equivalent. They would be if the first one was
written as:

void foo ( int * iptr ) {
*((char *)iptr + 1) = 'c';
*((char *)iptr + 3) = 'b';
}

Anyway, I assume that's what Andy Glew actually meant. In that case assuming
that sizeof(int) >= 4, I don't see what in the semantics of C would prevent
the complete elimination of the memory read and write if the function was
inlined. In my opinion this would be a trivial optimisation.

-tzvetan

Andy Glew

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

> > - garbage collection
>
> If the compiler can analyse the lifetime and prove that the value doesn't
> escape then it can be stack/register allocated just as in C. If the
> compiler can't figure out the lifetime then there's a good chance that the
> programmer can't either

By the way, I want garbage collection in C++,
but I want to be able to explicitly declare stack
local variables (just like today), and I want to
be able to explicitly delete an object that I think
is no longer in use.

If I attempt to delete an object that is no longer
in use, but for which there are outstanding
references/pointers, then this should be an
error that the garbage collector can detect.
No need to expensively detect the error right
away - it's okay if the error is reported at the
next garbage collection sweep.

Similarly, having a stack local object go out of
scope and get destroyed is similarly an error that
the garbage collector should be able to detect.

One way to do this is to have versions attached to
objects. That would work for Java, but would be harder
for C++.

Reasons:

(a) GC'ed languages are vulnerable to "memory leaks"
when a pointer that should have been nulled wasn't,
keeping the object alive.

(b) One of C++'s great strengths is the
"Resource Allocation Is Initialization" construct,
which I call "Resource Management via Scoping".

Automatically calling a destructor at a known time,
for all possible exits from a block, is important.

By comparison, things like Java's finalize method,
or the try/catch/finally construct, are much less useful.

"Scoped resource management" is the best way
I know of to program in the presence of exceptions.

(c) Marginal performance.

Muzaffer Kal

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

"Andy Glew" <gl...@cs.wisc.edu> wrote:

>> > - garbage collection
>>
>> If the compiler can analyse the lifetime and prove that the value doesn't
>> escape then it can be stack/register allocated just as in C. If the
>> compiler can't figure out the lifetime then there's a good chance that the
>> programmer can't either
>
>By the way, I want garbage collection in C++,
>but I want to be able to explicitly declare stack
>local variables (just like today), and I want to
>be able to explicitly delete an object that I think
>is no longer in use.

I think the following is more or less what you want if you haven't
seen it already:
http://www.hpl.hp.com/personal/Hans_Boehm/gc/

David Gay

unread,

Apr 6, 2000, 3:00:00 AM4/6/00

to

bruce...@pobox.com (Bruce Hoult) writes:
> In article <s71n1n8...@barnowl.CS.Berkeley.EDU>, David Gay
> <dg...@barnowl.CS.Berkeley.EDU> wrote:
>

> > Reasons why Scheme might be slower than C:
>

> These are all good reasons why interpreted or simply-compiled Scheme is
> slower than C, but there's a lot that a good compiler can do.

I should mention that I'm a grad student in programming languages, so I'm
well aware of the various techniques proposed for addressing these issues
(and have done some research on them myself). I just don't believe (and
have seen no real evidence - this thread has only produced a not very
useful toy benchmark) that they are sufficiently good to overcome all the
overhead. In particular, I'm skeptical about the possibility of good alias
analysis which is required for nearly all the issues I brought up.

I therefore believe that C will always beat Scheme if you care about
performance. On the same lines, assembler will always beat C ;-)

Bruce Hoult

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

In article <8ci4jn$d...@spool.cs.wisc.edu>, "Andy Glew" <gl...@cs.wisc.edu> wrote:

> > > void foo( int* iptr ) {
> > > *(char*)(iptr+1) = 'c';
> > > *(char*)(iptr+3) = 'b';
> > > }
> >
> > The second example, in particular is likely to be a lot slower than
> > something like (on bigendian)...
> >
> > int foo(int i){
> > i = (i & ~(255<<16)) | (('c'<<16) & (255<<16));
> > i = (i & ~255) | ('c' & 255);
> > return i;
> > }
> >
> > ... because the semantics of C force *iptr into memory, even if the
> > function is inlined.
>
>

> Sorry, no.
>
> *iptr may be read at the beginning of foo, modified
> in a register as you propose, and then written back.
> The semantics of C do *NOT* force iptr to be in memory
> for the entire life of the function, because it can be trivially
> proven that there are no overlaps in the function foo.
> Whether any compilers do this is moot.

The semantics of C force *iptr to be in memory at the start and end of the
function. Those may well be totally unnecessary reads and writes
deoending on what the rest of the program is doing.

> Especially since the in-memory code is faster...

Not on the machines I use. The bitfield inserts are one instruction each.

In any case, you didn't say what you're trying to do with this code.
You're writing code that is explicitly machine-level, not code that
expresses what you are trying to do, which makes it very hard to know
which parts of your machine-level expression are necessary and which parts
are incidental. On dumb compilers this is your only choice for
reasonable, but on smart ones you may be giving up the opportunity for
optomisations.

-- Bruce

Bruce Hoult

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

In article <s71n1n8...@barnowl.CS.Berkeley.EDU>, David Gay
<dg...@barnowl.CS.Berkeley.EDU> wrote:

> Reasons why Scheme might be slower than C:

These are all good reasons why interpreted or simply-compiled Scheme is
slower than C, but there's a lot that a good compiler can do.

> - untyped language, dynamic type checks (and array bounds checks)

You of course want to move these things out of inner loops, and creating
specialised versions of or inlining functions can help to preserve
knowledge of types -- think of C++ template functions which are also a
form of programming using untyped variables.

Optional type declarations (as in Dylan) can of course help out the
compiler and be used as semi-manual hints if the compiler just doesn't
"get it" in some particular complex situation. Often simply declaring the
types of function arguments (which you do anyway for overloaded function
resolution) is all the compiler needs to figure out the types of
everything.

> - "everything is a pointer" (aka no unboxing)

Um: "everything is a *reference*". It only actually needs to be a pointer
if the compiler can't prove that the object is of a particular fixed size
and that the lifetime is limited to that of the block in which it was
created. Ok, that's a bit oversimplified, but basically if you create a
new variable, use it a bit, and either don't pass it to any other
functions, or those functions have been analysed and are known not to
"capture" the value, then it can be stack-allocated.

Collections (arrays etc) can be a problem as it can be hard to prove that
only values of a particular type are ever put in the array. Dylan allows
you to declare that a particular array, string, hash etc only contains
values of a particular type. This gets rid of boxing (for immutable types
such as int, float, char) and typechecks when you extract the value.

> - garbage collection

If the compiler can analyse the lifetime and prove that the value doesn't
escape then it can be stack/register allocated just as in C. If the
compiler can't figure out the lifetime then there's a good chance that the

programmer can't either and that they'll be using something like explicit
reference-counting (or will have a memory leak) -- and custom-rolled
reference counting is *less* efficient than modern garbage collectors
because it constantly requires the count to be incremented/decremented,
often overwhelming the actual work being performed.

There *are* cases where the programmer knows the lifetime of something
that the compiler can't figure out and do an explicit delete at the right
time. But such cases are pretty rare (in my experience) outside either a)
a pointer that is a field in an object being freed in the destructor of
that object, or b) a pointer that is allocated and freed in the same
function because of API requirements.

> - the Scheme rules for arithmetic

Dylan (and some Schemes) offer modulo arithmetic integer types as well as
unlimited precision ones. Dylan also offers integer subrange types which
let the representation be nailed down at the cost of occasional overflow
checks (which can mostly be eliminated in practise).

> Personnally, and though there is a bunch of nice research addressing all
> these issues, I don't think you can wholly get rid of the overheads listed
> above.

You can't wholly get rid of them, but you can get them out of the inner
loops so that the practical effect is unmeasurable.

> I also notice that this discussion is has two views of language performance:
> - the performance you get in language X for a "typical" program
> - the best performance you can get in language X with large amounts of hand
> optimisation
>
> The people who care most about performance are generally talking about the
> second kind of performance.

Sometimes I'm in one group and sometimes in the other. When I'm in the
second group I'm not using C anyway, I'm dropping down into assembler.

-- Bruce

Dave Harris

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

gl...@cs.wisc.edu (Andy Glew) wrote:
> such thunking is a lot less expensive than dereferencing a pointer.

I think you mean, "more expensive". The point of the setter is to allow
arbitrary code to run. I don't see how you can achieve that without
nominally passing a function.

The runtime cost depends on whether the compiler can optimise it well. In
the C++ case, with suitable templates, I wouldn't be surprised if the
final code boiled down to a pointer dereference. Other languages likewise.
One approach would be for the compiler to make 2 versions of the increment
function behind the scenes, one taking a pointer and the other the more
general function. There isn't any need to expose the pointer version to
humans.

The need for this is pretty rare, though. Most of the time you want to
pass a full object anyway.

> (2) Provide transparent getter/setters, but don't allow pass by
> reference
> -- Eiffel

> [...]

> But, pass by reference is required for large linked
> data structures, and I think that allowing pass by reference
> for these but not for basic types like "int" would be
> a regrettable irregularity in a language definition.

I think you're getting confused. You can certainly write linked lists in
Eiffel. It does not treat "int" specially; it has a general mechanism for
what it calls expanded types. Also, Eiffel does not allow transparent
setters, only getters. Assignment is flat-out forbidden from outside the
class.

> Somewhere I read a comment, I think it was by Stroustrup
> talking about a conversation with the designer of the original
> C, or someone of that ilk, that said something like
> "C was successful because it had a machine model
> that corresponded well to actual hardware."

It was Stepanov, talking about significance of STL iterators as an
abstraction/generalisation of machine pointers.

I don't think it means we should look at what hardware can do efficiently,
and then force a use for it in high level languages.

> getter/setters are the *right* thing to do from a programmer's
> point of view - functional interception is the most general form
> - but the getter/setter breaks down in the presence of
> pass-by-reference, and pointers in general. And pass-by-reference
> and pointers in general correspond so exactly to CPU hardware, that
> it would be silly to do without them in a general purpose systems
> language.

Again I think this is confused. We need references to objects, but not
necessarily to variables. References to objects suffice for linked lists.

Dave Harris, Nottingham, UK | "Weave a circle round him thrice,
bran...@cix.co.uk | And close your eyes with holy dread,
| For he on honey dew hath fed
http://www.bhresearch.co.uk/ | And drunk the milk of Paradise."

Bruce Hoult

unread,

Apr 7, 2000, 3:00:00 AM4/7/00

to

In article <8cfmcp$2...@spool.cs.wisc.edu>, "Andy Glew" <gl...@cs.wisc.edu> wrote:

> > In Dylan, when you write "slot name :: type;" in a class declaration it
> > automatically does everything you did in your macro, plus it's easier to
> > customize because you can instead write something like...
> >
> > slot name :: type,
> > getter: myName,
> > setter: #f;
> >
> > -- Bruce
>
>
> I'm familiar with languages that
> (1) allow a data member or a getter/setter to be syntactically
> indistinguishable - so you can start off with an exposed
> data member and then convert to a getter/setter without
> changing any code
> (2) implicit create getter/setters.
> E.g. Sather, Eiffel.

Dylan is (2).

> I'd accept Bertrand Meyer's dictum that this should be a basic
> principle of language design, except for one issue, which I have
> not managed to come to a satisfactory resolution for, and
> which I seek advice anywhere:
>
> * Pass by Reference *
> or, equivalently, pass by pointer.
>
> Let's talk about C-style pass by pointer, since C++ style
> pass by reference is just syntactic sugar about this.
>

> Say you have a function that modifies an argument
> passed by reference:
>
> void increment(int* x) { ++(*x); }
>
> If you have a raw data member, it can be passed by reference
> to this function
> struct Sclass { int i; int j; };
> struct Sclass sobj;
> increment(&sobj.i);
> increment(&sobj.j);
>
> If you want to make this data member private, then you can't take the
> address any longer. (Or, if you can take the address of the member, then
> it isn't very private.)

You're worried about the called function potentially keeping a copy of the
address?

I think there are only three solutions to this:

- copy in/copy out
- pass in a setter function and a getter function
- use a macro, rather than a function

I'll give examples of each of these using Dylan. I've tested all code
using the Gwydion project "d2c" compiler, but for the salke of brevity
I'll leave out definitions from later examples that are unchanged from
earlier ones...

Copy in/out is really the traditional style. e.g. in Dylan:

define inline method increment(x :: <integer>) => (result :: <integer>)
x + 1;
end;

define class Sclass (<object>)
slot i :: <integer>, init-value: 10;
slot j :: <integer>, init-value: 13;
end Sclass;

define method test()
let sobj = make(Sclass);
sobj.i := increment(sobj.i);
sobj.j := increment(sobj.j);
end;

define method main(appname, #rest arguments)

let sobj = test();
format-out("i=%d j=%d\n", sobj.i, sobj.j);
test1();
exit(exit-code: 0);
end method main;

Note 1) "sobj.i" actually means "i(sobj)" and "sobj.i := foo" actually
means "i-setter(foo, sobj)" and the dot and := syntax is just a handy
shortcut.

Note 2) extending this to functions having multiple "reference" arguments
wouldn't be a problem because Dylan has multi-valued functions.

let (i, j, k) = mutate(i, j, k);

The second technique of explicitly passing in an object reference plus a
getter and setter for the correct field is a bit simpler in Dylan than in
C++:

define inline method increment(obj, field, field-setter) => ()
obj.field := obj.field + 1; // means: field-setter(field(obj) + 1, obj)
end;

define method test()
let sobj = make(Sclass);
increment(sobj, i, i-setter);
increment(sobj, j, j-setter);
sobj;
end;

On the Dylan compiler I'm using, each (inlined) call to increment gets
optomised to a couple of machine instructions.

With this technique it is passing in a variable that is not a field of an
object that is a little bit messy:

define method test1()
let v = 100;
increment(
#f, // pass any old object
method (obj) v end,
method (newVal, obj) v := newVal end
);
format-out("v=%d\n", v);
end;

A slightly different approach would be to explicitly pass in just two
items, a getter and a setter function for the exact item. This make both
field and non-field cases equally messy :-):

define method increment(get, set) => ()
set(get() + 1)
end;

define method test()
let sobj = make(Sclass);
increment(curry(i, sobj), rcurry(i-setter, sobj));
increment(curry(j, sobj), rcurry(j-setter, sobj));
sobj;
end;

define method test1()
let v = 100;
increment(
method () v end,
method (newVal) v := newVal end
);
format-out("v=%d\n", v);
end;

In Dylan, the macro solution looks like this:

define macro increment
{ increment( ?var:expression ) }
=> { ?var := ?var + 1 }
end;

define method test()
let sobj = make(Sclass);
increment(sobj.i);
increment(sobj.j);
sobj;
end;

define method test1()
let v = 100;
format-out("v=%d\n", increment(v));
end;

Dylan macros are syntactically and semantically safe, so this is a much
better alternative than the equivalent in C. If ?var was used more than
once on the RHS then you could safely introduce a temp variable, and so
forth.

Another use of macros (perhaps a better one) would be to encapsulate and
simplify the use of one of the non-macro solutions.

> The bottom line, at least insofar as I can tell, is that
> attempting to transparently substitute a getter/setter for a data member
> fails in the presence ofd pass by reference.

The only thing that really failed (apart from the large amount of work you
had to do and the ugly syntax) was that you had to make "increment" into a
template function.

> There appear tp be only a limited number of possible choices:
>
> (1) Don't provide transparent getter/setters --- C++

> (2) Provide transparent getter/setters, but don't allow pass by reference
> -- Eiffel

> (3) Provide both implicit getter/setters and pass by reference;
> situations where a passed by reference data member is changed
> to a getter/setter acces are flagged as compile-time errors
> This is not totally transparent, but it is good enough
> for 90% of all situations.
> I think Ada is in this camp
> (4) Use a language that implicitly supports the sort of thunking
> I loosely describe above. E.g. LISP or Smalltalk.

Also Dylan.

> For a time I was seriously interested in (2) approaches,
> replacing pass by reference by copy-in/out.

> But, pass by reference is required for large linked
> data structures,

Why? Are you sure that you're not confusing passing by reference and
passing a reference by value? The difference lies in whether or not
assigning to the formal parameter changes the value of the actual
argument.

LISP-family languages (for example) are rather well known for their use in
manipulating large linked data structures, and yet pass-by-value is all
they have.

> and I think that allowing pass by reference
> for these but not for basic types like "int" would be
> a regrettable irregularity in a language definition.

This is confusing semantics and implementation. It may also be confusing
names (variables) and values (objects).

In languages such as Dylan and Scheme all values are conceptually (but not
necessarily in actual implementation) objects floating around on the heap
and all variables (names) are references to these objects. You can always
rebind (assign) a variable to refer to a different object using ":=" in
Dylan or "set!" in Scheme, but calling a function with the variable as an
argument can never result in that variable being bound to a different
object. What the called function *might* do is internally change the
value of the object. Some types of objects (mutable ones) can be updated
in this way, and some (immutable ones) can't. Integers and floats and
booleans and chars are immutable objects -- you don't change the value of
"3" when you add "2" to it, you instead (conceptually) create a *new*
object with the value "5".

Java is almost, but not quite, like this too.

This is on obvious contrast to languages such as FORTRAN that really do
use pass-by-reference. Everyone knows the classic FORTRAN:

subroutine f(n)
n = n + 1
end

call f(3)
if (3 .eq. 4) bigtrouble

One upshot of all this is that if the compiler knows that a variable is of
an immutable type then it doesn't *have* to pass around references to the
values to which it refers -- if the type is immutable then you can unbox
it and copy it from place to place and no one can tell the difference
because no one ever tries to modify it. If you need to pass it to a
function which doesn't already know its type then you have to box it up at
that point, but then that's life.

For some reason the designers of Java didn't seem to know this and as a
result we are forever stuck with the ugly as sin "int"/"Integer" dichotomy
and every programmer has to box and unbox their integers explicitly (and
typical Java programs are *riddled* with it).

> Anyway.... I keep hoping that there is some magic way
> out of this.

I recommend copy in/out with multi-valued functions (if you need more than
one by-reference argument) and a nice little macro to wrap it if you don't
like typing the same field reference on both sides of the assignment.

In the absense of multi-valued functions in C++ you might want to try
putting the "by-reference" arguments in a struct, and wrapping the whole
thing in a macro:

struct foo_args {int a,b,c; foo_args(a,b,c): s(a), b(b), c(c){};};

void foo_impl(foo_args &args)
{
args.a = args.b + args.c;
}

#define foo(A,B,C) \
{foo_args args(A,B,C); \
foo_impl(foo_args); \
A=args.a; B=args.b; C=args.c;}

> Somewhere I read a comment, I think it was by Stroustrup
> talking about a conversation with the designer of the original
> C, or someone of that ilk, that said something like
> "C was successful because it had a machine model
> that corresponded well to actual hardware."

Well, that certainly made the mapping from C code to machine C easy for
even the most basic compiler to do efficiently.

> I think this applies here:

> getter/setters are the *right* thing to do from a programmer's
> point of view - functional interception is the most general form

I agree.

> - but the getter/setter breaks down in the presence of pass-by-reference,
> and pointers in general. And pass-by-reference and pointers in
> general correspond so exactly to CPU hardware, that it would
> be silly to do without them in a general purpose systems language.

> (Even Java's references are only slightly adorned pointers.)

- don't provide/use pass by reference

- don't confuse references to values with pointers. They are not the same
thing. A reference is a high level conceptual thing. A pointer is a low
level implementation detail. A reference *might* well end up in the
machine code as a pointer, but it might not too. Just as a for-instance,
a pointer to something implies that the something has an address, which
implies that it is in memory. But modern machines have more and more
registers, keeping things in them is good, and registers don't have
addresses.

-- Bruce