Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Final kick at the "Merced Guessing Game"

8 views
Skip to first unread message

Paul DeMone

unread,
Sep 30, 1997, 3:00:00 AM9/30/97
to

While reading through an old (Dec 1995) issue of Proceedings of the IEEE the
other day while waiting for an HSPICE run to finish I came across a remarkable
paper entitled "Compiler Technology for Future Microprocessors". The authors
(W. Hwu, R. Hank, D. Gallagher, S. Mahlke, D. Lavery, G. Haab, J. Gyllenhaal,
and D. August) are associated with the University of Illinois but the research
seems to have strong ties with HP.

The primary thrust of the paper was to describe compiler algorithms and hardware
mechanisms that could be used to get around the problem of branches and false
dependencies in gathering enough instruction level parallelism for an eight issue
RISC-like machine. The paper showed how predication (kind of like condition codes
that can be used to permit conditional execute any instruction, a generalization of
what ARM does) can be used in conjunction with loop unrolling and basic block
oalesing to dramatically decrease the number of conditional branches that need to
be predicted/executed (in one example mispredicted branches using the 2-bit branch
prediction model were reduced by almost three orders of magnitude within a complex
inner loop).

The authors give examples of how an eight issue processor with full predication
could give a remarkable improvement (1.2x to 4.5x) over an eight issue design with a
non-predicated ISA. Their description of their architecture strikes me as eerily
similar to some predictions about the Intel IA-64 Merced design so I will repeat it:

"The processor model used for these experiments is an eight-issue extension of
the HP PA-RISC[TM] architecture. The processor may issue eight instructions
of any type each cycle except branches, which are restricted to one per cycle.
The processor has 64 integer, 64 floating-point, and 64 predicate registers."

The design would seem to effectively avoid the complexity and cycle time impact of
the paraphenalia of aggressive out-of-order designs such as register renaming and
in-order completion unit. The paper gives brief descriptions of advanced compiler
techniques to transform around loops (essentially static compile time register re-
naming) and also discusses how to incorporate memory data dependence analysis (used
by the Multiflow Trace scheduling compiler for the TRACE series of VLIW computers)
for another significant speedup.

HP and Intel will disclose the native Merced instruction set (IA-64) in a couple of
weeks but if they exploit the advances described in this paper it could really be
something with the potential for dramatically raising the bar of microprocessor
performance and not just typical Intel hype and FUD. Comments?


All opinions strictly my own.

--
Paul W. DeMone The 801 experiment SPARCed an ARMs race
Kanata, Ontario to put more PRECISION and POWER into
dem...@mosaid.com architectures with MIPSed results but
PaulD...@EasyInternet.net ALPHA's well that ends well.


Guy Harris

unread,
Sep 30, 1997, 3:00:00 AM9/30/97
to

Paul DeMone <PaulD...@EasyInternet.net> wrote:
>While reading through an old (Dec 1995) issue of Proceedings of the IEEE the
>other day while waiting for an HSPICE run to finish I came across a remarkable
>paper entitled "Compiler Technology for Future Microprocessors".

A Web search for that title found a page at

http://www.cs.colorado.edu/~klauser/ilp/ilp.html

which is a reading list of papers in instruction-level parallelism. It
has a link to a PostScript version of a paper with that title, at:

ftp://ftp.crhc.uiuc.edu/pub/IMPACT/journal/ieeepro.future.95.ps

although in a quick look I didn't see the comment you quoted aabout the
processor model they used, so maybe this is an earlier version of the
paper.

The page cited above does have a number of papers from HP Labs....

> The design would seem to effectively avoid the complexity and cycle time
> impact of the paraphenalia of aggressive out-of-order designs such as
> register renaming and in-order completion unit.

So are they going to blow off making Merced run IA32 code really really
really fast, leaving that up to future IA32 implementations (perhaps on
the theory that Merced will be used mainly in machines that don't have
to run IA32 stuff really really really fast, just "fast enough"), and
maybe require software to do binary-to-binary translation of IA32
programs if you want them to run really really really fast? (I.e., it
doesn't have to run Windows 98 or Office 97 really really really fast,
as a P6-generation or next-generation-after-P6 machine can handle that
job, it just has to run big database XXX or big server YYY or big
number-crunching program ZZZ really really really fast, and those will
be compiled into native code?)
--
Reply, or follow up, but don't do both, please.

postmaster@localhost
postmaster@[127.0.0.1]

Nick Maclaren

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

In article <3431A2...@EasyInternet.net>, Paul DeMone <PaulD...@EasyInternet.net> writes:
|>
|> HP and Intel will disclose the native Merced instruction set (IA-64) in a couple of
|> weeks but if they exploit the advances described in this paper it could really be
|> something with the potential for dramatically raising the bar of microprocessor
|> performance and not just typical Intel hype and FUD. Comments?

It will have little or no practical effect, because the dominating
languages and programming styles have one unpredictable branch or
data indirection every few instructions. Until and unless C, C++,
most windowing systems and similar tripe are scrapped, almost all
systems will run at a miniscule fraction of their potential CPU
performance. A quick look at Java implies that it probably has the
same problems, but I cannot swear to it.


Nick Maclaren,
University of Cambridge Computer Laboratory,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email: nm...@cam.ac.uk
Tel.: +44 1223 334761 Fax: +44 1223 334679

J. Corbally

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

> It will have little or no practical effect, because the dominating
> languages and programming styles have one unpredictable branch or
> data indirection every few instructions. Until and unless C, C++,
> most windowing systems and similar tripe are scrapped, almost all
> systems will run at a miniscule fraction of their potential CPU
> performance. A quick look at Java implies that it probably has the
> same problems, but I cannot swear to it.
>
> Nick Maclaren,
> University of Cambridge Computer Laboratory,
> New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
> Email: nm...@cam.ac.uk
> Tel.: +44 1223 334761 Fax: +44 1223 334679

I'm sure you have some new hyper-advanced programming language in the
waiting that will realise this "potential CPU performance", and obsolete
all other languages. Yeah, right.....

Please explain yourself....

Regards,


James.....


--
All opinions expressed herein are my own personal observations
and DO NOT represent the views of my past, current, or future
employers. All opinions included herein due to quoting are
the personal opinions of their respective posters, and no
implication of agreement or endorsement on my part shall be
inferred in them.

Remove *NO_SPAM* to enable mailing

Ketil Z Malde

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

nm...@cus.cam.ac.uk (Nick Maclaren) writes:

> If you take a close look at C++, the X Windowing System, the various
> IBM and Microsoft windowing systems etc., you will see that an
> 'object' is implemented as a data structure and you invoke a method
> by picking an address up from that structure and branching to it.

Right - and in Pascal, Algol, Cobol, and all those other superior
languages, function calls are made without branching? Which is why
Pascal and Cobol is commonly used when performance is important? Right?

And since programs written in C runs at a ``miniscule fraction of their
potential CPU performance'', could you please inform me which Pascal,
Algol, Cobol or Modula-[23] compiler that create such vastly improved
speed of execution? And - how much faster can I expect applications to
run?

HIBT?

~kzm
--
If I haven't seen further, it is by standing in the footprints of giants

Nick Maclaren

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

In article <60t42r$4...@scnews.sc.intel.com>, "J. Corbally" <icorb*NO_SPAM*@*NO_SPAM*indigo.ie> writes:
|> > It will have little or no practical effect, because the dominating
|> > languages and programming styles have one unpredictable branch or
|> > data indirection every few instructions. Until and unless C, C++,
|> > most windowing systems and similar tripe are scrapped, almost all
|> > systems will run at a miniscule fraction of their potential CPU
|> > performance. A quick look at Java implies that it probably has the
|> > same problems, but I cannot swear to it.
|>
|> I'm sure you have some new hyper-advanced programming language in the
|> waiting that will realise this "potential CPU performance", and obsolete
|> all other languages. Yeah, right.....
|>
|> Please explain yourself....

Whether I do or not is irrelevant, as all of the Fortrans, Cobols,
Algols, Pascals, Modulas etc. are streets better. So, interestingly
enough, are 'object oriented' systems like Smalltalk and (I believe)
Eiffel - but they have other problems.

If you take a close look at C++, the X Windowing System, the various
IBM and Microsoft windowing systems etc., you will see that an
'object' is implemented as a data structure and you invoke a method
by picking an address up from that structure and branching to it.

Furthermore, most of the important data is indirected through multiple
levels of data structure.

When I first described this to a hardware designer, back in the early
days of the latest RISC hype, he said "But nobody writes code like
that!" When I said that the C++ and X Windowing System did little
else, he said "Well, people won't use them then."

Tom Thornhill

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

Paul DeMone <PaulD...@EasyInternet.net> wrote in article <3431A2...@EasyInternet.net>...

> While reading through an old (Dec 1995) issue of Proceedings of the IEEE the
> other day while waiting for an HSPICE run to finish I came across a remarkable
> paper entitled "Compiler Technology for Future Microprocessors". The authors
> (W. Hwu, R. Hank, D. Gallagher, S. Mahlke, D. Lavery, G. Haab, J. Gyllenhaal,
> and D. August) are associated with the University of Illinois but the research
> seems to have strong ties with HP.
>
> The primary thrust of the paper was to describe compiler algorithms and hardware
> mechanisms that could be used to get around the problem of branches and false
> dependencies in gathering enough instruction level parallelism for an eight issue
> RISC-like machine. The paper showed how predication (kind of like condition codes
> that can be used to permit conditional execute any instruction, a generalization of
> what ARM does) can be used in conjunction with loop unrolling and basic block
> oalesing to dramatically decrease the number of conditional branches that need to
> be predicted/executed (in one example mispredicted branches using the 2-bit branch
> prediction model were reduced by almost three orders of magnitude within a complex
> inner loop).
>
> The authors give examples of how an eight issue processor with full predication
> could give a remarkable improvement (1.2x to 4.5x) over an eight issue design with a
> non-predicated ISA. Their description of their architecture strikes me as eerily
> similar to some predictions about the Intel IA-64 Merced design so I will repeat it:
>
> "The processor model used for these experiments is an eight-issue extension of
> the HP PA-RISC[TM] architecture. The processor may issue eight instructions
> of any type each cycle except branches, which are restricted to one per cycle.
> The processor has 64 integer, 64 floating-point, and 64 predicate registers."
>
> The design would seem to effectively avoid the complexity and cycle time impact of
> the paraphenalia of aggressive out-of-order designs such as register renaming and
> in-order completion unit. The paper gives brief descriptions of advanced compiler
> techniques to transform around loops (essentially static compile time register re-
> naming) and also discusses how to incorporate memory data dependence analysis (used
> by the Multiflow Trace scheduling compiler for the TRACE series of VLIW computers)
> for another significant speedup.
>
> HP and Intel will disclose the native Merced instruction set (IA-64) in a couple of
> weeks but if they exploit the advances described in this paper it could really be

Do you have a date?

> something with the potential for dramatically raising the bar of microprocessor
> performance and not just typical Intel hype and FUD. Comments?
>
>

> All opinions strictly my own.
>
> --
> Paul W. DeMone The 801 experiment SPARCed an ARMs race
> Kanata, Ontario to put more PRECISION and POWER into
> dem...@mosaid.com architectures with MIPSed results but
> PaulD...@EasyInternet.net ALPHA's well that ends well.
>
>

I think you're onto something here - I would expect that if Intel introduce
a new architecture they'd try to ensure that relatively simple implementations
of it ( ie those without renaming and out of order execution ), outperform
agressive implementaions of Alpha AXP ( 21264 ? ). This gives them something
to scale towards.

Also since they've waited for so long before replacing IA32, you'd expect
them to come up with something other than an Alpha clone.

--
Mailto: ttho...@best.ms.philips.com-nospam ( remove the -nospam before replying )
Talkto: Tom Thornhill, Philips Medical Systems Nederland BV
Tel +31 4027 64080 (Work), +31 499 338211 (Hotel - after 5pm UK time)


Tom Thornhill

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

Guy Harris <g...@netapp.com> wrote in article <60slvu$3...@tooting.netapp.com>...

>
> So are they going to blow off making Merced run IA32 code really really
> really fast, leaving that up to future IA32 implementations (perhaps on
> the theory that Merced will be used mainly in machines that don't have
> to run IA32 stuff really really really fast, just "fast enough"), and
> maybe require software to do binary-to-binary translation of IA32
> programs if you want them to run really really really fast? (I.e., it
> doesn't have to run Windows 98 or Office 97 really really really fast,
> as a P6-generation or next-generation-after-P6 machine can handle that
> job, it just has to run big database XXX or big server YYY or big
> number-crunching program ZZZ really really really fast, and those will
> be compiled into native code?)

Surely for IA32 code it would be possible to do on the fly translation
a la the PPro from instructions fetched from memory to the instructions
executed by the pipeline(s). I suppose that moving complexity out of
the pipeline just means that you have to do more work on the translation,
but that work will already have been done for Willamette (sp?), assuming
that uses PPro-style decoupled architecture.

I suppose you could have a kind of modular design with the same pipeline
design in both chips but with a highly optimised ( and big )
translation unit in Willamette feeding a few pipelines, but a less optimised
and smaller translation unit feeding more pipelines in Merced.

That way you could have a modular design that could be tuned for executing
translated IA32 or native IA64. Eventually you can lose the translation unit
completely and move to an IA64 only architecture at a higher clock speed.
You could then scale this by adding superscalar execution for the VLIW instructions
( is this possible )

Hmm, maybe I should learn VHDL and find out how practical all this stuff is.

> --
> Reply, or follow up, but don't do both, please.
>
> postmaster@localhost
> postmaster@[127.0.0.1]
>


--
***If you can understand this then your brain needs servicing URGENTLY***

Richard A. Schumacher

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

[...]

>The authors give examples of how an eight issue processor with full predication
>could give a remarkable improvement (1.2x to 4.5x) over an eight issue design with a
>non-predicated ISA. Their description of their architecture strikes me as eerily
>similar to some predictions about the Intel IA-64 Merced design so I will repeat it:

[...]

>HP and Intel will disclose the native Merced instruction set (IA-64) in a couple of
>weeks but if they exploit the advances described in this paper it could really be

>something with the potential for dramatically raising the bar of microprocessor
>performance and not just typical Intel hype and FUD. Comments?


Yes. The label reads "Intel", but it's HP inside.

--
-----------------------------------------------
I do not speak in any capacity for the
Convex Division of the Hewlett Packard Company.

Michael D. Nahas

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

Ketil Z Malde wrote:

> nm...@cus.cam.ac.uk (Nick Maclaren) writes:
>
> > If you take a close look at C++, the X Windowing System, the various
> > IBM and Microsoft windowing systems etc., you will see that an
> > 'object' is implemented as a data structure and you invoke a method
> > by picking an address up from that structure and branching to it.
>

> Right - and in Pascal, Algol, Cobol, and all those other superior
> languages, function calls are made without branching? Which is why
> Pascal and Cobol is commonly used when performance is important? Right?
>
> And since programs written in C runs at a ``miniscule fraction of their
> potential CPU performance'', could you please inform me which Pascal,
> Algol, Cobol or Modula-[23] compiler that create such vastly improved
> speed of execution? And - how much faster can I expect applications to
> run?

I'm afraid that I must admit that Maclaren, even with his outrageous
claim, is correct. Malde has missed the problem - it is not branching,
but indirection, A.K.A. pointer chasing.

For a standard function, you just branch to the address. With C++'s
virtual
functions you either follow a pointer or perform an indexed lookup to a
jump
table, which will load the address you will jump to, and then branch to
that
address.

Why is indirection bad? When the destination address of a jump is known,
the processor
can begin fetching from the new address and continue executing. When the
destination address of the jump is unknown, the processor usually stalls
until
the address comes in from memory. Some processors do have hardware to
speed up indirect jumps, but the hardware was developed using C benchmarks,

where 90% of indirect jumps are just returns from functions.

The truely evil part of indirection is not in function calls, but in the
non-array
data structures of most programs. Think of traversing a linked list. To
load the
first node in the list, it takes as much time as the memory latency. To
load the
second node, you must first load the first node, and then pay the cost of
the
memory latency time again. So to traverse a N node linked list, it takes N
times
the memory latency.

For most cases, the nodes are contiguous in memory and most programs run
inside the L1 and L2 cache, so these latencies are not too bad for most
programs.
However, programs which do not fit in large L2 caches, such as sparse
matrix
programs and the X Windows system, pay heavily. The latency time of going
to
main memory is growing exponentially in terms of CPU cycles.

So, is this a problem, and if so, how do we fix it. Yes, it is a problem,
because any
stall to the pipeline which prevents speculative execution is bad. How do
we
solve it? The first thing to do is to make programs which have this
problem part
of the SPEC Benchmarks. This will make it the focus of everyone's
attention.
The other thing that can be done is to look at Compiler optimizations -
many of
the indirect function calls can be eliminated, especially if all the source
is compiled
together. Separate compilation of object files requires general interfaces
which
prevent many optimizations.

Also, a note on Java. All Java functions are virtual and all require a
general
interface on a class level. Lastly, all objects are referenced indirectly,
so even an
array of Java objects is two levels of indirection to get the object you
want! So,
Java is even worse than C++.

Mike

As always, comments and criticism welcome.

--
/********************************************************\
* Michael David Nahas na...@virginia.edu *
* Graduate Student in CS University of Virginia *
\********************************************************/


bri...@bit3.com

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

In article <34326556...@virginia.edu>,

Michael D. Nahas <na...@virginia.edu> wrote:
>I'm afraid that I must admit that Maclaren, even with his outrageous
>claim, is correct. Malde has missed the problem - it is not branching,
>but indirection, A.K.A. pointer chasing.
>
>For a standard function, you just branch to the address. With C++'s
>virtual
>functions you either follow a pointer or perform an indexed lookup to a
>jump
>table, which will load the address you will jump to, and then branch to
>that
>address.
>
>Why is indirection bad? When the destination address of a jump is known,
>the processor
>can begin fetching from the new address and continue executing. When the
>destination address of the jump is unknown, the processor usually stalls
>until
>the address comes in from memory. Some processors do have hardware to
>speed up indirect jumps, but the hardware was developed using C benchmarks,
>
>where 90% of indirect jumps are just returns from functions.

The problem is not the double indirection- which can be handled if the CPU
prefetchs data as well as instructions- but the fact that you are doing two
memory accesses now instead of one, and usually widely seperate memory
accesses (most implementations don't AFAIK actually keep the virtual
function tables with the class data, but in a seperate structure shared by
all objects of the same type- which is usually somewhere far away from
where the data ends up). This is a symptom of the fact that memory has not
increased in speed the same way CPUs have (I put 70ns memory into my 386-
granted, no FPM or EDO, but still... And the 386 still needed cache).

The only thing more painfull on performance than leaving cache and waiting on
main memory is leaving main memory and swapping pages into disk. Compilers
(and programmers!) should worry more about locality of reference.

If you are going to blow cache, the question arises "is it worth it?". For
sparse matricies, the answer is obviously yes (the alternative is to swap
like mad)- although the matrix solving programs should be written to be
cache aware, and keep locality of reference up. For virtual functions,
I think the benefits outweigh the gains, although it depends critically on
what you're doing. I note the Fortran is still one of the two best languages
for doing numeric computation in- only recently have there been C compilers
as good as Fortran compilers (and even in these cases you often find they
share an optimizer with a Fortran compiler- examples include SGI's MipsPro
compilers, and HP's compilers).

I don't speak for Bit 3.
Brian


--
"NT and security should never be used in the same breath."
- Winn Schwartau, EETimes #972, 22 September 1997 pp.96

-Brian

Stefan Monnier

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

"Michael D. Nahas" <na...@virginia.edu> writes:
> I'm afraid that I must admit that Maclaren, even with his outrageous
> claim, is correct. Malde has missed the problem - it is not branching,
> but indirection, A.K.A. pointer chasing.

I believe everyone here is aware of the fact that indirection is *the*
problem. The question was "why should indirection be more of an issue with
C/C++ than with Pascal, Algol, ... ?"


Stefan

Nick Maclaren

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

In article <5l90wd2...@tequila.systemsz.cs.yale.edu>,

In a word - "aliasing". With multiple levels of indirection, a conforming
C compiler has to assume that almost every location can be aliased, and
therefore must make a store reference every time. Almost all fancy
optimisations that are possible in those languages will cause at least
some legal C programs to fail. "const" helps a little, and "restrict"
(in C9X) will help a bit more, but at best will bring C up to the very
low level of Fortran. And I am not optimistic about even that.

Michael Nahas also correctly points out that C++, X, Java etc. have MORE
indirections than the equivalent Fortran or 'ordinary' C. This is a matter
of interface style as much as the underlying language, but that doesn't
help at all. We are stuck with both of them.

To respond to the previous posting by bri...@bit3.com, the reason that
the multiple indirection is nasty is precisely because it causes trouble
with data prefetching! Consider A->B->C->D->E. At each stage, the CPU
has to perform the address calculation before it can start prefetching
the next address. Now, this very simple case CAN be done by using a
multi-threaded architecture, with prefetching threads running ahead to
preload the next operands.

But what about (P = A->B ? (Q = P->C ? Q->D : P->E) : A->F)? To handle
THAT, you need speculative pre-execution, which was dabbled with in the
1960s and 1970s, and quite rightly dropped as an abomination. And, if
you look at the source of the X Windowing System, you will see code that
is pretty well equivalent :-(

For efficient use of modern architectures, we need languages in which we
programmers can express what we want to say in a fashion that can be
converted to the sort of instruction stream that executes fast. And
the current trends are in exactly the opposite direction.

P Murray

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

Nick Maclaren (nm...@cus.cam.ac.uk) wrote:

: For efficient use of modern architectures, we need languages in which we


: programmers can express what we want to say in a fashion that can be
: converted to the sort of instruction stream that executes fast. And
: the current trends are in exactly the opposite direction.

Any suggestions on what such a language would look like? Hopefully
it will not look like ASM. Would it be possible to modify C/C++ in
such a way as to allow this?

And finally, could anyone recommend ways to code in C/C++ that
avoid these problems and allow the optimiser to produce the
most efficient code? I assume virtual function calls would
be out, as well as all forms of aliasing.

Cheers,
- Peter.

Nick Maclaren

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

In article <60ui2q$33a$1...@cdn-news.telecom.com.au>,

P Murray <p...@vus002.telecom.com.au> wrote:
>Nick Maclaren (nm...@cus.cam.ac.uk) wrote:
>
>: For efficient use of modern architectures, we need languages in which we
>: programmers can express what we want to say in a fashion that can be
>: converted to the sort of instruction stream that executes fast. And
>: the current trends are in exactly the opposite direction.
>
>Any suggestions on what such a language would look like? Hopefully
>it will not look like ASM. Would it be possible to modify C/C++ in
>such a way as to allow this?

Well, I would favour something using the more successful ideas of
Algol 68 :-) But it would have to include some of the concepts of the
stricter functional languages.

>And finally, could anyone recommend ways to code in C/C++ that
>avoid these problems and allow the optimiser to produce the
>most efficient code? I assume virtual function calls would
>be out, as well as all forms of aliasing.

Yes. And, if you can code C without implying aliasing, I take my hat
off to you! The biggest problem with attempting this is that you can
only 'const qualify' at the top level and, for efficient code-generation,
you need to do it at other levels. I believe that the new 'restrict' in
C9X is a sort of recursive 'const', but have not seen a specification.

John Wilson

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

In article <60uek7$fon$1...@lyra.csx.cam.ac.uk>,

Nick Maclaren <nm...@cus.cam.ac.uk> wrote:
>For efficient use of modern architectures, we need languages in which we
>programmers can express what we want to say in a fashion that can be
>converted to the sort of instruction stream that executes fast. And
>the current trends are in exactly the opposite direction.

Luckily we still have good ol' assembly language!

Sorry, couldn't resist -- the paragraph above nicely supports my
personal prejudices though. :-)

John Wilson
D Bit

Paul DeMone

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

Nick Maclaren wrote:
>
> In article <3431A2...@EasyInternet.net>, Paul DeMone <PaulD...@EasyInternet.net> writes:
> |>
> |> HP and Intel will disclose the native Merced instruction set (IA-64) in a couple of
> |> weeks but if they exploit the advances described in this paper it could really be
> |> something with the potential for dramatically raising the bar of microprocessor
> |> performance and not just typical Intel hype and FUD. Comments?
>
> It will have little or no practical effect, because the dominating
> languages and programming styles have one unpredictable branch or
> data indirection every few instructions. Until and unless C, C++,
> most windowing systems and similar tripe are scrapped, almost all
> systems will run at a miniscule fraction of their potential CPU
> performance. A quick look at Java implies that it probably has the
> same problems, but I cannot swear to it.

I thought much along the same line until I read this paper. some sample code is
the inner loop from Unix utilities. These are really twisty C programs chosen for
their branch-ugliness, i.e. the poor way conventional compilers and machines can
handle them. The job their compiler does on the inner loop of the wc (word count)
program is amazing. It eliminates most branches by replicating basic blocks and
using predication to coalesce around branches and form larger basic blocks. The
combination of loop unrolling and formation of hyperblocks allows a good deal of
instruction level parallelism to be extracted. In the case of wc the use of pred-
ication reduced dynamic branch execution from 572K to 315K while reducing branch
misprediction using the 2-bit scheme from 52K to 56. On the eight issue target CPU
wc was sped up by 5.5x versus a single issue RISC CPU while grep was sped up 4x.
The SPEC92 programs espresso, eqntott, alvinn, ear, and sc were sped up 3.3x, 5.5x,
7.7x, and 5.8x respectively versus a single issue RISC.

Considering that the three issue P6/Pentium II and quad issue RISCs like R10K and
Alpha 21164 probably average only about 2.5x (or less) speed up versus a single
issue RISC the performance of this new architecture with its advanced compiler is
damn impressive! Don't forget that this new design likely can be implemented without
all the overhead of out-of-order machines like P6 and R10K and so can be clocked
much, much faster (maybe even faster than Alpha in the same process). Granted some
of the apparent speedup of the new design will be lost from main memory latency and
bandwidth limitations or icache inefficiency from code bloat but it still could be
a significant step forward.

>
> Nick Maclaren,
> University of Cambridge Computer Laboratory,
> New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
> Email: nm...@cam.ac.uk
> Tel.: +44 1223 334761 Fax: +44 1223 334679

--

Paul DeMone

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

Tom Thornhill wrote:
[snip]

> > HP and Intel will disclose the native Merced instruction set (IA-64) in a couple of
> > weeks but if they exploit the advances described in this paper it could really be
>
> Do you have a date?

Approximately 1:00 pm Tuesday Oct 14 (at the 1997 Microprocessor Forum,
Fairmont Hotel in San Jose).

[snip]


> I think you're onto something here - I would expect that if Intel introduce
> a new architecture they'd try to ensure that relatively simple implementations
> of it ( ie those without renaming and out of order execution ), outperform
> agressive implementaions of Alpha AXP ( 21264 ? ). This gives them something
> to scale towards.
>
> Also since they've waited for so long before replacing IA32, you'd expect
> them to come up with something other than an Alpha clone.

Compiler technology and architecture seems to have advanced significantly
since Alpha was designed (itself essentially a classic RISC but streamlined
for superscalar and out-of-order implementations) so Intel would have to
stumble badly not to be able to get some degree of advantage.

>
> --


> Mailto: ttho...@best.ms.philips.com-nospam ( remove the -nospam before replying )
> Talkto: Tom Thornhill, Philips Medical Systems Nederland BV
> Tel +31 4027 64080 (Work), +31 499 338211 (Hotel - after 5pm UK time)

--

Allen Kim

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

Nick Maclaren wrote:
>
> But what about (P = A->B ? (Q = P->C ? Q->D : P->E) : A->F)? To handle
> THAT, you need speculative pre-execution, which was dabbled with in the
> 1960s and 1970s, and quite rightly dropped as an abomination. And, if
> you look at the source of the X Windowing System, you will see code that
> is pretty well equivalent :-(
>
> For efficient use of modern architectures, we need languages in which we
> programmers can express what we want to say in a fashion that can be
> converted to the sort of instruction stream that executes fast. And
> the current trends are in exactly the opposite direction.

Despite the fact that you guys are coming pretty close to speaking pure
"Greek" here (or is it "geek"?), I'll try and add a little to the
discussion.

My impression with Merced and VLIW-like architectures which move branch
optimization out to the compiler is that instructions like the one above
can be solved by very advanced techniques which effectively turn pieces
of code into "trees" which each calculate a number of possible outcomes
at once. Now I know that with that statement, I probably distorted the
actual concept, but I'm hoping you'll understand. Anyway, trees like
this would involve multiple memory accesses at once, which could be the
potential bottleneck unless some very sophisticated caching and bus
transaction mechanisms are developed.

See this link for more info:

http://www.research.ibm.com/vliw/home.html

Disclaimer: I don't speak officially for Intel, and I don't work on the
Merced processor itself. I *do* work on the chipset which will support
Merced, but that doesn't mean I know the intricate details of the
processor.

--
Allen Kim
alle...@scic.intel.com

Jim Hull

unread,
Oct 1, 1997, 3:00:00 AM10/1/97
to

> is Tom Thornhill
> > is Paul DeMone

> > HP and Intel will disclose the native Merced instruction set (IA-64)

> > in a couple of weeks ...

> Do you have a date?

October 14, at Microprocessor Forum.

For details, check:

http://www.MDRonline.com/mpf

--
Jim Hull

"If so strong the force in Yoda is, construct a sentence with words in
the proper order then why can't he?" -- Teng-Kiat Lee

Vesa Karvonen

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

Nick Maclaren wrote:
> In article <60ui2q$33a$1...@cdn-news.telecom.com.au>,
> P Murray <p...@vus002.telecom.com.au> wrote:
> >Nick Maclaren (nm...@cus.cam.ac.uk) wrote:
> >
> >: For efficient use of modern architectures, we need languages in which we

> >: programmers can express what we want to say in a fashion that can be
> >: converted to the sort of instruction stream that executes fast. And
> >: the current trends are in exactly the opposite direction.
> >
> >Any suggestions on what such a language would look like? Hopefully
> >it will not look like ASM. Would it be possible to modify C/C++ in
> >such a way as to allow this?
>
> Well, I would favour something using the more successful ideas of
> Algol 68 :-) But it would have to include some of the concepts of the
> stricter functional languages.
>
> >And finally, could anyone recommend ways to code in C/C++ that
> >avoid these problems and allow the optimiser to produce the
> >most efficient code? I assume virtual function calls would
> >be out, as well as all forms of aliasing.
>
> Yes. And, if you can code C without implying aliasing, I take my hat
> off to you! The biggest problem with attempting this is that you can
> only 'const qualify' at the top level and, for efficient code-generation,
> you need to do it at other levels. I believe that the new 'restrict' in
> C9X is a sort of recursive 'const', but have not seen a specification.

Being an assembly language programmer before learning C, I used
to think a bit like you are thinking after seeing the bad code
produced by aliasing problems and the like in C. However, after
reading "The C++ Programming Language" (second edition) by
Bjarne Stroustrup, and seeing that he actually tells you which
language constructs generate extra overhead, I have to disagree
about leaving language features out simply because they are "bad
for the optimizer/processor"; you can always tell the programmers
what is bad for the optimizer and the processor. Leaving out
useful language constructs, like virtual functions, will only
make the language hard to use, by modern standards, and restricts
the useability of the language by making certain applications
too difficult to implement in the language.

With the exception of aliasing problems and problems with
restructuring floating point expressions, I see very little
room for defining a "faster" C (for single processor systems
anyway).

The "restrict" qualifier would/will fix the aliasing problem
in C. Of course, the problem is that a lot of old code has to
be modified, which is usually rather costly.

It might not even be necessary to have "restrict", since
some of the worse aliasing problems could also be fixed by
specifying some kind of syntax for parallel execution of
statements/expressions. Such a syntax could come very handy
in vector code, for instance.

Most optimizing C compilers have optimization switches that
can be used to fix both the aliasing and the floating point
expression restructuring problems.

--
<-->
Vesa Karvonen | If I were speaking for my
Housemarque Games, Inc. | employer, you would not be
http://www.housemarque.com | reading this...

Joseph H Allen

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

In article <3432F8...@HousemarqueREM.fi>,
Vesa Karvonen <Vesa.K...@HousemarqueREM.fi> wrote:
>Nick Maclaren wrote:

>> Yes. And, if you can code C without implying aliasing, I take my hat
>> off to you! The biggest problem with attempting this is that you can
>> only 'const qualify' at the top level and, for efficient code-generation,
>> you need to do it at other levels. I believe that the new 'restrict' in
>> C9X is a sort of recursive 'const', but have not seen a specification.

More complicated key words. Yuck, that's just what C needs.

>Being an assembly language programmer before learning C, I used
>to think a bit like you are thinking after seeing the bad code
>produced by aliasing problems and the like in C.

I'm still waiting for someone to try a hardware fix for the aliasing
problem. One way is to make the internal registers more like a cache than
they are now. Then when you load a word into a register, you also load the
memory address where the word came from into some auxiliary part of the
register. The register snoops the bus at this point: if a subroutine
modifies the memory at that address, a bit in the register is set. If the
caller then accesses that register you get a trap or delay and a reload
occurs. Using this, you can have your compiler safely assume that there are
no aliases, and avoid doing tons of reloads after every subroutine call.

Of course this only solves the problem of aliases which depend on a single
memory location as their source, not complicated values which depend on
multiple memory locations. But in a language like C++, I would guess that
most of the aliases are like this anyway (like the base addresses of
objects, or whatever). I suppose you could even have more general alias
detection hardware which would allow you to detect if any of the sources of
a derived value were modified- it would just cost more.

Other hardware could conceivably solve more general alias and dependancy
problems than those caused by non-local subroutine calls. For example:

if a==z then x=5
y=x+3

causes a pipeline stall of some sort, because y is an alias for x, but
'y=x+3' can not occur until it is known whether the previous conditional is
going to modify x. The compiler can't do this because a and z could be
arbitrary non-constant values.

But with extra hardware, there are possibilities. Maybe you do y=x+3 at the
same time that you do the if(a==z), but also have a 'y=8' on standby, in
case the x=5 ended up happening. A branchless version of the above code
could be done with a few extra interesting flag bits, instructions which
occur only conditionally as on the ARM and a compiler which is aware of the
processor's pipeline structure.

--
/* jha...@world.std.com (192.74.137.5) */ /* Joseph H. Allen */
int a[1817];main(z,p,q,r){for(p=80;q+p-80;p-=2*a[p])for(z=9;z--;)q=3&(r=time(0)
+r*57)/7,q=q?q-1?q-2?1-p%79?-1:0:p%79-77?1:0:p<1659?79:0:p>158?-79:0,q?!a[p+q*2
]?a[p+=a[p+=q]=q]=q:0:0;for(;q++-1817;)printf(q%79?"%c":"%c\n"," #"[!a[q-1]]);}

Eric Hildum

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

Michael D. Nahas wrote:

>
> Ketil Z Malde wrote:
>
> So, is this a problem, and if so, how do we fix it. Yes, it is a problem,
> because any
> stall to the pipeline which prevents speculative execution is bad. How do
> we
> solve it? The first thing to do is to make programs which have this
> problem part
> of the SPEC Benchmarks. This will make it the focus of everyone's
> attention.
> The other thing that can be done is to look at Compiler optimizations -
> many of
> the indirect function calls can be eliminated, especially if all the source
> is compiled
> together. Separate compilation of object files requires general interfaces
> which
> prevent many optimizations.
>

No, it only only requires a reasonable object code format and linker.
That is, it requires a link editor such as exists under VMS which is
capable of selecting the branch or call instruction and addressing, not
the trival symbol table lookup and drop the bytes in the slot that Unix
linkers perform.

--
"The idea that Bill Gates has appeared like a knight in shining armour
to lead all customers out of a mire of technological chaos neatly
ignores the fact that it was he who, by peddling second-rate technology,
led them into it in the first place."-Douglas Adams

Eric Hildum
Eric....@Japan.NCR.COM

Torben AEgidius Mogensen

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

p...@vus002.telecom.com.au (P Murray) writes:

>Nick Maclaren (nm...@cus.cam.ac.uk) wrote:

>: For efficient use of modern architectures, we need languages in which we
>: programmers can express what we want to say in a fashion that can be
>: converted to the sort of instruction stream that executes fast. And
>: the current trends are in exactly the opposite direction.

>Any suggestions on what such a language would look like? Hopefully
>it will not look like ASM. Would it be possible to modify C/C++ in
>such a way as to allow this?

A possibility is SML using the region-based memory model. You would
still do pointer cahsing, but the regions model keep a fairly good
locality (elements of the same list stay close, even when other data
is allocated in-between). In SML you also avoid most of the aliasing
problems of C/C++. The technology still needs a few years to be
competitive with C. One problem is that the polymorphic type system in
SML introduces extra indirections (boxing). This can be solved if you
compile the entire program in one job, as the polymorphism can be
eliminated when all uses are known. Though, theoretically, this can
give an exponetial blow-up in code size, experiments show that the
code expansion is usually small.

You can use the regions model on C-like languages, but you would have
to give up on a lot of the 'dirty' features of C, such as type
casting and varargs.

>And finally, could anyone recommend ways to code in C/C++ that
>avoid these problems and allow the optimiser to produce the
>most efficient code? I assume virtual function calls would
>be out, as well as all forms of aliasing.

It is not enough that you avoid aliasing. The compiler must be able to
recognize that there is none. And this is not often possible,
especially if you use separate compilation. If a language is designed
such that aliasing doesn't occur or only does so in easily
identifiable contexts (e.g. array lookups), the compiler can use this
assumption safely, even without complex and inaccurate analysis.

Torben Mogensen (tor...@diku.dk)

Wilco Dijkstra

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

Joseph H Allen wrote:

> I'm still waiting for someone to try a hardware fix for the aliasing
> problem. One way is to make the internal registers more like a cache than
> they are now. Then when you load a word into a register, you also load the
> memory address where the word came from into some auxiliary part of the
> register. The register snoops the bus at this point: if a subroutine
> modifies the memory at that address, a bit in the register is set. If the
> caller then accesses that register you get a trap or delay and a reload
> occurs. Using this, you can have your compiler safely assume that there are
> no aliases, and avoid doing tons of reloads after every subroutine call.

I don't think it's a good idea to try to create expensive hardware
solutions
to the aliasing problem - it's software that created it in the first
place!
You just need to find the cases where aliasing is appropriate and ban
all other
uses. This is similar to gotos.
It's my belief that almost all aliasing is avoidable at low cost during
compilation. For example most aliasing is due to function calls. With
global
aliasing analysis the complete set of aliases can be calculated. Of
course
a good language which avoids most of the aliasing problems (eg. the ADA
parameter
passing mechanism) would help!

Nick Maclaren

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

In article <6108a1$1...@news-rocq.inria.fr>, har...@pauillac.inria.fr (Robert Harley) writes:
|> Nick Maclaren (nm...@cus.cam.ac.uk) writes:
|> >[...] The biggest problem with attempting this is that you can

|> >only 'const qualify' at the top level and, for efficient code-generation,
|> >you need to do it at other levels.
|>
|> Huh? What's wrong with 'double const*const*const triMat'?

How do you get a 'double *** triMat' to that? You have to put a
cast in, which essentially says "don't do ANY checking", and would
be equally happy with a 'double ** triMat' or 'double **** triMat'
argument.

|> >I believe that the new 'restrict' in
|> >C9X is a sort of recursive 'const', but have not seen a specification.
|>

|> The object accessed via pointers based on a given restricted pointer
|> is only accessed in that way. No recursive 'const' in sight.

Oh, hell. Well, that's not going to help a lot then, except for the
simplest of cases (which, admittedly, do include the BLAS).

Patrick F. McGehearty

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

In article <60vj9p$7...@vidar.diku.dk>,

Torben AEgidius Mogensen <tor...@diku.dk> wrote:
>p...@vus002.telecom.com.au (P Murray) writes:
>
>>Nick Maclaren (nm...@cus.cam.ac.uk) wrote:
>
>>: For efficient use of modern architectures, we need languages in which we
>>: programmers can express what we want to say in a fashion that can be
>>: converted to the sort of instruction stream that executes fast. And
>>: the current trends are in exactly the opposite direction.
>
>>Any suggestions on what such a language would look like? Hopefully
>>it will not look like ASM. Would it be possible to modify C/C++ in
>>such a way as to allow this?
>
>A possibility is SML using the region-based memory model. You would
>still do pointer cahsing, but the regions model keep a fairly good
>locality (elements of the same list stay close, even when other data
>is allocated in-between). In SML you also avoid most of the aliasing
>problems of C/C++. The technology still needs a few years to be
>competitive with C. One problem is that the polymorphic type system in
>SML introduces extra indirections (boxing). This can be solved if you
>compile the entire program in one job, as the polymorphism can be
>eliminated when all uses are known. Though, theoretically, this can
>give an exponetial blow-up in code size, experiments show that the
>code expansion is usually small.
>

Code expansion is not the problem. Compile time is.
Large development projects involve 100's of modules of code
in the 100,000 to 1,000,000 lines of code range. It only takes one
unbounded non-linear optimization algorithm to cause full-program compiles
to take so long that you start worrying about the halting problem. That is,
after a couple of days of computing, you don't know if the compiler is
in an infinite loop or still computing. The "couple of days" is not an
exaggeration. I've seen a 1000 line program take over a day due to a
single unbounded non-linear optimization algorithm.

As someone working professionally in compiler performance for the last few
years, I'll take any help I can get from language definition to bound the
amount of code that must be examined to determine if an optimization is
safe. Since I've been in graduate school (over 20 years ago), I've been
hearing about how this language feature or that can be optimized, but I've
seldom seen these optimizations actually be performed. It takes a lot of
work to cover all the cases, and rarely used language features are often
ignored when it comes time to enhance optimizers.

Nick is absolutely dead-on when he identifies the aliasing problem as one of
the most troubling in C. Fortran avoids that problem by defining it away.
Users are seldom aware of the compiler switches which say "ignore aliasing".
In addition, the switches are different for different platforms, so standard
make files for multiplatform products don't use them. A plausible
compromise for future languages would be to have a default that assumes
Fortran aliasing rules and to require a declaration to allow aliasing when
it is needed. Then most of the time, the compiler can generate fast code
and only when aliasing is strictly needed will optimization be degraded.

On the C++ virtual function issue, what is needed is more training of C++
programmers to avoid inefficient constructs except when they are really
necessary, and to provide whatever language advisories possible to help the
compiler along. While processor speeds continue to increase at exponential
rates, memory latencies are improving much more slowly. Ten years ago, the
cost of a non-cache memory access was less than 10 instructions. Four years
ago, it was less than 40 instructions. Now it is well over 100 instructions
on super-scalar architectures. The exact numbers will vary depending on
which architectural family you look at, but the trend line is clear. And
everything I'm aware of indicates that it will only get worse in the future.
Virtual functions that cannot be optimized to static addresses at compile or
link time are the enemy of performance, and future processor speed
improvements will only make situation more extreme.

- Patrick McGehearty
pat...@convex.hp.com

Richard A. Schumacher

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

>>For efficient use of modern architectures, we need languages in which we
>>programmers can express what we want to say in a fashion that can be
>>converted to the sort of instruction stream that executes fast. And
>>the current trends are in exactly the opposite direction.

>Luckily we still have good ol' assembly language!

And, luckily, in about six years there will be only one
architecture in wide use, which nicely reduces the scope
of the problem...

Robert Harley

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

Nick Maclaren (nm...@cus.cam.ac.uk) writes:
>[...] The biggest problem with attempting this is that you can
>only 'const qualify' at the top level and, for efficient code-generation,
>you need to do it at other levels.

Huh? What's wrong with 'double const*const*const triMat'?

>I believe that the new 'restrict' in
>C9X is a sort of recursive 'const', but have not seen a specification.

The object accessed via pointers based on a given restricted pointer
is only accessed in that way. No recursive 'const' in sight.

-- Rob.

Robert Harley

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

Nick Maclaren (nm...@cus.cam.ac.uk) wrote:

>har...@pauillac.inria.fr (Robert Harley) writes:
>> Nick Maclaren (nm...@cus.cam.ac.uk) writes:
>> >[...] The biggest problem with attempting this is that you can
>> >only 'const qualify' at the top level and, for efficient code-generation,
>> >you need to do it at other levels.
>>
>> Huh? What's wrong with 'double const*const*const triMat'?
>
>How do you get a 'double *** triMat' to that? You have to put a cast in,

Well sure, if for some strange reason you wanted to turn a
pointer-pointer-pointer into a pointer-pointer, C allows it via a cast.
But why on earth would you want to do that? I can only assume that
you meant 'double ** triMat' in which case assign it:
double const*const*const constTriMat;
constTriMat = triMat;

or pass it as an argument:
double myFunc(double const*const*const bla);
myFunc(triMat);

or something. Type qualifiers don't affect type compatibility, so
it's fine to add a 'const' in this manner. It's also OK discard a
'const' but compilers give a warning if you do so.


>which essentially says "don't do ANY checking", [...]

Whereas your beloved Fortran checks that the user's assumptions about
lack of aliasing are correct, perhaps?


>> The object accessed via pointers based on a given restricted pointer

>> is only accessed in that way. [...]


>
>Oh, hell. Well, that's not going to help a lot then, except for the

>simplest of cases [...]

Huh?

-- Rob.

Robert Harley

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

schu...@convex.hp.com (Richard A. Schumacher) writes:
>>Luckily we still have good ol' assembly language!
>
>And, luckily, in about six years there will be only one
>architecture in wide use, which nicely reduces the scope
>of the problem...

Hey, nice of a guy from HP to defend Alpha like that!

=:-)

.-. Robert...@inria.fr .-.
/ \ .-. .-. / \
/ \ / \ .-. _ .-. / \ / \
/ \ / \ / \ / \ / \ / \ / \
/ \ / \ / `-' `-' \ / \ / \
\ / `-' `-' \ /
`-' Linux + 500MHz Alpha + 256MB SDRAM = heaven `-'

Nick Maclaren

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

In article <610rbh$5...@news-rocq.inria.fr>,
Robert Harley <har...@pauillac.inria.fr> wrote:

[ Correctly pointing out that I had miscounted the number of '*'s in
the previous posters C type definition. ]

>... Type qualifiers don't affect type compatibility, so


>it's fine to add a 'const' in this manner. It's also OK discard a
>'const' but compilers give a warning if you do so.

This is getting rather off-group, but I don't want anyone to be taken in.
Type qualifiers DO affect type compatibility, your example is incorrect
C, and many compilers will refuse to compile it at all.

I could explain why such constructions are needed, and the consequences
for both efficiency and 'software engineering', but these newgroups are
not appropriate places for that.

Andrew Glew

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to Nick Maclaren

> Well, I would favour something using the more successful ideas of
> Algol 68 :-) But it would have to include some of the concepts of the
> stricter functional languages.

Nick: you and I both love Algol68!

But when I have looked at Algol68 compiler produced assembly code, it looks
remarkably like C++.

The best thing that Algol68 had going for it is that it allowed many of the
virtual function decisions to be made lexically, at compile time. But only, I
believe,
in the absence of separate compilation (or in the presence of a smart linker).

Stanley Chow

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

In article <610dd5$ia0$1...@lyra.csx.cam.ac.uk>,
Nick Maclaren <nm...@cus.cam.ac.uk> wrote:

>In article <6108a1$1...@news-rocq.inria.fr>, har...@pauillac.inria.fr (Robert Harley) writes:
>|> Nick Maclaren (nm...@cus.cam.ac.uk) writes:
>|> >[...] The biggest problem with attempting this is that you can
>|> >only 'const qualify' at the top level and, for efficient code-generation,
>|> >you need to do it at other levels.
>|>
>|> Huh? What's wrong with 'double const*const*const triMat'?
>
>How do you get a 'double *** triMat' to that? You have to put a
>cast in, which essentially says "don't do ANY checking", and would
>be equally happy with a 'double ** triMat' or 'double **** triMat'
>argument.

Or if the "double ***" is a type macro somewhat that may or may not
be changed from compile to compile.


--
Stanley Chow; sc...@nortel.ca, (613) 763-2831
Nortel/BNR, PO Box 3511 Station C, Ottawa, Ontario
Me? Represent other people? Don't make them laugh so hard.

Jon Leech

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

In article <60uek7$fon$1...@lyra.csx.cam.ac.uk>,

Nick Maclaren <nm...@cus.cam.ac.uk> wrote:
>For efficient use of modern architectures, we need languages in which we
>programmers can express what we want to say in a fashion that can be
>converted to the sort of instruction stream that executes fast. And
>the current trends are in exactly the opposite direction.

Perhaps language designers are more concerned with efficient use of
programmers than efficient use of hardware. What we need to do on today's
hardware may be completely inappropriate 10 years down the road, while
fundamental data structures remain pretty much the same.
Jon
__@/

bri...@bit3.com

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

>On the C++ virtual function issue, what is needed is more training of C++
>programmers to avoid inefficient constructs except when they are really
>necessary, and to provide whatever language advisories possible to help the
>compiler along. While processor speeds continue to increase at exponential
>rates, memory latencies are improving much more slowly. Ten years ago, the
>cost of a non-cache memory access was less than 10 instructions. Four years
>ago, it was less than 40 instructions. Now it is well over 100 instructions
>on super-scalar architectures. The exact numbers will vary depending on
>which architectural family you look at, but the trend line is clear. And
>everything I'm aware of indicates that it will only get worse in the future.
>Virtual functions that cannot be optimized to static addresses at compile or
>link time are the enemy of performance, and future processor speed
>improvements will only make situation more extreme.

There's a reason Java defined all methods as virtual methods- and why it was
bad that C++ didn't. One of the big advantages of OO programming is code
reuse- the problem with this is that you never know, when you write an object,
what uses you may put that object to months or years from now. In fact, it
is the mark of a good object design when it's used in ways the original
author never thought of. The cost of this is (currently) an extra memory
reference, which is almost certainly not cached, and possibly even swapped
out.

We're trying to get away from the model where "this code will only be used
in this way in this program" is the norm. Two technologies that may make
the situation better, as I see it:

1) Just in time compliation, which can say "In the general case,
this needs to be a virtual function, but in this specific case I know it
will always be this function."

2) Large, native system libraries, which should be written and
compiled by people who really know the system, and only for that system,
probably with -no-aliasing flags set (among others).

But, in the end, we still have the basic problem: CPUs are way faster than
main memory and getting faster. _This_ is the problem, not "bad programming".
There is a solution to it (faster memory), but wether or not it's worth it
will be determined by the market. And we are seeing, at least in the
workstations, more attention being paid to the memory architectures to help
allieviate this problem. You're also seeing bigger caches, the other way
to allieviate the problem.

I don't speak for Bit 3.

Brian.

>
>- Patrick McGehearty
>pat...@convex.hp.com

Stefan Monnier

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

nm...@cus.cam.ac.uk (Nick Maclaren) writes:
> In article <5l90wd2...@tequila.systemsz.cs.yale.edu>,
> Stefan Monnier <monnier+comp/arch/news/@tequila.cs.yale.edu> wrote:
> >I believe everyone here is aware of the fact that indirection is *the*
> >problem. The question was "why should indirection be more of an issue with
> >C/C++ than with Pascal, Algol, ... ?"
> In a word - "aliasing". With multiple levels of indirection, a conforming
> C compiler has to assume that almost every location can be aliased, and
> therefore must make a store reference every time. Almost all fancy

I'm sorry, but I can't seem to remember that Pascal disallows aliasing (I
honestly don't know about Algol). Furthermore, do you really believe that
aliasing information would help for GUI-like code ?

Do you really think the problem is due to coding style (like the Xwin code) and
to programming languages (like C) rather than to what is being coded (like a
user interface rather than a numerical simulation) ?

> For efficient use of modern architectures, we need languages in which we
> programmers can express what we want to say in a fashion that can be
> converted to the sort of instruction stream that executes fast. And
> the current trends are in exactly the opposite direction.

Because the trend is to say "CPUs are fast enough, let's care about
programmer's convenience". You seem to care about Fortran code, so probably
more numerical-oriented, in which case convenience is still a secondary concern
whereas speed is paramount. I can only agree to the fact that languages should
be developped to better take advantage of current (and future) technology, but
programmer's convenience cannot be sacrificed so easily so it's a very hard
design problem and I still haven't heard of any convincing proposition (the
best I've heard is in the NA area where Sisal sounds pretty interesting).


Stefan

Greg Franks

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

schu...@convex.hp.com (Richard A. Schumacher) writes:

> And, luckily, in about six years there will be only one
> architecture in wide use, which nicely reduces the scope
> of the problem...
>

Ha ha hahahahahah.

I wonder how many of those same processors will be found in toasters?
--
__@ Greg Franks, (613) 520-5726 _~@ __O
_`\<,_ Systems Engineering, Carleton University, -^\<;^\<,
(*)/ (*) Ottawa, Ontario, Canada K1S 5B6. (*)%---/(*)
"Where do you want to go today?" Outside.

Richard A. Schumacher

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to


>> And, luckily, in about six years there will be only one
>> architecture in wide use, which nicely reduces the scope
>> of the problem...
>>

>Ha ha hahahahahah.

>I wonder how many of those same processors will be found in toasters?

And how many of those toasters will care how quickly indirection
is resolved.

George Herbert

unread,
Oct 2, 1997, 3:00:00 AM10/2/97
to

Richard A. Schumacher <schu...@convex.hp.com> wrote:
>>> And, luckily, in about six years there will be only one
>>> architecture in wide use, which nicely reduces the scope
>>> of the problem...
>
>>Ha ha hahahahahah.
>>I wonder how many of those same processors will be found in toasters?
>
>And how many of those toasters will care how quickly indirection
>is resolved.

My fondest hope, and worstest fear, is that my VCR will start
borrowing CPU cycles on my toaster and microwave across the
house-area-network...

-george william herbert
gher...@crl.com


Anil Thomas Maliyekkel

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

Robert Harley (har...@pauillac.inria.fr) wrote:
: schu...@convex.hp.com (Richard A. Schumacher) writes:
: >>Luckily we still have good ol' assembly language!
: >
: >And, luckily, in about six years there will be only one

: >architecture in wide use, which nicely reduces the scope
: >of the problem...

: Hey, nice of a guy from HP to defend Alpha like that!

: =:-)

: .-. Robert...@inria.fr .-.
: / \ .-. .-. / \
: / \ / \ .-. _ .-. / \ / \
: / \ / \ / \ / \ / \ / \ / \
: / \ / \ / `-' `-' \ / \ / \
: \ / `-' `-' \ /
: `-' Linux + 500MHz Alpha + 256MB SDRAM = heaven `-'


I thought he was talking about Java chips.


Richard A. O'Keefe

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

Stefan Monnier <monnier+comp/arch/news/@tequila.cs.yale.edu> writes:

>I'm sorry, but I can't seem to remember that Pascal disallows aliasing (I
>honestly don't know about Algol). Furthermore, do you really believe that
>aliasing information would help for GUI-like code ?

Do you mean "code like that which USES a GUI" or
"code like that which IMPLEMENTS a GUI"?

I note that one of my favourite languages (Clean) has an elegant
GUI *using* interface which has complete freedom from aliasing.
Clean is a *pure* lazy functional language, there are no assignments
at all. The intention is that the compiler will _implement_
"make a record/array just like this one except for these few
differences" destructively, _but_ only when it is safe, and the
type system tells the compiler at compile time which references
are the only reference to such a datum. Let me offer an example:

:: Complex = { re :: Real, im :: Real }

conj_in_place : *Complex -> *Complex
conj_in_place c=:{im=i} = {c & im= ~i}

The * here in front of the type before the arrow means that a call to
this function is legal only if the compiler can prove that this is the
_only_ reference remaining to that record. The * in front of the type
after the arrow means that there will be no other references to the
result. Because the argument is 'dead' after the call, the compiler
is allowed to update it in place.

The uniqueness type system in Clean is surprisingly easy to use, and
it means that the compiler can be *certain* that aliasing is not taking
place.

(To be fair, since Clean is a higher order language with Haskell-ish
type classes, it has its own fair share of calling functions via pointers.)

As for code that *implements* a GUI, the bit-shuffling stuff that
gets done in the server would indeed benefit from a language that
allows more and better optimisation.

>Do you really think the problem is due to coding style (like the Xwin
>code) and to programming languages (like C) rather than to what is being
>coded (like a user interface rather than a numerical simulation) ?

Yep. I keep on pointing out to students here (with examples) that the
*really* big speedups come from high level reasoning that eliminates
operations completely, rather than from *manual* low-level fiddling.

Computer hardware has changed *immensely* in my lifetime. My intuition
about what is fast and what is slow keeps needing dramatic revision.
The one guideline that still works after 20 years is "keep your code as
clean as possible so that the compiler has a chance of figuring out
what's supposed to happen". (Well, there's another one: keep your
working set small; avoid going to slow memory if you can; if you can't
avoid that, avoid the disc; if you can't avoid disc, avoid tape; and
whatever you do, don't use paper tape.)

>> For efficient use of modern architectures, we need languages in which we
>> programmers can express what we want to say in a fashion that can be
>> converted to the sort of instruction stream that executes fast. And
>> the current trends are in exactly the opposite direction.

>Because the trend is to say "CPUs are fast enough, let's care about
>programmer's convenience".

Hmm. Current trends include C, C++, and Java, which are all of them
remarkably opposed to the programmer's convenience. (I have a hard time
believing how clumsy and verbose Java is.)

--
John Æneas Byron O'Keefe; 1921/02/04-1997/09/27; TLG,TLTA,BBTNOTL.
Richard A. O'Keefe; http://www.cs.rmit.edu.au/%7Eok; RMIT Comp.Sci.

Steven Correll

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

In article <610gbg$k...@deadzone.rsn.hp.com>,

Patrick F. McGehearty <pat...@convex.COM> wrote:
>On the C++ virtual function issue, what is needed is more training of C++
>programmers to avoid inefficient constructs except when they are really
>necessary, and to provide whatever language advisories possible to help the
>compiler along...

>Virtual functions that cannot be optimized to static addresses at compile or
>link time are the enemy of performance, and future processor speed
>improvements will only make situation more extreme.

Telling C++ programmers in 1997 to avoid virtual functions because they
make the program run slower is a lot like telling Fortran programmers
in 1970 to avoid subroutine calls for that same reason. They are all
too likely to follow your advice, and given the choice of maintaining a
ten-year-old C++ program with "virtual" missing from most but not all
of its functions, versus a twenty-year-old Fortran program which was
written as a single 100000 line subroutine, I might choose the Fortran
program--at least it's provably possible to restructure it based solely
on static information. But probably I'd just look for a new line of
work, like clearing stuck toilets.
--
Steven Correll == PO Box 66625, Scotts Valley, CA 95067 == s...@netcom.com

Richard A. O'Keefe

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

har...@pauillac.inria.fr (Robert Harley) writes:
>Well sure, if for some strange reason you wanted to turn a
>pointer-pointer-pointer into a pointer-pointer, C allows it via a cast.
>But why on earth would you want to do that? I can only assume that
>you meant 'double ** triMat' in which case assign it:
> double const*const*const constTriMat;
> constTriMat = triMat;

>or pass it as an argument:
> double myFunc(double const*const*const bla);
> myFunc(triMat);

Doesn't work.

There's an illuminating discussion in the rationale of the POSIX.1
standard about why they didn't chuck a const or two into the prototypes
for the exec() functions.

"Specifying two levels of const-qualification for the argv[] and envp[]
parameters for the exec functions may seem to be the natural choice,
given that these functions do not modify either the array of pointers
or the characters to which the function (sic) points, but this would
disallow existing correct code. Instead, only the array of pointers
is noted as constant. The table of assignment compatibility for
dst = src, derived from the C standard, summarizes the compatibility:

dst const char * const char
src char *[] char *[] const [] * const []
char *[] VALID VALID
const char *[] VALID VALID
char * const [] VALID
const char * const [] VALID

" Entries not marked VALID are not valid at all.

THIS is Nick Maclaren's point. Given
char **triMat;
the declaration
double const * const * const constTriMat = triMat;
is simply illegal.

It's illegal because there are plausible implementations where it won't
work. Anyone who thinks it does clearly hasn't bothered to read the
standard, or even a good textbook (not there are many).

>Whereas your beloved Fortran checks that the user's assumptions about
>lack of aliasing are correct, perhaps?

Yes, there are indeed static checkers available for Fortran that will
check such assumptions. I have not used any myself, but I am aware of
two.

Michael Williams

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

In article <6112t5$a...@valencia.rsn.hp.com>,

Richard A. Schumacher <schu...@convex.hp.com> wrote:
>>I wonder how many of those same processors will be found in toasters?
>
>And how many of those toasters will care how quickly indirection
>is resolved.

I think the toaster industry will be more interested in how many watts
the processor gives off. Maybe we should take StrongARM and respin it
in 1um ECL, running at 25V, to break into this lucrative market.

Mike.

Michael Williams

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

In article <610gbg$k...@deadzone.rsn.hp.com>,
Patrick F. McGehearty <pat...@convex.COM> wrote:
> That is,
>after a couple of days of computing, you don't know if the compiler is
>in an infinite loop or still computing. The "couple of days" is not an
>exaggeration. I've seen a 1000 line program take over a day due to a
>single unbounded non-linear optimization algorithm.

I once discovered a case where our compiler's CSE would take O(e^N)
to compile code, where N was the number of variables in the expression.

(The code was x5=x1+x2+x3+x4; x6=x2+x3+x4+x5; ... x4=xN+x1+x2+x3.)

I calculated that a simple 52 line program would take order of 250
years per line to compile. Needless to say, it got fixed.

Mike.

Larry Kilgallen

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

In article <610m6v$4l...@fido.asd.sgi.com>, nos...@oddhack.engr.sgi.com (Jon Leech) writes:
> In article <60uek7$fon$1...@lyra.csx.cam.ac.uk>,
> Nick Maclaren <nm...@cus.cam.ac.uk> wrote:
>>For efficient use of modern architectures, we need languages in which we
>>programmers can express what we want to say in a fashion that can be
>>converted to the sort of instruction stream that executes fast. And
>>the current trends are in exactly the opposite direction.
>
> Perhaps language designers are more concerned with efficient use of
> programmers than efficient use of hardware. What we need to do on today's
> hardware may be completely inappropriate 10 years down the road, while
> fundamental data structures remain pretty much the same.

Efficient usage of programmers is important to any business.
In order to couple that with efficient use of hardware it is
not sufficient to have a language in which:

It is Possible

to write an efficient program, but rather it is necessary to
have a language in which:

It is Natural

to write an efficient program. That is why methods of adding
extra performance-hinting language elements are a poor choice
compared to using a language which happens to naturally avoid
the aliasing problems which have been discussed.

Larry Kilgallen

Robert Harley

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

Richard A. O'Keefe (o...@goanna.cs.rmit.edu.au) wrote:
>Doesn't work.
>
>There's an illuminating discussion in the rationale of the POSIX.1
>standard [...]

>
>THIS is Nick Maclaren's point. Given
> char **triMat;
>the declaration
> double const * const * const constTriMat = triMat;
>is simply illegal.

Sure, and it's "simply illegal" to assume that integer arithmetic is
done in binary.


>It's illegal because there are plausible implementations where it won't work.

>[...ad hominem deleted...]

But guess what, there are other implementations that are not only
plausible but actually exist! And they have implementation-dependent
features left open by the standard, and extensions not specified by
the standard! Gasp!

Over here in the real world, we compile our programs with them:
compilers for which integers are in binary and type qualifiers
give warnings sometimes, not errors. If you are allergic to such a
warning, stick in a cast as Nick suggested.


>>Whereas your beloved Fortran checks that the user's assumptions about
>>lack of aliasing are correct, perhaps?
>
>Yes, there are indeed static checkers available for Fortran that will
>check such assumptions. I have not used any myself, but I am aware of
>two.

Ah, so in the case of C refer to the standard only, in order to
criticize limitations in it. But for other languages, dredge up extra
features you "are aware of" in implementations you've never used.

Everybody is fooled by this devastating ploy, so we'll now delete our
programs, apps and O.S.es, the transport protocols that will send this
message and so on, and rewrite them in a superior language.

-- Rob.

Alex Colvin

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

>Hmmm... I think Pascal had the right approach when it came to reference
>parameters: the effects of passing the same variable into a function
>via more than one VAR parameter were "undefined", which left the
>language implementor more scope for optimization.

Last time I mentioned this, I was informed that this restrictition had
been dropped.

Support for aliasing may be a good thing. Undefining the effect of
aliasing looks likely to lead to seriously obscure bugs, unless you can
check this behavior at compile time.

--
Alex Colvin
alex....@dartmouth.edu


Mike Smith

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

Nick Maclaren <nm...@cus.cam.ac.uk> wrote in article
<60uek7$fon$1...@lyra.csx.cam.ac.uk>...
>
> But what about (P = A->B ? (Q = P->C ? Q->D : P->E) : A->F)? To handle
> THAT, you need speculative pre-execution, which was dabbled with in the
> 1960s and 1970s, and quite rightly dropped as an abomination. And, if

Eh? Somebody better tell this to the people at Intel, AMD,
and Cyrix, all of which have processors that perform specul-
ative execution, IIRC.

--Mike Smith

Doug Siebert

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

gher...@crl3.crl.com (George Herbert) writes:

>My fondest hope, and worstest fear, is that my VCR will start
>borrowing CPU cycles on my toaster and microwave across the
>house-area-network...


If it runs "Windows for VCRs" it probably will need the extra CPU time
for the neural network software to detect and eliminate any ads from
Netscape or whoever so you don't see them and realize you have a choice :)

--
Douglas Siebert Director of Computing Facilities
douglas...@uiowa.edu Division of Mathematical Sciences, U of Iowa

He who lives in a glass house should not invite in he who is without sin.

Jon Leech

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

In article <1997Oct3.071424.1@eisner>,

Um, I was speaking to indirection and pointer-chasing, not aliasing. I
thought that would be clear by the reference to "fundamental data
structures".
Jon
__@/

Jon Leech

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

In article <61355n$1rf$1...@void.agames.com>,
Mike Albaugh <alb...@agames.com> wrote:
[much good stuff snipped]
> As for "future machines", it is unlikely that in my lifetime
>we will see machines which handsomely reward gratuitous complexity
>(unrelated to quirks of the machine) in source code :-)

But it's eminently possible that we could see a time when the horrid
latency problem we grapple with today is history.
Jon
__@/

Maynard Handley

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

In article <61355n$1rf$1...@void.agames.com>, alb...@agames.com (Mike
Albaugh) wrote:

> Jon Leech (nos...@oddhack.engr.sgi.com) wrote:
> : In article <60uek7$fon$1...@lyra.csx.cam.ac.uk>,


> : Nick Maclaren <nm...@cus.cam.ac.uk> wrote:
> : >For efficient use of modern architectures, we need languages in which we
> : >programmers can express what we want to say in a fashion that can be
> : >converted to the sort of instruction stream that executes fast. And
> : >the current trends are in exactly the opposite direction.
>
> : Perhaps language designers are more concerned with efficient use of
> : programmers than efficient use of hardware. What we need to do on today's
> : hardware may be completely inappropriate 10 years down the road, while
> : fundamental data structures remain pretty much the same.
>

> While I have some sympathy for this notion, in practice many of
> the same things being touted for "enhanced programmer productivity"
> and damned for "poor fit to hardware" are not even unalloyed blessings
> from a software point of view. They tend to make writing the "first cut"
> releatively painless, but to also to confound maintenance and extension.
> If you want a language in which trivial programs are easy to write,
> well, BASIC is not such a bad choice. If you want to write high-reliability
> programs with long life-cycles and low maintenance costs, well, none
> of the current languages are appropriate.
>

I think Mike put together a pretty coherent argument regarding issues that
have bugged me about discussions of language utility.
To give a simple example here, I once wrote a fairly large and complex
program (an MPEG-1 encoder/decoder) using the buzz-words of the time, C++,
objects and encapsulation. What I found after a while was that
understanding the source was far from trivial because
myObject->SomeFunction() could refer to any of a dozen functions. The
only way to figure out how any code worked was to walk through it in the
debugger and see exactly what sort of object myObject was and thus which
version of SomeFunction() was being called. Writing the code was easy,
understanding it was not.
(Note that this is in contrast to C. In C while one CAN easily write
incomprehensible programs, one can use one's head and not do so. Here
specifically using the language features that were supposed to boost my
productivity was the cause of the problems.)

Now maybe I'm a moron. Maybe my methods for slicing up an app into objects
were stupid compared to how OO advocates would do things. However I find
it hard to believe I did things so badly. My suspicion, rather, is that
this is a real problem (at least with C++, and given the way OO is
supposed to work, as I understand it, with pretty much any OO language).

In an ideal world, where one starts from some well-defined spec for a
well-defined app that's been written many times before and is understood,
maybe it shouldn't matter which version of SomeFunction() gets called as
long as its the one appropriate to myObject.
But in the real world at least some code changes over time. One adds
features. One does not know what one can do until one tries things. In
such a world hiding the specific function that's called is a recipe for
trouble.

Maynard

--
My opinion only

Andrew Glew

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

Y'all may be in a terminology battle:

Current usage:

Speculative execution = executing instructions before you know for certain that
you need to execute them, usually as a result of branch prediction (but also
assuming cache hit, no fault, etc., etc.) Usually speculative execution implies
only one predicted path.

Eager execution = executing both sides of some flakey branches.


Allen J. Baum

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

In article <611a1r$n...@crl3.crl.com>, gher...@crl3.crl.com (George
Herbert) wrote:

> Richard A. Schumacher <schu...@convex.hp.com> wrote:

> >>> And, luckily, in about six years there will be only one
> >>> architecture in wide use, which nicely reduces the scope
> >>> of the problem...
> >

> >>Ha ha hahahahahah.


> >>I wonder how many of those same processors will be found in toasters?
> >
> >And how many of those toasters will care how quickly indirection
> >is resolved.
>

> My fondest hope, and worstest fear, is that my VCR will start
> borrowing CPU cycles on my toaster and microwave across the
> house-area-network...

No way - all the cycles will be sucked up by the Java, I mean coffee maker.

--
***********************************************
* Allen J. Baum *
* Digital Semiconductor *
* 181 Lytton Ave. *
* Palo Alto, CA 94306 *
***********************************************

Stefan Monnier

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

Wilco Dijkstra <Wilco.D...@NOSPAMarm.com> writes:

> With global aliasing analysis the complete set of aliases can be
> calculated.

A problem that requires global analysis to be solved is a BIG problem that
should be gotten rid of instead of "solved". Especially because global analysis
is not possible in some settings (that I consider as desirable, by the way).


Stefan

Stefan Monnier

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

jha...@world.std.com (Joseph H Allen) writes:
> I'm still waiting for someone to try a hardware fix for the aliasing
> problem.

It's not so simple: knowledge of "no-aliasing" is not used just to spare a
reload at each use of some value, it's used for dependency analysis to allow
code reordering.


Stefan

Mike Albaugh

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

Jon Leech (nos...@oddhack.engr.sgi.com) wrote:
: In article <60uek7$fon$1...@lyra.csx.cam.ac.uk>,
: Nick Maclaren <nm...@cus.cam.ac.uk> wrote:
: >For efficient use of modern architectures, we need languages in which we
: >programmers can express what we want to say in a fashion that can be
: >converted to the sort of instruction stream that executes fast. And
: >the current trends are in exactly the opposite direction.

: Perhaps language designers are more concerned with efficient use of
: programmers than efficient use of hardware. What we need to do on today's
: hardware may be completely inappropriate 10 years down the road, while
: fundamental data structures remain pretty much the same.

While I have some sympathy for this notion, in practice many of
the same things being touted for "enhanced programmer productivity"
and damned for "poor fit to hardware" are not even unalloyed blessings
from a software point of view. They tend to make writing the "first cut"
releatively painless, but to also to confound maintenance and extension.
If you want a language in which trivial programs are easy to write,
well, BASIC is not such a bad choice. If you want to write high-reliability
programs with long life-cycles and low maintenance costs, well, none
of the current languages are appropriate.

No, I'm not saying that I "know the answer" and have designed
the be-all-and-end-all language. In fact, I strongly believe that such
does not exist. (check "The Psychology of Computer Programming", Weinberg,
in re: "Fischer's Fundamental Theorem") What I'm saying is that languages
need to walk a fine line between "Here is what I want to accomplish,
algorithm choice is up to you and I won't even give you hints at usage
patterns" and naked (not even real macros) assembly. If you think the
former is always a good thing, maybe I can get you an NDA and show you
the C++ code I inherited, which could only be compiled correctly by a single
user with a custom "include" path, and would only run on his particular
machine, although it appeared to compile correctly elsewhere. A language
wherein the entire history of the developer's environment must be known
and taken into account to understand a single, straightforward-looking
statement is _not_ robust, no matter how "abstract" (said with reverence)

So, what does this have to do with the current discussion?
If you look at it, there is an underlying current behind both.
The management of complexity is the key issue in design, including
software design. Techniques which _appear_ to reduce complexity
while in fact "diffusing" it outward and sweeping it under the rug
are _not_ your friend. Whether they require the CPU to chase a
chain of pointers (page-faulting on each) or "simply" require
a maintenance programmer to have more than ten fingers to keep
in "the book", they are a _bad_ idea. Enabling a fool to build
a 12-story building without knowing the structural properties
of the foundation (and with no way to check them) is _not_ a
benefit to society. I wish I could remember the source of the
quote: "A design is finished not when there is nothing more to
add, but when there is nothing left to remove".

Block-structured (Algolish) languages were a big step
forward, but lately I've come to believe that was mainly true
in an environment where all sources were easily available, so
that the actual function of the "hidden" stuff could be verified
and corrected if need be. Nowadays, the hidden stuff is far
larger than the visible stuff, has a far greater impact, and is
far more likely to be buggy. Not a promising situation.

As for "future machines", it is unlikely that in my lifetime
we will see machines which handsomely reward gratuitous complexity
(unrelated to quirks of the machine) in source code :-)

Mike
| alb...@agames.com, speaking only for myself

Nick Maclaren

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

In article <01bcd02b$58e67280$0200000a@kld_mcs>,

Please don't confuse preloading and branch prediction with speculative
execution. I gave a very simple example, which could be done solely by
preexecuting address calculation instructions but, in general, that is
not the case. I am pretty sure that none of the processors you mention
do any significant amount of speculative execution.


Nick Maclaren,
University of Cambridge Computer Laboratory,
New Museums Site, Pembroke Street, Cambridge CB2 3QG, England.
Email: nm...@cam.ac.uk
Tel.: +44 1223 334761 Fax: +44 1223 334679

Rob Barris

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

In article <613p2b$g1r$1...@lyra.csx.cam.ac.uk>, nm...@cus.cam.ac.uk (Nick
Maclaren) wrote:

> In article <01bcd02b$58e67280$0200000a@kld_mcs>,
> Mike Smith <kld_m...@earthlink.nospam.net> wrote:
> >Nick Maclaren <nm...@cus.cam.ac.uk> wrote in article
> ><60uek7$fon$1...@lyra.csx.cam.ac.uk>...
> >>
> >> But what about (P = A->B ? (Q = P->C ? Q->D : P->E) : A->F)? To handle
> >> THAT, you need speculative pre-execution, which was dabbled with in the
> >> 1960s and 1970s, and quite rightly dropped as an abomination. And, if
> >
> >Eh? Somebody better tell this to the people at Intel, AMD,
> >and Cyrix, all of which have processors that perform specul-
> >ative execution, IIRC.
>
> Please don't confuse preloading and branch prediction with speculative
> execution. I gave a very simple example, which could be done solely by
> preexecuting address calculation instructions but, in general, that is
> not the case. I am pretty sure that none of the processors you mention
> do any significant amount of speculative execution.

I think what goes on inside processors such as PentiumPro / Pentium-II
as well as PowerPC-604 (and several others, Alpha included? dunno) very
much qualifies as "speculative execution", perhaps it would help if you
shared your definitions of "speculative" and "significant".

The fact that these processors employ register renaming and are able to
generate results speculatively and later choose whether or not to make
those results part of the visible state certainly sounds like "speculative
execution" to me.

Rob Barris Quicksilver Software Inc. rba...@quicksilver.com
* Opinions expressed not necessarily those of my employer *

Jason....@anu.edu.au

unread,
Oct 3, 1997, 3:00:00 AM10/3/97
to

In <3433AA...@NOSPAMarm.com> Wilco Dijkstra <Wilco.D...@NOSPAMarm.com> writes:

>It's my belief that almost all aliasing is avoidable at low cost during
>compilation. For example most aliasing is due to function calls. With
>global
>aliasing analysis the complete set of aliases can be calculated. Of
>course
>a good language which avoids most of the aliasing problems (eg. the ADA
>parameter
>passing mechanism) would help!

Hmmm... I think Pascal had the right approach when it came to reference
parameters: the effects of passing the same variable into a function
via more than one VAR parameter were "undefined", which left the

language implementor more scope for optimization. The mechanism
was perhaps less important than the simple fact that there was no
guarantee that reference parameters would work correctly in the presence
of aliasing, so you didn't do it.

Conclusion: C/C++ would be nicer languages if they had much more undefined
behaviour. :-) :-)

(Not having any C++ standards around, I ought to ask:
Are C++ references guaranteed to work in the presence of aliasing?)
_____________________________________________________________________________
Jason Ozolins (Jason....@anu.edu.au) | Put your hands over your eyes.
Teaching and Learning Technology Support Unit| Jump out of the plane.
Australian National University | There is no pilot.
My opinions, not ANU's. | You are not alone.
--

H. Eckert

unread,
Oct 4, 1997, 3:00:00 AM10/4/97
to

nm...@cus.cam.ac.uk (Nick Maclaren):
> I am, unfortunately, not joking when I say that there are critical paths
> in the X Windowing System etc. that need that sort of optimisation for
> efficiency.

If you define "efficiency" as getting the maximum possible data
throughput from a given machine, you're probably right since a general
purpose machine cannot suit every task equally well. But what does
"efficiency" mean to a GUI ? It is certainly nice to have a machine
that does every graphics operation faster than you can visibly follow
but that alone doesn't help a lot...

> Most foully, the paths can run ahead at different rates, and you
> STILL have to get interruption and recovery right :-(

So you need some synchronisation points where you lose some possible
CPU cycles in NOPs. Some friction is always present; the task for
optimization is to reduce them as much as possible...

> 2) EITHER Intel have reinvented the "Model 91 pipe-line drain"
> instruction under another name OR I could write a valid C program that
> would execute incorrectly.

Do you care to elaborate ? IMHO this would be a sign for a broken
compile which produces incorrect code.

> My point about languages is that it becomes possible to implement reliable
> speculative execution relatively easily, IF you can assume that certain
> constraints are met. In particular, you know that the operands may be
> invalid, but they will not change between the preloading and the final use.
> And that is PRECISELY what you do not get with C or any of its derivatives.

Ok, so you want to have a language where you can put additional
optimization hints into the code. This is fine but pretty useless
unless the programmer knows a lot more about performance implications
of certain high-level instructions. This is the school of "you need
to know how the computer works from inside to produce good code" to
which I agree to some extent. This is why assembly language is still
a part in a university computer science degree.

Greetings,
Ripley
--
Ach komm, ich denke schon, daß die Raver sehr tolerant sind. Gerade bei dieser
Love Parade. Du mußt dir nur mal ansehen, wer da sein wird: Westbam, Marusha,
Blümchen, ein Wagen vom Marienhof ... Also, das würde meine Toleranzgrenze weit
überschreiten. (M. Friese in bln.termine, <5pq8le$g1s$1...@news.cs.tu-berlin.de>)

Nick Maclaren

unread,
Oct 4, 1997, 3:00:00 AM10/4/97
to

In article <rbarris-ya0232800...@206.82.216.1>,

Rob Barris <rba...@quicksilver.com> wrote:
>
> I think what goes on inside processors such as PentiumPro / Pentium-II
>as well as PowerPC-604 (and several others, Alpha included? dunno) very
>much qualifies as "speculative execution", perhaps it would help if you
>shared your definitions of "speculative" and "significant".

Yes, I think that Andrew Glew and you are right, and there is considerable
terminological confusion. A paragraph that I left out of my previous
posting said that there was no clear boundary between simple operand
pre-loading and speculative/eager/etc/ execution in their fullest sense.
However, I was referring to actions like the following:

Preloading operands based on,
address calculations based on,
both paths of a conditional branch based on,
some tests based on,
preloaded operands based on,
address calculations based on,
both paths of a conditional branch based on,
a test that has not yet been evaluated.

I am, unfortunately, not joking when I say that there are critical paths
in the X Windowing System etc. that need that sort of optimisation for

efficiency. And remember that the address calculations will, in general,
need both addition and multiplication units. Most foully, the paths can


run ahead at different rates, and you STILL have to get interruption
and recovery right :-(

Naturally, I could be wrong, but I understood the Pentium XXX designs to
include only very limited speculative execution facilities. If I am
wrong, I am pretty certain of the following:

1) Interrupt handling has been broken again (though this is getting
be an Intel tradition, and they have got it codified in standards).

2) EITHER Intel have reinvented the "Model 91 pipe-line drain"
instruction under another name OR I could write a valid C program that
would execute incorrectly.

My point about languages is that it becomes possible to implement reliable


speculative execution relatively easily, IF you can assume that certain
constraints are met. In particular, you know that the operands may be
invalid, but they will not change between the preloading and the final use.
And that is PRECISELY what you do not get with C or any of its derivatives.

Andy "Krazy" Glew

unread,
Oct 5, 1997, 3:00:00 AM10/5/97
to Nick Maclaren

> I am, unfortunately, not joking when I say that there are critical paths
> in the X Windowing System etc. that need that sort of optimisation for
> efficiency. And remember that the address calculations will, in general,
> need both addition and multiplication units. Most foully, the paths can
> run ahead at different rates, and you STILL have to get interruption
> and recovery right :-(

Actually, precise interrupts seem to be a solved problem. There is some
residual advantage to imprecise interrupts and exceptions, as in Alpha, but it
mainly relates to buffer sizes. Further applications of the exception hierarchy
(see the earlier thread on out-of-order retirement) narrow it further, and, in
fact, seem to surpass exposed imprecise interrupts.

> Naturally, I could be wrong, but I understood the Pentium XXX designs to
> include only very limited speculative execution facilities. If I am
> wrong, I am pretty certain of the following:

Depends on what you mean by limited:

The Pentium Pro and Pentium II have a ReOrder Buffer of 40 instructions, a load
buffer of 16 (?), and a store buffer of 12. The reservation station size is 20.

Thus, 20 instructions can be waiting to execute at any time. At least an
additional 20 may have executed, and be waiting for the oldest to execute. In
addition, there are many more pipestages of instructions in the process of
being fetched.

There is no limit on how many predicted branches may be speculated past. I.e.
every entry in the reorder buffer, and all the instructions in earlier
pipestages, may be correctly predicted branches.

Cache misses can be performed speculatively: L1 I and D, L2. TLB misses can be
performed speculatively.

Stores are not performed speculatively. Nor are accesses to memory mapped I/O
and ordinary I/O spaces.

Is this "significant" speculative execution? I think that it is under any
reasonable definition of significant. It is not, of course, eager execution.
What's left is increasing the size of buffers, and tuning the algorithms to be
more efficient of resources.

The only significant restrictions on speculative execution are

(1) I/O not being done speculatively - which, by the way, I think can now be
done away with.

(2) The single Instruction Pointer (Program Counter) for instruction fetch.

and maybe

(3) The Pentium Pro family's mispredicted branch recovery is not so efficient.
If code is divided into segments between mispredicted branches, P6 can only
have one such segment under execution at any time (mo9dulo a limited amount of
overlap). This wasn't a limitation in our pipeline because the latencies
worked out right, and it should be obvious how it can be fixed.


> 1) Interrupt handling has been broken again (though this is getting
> be an Intel tradition, and they have got it codified in standards).

Depends. The APIC did break some interrupt code, but this was an APIC(Advanced
Programmable Interrupt Controller) issue, introduced in the i486 generation and
standardized in the Pentium (P54C) generation. Not an issue with the dynamic
execution model of the P6 (out-of-order, speculative).

> 2) EITHER Intel have reinvented the "Model 91 pipe-line drain"
> instruction under another name OR I could write a valid C program that
> would execute incorrectly.

Nope. "Serializing instructions" do exist. But almost nobody uses them.

I don't think that you can write a valid uniprocessor C program that will
break.

Gianni Mariani

unread,
Oct 5, 1997, 3:00:00 AM10/5/97
to

By this definition the MIPS R10K and the HP PA8XXX do speculative
execution. To some extent however, the MIPS compilers will
emit "eager" code in the sense that in some cases, both sides
of an if () statement are computed if scheduling of instructions
that way are an advantage.

--

_ ` _ ` Webtv Service Globalization
/ \ / / \ /-- /-- /
/ // / / / / / / / We mean to be understood
\_/ \ \_ \/ /_/ /_/ o
/ /
\_/ (415) 614 6043 Opinions mine etc ...

Richard A. O'Keefe

unread,
Oct 6, 1997, 3:00:00 AM10/6/97
to

har...@pauillac.inria.fr (Robert Harley) writes:

>Richard A. O'Keefe (o...@goanna.cs.rmit.edu.au) wrote:
>>Doesn't work.
>>
>>There's an illuminating discussion in the rationale of the POSIX.1
>>standard [...]
>>
>>THIS is Nick Maclaren's point. Given
>> char **triMat;
>>the declaration
>> double const * const * const constTriMat = triMat;
>>is simply illegal.

>Sure, and it's "simply illegal" to assume that integer arithmetic is
>done in binary.

This is completely wrong, and betrays a total lack of familiarity
with the history or content of the C standard.

The C standard *requires* binary integer arithmetic.
C89 was carefully worded to allow any of
sign-and-magnitude (like Unisys A series machines)
ones complement (like Unisys 2200 machines)
twos complement
But the C standard is quite clear that all integer arithmetic results
must be as if one of these systems were in use.

In contrast, the mismatched pointer example is one where the
compiler is not merely not required to make it work, the
compiler is *REQUIRED* to issue a diagnostic.

>But guess what, there are other implementations that are not only
>plausible but actually exist!

So? The C standardisers were not interested in limiting C to only
the machines where your prejudices happen to be valid. The C9X
revisers have no interest in such limitations either.

>And they have implementation-dependent
>features left open by the standard, and extensions not specified by
>the standard! Gasp!

Yeah, but mixing up pointers to differently qualified objects is NOT
only of those things left open.

>Over here in the real world, we compile our programs with them:
>compilers for which integers are in binary

which the C standard clearly and explicitly requires

>and type qualifiers
>give warnings sometimes, not errors.

Then you are not using standard C compilers. That is your privilege.

>If you are allergic to such a
>warning, stick in a cast as Nick suggested.

No. The diagnostics are required because a cast _wouldn't_ fix the
problem.

>>>Whereas your beloved Fortran checks that the user's assumptions about
>>>lack of aliasing are correct, perhaps?
>>
>>Yes, there are indeed static checkers available for Fortran that will
>>check such assumptions. I have not used any myself, but I am aware of
>>two.

>Ah, so in the case of C refer to the standard only, in order to
>criticize limitations in it.

Are you serious?

I have not criticised any limitation in the C standard at all.
I have pointed out that the kind of Stupid Pointer Tricks (comp.std.c
idiom) you want to play are explicitly forbidden by the standard.
I THINK THIS IS RIGHT. I am not criticising the standard there,
I am approving it.

>But for other languages, dredge up extra
>features you "are aware of" in implementations you've never used.

Those checkers enforce aspects of the Fortran standard.
C compilers are required to enforce aspects of the C standard.
As for me, I use the best static checkers for C that I can get
my hands on, and if I did as much Fortran coding as I do C,
I would make sure to use the best static checkers for Fortran
that I could get my hands on.


>Everybody is fooled by this devastating ploy, so we'll now delete our
>programs, apps and O.S.es, the transport protocols that will send this
>message and so on, and rewrite them in a superior language.

Oh grow up.

Where did I say C was inferior?
On *one specific point*, I provided factual support for Nick Maclaren's
claim. That has absolutely no bearing on whether one language is
better than the other *in general* or for any particular *nominated
purpose*.

C was not designed for high performance numeric computing, so it is
no disgrace that it isn't _quite_ as well suited to that task as
Fortran, which _was_ designed for high performance numeric computing.

Fortran was not designed for implementing operating systems, so it is
no disgrace that it (F95) isn't _quite_ as well suited to that task
as C, which _was_ designed for implementing operating systems.

C is good enough to stand on its own merits without people telling
lies about it.

Richard A. O'Keefe

unread,
Oct 6, 1997, 3:00:00 AM10/6/97
to

m...@coos.dartmouth.edu (Alex Colvin) writes:

>>Hmmm... I think Pascal had the right approach when it came to reference
>>parameters: the effects of passing the same variable into a function
>>via more than one VAR parameter were "undefined", which left the
>>language implementor more scope for optimization.

>Last time I mentioned this, I was informed that this restrictition had
>been dropped.

ISO 10206:1990, Extended Pascal has four kinds of parameters:

a: integer; pass by copy-in, modifiable
var b: integer; pass by reference, modifiable
protected c: integer; pass by copy-in, not modifiable
protected var d: integer; pass by reference, not modifiable

The confusing thing (Note 2, p55) is that a protected variable
parameter (protected var) may not be changed, but it may change.
This is the only mention of 'aliasing' that I can find in that
standard.

Of course this kind of thing is very upsetting to optimisers.
It means that in the sequence of statements

x := d;
<FOO!>
x := d;

where <FOO!> contains no other assignments to x, if <FOO!> contains
any assignments to non-local integer variables or procedure calls,
x will have to be reloaded, even tough _this_ procedure isn't allowed
to modify d directly. This is even more upsetting to human reasoning
than to optimisers.


>Support for aliasing may be a good thing. Undefining the effect of
>aliasing looks likely to lead to seriously obscure bugs, unless you can
>check this behavior at compile time.

Consider this:

C If this were F90, I'd have INTENT OUT stated explicitly
SUBROUTINE BREAK(X, Y)
REAL X, Y
X = 1.0
Y = 2.0
END

PROGRAM MAIN
REAL A(10)
INTEGER I, J

READ * I, J
CALL BREAK(A(I), A(J))
PRINT *, A(I), A(J)
END

What is the output of this program?
How is the compiler to check at compile time that I.NE.J?

It is important to distinguish between case where you WANT to support
aliasing (which is the expensive option) and cases where you DON'T
(which is the high performance option).

Ada 95 did the right thing here, requiring you to explicitly declare
any potentially aliased variables.

Aliasing is very confusing for people to deal with.
Of course, as soon as you have objects as in Smalltalk or Common Lisp or
Java, you have precisely the same kind of problem when the same object
reference can be passed (the _reference_ is passed by copy-in) to a
procedure.

Nick Maclaren

unread,
Oct 6, 1997, 3:00:00 AM10/6/97
to

In article <3437F16A...@cs.wisc.edu>,

"Andy \"Krazy\" Glew" <gl...@cs.wisc.edu> writes:
|>
|> The Pentium Pro and Pentium II have a ReOrder Buffer of 40 instructions, a load
|> buffer of 16 (?), and a store buffer of 12. The reservation station size is 20.
|>
|> There is no limit on how many predicted branches may be speculated past. I.e.
|> every entry in the reorder buffer, and all the instructions in earlier
|> pipestages, may be correctly predicted branches.
|>
|> Is this "significant" speculative execution? I think that it is under any
|> reasonable definition of significant. It is not, of course, eager execution.
|> What's left is increasing the size of buffers, and tuning the algorithms to be
|> more efficient of resources.

The traditional meaning of "speculative" included what people have now
separated off as "eager" (assuming that usage is now standard), and is
what I was referring to. In the cases that I was referring to, it is
essential to execute ahead of BOTH branches, and I was therefore
referring to what you call "eager execution". I made this quite clear
in the example immediately preceding the first paragraph you quoted.

The reason is that the same code is used for two objects, which are
likely to need different branch predictions. To get this "right" for
single-path speculative execution, you need to do branch prediction on
the basis of the DATA location as well as the CODE location.

To sidetrack a little, rip...@nostromo.in-berlin.de (H. Eckert) writes:

|> If you define "efficiency" as getting the maximum possible data
|> throughput from a given machine, you're probably right since a general
|> purpose machine cannot suit every task equally well. But what does
|> "efficiency" mean to a GUI ? It is certainly nice to have a machine
|> that does every graphics operation faster than you can visibly follow
|> but that alone doesn't help a lot...

Unfortunately, most GUIs (and definitely including anything based on X)
make timing assumptions about the responses. There are a very large
number of X applications where a 0.01 second delay will cause them to
behave incorrectly - and, in fact, I am using one right now (xrn)!
These may be bugs in the application, but are actually almost unfixable,
as they are really due to design errors in X.

I have had similar effects in OS/2, Windows 3.1 and MacOS. I have never
used Microsoft's latest utterings, but have reason to believe that the
same would apply.

|> Is this "significant" speculative execution? I think that it is under any
|> reasonable definition of significant. It is not, of course, eager execution.
|> What's left is increasing the size of buffers, and tuning the algorithms to be
|> more efficient of resources.

Yes, it is. But not under any reasonable definition of "speculative"!
Using your terminology, consider my posting to refer to "significant
eager execution".

bri...@bit3.com

unread,
Oct 6, 1997, 3:00:00 AM10/6/97
to

In article <handleym-031...@handma.apple.com>,

Maynard Handley <hand...@apple.com> wrote:
>In article <61355n$1rf$1...@void.agames.com>, alb...@agames.com (Mike
>Albaugh) wrote:
>
>I think Mike put together a pretty coherent argument regarding issues that
>have bugged me about discussions of language utility.
>To give a simple example here, I once wrote a fairly large and complex
>program (an MPEG-1 encoder/decoder) using the buzz-words of the time, C++,
>objects and encapsulation. What I found after a while was that
>understanding the source was far from trivial because
> myObject->SomeFunction() could refer to any of a dozen functions. The
>only way to figure out how any code worked was to walk through it in the
>debugger and see exactly what sort of object myObject was and thus which
>version of SomeFunction() was being called. Writing the code was easy,
>understanding it was not.
>(Note that this is in contrast to C. In C while one CAN easily write
>incomprehensible programs, one can use one's head and not do so. Here
>specifically using the language features that were supposed to boost my
>productivity was the cause of the problems.)
>
>Now maybe I'm a moron. Maybe my methods for slicing up an app into objects
>were stupid compared to how OO advocates would do things. However I find
>it hard to believe I did things so badly. My suspicion, rather, is that
>this is a real problem (at least with C++, and given the way OO is
>supposed to work, as I understand it, with pretty much any OO language).

This is a common symptom of a bad object design- you shouldn't case _which_
function gets called- just so long as it applies the correct concept to
the object. A good example of a properly overridden function would be
open(). When applied to objects representing files, network connects, or
fifo pipes, the function would execute radically different code- but the
code calling doesn't care, so long as it opens the object (whatever it is)
and makes it ready for reading or writing. Several different implementations,
but one concept.

Programming languages do not, can not, prevent a programmer from writting
a bad program. This is a problem, not only with OO languages, but with
all languages. OO languages just have a (slightly) different concept of
"good" then procedural languages.

>
>In an ideal world, where one starts from some well-defined spec for a
>well-defined app that's been written many times before and is understood,
>maybe it shouldn't matter which version of SomeFunction() gets called as
>long as its the one appropriate to myObject.
>But in the real world at least some code changes over time. One adds
>features. One does not know what one can do until one tries things. In
>such a world hiding the specific function that's called is a recipe for
>trouble.

Functions get rewritten, and have different behaviors after being rewritten,
which breaks code. You violate the assumptions the code calling your code
makes, and you have problems. In procedural languages as well as OO
languages. Not allowing the programmer to generalize leads to reinventing
the wheel everytime you turn around. It also limits the code- my code
may conceptually work with files, pipes, and network connections, but since
I called file_open() and not pipe_open() or network_open(), it doesn't.
To make it work with pipes and networks, you need to have three nearly
identical copies of the code- one calling each set of functions. And be
sure to remember if you change one peice of code to also change the other
two! Or is it three? Maybe four?...

Which is why I say: you can take my virtual functions when you pry them
from cold, dead fingers. :-)

Brian
I don't speak for Bit 3

>
>Maynard
>
>--
>My opinion only


--
"NT and security should never be used in the same breath."
- Winn Schwartau, EETimes #972, 22 September 1997 pp.96

-Brian

Graham Hughes

unread,
Oct 6, 1997, 3:00:00 AM10/6/97
to

-----BEGIN PGP SIGNED MESSAGE-----

>>>>> "Richard" == Richard A O'Keefe <o...@goanna.cs.rmit.edu.au> writes:

Richard> I think perhaps it is time instead that _you_ took your
Richard> blinkers off. Fortran is not my favourite language by a long
Richard> way, but it has come a *long* way in the last 20 years, and
Richard> is by most criteria a more 'modern' language than C.

Well, that's partly because F90/95 really isn't the same language as
F77.

Certain language communities evolve in different ways; the Wirth crowd
tends to invent new languages with different names but similar M.O.'s
(Pascal->Modula 2->Ada|Oberon), LISP reinvents itself with different
semantics but almost exactly the same syntax, and Fortran invents new
languages and calls them the same thing. Note that both Fortran and
LISP are longer lived than anything out there, so this may not be an
entirely bad thing....

What was the epigram? ``I don't know what language we'll be using in
2030, but I know it'll be called FORTRAN?''
- --
Graham Hughes <graham...@resnet.ucsb.edu> / Never attribute to malice that
MIME OK, PGP preferred / which can adequately be
http://A31147.resnet.ucsb.edu/~graham/ / explained by stupidity.
/ -- Hanlon's Razor

-----BEGIN PGP SIGNATURE-----
Version: 2.6.3
Charset: noconv
Comment: Processed by Mailcrypt 3.4, an Emacs/PGP interface

iQCVAwUBNDnLziqNPSINiVE5AQEp1wP+LTa1OFoS+oBQxei5lQnu7RUUDUqe2kbt
FeNcuZyri5IL8D9qvjdG9hssfly0FbAQ5Pe34Cvy2MIav4FVc3XTtdtN1UTGVsoc
pKc2nS06sjXiFXlrGmWo8XEliaZy3piIKGIOSdcjaUPFeNxzJVfq/8BHSdjQI0QB
oo2J/Ey6XDM=
=Bd4P
-----END PGP SIGNATURE-----

boo

unread,
Oct 6, 1997, 3:00:00 AM10/6/97
to

Richard A. O'Keefe wrote:
> C was not designed for high performance numeric computing, so it is
> no disgrace that it isn't _quite_ as well suited to that task as
> Fortran, which _was_ designed for high performance numeric computing.
>
> Fortran was not designed for implementing operating systems, so it is
> no disgrace that it (F95) isn't _quite_ as well suited to that task
> as C, which _was_ designed for implementing operating systems.

Oh puh-leeze.

Who cares what it was designed for? C is far more powerful than
Fortran, especially for the kind of work that some of us
do in the real world. If Rob can show me how to do something
with a particular compiler that speeds up what I need done,
I am listening to him, not to you.

Time to dismount that Ivory Tower of yours.

Nick Maclaren

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

In article <61cqkc$9...@news-rocq.inria.fr>,
har...@pauillac.inria.fr (Robert Harley) writes:
|> o...@goanna.cs.rmit.edu.au (Richard A. O'Keefe) writes:
|> >There is precisely *one* construction present in C which is not
|> >supported as well or better by Fortran, and that is pointer arithmetic.
|>
|> That sounds suspiciously like "There is precisely *one* thing present
|> on car X which is not on car Y, and that is wheels".

Are you aware of the facts (a) that unchecked pointer arithmetic
is the source of most of the foulest bugs in existing software,
(b) that many software engineering experts believe that it is
impossible to use safely and (c) that a vast amount of experience
shows that it can usually be omitted with little hassle to the
programmer?

Your analogy is more than a little inappropriate.

|> Hey, if some people are under the delusion that C sucks but
|> Fortran/Lisp/etc are great, that's just fine. But please resist the
|> urge to barge into every thread and yammer on about it.

The original question was "what effect do you think that the amazing
new features will have", and I answered "very little, because of the
loathesome coding style (and language properties) of many standard
applications." You may regard that as irrelevant, but you would be
wrong.

Martin Rodgers

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

Serge Pachkovsky wheezed these wise words:

> Who cares what it was designed for? Fortran is far more powerful than
> C, especially for the kind of work that some of us do in the real world.

Which egg orientation do you prefer?
--
<URL:http://www.wildcard.demon.co.uk/> You can never browse enough
Please note: my email address is gubbish
assert(got(coat)).

Martin Rodgers

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

Robert Harley wheezed these wise words:

> Hey, if some people are under the delusion that C sucks but
> Fortran/Lisp/etc are great, that's just fine. But please resist the
> urge to barge into every thread and yammer on about it.

Let's ask the C/C++ if they could please follow the same advise. When
I started reading UseNet, for the first 3 years I saw massed attacks
from C/C++ programmers, massively crossposting, attacking Lisp.

If C++ is such a fine language, then do so many of its supporters
express their support by "dumping" on other languages? What are they
trying to prove? Ditto for C.

Now, can anyone tell me what was significant about 1995 that caused
those massed attacks (mentioned above) to cease? I think that the form
of those attacks may offer a wee clue: garbage collection. Is there no
stronger argument against Lisp than GC? It appears not, judging by the
post-95 nature of attacks on Lisp.

Surely there're better arguments for using C/C++ than kicking some
other language? I appreciate good intentions, but I'm puzzled by the
frequency with which I find people on UseNet (and elsewhere) trying to
convince a group of people that they shouldn't be using the tools that
they are in fact using. Even if we do all need saving from ourselves,
could there not be a better way of doing it than the "hard sell"?

Richard's comments look to me like a reasonable response to a hard
sell. It's a popular one, and it goes like this, "Thanks for your
time, but I'm perfectly happy with what I'm currently using." Some
people go as far as hanging up the phone or closing the door, rather
than continue an argument that has no end.

Of course, I could be wrong about Richard's comments. I can't
speak for him! I'll get my coat.

boo

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

Richard A. O'Keefe wrote:
> I strongly suspect that you don't know what modern Fortran is like.

> There is precisely *one* construction present in C which is not
> supported as well or better by Fortran, and that is pointer arithmetic.

And you think this is a minor difference?

>
> Modules? Fortran has them, C hasn't.
> Overloading? Fortran has it, C hasn't.
> Records? Fortran and C both have them.
> Pointers? Fortran and C both have them.
> Dynamic allocation? Fortran and C both have them.
> Recursive procedures? Fortran and C both have them.
> Nested procedures? Fortran has one level of nesting, C has none.
> Strings? Fortran has them, C hasn't.
> Character-oriented IO? Fortran and C both have it.
> Record-oriented IO? Fortran has it, C hasn't.
> User defined operators? Fortran has them, C hasn't.
> Free-format source form? Fortran and C both have it.
> Bitwise operations? Fortran and C both have them.
> Control over record layout? Neither language supports this. (Yes, I know
> about bitfields, and I also know how extremely weak are the
> guarantees you get about them.)

Ever heard of C++?

Hahaha! Now I get it! This is a troll! Nobody could possibly
think this is a serious reply.

What I do with my C++/C/assembly combination, you cannot possibly
do with Fortran.... as quickly and easily as I do. I work in the
virtual reality field, with parallel algorithms, gigabyte databases
and realtime 3D OpenGL animation, and some serious number crunching
too. The most heavily used routines I code in C with a little
inline assembly (which is quick enough to do for any CPU once you
have a little practice). What do you do? Simple number crunching
of the style done in the 60's and 70's? Sure, Fortran will suffice!

> I think perhaps it is time instead that _you_ took your blinkers off.
> Fortran is not my favourite language by a long way, but it has come
> a *long* way in the last 20 years, and is by most criteria a more
> 'modern' language than C.

I happen to have used modern variants of Fortran, I'll have you
know. Fortran has its place, certainly. It is possible to produce
very efficient code with a high level language that is easy to
master. But outside the realm of brain-dead number crunching, it
cannot compare with efficient C++/C.

Martin Rodgers

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

Greg Weeks wheezed these wise words:

> |> Which egg orientation do you prefer?
>

> Big Endian.

Anyone unsure what we're talking about should read:
<URL:http://www.jaffebros.com/lee/gulliver/bk1/chap1-4.html>

Greg Weeks

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

In article <MPG.ea44c489...@news.demon.co.uk>, mcr@this_email_address_intentionally_left_crap_wildcard.demon.co.uk (Martin Rodgers) writes:
|> Serge Pachkovsky wheezed these wise words:
|>
|> > Who cares what it was designed for? Fortran is far more powerful than
|> > C, especially for the kind of work that some of us do in the real world.
|>
|> Which egg orientation do you prefer?

Big Endian.

Greg Weeks

Carlie Coats

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

In article <34392537...@tonys.ecology.umn.edu>,

boo <b...@tonys.ecology.umn.edu> wrote:
| Richard A. O'Keefe wrote:

This from an advocate of a language that makes arrays second class
citizens? C can't even do the equivalent of

SUBROUTINE S( M, N, P, A )
INTEGER, INTENT( IN ): M, N, P
REAL, INTENT( IN ): A( M, N, P )
...

nor can C++, for that matter. Fortran can. So can lisp.

_Whose_ ivory tower?

William Clodius

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

Richard A. O'Keefe wrote:
> <snip>

> >Who cares what it was designed for? C is far more powerful than
> >Fortran, especially for the kind of work that some of us
> >do in the real world.
>
> I strongly suspect that you don't know what modern Fortran is like.
> There is precisely *one* construction present in C which is not
> supported as well or better by Fortran, and that is pointer arithmetic.
> <snip>

F90 also lacks pointers to procedures. It lacks stream based I/O. (There
are contexts where non-advancing I/O is not sufficient.)

--

William B. Clodius Phone: (505)-665-9370
Los Alamos Nat. Lab., NIS-2 FAX: (505)-667-3815
PO Box 1663, MS-C323 Group office: (505)-667-5776
Los Alamos, NM 87545 Email: wclo...@lanl.gov

Shankar Unni

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

Robert Harley wrote:

> o...@goanna.cs.rmit.edu.au (Richard A. O'Keefe) writes:

> >There is precisely *one* construction present in C which is not
> >supported as well or better by Fortran, and that is pointer arithmetic.
>

> That sounds suspiciously like "There is precisely *one* thing present
> on car X which is not on car Y, and that is wheels".

Yo, Robert (and you, too, "b...@somethingorother.umn.edu"):

Richard and Nick have probably forgotten more C than you'll ever learn
in your lifetime, OK? Give it a rest; listen and learn. There's a few
lifetimes of actual working experience behind these opinions..

We see this sort of crazed C cheerleading about the start of every
semester. We've all programmed in C (and in C++). We are painfully aware
of almost every strength and weakness in both C and C++.

No one's slamming C as the worst of all languages. I'll still write an
OS, if I have to, in C or C++, but after breaking my ankles on all the
gopher holes, I'll be real careful where I put my foot the next time.

But if I'm doing numerical analysis or linear regressions, I'll pick
Fortran almost any day. No, I won't do it in C++, even if I can make a
"Matrix" class ("look, Mommy! I made a Matrix!"). If you don't
understand why, you need to take a *real* "Numerical Analysis with C++"
class.

--
Shankar Unni Powertel Global, Inc.
(650) 259-1700 sha...@powertelglobal.com
(650) 259-1702 (fax) sha...@webnexus.com

William Clodius

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

boo wrote:
>
> Richard A. O'Keefe wrote:
> > I strongly suspect that you don't know what modern Fortran is like.
> > There is precisely *one* construction present in C which is not
> > supported as well or better by Fortran, and that is pointer arithmetic.
>
> And you think this is a minor difference?
>

I suspect he considers it to be a major advantage, for Fortran.

> <snip>


>
> What I do with my C++/C/assembly combination, you cannot possibly
> do with Fortran.... as quickly and easily as I do. I work in the
> virtual reality field, with parallel algorithms, gigabyte databases
> and realtime 3D OpenGL animation, and some serious number crunching
> too. The most heavily used routines I code in C with a little
> inline assembly (which is quick enough to do for any CPU once you
> have a little practice). What do you do? Simple number crunching
> of the style done in the 60's and 70's? Sure, Fortran will suffice!

> <snip>

It is relatively easy to find out what Richard O'Keefe does, check his
news postings using Dejanews (http://www2.dejanews.com/), I suspect you
might find his postings to comp.std.c to be of particular interest, also
check his web page (http://www.cs.rmit.edu.au/%7Eok), note Richard
should remove the trailing semicolon from his signature. While Richard
has done some number crunching, his interests primarilly lies elsewhere:
teaching, algorithms (parallel and sequential), databases, logic
programming, functional programming, language design, and language
implementation.

If Richard comments on an aspect of a language pay attention! I have
read knowledgeable comments from him on Ada (83 & 95), Algol 68, APL, C
(K&R & ISO), C++, Clean, Fortran (77, 90, & HPF), Lisp, Mercury, Pascal
(Standard, Extended, & Borland), Perl, Prolog, S, Scheme, and Sisal.
Comments that indicate that he has both thoroughly read the
documentation and used implementations for all of the above languages.
Further, he has worked on compilers for at least two languages: Prolog
and a dialect of S. He has claimed to have experience with 100+
languages, and I believe him. If you think he has said something that is
incorrect look closely: you probably read too quickly, but it might be a
rare case of ambiguity or (obvious) opinion that is not labeled as such.

Stefan Monnier

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

gee...@pitt.edu (Gregg E Economou) writes:
> >Dynamic allocation? Fortran and C both have them.
> You're confusing issues of the library with the language.

Actually, not necessarily. Dynamic allocation can be a language issue.
Dynamis allocation in C is sufficiently minimal that it can be delt with by a
few library functions, but you must admit that the usual practice is to say "a
garbage collected language" rather than "a language with a GC library".

> a point to be thought of- Fortan is a high level language, C is not.

Which is the whole point of Nick. He's not trying to make "you should always
use C vs. you should always use Fortran" but he's just pointing out that for
many uses, C is too low level a language for a compiler to be able to take
advantage of modern processors. It's fine as long as you understand that it's a
low level language and that you shouldn't expect the compiler to reason too
much about the program and make something clever with it. In other words,
because it's low level you can write "down to the metal" code and because it's
low level, you're forced to write "down to the metal code" (for instance,
you might have to do what amounts to trace scheduling by hand).

So please let's get back on track: what should a language look like to make it
possible for compilers to generate good code ?


Stefan

boo

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

On 7 Oct 1997 12:53:16 -0400, Carlie Coats (x...@hilbert.ncsc.org) said:
> > This from an advocate of a language that makes arrays second class
> > citizens? C can't even do the equivalent of
> >
> > SUBROUTINE S( M, N, P, A )
> > INTEGER, INTENT( IN ): M, N, P
> > REAL, INTENT( IN ): A( M, N, P )
> > ...
> >
> > nor can C++, for that matter. Fortran can. So can lisp.

Oh, so you think that high level language constructs are
something to boast about?

How about

#include <LifeTheUniverseAndEverything.h>

int main(void){
SolveMyProblem();
return 0;
}

Can yours do that? That header file I included contains the
prototypes for the functions I like the most, in my
own particular implementation of C. Is this high level
enough for you?

Paul added:
> Excuse the rest of us C/C++ biggots who don't know the F90 syntax, but
> what does the above syntax signify?

I suspect it says something about how the data ("arrays") should
be laid out in memory, and says something about how you are
going to access it, to help the compiler out. In other
words, let the compiler do the thinking, and let the
programmer concentrate on the abstract stuff. This kind
of thing is possible in a simple language like Fortran.

To the original poster: If I know something about the
architecture of the machine you are using (such as a T3E, for
example), I could write some nodal C code with inline
assembly or optimized C (with some non-standard
code, perhaps) that will thrash your high level Fortran code
any day. I may have to use my noggin a little to figure
out the best way to lay the problem out, but that is ok.
Of course, you most likely will get your code running
before I do.... and THAT precisely is where the power
of Fortran is. Of course, Fortran is also ideal for people
who don't quite "get all the pointer nonsense in C"
(probably the biggest stumbling block that sends people
scurrying back to Fortran, basic, pascal, etc.).

boo

Richard A. O'Keefe

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

boo <b...@tonys.ecology.umn.edu> writes:
>Oh puh-leeze.

>Who cares what it was designed for? C is far more powerful than
>Fortran, especially for the kind of work that some of us
>do in the real world.

I strongly suspect that you don't know what modern Fortran is like.


There is precisely *one* construction present in C which is not
supported as well or better by Fortran, and that is pointer arithmetic.

Modules? Fortran has them, C hasn't.


Overloading? Fortran has it, C hasn't.

Records? Fortran and C both have them.
Pointers? Fortran and C both have them.


Dynamic allocation? Fortran and C both have them.

Recursive procedures? Fortran and C both have them.
Nested procedures? Fortran has one level of nesting, C has none.
Strings? Fortran has them, C hasn't.
Character-oriented IO? Fortran and C both have it.
Record-oriented IO? Fortran has it, C hasn't.
User defined operators? Fortran has them, C hasn't.
Free-format source form? Fortran and C both have it.
Bitwise operations? Fortran and C both have them.
Control over record layout? Neither language supports this. (Yes, I know
about bitfields, and I also know how extremely weak are the
guarantees you get about them.)

>If Rob can show me how to do something

>with a particular compiler that speeds up what I need done,
>I am listening to him, not to you.

Why shouldn't you listen to both of us? I am, after all, telling the truth,
as you can easily verify for yourself.

>Time to dismount that Ivory Tower of yours.

I think perhaps it is time instead that _you_ took your blinkers off.


Fortran is not my favourite language by a long way, but it has come
a *long* way in the last 20 years, and is by most criteria a more
'modern' language than C.

--

Robert Harley

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

o...@goanna.cs.rmit.edu.au (Richard A. O'Keefe) writes:
>There is precisely *one* construction present in C which is not
>supported as well or better by Fortran, and that is pointer arithmetic.

That sounds suspiciously like "There is precisely *one* thing present


on car X which is not on car Y, and that is wheels".


>[... FUD deleted...]

Hey, if some people are under the delusion that C sucks but
Fortran/Lisp/etc are great, that's just fine. But please resist the
urge to barge into every thread and yammer on about it.

-- R.

Serge Pachkovsky

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

boo (b...@tonys.ecology.umn.edu) wrote:
: Oh puh-leeze.

: Who cares what it was designed for? C is far more powerful than
: Fortran, especially for the kind of work that some of us
: do in the real world.

Oh puh-leeze.

Who cares what it was designed for? Fortran is far more powerful than
C, especially for the kind of work that some of us do in the real world.

Cheese,

/Serge.P

--

Russian guy from the Zurich university...


cliffc

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

Mike Albaugh wrote:
> I wish I could remember the source of the quote:
> "A design is finished not when there is nothing more to
> add, but when there is nothing left to remove".

From the chapter heading in my thesis on compiler IR design:

"A designer knows he has achieved perfection not when there is
nothing left to add, but when there is nothing left to take away."
-- Antoine de Saint-Exup'ery (1900-1944)

The French author, best known for his classic children's book
"The Little Prince", was also an aircraft designer.

Cliff

--
Cliff Click Compiler Designer and Researcher
cliffc at acm.org JavaSoft
(408) 863-3266 MS UCUP02-302

Hugh LaMaster

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

cliffc wrote:

> "A designer knows he has achieved perfection not when there is
> nothing left to add, but when there is nothing left to take away."
> -- Antoine de Saint-Exup'ery (1900-1944)
>
> The French author, best known for his classic children's book
> "The Little Prince", was also an aircraft designer.

"Night Flight" is an adult book to remember him by.

--
Hugh LaMaster, M/S 258-5, Email:
hlam...@mail.arc.nasa.gov.NOSPAM
NASA Ames Research Center Or: lama...@nas.nasa.gov.NOSPAM
Moffett Field, CA 94035-1000 Junkmail: Don't: USC 18 section 2701
Phone: 415/604-1056 Disclaimer: Unofficial, personal
*opinion*.

Paul Hsieh

unread,
Oct 7, 1997, 3:00:00 AM10/7/97
to

On 7 Oct 1997 12:53:16 -0400, Carlie Coats (x...@hilbert.ncsc.org) said:
> This from an advocate of a language that makes arrays second class
> citizens? C can't even do the equivalent of
>
> SUBROUTINE S( M, N, P, A )
> INTEGER, INTENT( IN ): M, N, P
> REAL, INTENT( IN ): A( M, N, P )
> ...
>
> nor can C++, for that matter. Fortran can. So can lisp.

Excuse the rest of us C/C++ biggots who don't know the F90 syntax, but

what does the above syntax signify?

--
Paul Hsieh
http://www.geocities.com/SiliconValley/9498/mailme.html

Zalman Stern

unread,
Oct 8, 1997, 3:00:00 AM10/8/97
to

Richard A. O'Keefe (o...@goanna.cs.rmit.edu.au) wrote:
: Aliasing is very confusing for people to deal with.

: Of course, as soon as you have objects as in Smalltalk or Common Lisp or
: Java, you have precisely the same kind of problem when the same object
: reference can be passed (the _reference_ is passed by copy-in) to a
: procedure.

See for example item 17 in Scott Meyers' _Effective_C++_ titled "Check for
assignment to self in operator=." The item explains that the statement:
a = a;
in C++ may not be a "nop" for an improperly implemented assignment
operator. (I wonder if the C++ draft standard allows the compiler to punt
on calling the assignment operator in that case. I doubt it...)

-Z-

Richard A. O'Keefe

unread,
Oct 8, 1997, 3:00:00 AM10/8/97
to

William Clodius <wclo...@lanl.gov> writes:

>F90 also lacks pointers to procedures.

True. As someone who is peeved with the oddly restricted function
pointers in Ada95, I *should* have remembered this one. Mind you,
function pointers in C are seriously restricted too.

>It lacks stream based I/O.

I am not quite certain what you mean by this.
If you mean the kind of thing that is found in C,
then I have a printed copy of the specification for stream-based IO in
Fortran somewhere in this office.

>(There
>are contexts where non-advancing I/O is not sufficient.)

Agreed.

Remember though, that the context of this thread was
- hardware and optimisation
- limited or not by common programming languages?

An important thing to understand is that if you are seriously concerned
about high performance and reliability, a language may be as useful for
what it _doesn't_ allow as for what it does.

Gregg E Economou

unread,
Oct 8, 1997, 3:00:00 AM10/8/97
to

In article <61c9kl$94l$1...@goanna.cs.rmit.edu.au>,

Richard A. O'Keefe <o...@goanna.cs.rmit.edu.au> wrote:
>
>I strongly suspect that you don't know what modern Fortran is like.
>There is precisely *one* construction present in C which is not
>supported as well or better by Fortran, and that is pointer arithmetic.

I strongly suspect you dont know what C is like. Go read K&R.

>Modules? Fortran has them, C hasn't.

C is very simple, and designed such that you can implement things of
a higher nature with it if you wish. You can write code that
behaves in a particular style in C.

>Overloading? Fortran has it, C hasn't.

one legitimate arguemtn in a cast of many pointless ones. touche.
(need an accent-grave there)

>Records? Fortran and C both have them.
>Pointers? Fortran and C both have them.
>Dynamic allocation? Fortran and C both have them.

You're confusing issues of the library with the language.
That migth be fine in an HLL like Fortran but in C the library
is not part of the language, even if it is standardized and commonly
accepted.

>Recursive procedures? Fortran and C both have them.
>Nested procedures? Fortran has one level of nesting, C has none.
>Strings? Fortran has them, C hasn't.

But C is lower-level. You can *implement* those things inside a C progam
if you like, but dont have to take on the extra weight if you dont.
Much more flexible.

>Character-oriented IO? Fortran and C both have it.
>Record-oriented IO? Fortran has it, C hasn't.

C has NO I/O. All I/O in the C libraries is implemented using the
basics of the langugae itself in manipulating memory directly. There is
no I/O native to the language. There is almost nothing beyond the
basics of memory handling, logic, and conditionals in the language itself.
The fact that C is used for such a wide variety of things is testimony to
the incredible flexibility it has by NOT being bound to things liek I/O.

>User defined operators? Fortran has them, C hasn't.
>Free-format source form? Fortran and C both have it.
>Bitwise operations? Fortran and C both have them.
>Control over record layout? Neither language supports this. (Yes, I know
> about bitfields, and I also know how extremely weak are the
> guarantees you get about them.)

see above.


>
>Why shouldn't you listen to both of us? I am, after all, telling the truth,
>as you can easily verify for yourself.

only a truth that is based on misdirection. You are obviously mistaken as
to what C is and what the C libraries are, and are mixing the two fo them up.

>I think perhaps it is time instead that _you_ took your blinkers off.
>Fortran is not my favourite language by a long way, but it has come
>a *long* way in the last 20 years, and is by most criteria a more
>'modern' language than C.

a point to be thought of- Fortan is a high level language, C is not.
The utility of C is that it is very universally portable and applicable-
because the laguage itself assumes only the presence of memory, a processing
capability to perform logical and arithmetic operations, and the ability
to perform conditional executions. The manipulation of memory is the
implementation of almost all higher-level fucntionality in the C libraries.
The beauty fo this is that it ends up that C programs very closely resemble
the actual activity fo the machine and low-level routines with a high-level
structure and syntax, and lead to a much simpler and more portabel coding than
an assembly lends, but is still very close down there to the hardware.

However, it is not ncessarily the best thing for _every_ job. For example,
renaming a list of files would be much simpler to accomplish with a short shell
script, and would be much more logically done that way than with a C program
designed explicitly for such a purpose.

anwyays.. lets NOT start another language war, eh?

isildur

David Chase

unread,
Oct 8, 1997, 3:00:00 AM10/8/97
to

Gregg E Economou wrote:
> ... The manipulation of memory is the
> implementation of almost all higher-level functionality in the C libraries.
> The beauty of this is that it ends up that C programs very closely resemble

a rat's nest of memory and functional unit latency.

> anwyays.. lets NOT start another language war, eh?

Why should the last word be given to someone who so clearly doesn't
know what he is talking about? Or were you just trolling?

--
David Chase, ch...@world.std.com

Richard A. O'Keefe

unread,
Oct 8, 1997, 3:00:00 AM10/8/97
to

gee...@pitt.edu (Gregg E Economou) writes:
>I strongly suspect you dont know what C is like. Go read K&R.

I have been using C since the days when UNIX V6+ was the latest thing.
I've fixed bugs in two C compilers.
I've bought and read three copies of the C standard (they keep disappearing
from my office; I'm getting sick of it),
and have obtained and read every publically available document about C9x.
(I was writing a proposal to add offset types to C, but didn't get it
finished in time to submit it to the committee. Sigh. It'd sure make
putting linked structures in shared memory easier.)

>>Modules? Fortran has them, C hasn't.

>C is very simple, and designed such that you can implement things of
>a higher nature with it if you wish. You can write code that
>behaves in a particular style in C.

If you think C is simple, you probably haven't seen the insides of a
C compiler. And you are going to have a nasty shock when C9X comes out.

>You're confusing issues of the library with the language.

>That migth be fine in an HLL like Fortran but in C the library
>is not part of the language, even if it is standardized and commonly
>accepted.

Clearly you have never participated in any of the threads about this
in comp.std.c. The C `library' *is* an official part of the language;
malloc() is as standard as &; sqrt() is as standard as +. (Technically,
there are two kinds of

>But C is lower-level. You can *implement* those things inside a C progam
>if you like, but dont have to take on the extra weight if you dont.
>Much more flexible.

>C has NO I/O.

Wrong. stdio is a fully standard part of C. getc() is as standard
as x = *p++.

>The fact that C is used for such a wide variety of things is testimony to
>the incredible flexibility it has by NOT being bound to things liek I/O.

If you want to talk about C being useful for 8051s you are right.
If you are talking about C systems running on Macs, PCs, UNIX boxes,
MVS, CMS, VMS, BeOS, AmigaDOS, then you are talking through your hat.

>only a truth that is based on misdirection. You are obviously mistaken as
>to what C is and what the C libraries are, and are mixing the two fo them up.

Obviously? I think perhaps you have not read the standard with due care
and attention.

>a point to be thought of- Fortan is a high level language, C is not.
>The utility of C is that it is very universally portable and applicable-
>because the laguage itself assumes only the presence of memory, a processing
>capability to perform logical and arithmetic operations, and the ability
>to perform conditional executions.

Oh dear. There are so many misconceptions here that it is hard to know
where to start.

Take the B6700 mainframes (now the A series).
*Lovely* machines. Fortran runs *beautifully* on them, as do
COBOL and Pascal. There _is_ a C compiler for them now, but
the implementors had to fight past C's utterly inappropriate
assumptions about memory. Heck, C wasn't even a good fit to the PC,
thanks to its extremely strong assumptions about memory.

C in practice does not merely assume the PRESENCE of memory,
it demands a FLAT, LINEAR, BYTE-ADDRESSED memory.

There have been plenty of machines with segmented (PC) or tree-structured
(B6700) memories. There have been plenty of machines with word-addressed
memories. The first MIPS designs didn't have byte loads and stores.
The first Alphas didn't have byte loads or stores. I don't remember
which models of the AMD29k if any have byte loads and stores, but the
oldest ones didn't.

While the C _standard_ doesn't say that NULL pointers are all bits zero,
there are enough C programs out there that rely on this that one manufacturer
(Prime) had to add special instructions just for C to their computers.
Those same computers had supported Fortran, COBOL, Pascal, and even PL/I
just fine.

Indeed, I once ran into trouble porting a C program to a machine which
was just your ordinary average M68k UNIX box _except_ that the user's
half of the address space was the _negative_ half, and large chunks of
the code relied on user addresses looking positive when you cast them
to long. "Very universally portable?" I think not.

I'm trying to bring this thread back into relevance to comp.arch, which
is where I am reading it.

The basic point I want to make is that for *most* (not all) tasks, if you
want efficiency, you want a *high* level language, not an old low level
language. Why? Because hardware has changed a lot in the last 20 years
too. On a good modern machine, load/store can be a HUNDRED TIMES SLOWER
than an add (UltraSPARC II: 4-way superscalar, it's fairly easy to fill
all the slots in vector code with a bit of unrolling, load from main
memory is 30 cycles, 30*4 = 120).

There are plenty of tasks where a low level language is just right,
and C is a welcome step up from assembler. (Programming an 8051 is an
area that forcibly suggests itself.) But writing a program that is a
good fit to _this_ year's machine is a good way to end up with a program
that is relatively slow on _next_ year's machine.

Compiler writers are doing truly _awesome_ things these days. I for one
would rather not get in their way.

--
John Æneas Byron O'Keefe; 1921/02/04-1997/09/27; TLG,TLTA,BBTNOTL.

Richard A. O'Keefe; RMIT Comp.Sci; http://www.cs.rmit.edu.au/%7Eok

Reinder Verlinde

unread,
Oct 8, 1997, 3:00:00 AM10/8/97
to

In article <5l3emde...@tequila.systemsz.cs.yale.edu>, Stefan Monnier
<monnier+comp/arch/news/@tequila.cs.yale.edu> wrote:

>
>So please let's get back on track: what should a language look like to make it
>possible for compilers to generate good code ?
>

My guesses have more to do with compilation environments than with
language. If you allow the linker (or whatever you will call it) to detect,
for instance, that function f(x,y) is only called with 'y' values of 2 and
456, or with a x value of 17, the linker could decide to write f2(x,
f456(x), and f17(y) for you, and it might even inline their calls. With
aggresive constant folding, the resulting functions might be a lot smaller
than the original ones (e.g. if f contains a switch( x) or switch( y)). Of
course, the language may make it easier to detect these kinds of things.
For some ideas about what makes a language compilable witout sacrificing
ease of coding too much, see 'Dylan' (sort-of an infix Lisp where the
programmer can specify types of variables, and the compiler is supposed to
propagate this type-information, allowing it to do method selection at
compilation time as much as possible)

A language-related thing is the idea to abstract away the hardware. For
instance, consider implementing C with its bitwise operators on a ternary
machine, or packed boolean arrays on a machine without bitwise operators.

Abstracting away the hardware can in C be done up to some point by using
libraries. For instance, the STL for a particular system can be optimized
for that system, just and OS calls can.

Reinder

--
Reinder Verlinde

Zalman Stern

unread,
Oct 8, 1997, 3:00:00 AM10/8/97
to

Paul Hsieh (q...@chromatic.com) wrote:

: On 7 Oct 1997 12:53:16 -0400, Carlie Coats (x...@hilbert.ncsc.org) said:
: > This from an advocate of a language that makes arrays second class
: > citizens? C can't even do the equivalent of
: >
: > SUBROUTINE S( M, N, P, A )
: > INTEGER, INTENT( IN ): M, N, P
: > REAL, INTENT( IN ): A( M, N, P )
: > ...
: >
: > nor can C++, for that matter. Fortran can. So can lisp.

: Excuse the rest of us C/C++ biggots who don't know the F90 syntax, but
: what does the above syntax signify?

An array with three dimensions in which all three are of dynamic extent. In
psuedo C think of it as:
void S(int N, int M, int P, double A[M][N][P]);

There is no way in C to do this such that you can write A[x][y][z] and have
it work. (One can use interior pointers for the outer dimensions as one
sometimes sees in DSP code, but that is a completely different technique
which increases the indirection hit we are trying to reduce.) One works
around this using macros or other hacks which makes some feel quite clever
and makes others wonder at what passes for programming tools these days :-)

(I have to admit I can't recall if FORTRAN is row major or column major
anymore. My recollection is that its the opposite of C which I believe
would make it row major. Is that right? 'Tis an irrelevant detail to this
discussion however.)

-Z-

Zalman Stern

unread,
Oct 8, 1997, 3:00:00 AM10/8/97
to

Richard A. O'Keefe (o...@goanna.cs.rmit.edu.au) wrote:
: The first MIPS designs didn't have byte loads and stores.

True of the Stanford MIPS research project "designs," not the MIPS R-series
from the company that is now a subsidairy of SGI.

: The first Alphas didn't have byte loads or stores.

Correct. But the ISA has always been byte addressed and had support (albeit
slow) for byte load/stores from the start. They do now have byte loads and
stores and it really looks like a screwup that they didn't in the first
place.

Don't forget to mention the CDC bit-addressed designs covered in another
thread recently and the word addressed PDP-10.

-Z-

Robert Harley

unread,
Oct 8, 1997, 3:00:00 AM10/8/97
to

Shankar Unni <sha...@powertelglobal.com> writes:
>Yo, Robert (and you, too, "b...@somethingorother.umn.edu"):
>
>Richard and Nick have probably forgotten more C than you'll ever learn
>in your lifetime, OK? Give it a rest; listen and learn. There's a few
>lifetimes of actual working experience behind these opinions..

Yo Shankar, their lifetimes of experience hacking primitive Fortrans and
Lisps in 500BC apparently don't help when talking about ANSI C and C9X.

Why exactly should we "listen and learn" to people who screw up on
'const' and admit to not knowing what 'restrict' is, when we're
shaving cycles off critical numerical code with both of them here and now?
Or who claim that:

>Nested procedures? Fortran has one level of nesting, C has none.

when GCC, for instance, has arbitrarily deep nesting with Pascal-style
lexical scoping? Oh no! It's not strictly standard-conforming!
La-di-frickin-da. It's works just fine on Alpha, ARM, Intel, Sparc
and experimental backroom chips nobody ever heard of, in my
ever-so-humble experience. (However I find it totally useless).

I believe that Richard does have a clue, however he also has a very
obvious bias. As Mark Twain said:

"First get the facts straight,
then you are free to distort them at your leisure."


>We see this sort of crazed C cheerleading about the start of every

>semester. We've all programmed in C (and in C++). [...]

Jeez, lose the patronising bullshit.

-- Rob.

Carlie Coats

unread,
Oct 8, 1997, 3:00:00 AM10/8/97
to

In article <343AF933...@tonys.ecology.umn.edu>,
boo <b...@tonys.ecology.umn.edu> blathers:

> On 7 Oct 1997 12:53:16 -0400, Carlie Coats (x...@hilbert.ncsc.org) said:
> > > This from an advocate of a language that makes arrays second class
> > > citizens? C can't even do the equivalent of
> > >
> > > SUBROUTINE S( M, N, P, A )
> > > INTEGER, INTENT( IN ): M, N, P
> > > REAL, INTENT( IN ): A( M, N, P )
> > > ...
> > >
> > > nor can C++, for that matter. Fortran can. So can lisp.
>
> Oh, so you think that high level language constructs are
> something to boast about?

"High level constructs"? Since when is being able to do decent
things with (multi-dimensional) arrays high level? Maybe instead
I should have said

<RANT>
When I'm programming in C or C++, why the *_hell_* can't
I do easy things like

void foo( const int m,
const int n,
const int p,
float a[p][n][m] ) {...}

This idiocy about all subscripting except the outermost
being required to be compile-time constant is really and
truly _brain-dead_!
</RANT>

Would that satisfy you?


It is loading more messages.
0 new messages