diffence between itanium and alpha

glen herrmannsfeldt

unread,

Jan 28, 2003, 1:08:41 PM1/28/03

to

"Yousuf Khan" <bbbl6...@yahoo.com.nospam> wrote (snip)
>
> If you want a 64-bit processor that will also run 32-bit OS, then
wait to
> get the AMD Opteron. That processor, much like the Alpha, has a
32-bit and
> 64-bit mode that are subsets and supersets of each other. But
better yet the
> instruction set is based on x86 for the Opteron.

I would say that Alpha has a much better instruction set. x86 is
more popular, though.

comp.arch added.

-- glen

Yousuf Khan

unread,

Jan 28, 2003, 1:27:42 PM1/28/03

to

"glen herrmannsfeldt" <g...@ugcs.caltech.edu> wrote in message
news:JGzZ9.69011$VU6....@rwcrnsc52.ops.asp.att.net...

After further reading of my post you may have discerned that I said "better
yet the instruction set is based on x86 for the Opteron", meaning I'm saying
it's better because it's based off of the x86 instruction set. I wasn't
saying the Alpha was better. If the desire is to run the largest number of
applications, then the x86 instruction set is the one to have.

Yousuf Khan

Eric Smith

unread,

Jan 28, 2003, 5:33:42 PM1/28/03

to

"Yousuf Khan" <bbbl6...@yahoo.com.nospam> writes:
> If you want a 64-bit processor that will also run 32-bit OS, then wait to
> get the AMD Opteron. That processor, much like the Alpha, has a 32-bit and
> 64-bit mode

I don't have the reference manual handy, but I don't recall the Alpha
architecture having a 32-bit mode. Are you sure?

The C compiler did have some options for the sizes of pointers and various
integer types, but I don't think that depended on the processor having a
different mode.

The Alpha did have a special mode for execution of PALcode. The processors
had a few extra registers and instructions for use by PALcode that were
not part of the architecture. But that didn't have anything special for
32-bit code.

David Mosberger-Tang

unread,

Jan 28, 2003, 6:28:33 PM1/28/03

to

>>>>> On 28 Jan 2003 14:33:42 -0800, Eric Smith <eric-no-s...@brouhaha.com> said:

Eric> "Yousuf Khan" <bbbl6...@yahoo.com.nospam> writes:

Eric> I don't have the reference manual handy, but I don't recall
Eric> the Alpha architecture having a 32-bit mode. Are you sure?

Eric> The C compiler did have some options for the sizes of pointers
Eric> and various integer types, but I don't think that depended on
Eric> the processor having a different mode.

This is much more a question of operating system (and it's supported
data models) than a question of archicture. For example, HP-UX on the
Itanium Processor Family defaults to a 32-bit (ILP32) data model.
Only if you turn on a flag (+DD64, IIRC), will you get the 64-bit
(LP64) data model (the kernel is always 64-bit, but there is nothing
in the architecture that forces this choice).

--david

Yousuf Khan

unread,

Jan 28, 2003, 9:27:04 PM1/28/03

to

"Eric Smith" <eric-no-s...@brouhaha.com> wrote in message
news:qhvg085...@ruckus.brouhaha.com...

> "Yousuf Khan" <bbbl6...@yahoo.com.nospam> writes:
> > If you want a 64-bit processor that will also run 32-bit OS, then wait
to
> > get the AMD Opteron. That processor, much like the Alpha, has a 32-bit
and
> > 64-bit mode
>
> I don't have the reference manual handy, but I don't recall the Alpha
> architecture having a 32-bit mode. Are you sure?

It's probably too strong to call the 32-bit operations, a "mode". It's
simply using only the 32-bits out of the full 64-bits.

Yousuf Khan

Brannon Batson

unread,

Jan 28, 2003, 11:52:11 PM1/28/03

to

"Yousuf Khan" <bbbl6...@yahoo.com.nospam> wrote in message news:<yYzZ9.212634$pDv....@news04.bloor.is.net.cable.rogers.com>...

> "glen herrmannsfeldt" <g...@ugcs.caltech.edu> wrote in message
> news:JGzZ9.69011$VU6....@rwcrnsc52.ops.asp.att.net...
> >

> > [snip]

>
> After further reading of my post you may have discerned that I said "better
> yet the instruction set is based on x86 for the Opteron", meaning I'm saying
> it's better because it's based off of the x86 instruction set. I wasn't
> saying the Alpha was better. If the desire is to run the largest number of
> applications, then the x86 instruction set is the one to have.
>
> Yousuf Khan

The Alpha ISA *is* better (by several metrics) when compared to
x86-32, x86-64, Sparc, IA64, etc. However, those metrics are of
primary importance just to the people who have to design hardware that
implements the ISA's and perhaps write low-level software. For those
people, the differences are dramatic, for everyone else, the only real
difference per ISA is the software availability.

Brannon
not speaking for Intel

Terje Mathisen

unread,

Jan 29, 2003, 2:53:33 AM1/29/03

to

Brannon Batson wrote:
> The Alpha ISA *is* better (by several metrics) when compared to
> x86-32, x86-64, Sparc, IA64, etc. However, those metrics are of

I agree.

> primary importance just to the people who have to design hardware that
> implements the ISA's and perhaps write low-level software. For those
> people, the differences are dramatic, for everyone else, the only real
> difference per ISA is the software availability.
>
> Brannon
> not speaking for Intel

'not speaking for Intel', right!

It is of course a bit too late to be speaking for DEC. :-(

Terje
--
- <Terje.M...@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

Nick Maclaren

unread,

Jan 29, 2003, 3:05:43 AM1/29/03

to

In article <4495ef1f.03012...@posting.google.com>,

Brannon...@yahoo.com (Brannon Batson) writes:
|>
|> The Alpha ISA *is* better (by several metrics) when compared to
|> x86-32, x86-64, Sparc, IA64, etc. However, those metrics are of
|> primary importance just to the people who have to design hardware that
|> implements the ISA's and perhaps write low-level software. For those
|> people, the differences are dramatic, for everyone else, the only real
|> difference per ISA is the software availability.

Well, there are other metrics, too, including ones which are of
primary importance to those who have to design, implement, debug
and maintain software - and remember that the majority of people
who do the last two on vendors' software are customers!

Few modern architectures score well on those, and the Alpha is
definitely not the best.

Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email: nm...@cam.ac.uk
Tel.: +44 1223 334761 Fax: +44 1223 334679

Bill

unread,

Jan 29, 2003, 7:12:20 AM1/29/03

to

"Brannon Batson" <Brannon...@yahoo.com> wrote in message
news:4495ef1f.03012...@posting.google.com...

>
> The Alpha ISA *is* better (by several metrics) when compared to
> x86-32, x86-64, Sparc, IA64, etc. However, those metrics are of
> primary importance just to the people who have to design hardware that
> implements the ISA's and perhaps write low-level software. For those
> people, the differences are dramatic, for everyone else, the only real
> difference per ISA is the software availability.
>

How do you quantify and grade ISAs? Forgive me if this is embarrasingly
obvious to computer architects - I'm not one.

Bill

Andi Kleen

unread,

Jan 29, 2003, 8:54:48 AM1/29/03

to

Brannon...@yahoo.com (Brannon Batson) writes:
>
> The Alpha ISA *is* better (by several metrics) when compared to
> x86-32, x86-64, Sparc, IA64, etc. However, those metrics are of
> primary importance just to the people who have to design hardware that
> implements the ISA's and perhaps write low-level software. For those

... Not to forget compiler writers.

Also you often have to differentiate between system and user level
programming. For example most of the really ugly parts of i386
(like the complicated segment handling) are at operating system level.
Most user level programmers using the services of an operating system can
just ignore them.

Even on many at the first look clean looking RISC architectures
the system level details can be surprisingly kludgy and ugly
when you look closer.

As an example for x86-64: the architecture manual actually splits
the description of user level programming and system level programming
into different parts. If you only read the user level parts
and perhaps ignore the instruction encoding everything looks much cleaner.

I believe IA64 splits the architecture manual in a similar way.

Of course sometimes the differences get blurry, like when you want
to know about performance counters and similar stuff.

On IA32 it is half split: the introducing descriptions are in different books,
but the instruction reference is unified.

-Andi

Nick Maclaren

unread,

Jan 29, 2003, 9:10:17 AM1/29/03

to

In article <m31y2wh...@averell.firstfloor.org>,

Andi Kleen <fre...@alancoxonachip.com> writes:
|>
|> ... Not to forget compiler writers.

And run-time system writers - if there are any left!

|> Also you often have to differentiate between system and user level
|> programming. For example most of the really ugly parts of i386
|> (like the complicated segment handling) are at operating system level.
|> Most user level programmers using the services of an operating system can
|> just ignore them.

I wish :-(

|> Even on many at the first look clean looking RISC architectures
|> the system level details can be surprisingly kludgy and ugly
|> when you look closer.
|>
|> As an example for x86-64: the architecture manual actually splits
|> the description of user level programming and system level programming
|> into different parts. If you only read the user level parts
|> and perhaps ignore the instruction encoding everything looks much cleaner.
|>

|> Of course sometimes the differences get blurry, like when you want
|> to know about performance counters and similar stuff.

And how, to all of the above! On a lot of architectures you have
to study the system level specification (and sometimes even the
half-documented implementaion details) if you have any serious
interest in debugging, tuning or exception handling. It isn't
that you need to USE the facilities, but you need to know what
you have to do to avoid the "gotchas" - and when to recognise a
"gotcha".

Andi Kleen

unread,

Jan 29, 2003, 9:21:34 AM1/29/03

to

nm...@cus.cam.ac.uk (Nick Maclaren) writes:
>
> And how, to all of the above! On a lot of architectures you have
> to study the system level specification (and sometimes even the
> half-documented implementaion details) if you have any serious
> interest in debugging, tuning or exception handling. It isn't

I know you like to scare people, Nick. But really, it it exaggerated.

In a few cases when the architecture (like the early alpha's fp exceptions)
don't shield you from system level details sufficiently ;). But even then
it can be mostly ignored.

I don't believe for example that you will need to know anything
about the weirdnesses of IA32 segment types to program normal user level
in a modern 32bit IA32 OS for example. Same for the more obscure details
of x87/SSE2 FPU exception handling. This assumes that your operating
system implements them correctly. If it doesn't you will need to learn
about it to fix/work around it.

Or both IA64 and Sparc have horribly over complicated register window
implementations (also known as the system programmer's nightmare)
But at user level you can completely ignore that.

-Andi

Nick Maclaren

unread,

Jan 29, 2003, 9:50:21 AM1/29/03

to

In article <m3y954g...@averell.firstfloor.org>, Andi Kleen <fre...@alancoxonachip.com> writes:
|> nm...@cus.cam.ac.uk (Nick Maclaren) writes:
|> >
|> > And how, to all of the above! On a lot of architectures you have
|> > to study the system level specification (and sometimes even the
|> > half-documented implementaion details) if you have any serious
|> > interest in debugging, tuning or exception handling. It isn't
|>
|> I know you like to scare people, Nick. But really, it it exaggerated.
|>
|> In a few cases when the architecture (like the early alpha's fp exceptions)
|> don't shield you from system level details sufficiently ;). But even then
|> it can be mostly ignored.

Like cache line sizes, associativity and so on? Like the way that
the Pentium 4 handles performance registers in HypeThreading mode?
Like the way that the systems I use handle denormalised numbers?
All of those are needed for serious tuning, and there are dozens
of other examples.

Similarly, any programmer who needs to handle exceptions and
interrupts on an Alpha (and many other CPUs) really DOES need to
know about things like the interrupt shadow (or whatever it is
called). Not least because you need to be able to read between
the lines of the compiler manuals to select options that work.

[ In this case, when I first hit it, there WEREN'T any that did!
The next version of the compiler introduced them, but I believe
that even the worst of the "gotchas" are STILL undocumented. ]

And, yes, I can assure you that it's not just me. Because many
people know that I am the man who knows where the bodies are
buried, they come to me asking why they can't get the system to
work properly. Sometimes I can even explain it :-)

|> I don't believe for example that you will need to know anything
|> about the weirdnesses of IA32 segment types to program normal user level
|> in a modern 32bit IA32 OS for example. Same for the more obscure details
|> of x87/SSE2 FPU exception handling. This assumes that your operating
|> system implements them correctly. If it doesn't you will need to learn
|> about it to fix/work around it.

I didn't say you have to know everything. And how are you proposing
to do the last without the information I am referring to? It does,
after all, occupy a great deal of many applications programmers'
time.

|> Or both IA64 and Sparc have horribly over complicated register window
|> implementations (also known as the system programmer's nightmare)
|> But at user level you can completely ignore that.

Oh, yeah? Now try debugging a large, optimised application that
has just crashed with a particularly horrible SIGSEGV, that you
suspect is caused by a code generation error.

Del Cecchi

unread,

Jan 29, 2003, 9:52:47 AM1/29/03

to

In article <qhvg085...@ruckus.brouhaha.com>,

And I am pretty sure all the 64 bit PowerPC processors run in 32 bit mode as
well. Just pick up a RS/6000 with a 630 in it and you are good to go. No waiting
required.
--

Del Cecchi
cec...@us.ibm.com
Personal Opinions Only

Robert Myers

unread,

Jan 29, 2003, 10:34:58 AM1/29/03

to

Nick Maclaren wrote:
> In article <m31y2wh...@averell.firstfloor.org>,
> Andi Kleen <fre...@alancoxonachip.com> writes:
> |>
> |> ... Not to forget compiler writers.
>
> And run-time system writers - if there are any left!
>

Um, what? Even without checking google, I would have thought run-time
systems to be one of the hottest areas of development going, and a quick
check of google (did you mean _runtime_ system?) produces pages and
pages of links, no matter whether you hyphenate it or not.

At least one catastrophic mistake of IA-64 is that it assumes that
enough is known at compile-time to pack instructions into Very Long
Instruction Words so as to allow optimally parallel execution as devised
and prescribed by the compiler.

Alpha makes no similar assumption, but does it matter?

In fact, one wonders if the ISA matters at all precisely _because_ of
the possibility of run-time supervision of execution--something that
happens right now within the beating hearts of the cleverly designed
Intel processors I use every day.

IMHO, Transmeta has at least part of the right idea, no matter how hard
a time they are having in executing it.

If Intel has any brains, and I would imagine that it does, it is
watching Transmeta very closely to try to learn from its mistakes,
rather than fixating on which particular ISA is to be presented to the
outside world.

Were I a writer of runtime systems, it would frustrate me greatly that I
could only get at the instruction stream *before* it got to the processor.

Blue sky: imagine a processor with hooks that would allow user designed
intervention within the processor, most likely through a
field-programmable part of the die. An invitation to chaos? Oh,
absolutely.

RM

Andi Kleen

unread,

Jan 29, 2003, 10:50:21 AM1/29/03

to

nm...@cus.cam.ac.uk (Nick Maclaren) writes:
>
> Like cache line sizes, associativity and so on? Like the way that
> the Pentium 4 handles performance registers in HypeThreading mode?

Details for performance registers are shielded from you by a good
profiler.

Ok basic knowledge of caches is useful for optimization, but that
has nothing to do with the ISA (but the microarchitecture)

> Like the way that the systems I use handle denormalised numbers?

Floating point at this level is clearly user level ISA.

>
> Similarly, any programmer who needs to handle exceptions and
> interrupts on an Alpha (and many other CPUs) really DOES need to

You don't handle interrupts in a user level program.

Exceptions are sometimes handled, but a good operating system shields you from
a lot of details.

> know about things like the interrupt shadow (or whatever it is
> called). Not least because you need to be able to read between
> the lines of the compiler manuals to select options that work.

Interrupt shadow registers is only interesting for an operating system.
Joe Application programmer couldn't care less about them and he will
never use the switches in the compiler required for it.

> [ In this case, when I first hit it, there WEREN'T any that did!
> The next version of the compiler introduced them, but I believe
> that even the worst of the "gotchas" are STILL undocumented. ]

I'm sure you weren't writing an normal user level application when you
needed it. You may need it for driver programming, but that's clearly
not user level.

> |> Or both IA64 and Sparc have horribly over complicated register window
> |> implementations (also known as the system programmer's nightmare)
> |> But at user level you can completely ignore that.
>
> Oh, yeah? Now try debugging a large, optimised application that
> has just crashed with a particularly horrible SIGSEGV, that you
> suspect is caused by a code generation error.

You mean a bug in the runtime?

Window handling consists of two parts:

- Compiler/User issuing instructions to allocate/free windows.
That is relatively simple and part of the user level ISA.
If your compiler gets that wrong you can easily check it without
knowing anything about system level ISA.

- Window overflow handlers and similar functions.
This is clearly system level ISA and normally part of the operating
system kernel. If you suspect that is buggy and you want to debug
your kernel then you will need to know about the system level
details. But that's definitely not what Joe Applicationprogrammer wants
or needs to know.

All your examples so far have reinforced my original point, thank you :-)

-Andi

Larry Kilgallen

unread,

Jan 29, 2003, 11:52:44 AM1/29/03

to

In article <CwSZ9.83166$6G4.11978@sccrnsc02>, Robert Myers <rmyer...@attbi.com> writes:

> At least one catastrophic mistake of IA-64 is that it assumes that
> enough is known at compile-time to pack instructions into Very Long
> Instruction Words so as to allow optimally parallel execution as devised
> and prescribed by the compiler.
>
> Alpha makes no similar assumption, but does it matter?

It matters to those who are trying to create software for distribution
and use on a variety of systems (e.g., commercial software). Those
who only create software for use on machine they own can standardize
on particular versions of the chip.

David Mosberger-Tang

unread,

Jan 29, 2003, 12:23:13 PM1/29/03

to

>>>>> On Wed, 29 Jan 2003 16:50:21 +0100, Andi Kleen <fre...@alancoxonachip.com> said:

>> |> Or both IA64 and Sparc have horribly over complicated register window
>> |> implementations (also known as the system programmer's nightmare)
>> |> But at user level you can completely ignore that.
>>
>> Oh, yeah? Now try debugging a large, optimised application that
>> has just crashed with a particularly horrible SIGSEGV, that you
>> suspect is caused by a code generation error.

Andi> - Window overflow handlers and similar functions. This is
Andi> clearly system level ISA and normally part of the operating
Andi> system kernel. If you suspect that is buggy and you want to
Andi> debug your kernel then you will need to know about the system
Andi> level details. But that's definitely not what Joe
Andi> Applicationprogrammer wants or needs to know.

Just to be clear: ia64 does _not_ require kernel-support for "window overflow".
All spilling/filling of stacked registers is handled autonomously by
the register stack engine.

(And I'll refrain from replying on Nick's comments; he's clearly always
right no matter what the facts... ;-)

--david

Andi Kleen

unread,

Jan 29, 2003, 12:28:37 PM1/29/03

to

David Mosberger-Tang <David.M...@acm.org> writes:
>
> Just to be clear: ia64 does _not_ require kernel-support for "window overflow".
> All spilling/filling of stacked registers is handled autonomously by
> the register stack engine.

Ok, but you have to set it up correctly etc. (if everything was automated
and trouble free you surely could have described it shorter in your book ;)

The sparc normally has overflow handlers in software.

-Andi

Nick Maclaren

unread,

Jan 29, 2003, 12:54:27 PM1/29/03

to

In article <CwSZ9.83166$6G4.11978@sccrnsc02>,
Robert Myers <rmyer...@attbi.com> writes:

Doubtless. Perhaps I should have expanded the statement to
"run-time system writers that attempt to tackle the system
interface issues properly".

Nick Maclaren

unread,

Jan 29, 2003, 1:06:13 PM1/29/03

to

In article <m3vg08g...@averell.firstfloor.org>, Andi Kleen <fre...@alancoxonachip.com> writes:
|> nm...@cus.cam.ac.uk (Nick Maclaren) writes:
|> >
|> > Like cache line sizes, associativity and so on? Like the way that
|> > the Pentium 4 handles performance registers in HypeThreading mode?
|>
|> Details for performance registers are shielded from you by a good
|> profiler.

Then you clearly haven't looked at the problem I am referring to.
It isn't possible for a profiler to shield those details from the
programmer in that case.

|> Ok basic knowledge of caches is useful for optimization, but that
|> has nothing to do with the ISA (but the microarchitecture)

And hence is sometimes not even documented in the system programmer
material, but in even more obscure manuals.

|> > Like the way that the systems I use handle denormalised numbers?
|>
|> Floating point at this level is clearly user level ISA.

And is often not documented at that level.

|> > Similarly, any programmer who needs to handle exceptions and
|> > interrupts on an Alpha (and many other CPUs) really DOES need to
|>
|> You don't handle interrupts in a user level program.

Oh, yes, you do. You may call them something else, but an exception
or signal that interrupts executing code, suspends it and calls a
handler is an interrupt. In order to understand the constraints,
AS SEEN BY UNPRIVILEGED CODE, you often need to dive very deeply
into the system and hardware manuals.

|> Exceptions are sometimes handled, but a good operating system shields you from
|> a lot of details.

Fine. Do you know of one? A mainstream one? I don't.

|> > know about things like the interrupt shadow (or whatever it is
|> > called). Not least because you need to be able to read between
|> > the lines of the compiler manuals to select options that work.
|>
|> Interrupt shadow registers is only interesting for an operating system.
|> Joe Application programmer couldn't care less about them and he will
|> never use the switches in the compiler required for it.

It wasn't the registers I was referring to but the constraints on
the state of a program when a handler might be called.

|> > [ In this case, when I first hit it, there WEREN'T any that did!
|> > The next version of the compiler introduced them, but I believe
|> > that even the worst of the "gotchas" are STILL undocumented. ]
|>
|> I'm sure you weren't writing an normal user level application when you
|> needed it. You may need it for driver programming, but that's clearly
|> not user level.

Yes, I was. All I was doing was writing a checker for <math.h>.

|> > |> Or both IA64 and Sparc have horribly over complicated register window
|> > |> implementations (also known as the system programmer's nightmare)
|> > |> But at user level you can completely ignore that.
|> >
|> > Oh, yeah? Now try debugging a large, optimised application that
|> > has just crashed with a particularly horrible SIGSEGV, that you
|> > suspect is caused by a code generation error.
|>
|> You mean a bug in the runtime?

I said a code generation error - i.e. an error in the code produced
by the compiler, such that it breaks the (fiendishly complicated)
constraints of using the register windows. Or an overwriting bug
(yours or the run-time systems) such that the source-level data
does not match the actual registers. All standard stuff.

Pete Zaitcev

unread,

Jan 29, 2003, 1:22:33 PM1/29/03

to

> Or both IA64 and Sparc have horribly over complicated register window
> implementations (also known as the system programmer's nightmare)
> But at user level you can completely ignore that.
>
> -Andi

I think that sparc64 (aka Ultrasparc) is much improved in this
regard over the sparc, check out arch/sparc/kernel/entry.S and
arch/sparc64/kernel/entry.S. But perhaps it's just my envy of DaveM
speaking ;-)

The Itanic does seem a little annoying with two stacks, but I did
not see David M-T complaining.

-- Pete

David Mosberger-Tang

unread,

Jan 29, 2003, 2:45:02 PM1/29/03

to

>>>>> On Wed, 29 Jan 2003 18:28:37 +0100, Andi Kleen <fre...@alancoxonachip.com> said:

Andi> David Mosberger-Tang <David.M...@acm.org> writes:
>> Just to be clear: ia64 does _not_ require kernel-support for
>> "window overflow". All spilling/filling of stacked registers is
>> handled autonomously by the register stack engine.

Andi> Ok, but you have to set it up correctly etc.

Oh, certainly.

Andi> (if everything was automated and trouble free you surely could
Andi> have described it shorter in your book ;)

Oh, it _is_ automated once setup. The reason I spent so much time on
the RSE in my book is because it's something new. Once you develop an
intuitive notion of how the RSE works, there is no need to go down to
that detailed level anymore, just like nobody would be spending much
time nowadays on describing how a regular memory-stack works and,
e.g., why you can't put live data below the stack pointer, etc.

Put differently, if every architecture for the last 30 years had used
a similar stack engine, I could have shortened the discussion
_greatly_. Yes, this sounds obvious, but just consider how many
"complicated" things we take for granted because we have been familiar
with them for years and developed intuitive models for how they
work...

--david

Nick Maclaren

unread,

Jan 29, 2003, 4:12:33 PM1/29/03

to

In article <ug4r7rh...@panda.mostang.com>,

David Mosberger-Tang <David.M...@acm.org> wrote:
>
>Oh, it _is_ automated once setup. The reason I spent so much time on
>the RSE in my book is because it's something new. Once you develop an
>intuitive notion of how the RSE works, there is no need to go down to
>that detailed level anymore, just like nobody would be spending much
>time nowadays on describing how a regular memory-stack works and,
>e.g., why you can't put live data below the stack pointer, etc.

Hang on. That's not equivalent. Even for someone who was familiar
with the concept and had used systems with that mechanism, it was and
is pretty complicated and the details were and are extremely unobvious.
It is several times as complex as the most complicated 'traditional'
stack mechanism that I know of.

Furthermore, the form of stack management that moves the pointer at
every block entry and exit has been standard practice for about 40
years, and is MUCH simpler, yet it STILL gets extra explanation because
of its complexity and unobviousness. Some developers have even backed
off using it because of the number of its users (e.g. in the library
and diagnostic tool areas) that got it wrong.

>Put differently, if every architecture for the last 30 years had used
>a similar stack engine, I could have shortened the discussion
>_greatly_. Yes, this sounds obvious, but just consider how many
>"complicated" things we take for granted because we have been familiar
>with them for years and developed intuitive models for how they
>work...

Yes, indeed, you could. Probably by a factor of three in bulk. Yet
your book would have been incomplete if you hadn't included almost
all of the detail and complication. All right, I am getting old, but
doubly manipulated stack mechanisms are not something that I have ever
found trivial, and I have designed a few.

And, as with all such mechanisms, users can get away with ignorance
until something goes badly wrong. Even if they are lucky, the
debugger will tell them little more than the contents of memory and
registers - if they are unlucky, as is common, the debugger will core
dump or tell them the wrong values :-(

In that case, they have to work out whether there is a bug in their
code or in the compiler causing the memory/register/RSE/etc. state
to be what it shouldn't be. And, to do so, they need to know all of
the horrible details. And, by 'their' code, I am not implying that
they wrote it or even understand it in detail.

Even ignoring 'trivial' errors, only a few percent of SIGSEGVs are
that bad - thank heavens! - but the aggregate time their investigation
consumes is out of all proportion to their number.

glen herrmannsfeldt

unread,

Jan 29, 2003, 4:40:10 PM1/29/03

to

"Bill" <bil...@attbidot.com> wrote in message
news:EyPZ9.11807$to3....@rwcrnsc51.ops.asp.att.net...

Probably about the same way as religions or political parties.
(Especially on the day after the state of the union speach.)

Things like how easy it is to learn and write useful programs with,
how easy it is to write a good compiler for, and how easy it is to
design hardware to implement are what I would consider.

-- glen

Andi Kleen

unread,

Jan 29, 2003, 5:08:16 PM1/29/03

to

David Mosberger-Tang <David.M...@acm.org> writes:

> e.g., why you can't put live data below the stack pointer, etc.

Actually the x86-64 ABI allows it for a limited "redzone", because it
apparently helps the compiler to generate better code in some circumstances.

Of course it has to be turned off for the kernel which may need to
handle hardware interrupts on the same stack.

-Andi

Brannon Batson

unread,

Jan 29, 2003, 6:56:31 PM1/29/03

to

nm...@cus.cam.ac.uk (Nick Maclaren) wrote in message news:<b1820n$c59$1...@pegasus.csx.cam.ac.uk>...
> [snip]

>
> Few modern architectures score well on those, and the Alpha is
> definitely not the best.

Nick, to satisfy my morbid curiousity, can you give a short list of
the things that Alpha ISA does that complicates your life?

Maybe:

1. No atomic ops, no implied fairness on LL/STC
2. ...

Brannon
not speaking for Intel

>[snip]

Andi Kleen

unread,

Jan 29, 2003, 7:06:06 PM1/29/03

to

[back to comp.arch only]

Brannon...@yahoo.com (Brannon Batson) writes:

> nm...@cus.cam.ac.uk (Nick Maclaren) wrote in message news:<b1820n$c59$1...@pegasus.csx.cam.ac.uk>...
>> [snip]
>>
>> Few modern architectures score well on those, and the Alpha is
>> definitely not the best.
>
> Nick, to satisfy my morbid curiousity, can you give a short list of
> the things that Alpha ISA does that complicates your life?

At least from the Linux perspective the Alpha has an too relaxed memory
model, which makes implementing certain lock less data structures harder.

See page 27 in http://www-124.ibm.com/linux/papers/lse/ols2002/rcu.pdf

It is the only architecture supported by Linux with this problem, needing
the excessive read barriers.

-Andi

David Mosberger-Tang

unread,

Jan 29, 2003, 7:51:01 PM1/29/03

to

>>>>> On Wed, 29 Jan 2003 23:08:16 +0100, Andi Kleen <fre...@alancoxonachip.com> said:

Andi> David Mosberger-Tang <David.M...@acm.org> writes:
>> e.g., why you can't put live data below the stack pointer, etc.

Andi> Actually the x86-64 ABI allows it for a limited "redzone",
Andi> because it apparently helps the compiler to generate better
Andi> code in some circumstances.

Andi> Of course it has to be turned off for the kernel which may
Andi> need to handle hardware interrupts on the same stack.

Of course. You're only adding fuel to my argument... ;-)

(BTW: The ia64 ABI does the "redzone" reservation such that it always
works, even in the kernel).

--david

Robert Myers

unread,

Jan 29, 2003, 8:28:54 PM1/29/03

to

What childhood fantasy are people holding onto? IA-64 is moving out to
sea aboard a nuclear-powered aircraft carrier with a complete air wing
and carrier battle group, and escorted by a nuclear attack submarine.

The physical Itanium chip that the world ends up using will likely bear
very little resemblance to Intel's original vision--especially the weird
idea that the complexity would be offloaded onto the compiler--but the
ISA we will be using will be IA-64.

RM

Nick Maclaren

unread,

Jan 30, 2003, 3:04:02 AM1/30/03

to

In article <qd%Z9.91160$AV4.2131@sccrnsc01>,

Hmm. Those of us who are old enough to remember the Vietnam war
will also remember when that was being said about (a) Fortran
and (b) the IBM 360/370 architecture.

Nick Maclaren

unread,

Jan 30, 2003, 3:25:21 AM1/30/03

to

In article <4495ef1f.03012...@posting.google.com>, Brannon...@yahoo.com (Brannon Batson) writes:

No. More basic. My copy of the architecture is at home, but my
recollection was of at least the following:

Considerable difficulty in trapping integer overflow (a
common mistake). I may be misremembering this one.

Poor support for memory synchronisation, which is important
if you have no atomic operations. LL/STC is NOT a good solution.
Again, I may be misremembering this one.

The shadow of undefined length associated with floating-point
operations and (worse) the fact that all results are undefined.

The last is the aspect that is made such a spectacular pig's
ear of in the compilers and run-time libraries. The specification
makes life easy for the hardware people at the expense of making
it impossible for the compiler and run-time system people. A few
trivial changes could improve all that considerably.

Of course, it is reasonable to say that the difficulties imposed
by the hardware are no excuse for producing ghastly software, and
are certainly no excuse for not documenting the restrictions.
But it is true to say that the best that the software people could
do is to detect and diagnose the more common failing cases, and
to document the restrictions.

Please note that I am NOT talking about IEEE support, but am
talking about the default, efficient mode. Nor am I talking
about fixups, but about reliable error flagging with reasonable
continuation semantics. All traditional (1960s) technology.

Jan C. Vorbrüggen

unread,

Jan 30, 2003, 7:21:05 AM1/30/03

to

> |> Exceptions are sometimes handled, but a good operating system shields
> |> you from a lot of details.
> Fine. Do you know of one? A mainstream one? I don't.

Does VMS count, or isn't it "mainstream" enough for you?

Jan

Nick Maclaren

unread,

Jan 30, 2003, 7:49:17 AM1/30/03

to

In article <3E3918B1...@mediasec.de>, Jan C. =?iso-8859-1?Q?Vorbr=FCggen?= <jvorbr...@mediasec.de> writes:
|> > |> Exceptions are sometimes handled, but a good operating system shields
|> > |> you from a lot of details.

|> > Fine. Do you know of one? A mainstream one? I don't.
|>
|> Does VMS count, or isn't it "mainstream" enough for you?

It quarter counts. It is half mainstream, and it is half way
there. Note that I wouldn't give MVS even that much, though it
beats the hell out of Unices and Microsoft systems in this
respect.

All right, it is some years since I looked at VMS, but I have
heard from several people that the design has not changed and
it is the deep design that causes the constraints.

Robert Myers

unread,

Jan 30, 2003, 10:04:55 AM1/30/03

to

Nick Maclaren wrote:
> In article <qd%Z9.91160$AV4.2131@sccrnsc01>,
> Robert Myers <rmyer...@attbi.com> writes:
> |>
> |> What childhood fantasy are people holding onto? IA-64 is moving out to
> |> sea aboard a nuclear-powered aircraft carrier with a complete air wing
> |> and carrier battle group, and escorted by a nuclear attack submarine.
> |>
> |> The physical Itanium chip that the world ends up using will likely bear
> |> very little resemblance to Intel's original vision--especially the weird
> |> idea that the complexity would be offloaded onto the compiler--but the
> |> ISA we will be using will be IA-64.
>
> Hmm. Those of us who are old enough to remember the Vietnam war
> will also remember when that was being said about (a) Fortran
> and (b) the IBM 360/370 architecture.
>

I'm definitely old enough to remember the VietNam war as well as the
statements about Fortran. I'm not onto your personality well enough yet
to know for sure when you're kidding.

Fortran is certainly alive and well, and may yet outlast C.

As to the 360/370 architecture, were 360 assembler and IBM JCL my most
important early experiences with computers, I would almost certainly
have confined my interest in computers to Fortran (or PL/I) programs.
Being a scientist, though, I soon learned that IBM computers were for
preparing W2 forms, and found the machines arising out of the mind of
Seymour Cray sufficiently appealing that I wanted to find out how they
worked and have never lost interest since.

As unappealing as I found it, though, System 360 was a bet the ranch
proposition for IBM that succeeded commercially. Del Cecchi has said
with a straight face that IBM is still selling systems that are direct
descendents of System 360, and I think I understand his personality well
enough to know that he was not kidding.

My bet is on IA-64.

RM

Nick Maclaren

unread,

Jan 30, 2003, 10:14:58 AM1/30/03

to

In article <qab_9.7639$G83.336@sccrnsc04>, Robert Myers <rmyer...@attbi.com> writes:

|> Nick Maclaren wrote:
|> > In article <qd%Z9.91160$AV4.2131@sccrnsc01>,
|> > Robert Myers <rmyer...@attbi.com> writes:
|> > |>
|> > |> What childhood fantasy are people holding onto? IA-64 is moving out to
|> > |> sea aboard a nuclear-powered aircraft carrier with a complete air wing
|> > |> and carrier battle group, and escorted by a nuclear attack submarine.
|> > |>
|> > |> The physical Itanium chip that the world ends up using will likely bear
|> > |> very little resemblance to Intel's original vision--especially the weird
|> > |> idea that the complexity would be offloaded onto the compiler--but the
|> > |> ISA we will be using will be IA-64.
|> >
|> > Hmm. Those of us who are old enough to remember the Vietnam war
|> > will also remember when that was being said about (a) Fortran
|> > and (b) the IBM 360/370 architecture.
|> >
|>

Nick Maclaren

unread,

Jan 30, 2003, 10:28:00 AM1/30/03

to

In article <qab_9.7639$G83.336@sccrnsc04>,
Robert Myers <rmyer...@attbi.com> writes:

|> Nick Maclaren wrote:
|> > In article <qd%Z9.91160$AV4.2131@sccrnsc01>,
|> > Robert Myers <rmyer...@attbi.com> writes:
|> > |>
|> > |> What childhood fantasy are people holding onto? IA-64 is moving out to
|> > |> sea aboard a nuclear-powered aircraft carrier with a complete air wing
|> > |> and carrier battle group, and escorted by a nuclear attack submarine.
|> > |>
|> > |> The physical Itanium chip that the world ends up using will likely bear
|> > |> very little resemblance to Intel's original vision--especially the weird
|> > |> idea that the complexity would be offloaded onto the compiler--but the
|> > |> ISA we will be using will be IA-64.
|> >
|> > Hmm. Those of us who are old enough to remember the Vietnam war
|> > will also remember when that was being said about (a) Fortran
|> > and (b) the IBM 360/370 architecture.
|>

|> I'm definitely old enough to remember the VietNam war as well as the
|> statements about Fortran. I'm not onto your personality well enough yet
|> to know for sure when you're kidding.

You may never know for sure! I was roughly half joking there, but
was serious in the references. The one relating to the first
paragraph should be obvious.

|> Fortran is certainly alive and well, and may yet outlast C.

But it does NOT dominate even scientific computing, let alone be
nearly a monopoly language as was predicted back then.

|> As unappealing as I found it, though, System 360 was a bet the ranch
|> proposition for IBM that succeeded commercially. Del Cecchi has said
|> with a straight face that IBM is still selling systems that are direct
|> descendents of System 360, and I think I understand his personality well
|> enough to know that he was not kidding.

He is correct, of course, but just how many people use them now?
They were predicted back then to monopolise the server market
more-or-less for all time.

|> My bet is on IA-64.

In the medium term (say 2-10 years from now), possibly, though I
remain skeptical. But it is vanishingly unlikely to dominate
the market either as thoroughly or as long as either Fortran
or System 360/370 did.

Rob Young

unread,

Jan 30, 2003, 11:15:45 AM1/30/03

to

In article <b1bga0$cn8$1...@pegasus.csx.cam.ac.uk>, nm...@cus.cam.ac.uk (Nick Maclaren) writes:

>
> |> My bet is on IA-64.
>
> In the medium term (say 2-10 years from now), possibly, though I
> remain skeptical. But it is vanishingly unlikely to dominate
> the market either as thoroughly or as long as either Fortran
> or System 360/370 did.
>

Define market.

Here is my definition of market in the context of IA64...
IA32. Since arguably 80%+ of all server and
desktop revenues are IA32 related (anybody got figures handy?)
I would say that is the market. All others are niches.

If all that is left of IA32 is IA64 10 years from now and IA64
is 80% of the market, then IA64 is *the* market.

Intel made IA32 fly. Who among us 10 years ago would have thought
IA32 would have near the performance it does 10 years out? And would
be dominating the market? Any Usenet/Google links to substantiate
such a claim?

Stick 10 years behind IA64 and it is pretty clear that pig will
fly. Shoot, Itanium isn't doing too bad right now in an Altix.

Or is it?

We all know what modest performance but tremendous business
acumen can result in - IA32. Now you have very good performance
and they haven't lost their edge on the business side - quite
contrary - Intel is still very much a winner there. Just check
out their recent pricing moves over the last 2-3 years if you
doubt that.

Intel will work their pricing model magic, IA64 will happen.

I wouldn't deny they may take some pricing dings (looking back
5-10 years from now), but when you own the market you take your dings.

Rob

Tony Nelson

unread,

Jan 30, 2003, 11:27:20 AM1/30/03

to

In article <m33cnba...@averell.firstfloor.org>,
Andi Kleen <fre...@alancoxonachip.com> wrote:

Hmm. On PowerPC, the hardware "supports" a Red Zone even for
interrupts, because interrupts don't touch the stack, only interrupt
handlers do, and they must follow the ABI and support the Red Zone.
____________________________________________________________________
TonyN.:' tony...@shore.net
'

Nick Maclaren

unread,

Jan 30, 2003, 12:01:37 PM1/30/03

to

In article <f+++0L...@eisner.encompasserve.org>,

you...@encompasserve.org (Rob Young) writes:
|>
|> Here is my definition of market in the context of IA64...
|> IA32. Since arguably 80%+ of all server and
|> desktop revenues are IA32 related (anybody got figures handy?)
|> I would say that is the market. All others are niches.

Hmm. Now look at the margin, which is equally important.

|> If all that is left of IA32 is IA64 10 years from now and IA64
|> is 80% of the market, then IA64 is *the* market.

Given that precondition, that is clearly true.

|> Intel made IA32 fly. Who among us 10 years ago would have thought
|> IA32 would have near the performance it does 10 years out? And would
|> be dominating the market? Any Usenet/Google links to substantiate
|> such a claim?

I would have to search, but MOST people thought that x86 would
dominate the desktop and even small server market. In 1990-1992,
the 80386 and 486 had become the workhorses of the desktop arena
and were making inroads into small servers. In early 1993, Intel
had FINALLY got the worst of the bugs out of the Pentium and were
about to launch it for real.

And it would run legacy software, but faster :-)

|> Stick 10 years behind IA64 and it is pretty clear that pig will
|> fly. Shoot, Itanium isn't doing too bad right now in an Altix.

Just like the i860 did?

|> We all know what modest performance but tremendous business
|> acumen can result in - IA32. Now you have very good performance
|> and they haven't lost their edge on the business side - quite
|> contrary - Intel is still very much a winner there. Just check
|> out their recent pricing moves over the last 2-3 years if you
|> doubt that.
|>
|> Intel will work their pricing model magic, IA64 will happen.

Perhaps. We shall see.

Bernd Paysan

unread,

Jan 30, 2003, 11:57:15 AM1/30/03

to

Rob Young wrote:
> We all know what modest performance but tremendous business
> acumen can result in - IA32.

However, most of x86 success was that it was a commodity, right from start.
Intel licensed it to AMD, and there were other clones like NEC's V20/V30.
The whole point how the Wintel monopoly came into place was that it didn't
start as a monopoly, but as close to free market as the whole computer
business (with all the copyright and patent based monopolies) ever got.

IA64 is trying the opposite: go to a more proprietary architecture, which
noone can clone. For me, that's comparable to IBM's strategy with PS/2 and
OS/2, and like that effort, it's likely to fail. Not for technical merits,
but for not being open (and backward compatible) enough.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

glen herrmannsfeldt

unread,

Jan 30, 2003, 1:10:46 PM1/30/03

to

"Robert Myers" <rmyer...@attbi.com> wrote in message
news:qab_9.7639$G83.336@sccrnsc04...
(snip)

>
> As to the 360/370 architecture, were 360 assembler and IBM JCL my
most
> important early experiences with computers, I would almost
certainly
> have confined my interest in computers to Fortran (or PL/I)
programs.
> Being a scientist, though, I soon learned that IBM computers were
for
> preparing W2 forms, and found the machines arising out of the
mind of
> Seymour Cray sufficiently appealing that I wanted to find out how
they
> worked and have never lost interest since.
>
> As unappealing as I found it, though, System 360 was a bet the
ranch
> proposition for IBM that succeeded commercially. Del Cecchi has
said
> with a straight face that IBM is still selling systems that are
direct
> descendents of System 360, and I think I understand his
personality well
> enough to know that he was not kidding.

They are still selling machines that will run problem state
programs that ran on S/360. They won't run unmodified OS that
would run on S/360, though. They have twice extended the address
space of the architecture, including backward compatibility for
older programs. From 24 bit to 31 bit to 64 bit addressing.

A z/architecture machine running OS/390 or z/OS should still run
the Fortran G and H compilers, the PL/I F compiler, etc., along
with the compiled programs.

I wonder if Windows 2020 will still run MSDOS 1.0 programs?

-- glen

glen herrmannsfeldt

unread,

Jan 30, 2003, 1:12:46 PM1/30/03

to

"Nick Maclaren" <nm...@cus.cam.ac.uk> wrote in message
news:b1bga0$cn8$1...@pegasus.csx.cam.ac.uk...
(Snip regarding IBM S/360 and successors)

>
> He is correct, of course, but just how many people use them now?
> They were predicted back then to monopolise the server market
> more-or-less for all time.

They now run Linux on them. Many are being used to run web
servers, or so I hear. I haven't actually seen one of these new
machines.

-- glen

Del Cecchi

unread,

Jan 30, 2003, 12:51:48 PM1/30/03

to

followups trimmed

In article <b1bga0$cn8$1...@pegasus.csx.cam.ac.uk>,
nm...@cus.cam.ac.uk (Nick Maclaren) writes:
|>

Back then the term server hadn't even been invented. And lots of people use them
now. IBM still sells on the order of 3 Billion Dollars worth per year.

And I try to not kid in this newsgroup except on homework questions.

Don't forget S/390, Z/900, eserver Z series, et al.

--

Del Cecchi
cec...@us.ibm.com
Personal Opinions Only

Nick Maclaren

unread,

Jan 30, 2003, 1:44:53 PM1/30/03

to

In article <b1bonk$dfg$1...@news.rchland.ibm.com>,

Del Cecchi <cec...@us.ibm.com> wrote:
>|>
>|> He is correct, of course, but just how many people use them now?
>|> They were predicted back then to monopolise the server market
>|> more-or-less for all time.
>
>Back then the term server hadn't even been invented. And lots of people use them
>now. IBM still sells on the order of 3 Billion Dollars worth per year.

That is all true (well, I assume the $3e9 is). My point isn't that
they have disappeared, but that they don't dominate. In terms of
the number of DIRECT users, they are negligible - but that isn't
the only measure of importance.

>And I try to not kid in this newsgroup except on homework questions.

I feel no such compunction, but usually do it by posting something
that is absolutely true :-)

They have never dominated more than a VERY specialist market. The
System 360/370 dominated much of computing - as some people believe
the IA-64 will.

Nick Maclaren

unread,

Jan 30, 2003, 1:49:16 PM1/30/03

to

In article <GUd_9.25367$to3....@rwcrnsc51.ops.asp.att.net>,

glen herrmannsfeldt <g...@ugcs.caltech.edu> wrote:
>
>They are still selling machines that will run problem state
>programs that ran on S/360. They won't run unmodified OS that
>would run on S/360, though. They have twice extended the address
>space of the architecture, including backward compatibility for
>older programs. From 24 bit to 31 bit to 64 bit addressing.
>
>A z/architecture machine running OS/390 or z/OS should still run
>the Fortran G and H compilers, the PL/I F compiler, etc., along
>with the compiled programs.
>
>I wonder if Windows 2020 will still run MSDOS 1.0 programs?

Round about 1990, I got a bug fixed that had to do with support
for object modules compiled under Fortran G. The relevant teams
in IBM no longer had the Fortran G and H Programmer's Guide and
I had to post them the extract :-)

Yousuf Khan

unread,

Jan 30, 2003, 2:14:49 PM1/30/03

to

"Tony Nelson" <tony...@shore.net> wrote in message
news:tonynlsn-410319...@news.primus.ca...

> > Actually the x86-64 ABI allows it for a limited "redzone", because it
> > apparently helps the compiler to generate better code in some
circumstances.
> >
> > Of course it has to be turned off for the kernel which may need to
> > handle hardware interrupts on the same stack.
>
> Hmm. On PowerPC, the hardware "supports" a Red Zone even for
> interrupts, because interrupts don't touch the stack, only interrupt
> handlers do, and they must follow the ABI and support the Red Zone.

What exactly is this RedZone you're talking about?

Yousuf Khan

Andi Kleen

unread,

Jan 30, 2003, 2:46:51 PM1/30/03

to

Tony Nelson <tony...@shore.net> writes:
>
> Hmm. On PowerPC, the hardware "supports" a Red Zone even for
> interrupts, because interrupts don't touch the stack, only interrupt
> handlers do, and they must follow the ABI and support the Red Zone.

x86-64 would support it in theory too. It has interrupts stack in the
hardware. The problem is that the interrupt stack mechanism does not
really support nested interrupts easily. To support nested interrupts
it is faster to use software interrupt stack switching: let the hardware
write the interrupt header to the old stack and then switch
the stack pointer. This prevents using the redzone.

-Andi

Eric Smith

unread,

Jan 30, 2003, 3:38:24 PM1/30/03

to

David Mosberger-Tang <David.M...@acm.org> writes:
> This is much more a question of operating system (and it's supported
> data models) than a question of archicture. For example, HP-UX on the
> Itanium Processor Family defaults to a 32-bit (ILP32) data model.
> Only if you turn on a flag (+DD64, IIRC), will you get the 64-bit
> (LP64) data model (the kernel is always 64-bit, but there is nothing
> in the architecture that forces this choice).

I think that's what I said. I was asking about someone's earlier claim
that the Alpha and Opteron *processors* have a 32-bit mode, which I
believe is false in the case of the Alpha.

Chris Torek

unread,

Jan 30, 2003, 3:36:40 PM1/30/03

to

In article <slrnb3g672....@devserv.devel.redhat.com>
Pete Zaitcev <zai...@yahoo.com> writes:
[on register window over/under-flow handling]
>I think that sparc64 (aka Ultrasparc) is much improved in this
>regard over the sparc ...

It is -- but getting all the details right is still hairy.

What V9 does, that V8-and-earlier made effectively impossible, is:

- a window over/under flow causes a fault [same as V8], but
- a fault within the fault handler for OF/UF is allowed in hardware.

This means the kernel code for "save a register window on normal
overflow" is just:

store regs, as_if_user[addr] # i.e., store with user's permissions,
# not privileged permissions
return from trap

which is of course quite short and, usually, fast. The same goes
for window underflow, which just uses "load as if user".

The problem comes in when the store or load fails. In this case,
you get a nested trap-within-trap. Then you have to figure out
WHY the store or load failed. (Most of this information is delivered
via the trap type.) TLB miss? Alignment? Page currently paged-out
but valid? Page never read/write-able? ECC error? TLB miss
followed by ECC error during TLB fill?

Note that in this last case, you must not lose the register
information (because V9 "ECC fault" traps include corrected ECC
errors).

It really does get rather hairy.
--
In-Real-Life: Chris Torek, Wind River Systems (BSD engineering)
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W)
Domain: to...@bsdi.com http://67.40.109.61/torek/ (for the moment)
(you probably cannot email me -- spam has effectively killed email)

Nick Maclaren

unread,

Jan 30, 2003, 4:56:48 PM1/30/03

to

In article <qhznpii...@ruckus.brouhaha.com>,

I believe that to be so - I certainly can't find one in my handbook.

Opteron certainly does, though what it does not have is a 32-bit
x86-64 mode for user processes. If you want a 32-bit mode, you get
x86 and like it.

At least SPARC, MIPS and POWER are like PA-RISC, except that I can't
say whether the architecture allows the kernel to be either.

And, no, I have no idea what AIX means by supporting 64-bit user
processes with a 32-bit kernel. I meant to find out, but never had
time.

Zack Weinberg

unread,

Jan 30, 2003, 5:01:44 PM1/30/03

to

>Opteron certainly does, though what it does not have is a 32-bit
>x86-64 mode for user processes. If you want a 32-bit mode, you get
>x86 and like it.

Nothing stopping software support for what Alpha compilers call
the 'truncated address space' option, though, where the chip is
in x86-64 mode but pointers are 32 bits wide. (For user space
code only.)

I'm not aware of anyone having done this yet though.

zw

Brannon Batson

unread,

Jan 30, 2003, 6:47:48 PM1/30/03

to

nm...@cus.cam.ac.uk (Nick Maclaren) wrote in message news:<b1anhh$kqp$1...@pegasus.csx.cam.ac.uk>...

>[snip]

>
> Poor support for memory synchronisation, which is important
> if you have no atomic operations. LL/STC is NOT a good solution.
> Again, I may be misremembering this one.

Are you just alluding to my #1 point above, or was there something
more subtle that concerned you? I waffle on LL/STC vs. atomic
operations, myself. On the one hand, LL/STC is faster, more general,
and easier to implement. On the other hand, LL/STC as it is defined
under Alpha can create some hard forward progress issues on poorly
written software, and there are some fairness issues in hardware as
well.

> [snip, complaints I don't understand on things I don't really care about]

Brannon
not speaking for Intel (or Dec/Compaq/HP)

McCalpin

unread,

Jan 30, 2003, 6:43:49 PM1/30/03

to

In article <b1c730$156$1...@pegasus.csx.cam.ac.uk>,

Nick Maclaren <nm...@cus.cam.ac.uk> wrote:
>
>And, no, I have no idea what AIX means by supporting 64-bit user
>processes with a 32-bit kernel. I meant to find out, but never had
>time.

Well, it means that you can run user applications with native
64-bit integers and 64-bit pointers even if the O/S does not use
those itself. It knows how to create large memory regions for
user processes and how to do the appropriate context state
fiddling when 64-bit user processes need it.

It works just fine, with the minor caveat that the 32-bit
version of AIX can only "see" 96 GB of memory. If you need
to use more than that in a single O/S image, you need the
64-bit version of AIX.
--
John D. McCalpin, Ph.D. mcca...@austin.ibm.com
Senior Technical Staff Member IBM POWER Microprocessor Development
"I am willing to make mistakes as long as
someone else is willing to learn from them."

Tony Nelson

unread,

Jan 30, 2003, 10:27:17 PM1/30/03

to

In article
<JQe_9.514765$F2h1....@news01.bloor.is.net.cable.rogers.com>,
"Yousuf Khan" <bbbl6...@yahoo.com.nospam> wrote:

> What exactly is this RedZone you're talking about?

"The Red Zone is for loading and unloading only." It's stack space
beyond the stack pointer, used by leaf procedures to reduce overhead.
(Don't gag, now.) Interrupt handlers must be careful to skip past the
Red Zone so as not to tromp on anything. On PowerPC, this is easy
enough, as an interrupt just sets a couple of spare registers and
transfers to the handler, which will set up a stack frame and re-enable
interrupts if it wishes to. For a description of the Red Zone on Power
Macs, see
<http://developer.apple.com/techpubs/mac/runtimehtml/RTArch-61.html> .
____________________________________________________________________
TonyN.:' tony...@shore.net
'

Sander Vesik

unread,

Jan 30, 2003, 10:37:20 PM1/30/03

to

In comp.arch Andi Kleen <fre...@alancoxonachip.com> wrote:
>> know about things like the interrupt shadow (or whatever it is
>> called). Not least because you need to be able to read between
>> the lines of the compiler manuals to select options that work.
>
> Interrupt shadow registers is only interesting for an operating system.
> Joe Application programmer couldn't care less about them and he will
> never use the switches in the compiler required for it.

These could be available/in-use in a signal handler.

>
>> [ In this case, when I first hit it, there WEREN'T any that did!
>> The next version of the compiler introduced them, but I believe
>> that even the worst of the "gotchas" are STILL undocumented. ]
>
> I'm sure you weren't writing an normal user level application when you
> needed it. You may need it for driver programming, but that's clearly
> not user level.
>
>
>> |> Or both IA64 and Sparc have horribly over complicated register window
>> |> implementations (also known as the system programmer's nightmare)
>> |> But at user level you can completely ignore that.
>>
>> Oh, yeah? Now try debugging a large, optimised application that
>> has just crashed with a particularly horrible SIGSEGV, that you
>> suspect is caused by a code generation error.
>
> You mean a bug in the runtime?
>
> Window handling consists of two parts:
>
> - Compiler/User issuing instructions to allocate/free windows.
> That is relatively simple and part of the user level ISA.
> If your compiler gets that wrong you can easily check it without
> knowing anything about system level ISA.

This is not quite so - selective scribbling on the stack is a counter-
example, runtimes that do garbage collection is another. Compilers
need to at least know there are windows (or otherwise the resulting
code for tail calls and leaf functions is probably quite pessimal),
and the programmer / designed being aware of the fact there are windows
and how many are likely to be available gives a faster-running design.

>
> -Andi
>

--
Sander

+++ Out of cheese error +++

Jan C. Vorbrüggen

unread,

Jan 31, 2003, 3:28:20 AM1/31/03

to

> Are you just alluding to my #1 point above, or was there something
> more subtle that concerned you? I waffle on LL/STC vs. atomic
> operations, myself. On the one hand, LL/STC is faster, more general,
> and easier to implement. On the other hand, LL/STC as it is defined
> under Alpha can create some hard forward progress issues on poorly
> written software, and there are some fairness issues in hardware as well.

In my view, LL/STC wins because you can build atomic ops out of it, but
you can't use atomic ops to build LL/STC - well, I suppose you could, but
it would be putting the cart before the horse.

With regard to the other issues, doesn't every synchronization/lock
mechanism suffer from these issues, i.e., the hardware _and_ software
architects _and_ implementors must know what they're doing, or things
_will_ go wrong?

Jan

Jan C. Vorbrüggen

unread,

Jan 31, 2003, 3:30:28 AM1/31/03

to

> > What exactly is this RedZone you're talking about?
>
> "The Red Zone is for loading and unloading only." It's stack space
> beyond the stack pointer, used by leaf procedures to reduce overhead.
> (Don't gag, now.) Interrupt handlers must be careful to skip past the
> Red Zone so as not to tromp on anything.

As I understand it, interrupt and exception handlers are forbidded to use
memory addressed at (SP):-x(SP), in addition to everything just above the
stack pointer, SP. Easy for a compiler to obey. Just don't forget to set
the proper switch when compiling you handler/driver 8-|.

Jan

Jan C. Vorbrüggen

unread,

Jan 31, 2003, 3:32:27 AM1/31/03

to

> To support nested interrupts
> it is faster to use software interrupt stack switching: let the hardware
> write the interrupt header to the old stack and then switch
> the stack pointer. This prevents using the redzone.

Why "prevents"? "makes it more difficult", I would agree. Depending on
the details, I can also see one additional word of overhead (i.e., do the
old frame-pointer trick), but that should be it.

Jan

Jan C. Vorbrüggen

unread,

Jan 31, 2003, 3:46:12 AM1/31/03

to

> >Don't forget S/390, Z/900, eserver Z series, et al.
>
> They have never dominated more than a VERY specialist market. The
> System 360/370 dominated much of computing - as some people believe
> the IA-64 will.

Please define your figure of merit for "dominates".

Number of units sold - 4-bit/8-bit systems win by orders of magnitude.
Number of users - embedded systems of all sorts (4- to 64-bit) win...
Revenue and earnings - embedded systens...
Amount of money moved - "mainframes" (360 et seq., Burroughs, VAX/Alpha)
win...
Public perception - only Wintel is on the map.

360/370 had the advantage of being one of very systems on the market, and
the only one with a really large, multinational corporation with a well-
established wordlwide sales force behind it. The surprise is that others
could survive besides it...

NASA even used a cluster (as one would call it today) of, I believe,
370/168 systems as real-time systems for processing Gemini and Apollo
telemetry. A few years later, one of the PC companies had an ad with a
Saturn at take-off and the caption "You can now have more computing power
on your desk than all of NASA had for the moon shots." And it was perfectly
true...

Jan

Nick Maclaren

unread,

Jan 31, 2003, 3:47:10 AM1/31/03

to

In article <4495ef1f.03013...@posting.google.com>,

Brannon Batson <Brannon...@yahoo.com> wrote:
>> |>
>> |> Maybe:
>> |>
>> |> 1. No atomic ops, no implied fairness on LL/STC
>>

>> No. More basic.

>
>> Poor support for memory synchronisation, which is important
>> if you have no atomic operations. LL/STC is NOT a good solution.
>> Again, I may be misremembering this one.
>
>Are you just alluding to my #1 point above, or was there something
>more subtle that concerned you? I waffle on LL/STC vs. atomic
>operations, myself. On the one hand, LL/STC is faster, more general,
>and easier to implement. On the other hand, LL/STC as it is defined
>under Alpha can create some hard forward progress issues on poorly
>written software, and there are some fairness issues in hardware as
>well.

Not more subtle, but more what I think Andi Kleen was referring to.
Consider a single-threaded, single CPU program that updates an
in-memory structure and receives commands via signal. Now, the POSIX
way is to play horrible games with signal masking, but there is an
older and better method that works in many cases.

You treat the existing copy of the structure as read-only, create a
new one, and swap the pointers. Now, the ONLY requirements for this
to be bulletproof are (a) that the pointer update is atomic as seen
by the interrupt mechanism and (b) that you can force the other
updates to complete before the pointer update.

The advantages about this technique (which I am sure you know) are
many. One of the main ones is that it works equally well whether
the source of asynchronicity is an interrupt or is a parallel thread.
The structure updater does not need to know which. Furthermore, it
works even in cases where the interrupt is an exception that leads
to undefined completion, cancellation or corruption of the active
state at the time the interrupt occurs.

Now, the Alpha seems to rely on barriers, which can be used to handle
this case (except that they aren't easily usable from high-level and
portable code), but they are far too heavyweight. Also, in general,
they need to be inserted into both the updating and reading threads
(though I am not sure if this is true for the Alpha), and that is
not always achievable.

Nick Maclaren

unread,

Jan 31, 2003, 3:51:56 AM1/31/03

to

In article <b1cdbl$ras$1...@ausnews.austin.ibm.com>,

McCalpin <mcca...@gmp246.austin.ibm.com> wrote:
>In article <b1c730$156$1...@pegasus.csx.cam.ac.uk>,
>Nick Maclaren <nm...@cus.cam.ac.uk> wrote:
>>
>>And, no, I have no idea what AIX means by supporting 64-bit user
>>processes with a 32-bit kernel. I meant to find out, but never had
>>time.
>
>Well, it means that you can run user applications with native
>64-bit integers and 64-bit pointers even if the O/S does not use
>those itself. It knows how to create large memory regions for
>user processes and how to do the appropriate context state
>fiddling when 64-bit user processes need it.

Yes, I understand at that level. What I was referring to were the
details, and whether there are actually any other restrictions than
the 96 GB real memory limit. For example, on MVT and MVS, there
were a fair number of subtleties that affected the obscurer parts
of the system during the various phases when similar techniques
were adopted.

It would not surprise me if (for example) the maximum size of buffer
acceptable to the read or write system calls was different between
the two kernels. I have been caught by that sort of thing before.

Nick Maclaren

unread,

Jan 31, 2003, 3:58:11 AM1/31/03

to

In article <3E3A37D4...@mediasec.de>,

Jan C. =?iso-8859-1?Q?Vorbr=FCggen?= <jvorbr...@mediasec.de> wrote:
>> >Don't forget S/390, Z/900, eserver Z series, et al.
>>
>> They have never dominated more than a VERY specialist market. The
>> System 360/370 dominated much of computing - as some people believe
>> the IA-64 will.
>
>Please define your figure of merit for "dominates".
>
>Number of units sold - 4-bit/8-bit systems win by orders of magnitude.
>Number of users - embedded systems of all sorts (4- to 64-bit) win...
>Revenue and earnings - embedded systens...
>Amount of money moved - "mainframes" (360 et seq., Burroughs, VAX/Alpha)
> win...

Today? I find that hard to believe. It certainly was true then.

>Public perception - only Wintel is on the map.

And back in the late 1960s and early 1970s, IBM's System 360/370
was pretty close to that.

But I was thinking as much of the markets that occupy most of the
executives' planning time, the marketdroid's budgets etc. Consider
Intel - despite its large embedded sales, they are run with relatively
little fuss. The real hoo-hah is at the high end.

Terje Mathisen

unread,

Jan 31, 2003, 7:24:41 AM1/31/03

to

glen herrmannsfeldt wrote:
> I wonder if Windows 2020 will still run MSDOS 1.0 programs?

Sure it will!

The machine it runs on will even support all the uglies, like
self-modifying code, just to make it possible for my MIME text
executable to still work:

The first two lines contain the primary bootstrap (i.e.
self-modification of a single Jcc opcode), as well as a generalized two
char to one binary byte converter:

ZRYPQIQDYLRQRQRRAQX,2,NPPa,R0Gc,.0Gd,PPu.F2,QX=0+r+E=0=tG0-Ju E=
EE(-(-GNEEEEEEEEEEEEEEEF 5BBEEYQEEEE=DU.COM=======(c)TMathisen95

Load them up in DEBUG or your favourite 16-bit Dos debugger to see what
it's doing. :-)

Terje

--
- <Terje.M...@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

Andi Kleen

unread,

Jan 31, 2003, 9:28:38 AM1/31/03

to

Terje Mathisen <terje.m...@hda.hydro.com> writes:

> glen herrmannsfeldt wrote:
>> I wonder if Windows 2020 will still run MSDOS 1.0 programs?
>
> Sure it will!

x86-64 dropped vm86 mode. If it catches on and Windows 2020 is a long
mode operating system then it won't run MSDOS 1.0 programs directly on
the CPU.

Of course you could always run it in a software emulator then, and
it will likely still be faster than the real thing.
(that is today the case with amiga emulators for example)

-Andi

Keith R. Williams

unread,

Jan 31, 2003, 9:26:08 AM1/31/03

to

In article <3E3A37D4...@mediasec.de>, jvorbr...@mediasec.de
says...

> > >Don't forget S/390, Z/900, eserver Z series, et al.
> >
> > They have never dominated more than a VERY specialist market. The
> > System 360/370 dominated much of computing - as some people believe
> > the IA-64 will.
>
> Please define your figure of merit for "dominates".
>
> Number of units sold - 4-bit/8-bit systems win by orders of magnitude.
> Number of users - embedded systems of all sorts (4- to 64-bit) win...
> Revenue and earnings - embedded systens...
> Amount of money moved - "mainframes" (360 et seq., Burroughs, VAX/Alpha)
> win...
> Public perception - only Wintel is on the map.
>
> 360/370 had the advantage of being one of very systems on the market, and
> the only one with a really large, multinational corporation with a well-
> established wordlwide sales force behind it. The surprise is that others
> could survive besides it...
>
> NASA even used a cluster (as one would call it today) of, I believe,
> 370/168 systems as real-time systems for processing Gemini and Apollo
> telemetry.

The 3168 came out in '72 and the final lunar landing was in '72, so I
doubt it was used for Gemini/Apollo in real time. ;-)

> A few years later, one of the PC companies had an ad with a
> Saturn at take-off and the caption "You can now have more computing power
> on your desk than all of NASA had for the moon shots." And it was perfectly
> true...

...and now your microwave oven.

--
Keith

Jan C. Vorbrüggen

unread,

Jan 31, 2003, 9:52:47 AM1/31/03

to

> >Amount of money moved - "mainframes" (360 et seq., Burroughs, VAX/Alpha)
> > win...
>
> Today? I find that hard to believe. It certainly was true then.

I sure hope my bank isn't using a W2K/SQPServer-based system to manage
my account...

> >Public perception - only Wintel is on the map.
>
> And back in the late 1960s and early 1970s, IBM's System 360/370
> was pretty close to that.

Yeah - but that's because there wasn't much competition, was there? OTOH,
DEC certainly managed to get a PDP-11 here and a PDP-10 there placed pro-
minently in films that were about computers...

> But I was thinking as much of the markets that occupy most of the
> executives' planning time, the marketdroid's budgets etc. Consider
> Intel - despite its large embedded sales, they are run with relatively
> little fuss. The real hoo-hah is at the high end.

That's because the buying decisions are made by the "man on the street".
The corporate decision maker you influence over a nice dinner or via his
personal bank account; you need a proper level of marketing hype for the
consumer...

Jan

Del Cecchi

unread,

Jan 31, 2003, 10:04:34 AM1/31/03

to

In article <3E3A37D4...@mediasec.de>,
Jan C. =?iso-8859-1?Q?Vorbr=FCggen?= <jvorbr...@mediasec.de> writes:
snip

|>
|> 360/370 had the advantage of being one of very systems on the market, and
|> the only one with a really large, multinational corporation with a well-
|> established wordlwide sales force behind it. The surprise is that others
|> could survive besides it...
|>

snip
|>
|> Jan
Yep, those companies like NCR, Burroughs, Univac, GE, et al were just garage shops.
NOT

Anne & Lynn Wheeler

unread,

Jan 31, 2003, 10:53:06 AM1/31/03

to

cec...@signa.rchland.ibm.com (Del Cecchi) writes:
> Yep, those companies like NCR, Burroughs, Univac, GE, et al were just garage shops.
> NOT

some recent discussions about the BUNCH and then later snow white and
the seven dwarfs:
http://www.garlic.com/~lynn/2002o.html#78 Newsgroup cliques?
http://www.garlic.com/~lynn/2003.html#36 mainframe
http://www.garlic.com/~lynn/2003.html#71 Card Columns
http://www.garlic.com/~lynn/2003b.html#5 Card Columns

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/
Internet trivia 20th anv http://www.garlic.com/~lynn/rfcietff.htm

Nick Maclaren

unread,

Jan 31, 2003, 11:19:49 AM1/31/03

to

In article <3E3A8DBF...@mediasec.de>, Jan C. =?iso-8859-1?Q?Vorbr=FCggen?= <jvorbr...@mediasec.de> writes:
|> > >Amount of money moved - "mainframes" (360 et seq., Burroughs, VAX/Alpha)
|> > > win...
|> >
|> > Today? I find that hard to believe. It certainly was true then.
|>
|> I sure hope my bank isn't using a W2K/SQPServer-based system to manage
|> my account...

Oh, right. I misunderstood. Yes, that is so.

|> > >Public perception - only Wintel is on the map.
|> >
|> > And back in the late 1960s and early 1970s, IBM's System 360/370
|> > was pretty close to that.
|>
|> Yeah - but that's because there wasn't much competition, was there? OTOH,
|> DEC certainly managed to get a PDP-11 here and a PDP-10 there placed pro-
|> minently in films that were about computers...

Hmm. ICL were streets ahead of IBM in terms of design. A pity about
their manufacturing and production. There were a lot of other, often
better, mainframe vendors, companies like Prime etc. There was as
much competition in the mainframe market then as there is in the
commercial "virtual paper" market today.

|> > But I was thinking as much of the markets that occupy most of the
|> > executives' planning time, the marketdroid's budgets etc. Consider
|> > Intel - despite its large embedded sales, they are run with relatively
|> > little fuss. The real hoo-hah is at the high end.
|>
|> That's because the buying decisions are made by the "man on the street".
|> The corporate decision maker you influence over a nice dinner or via his
|> personal bank account; you need a proper level of marketing hype for the
|> consumer...

I don't see the "man on the street" buying a lot of IA-64 systems
until at least the FOURTH IA-64 chip :-)

Bernd Paysan

unread,

Jan 31, 2003, 11:09:25 AM1/31/03

to

Andi Kleen wrote:
> x86-64 dropped vm86 mode. If it catches on and Windows 2020 is a long
> mode operating system then it won't run MSDOS 1.0 programs directly on
> the CPU.

There's always a way around that. Microsoft found a way to switch from the
286 PM back to real mode, which was considderably harder than switching
from x86-64 to compatibility mode. 286 PM mode was a "enter once, never
leave" mode, and you really had to reset the CPU, either with a tripple
fault or with the keyboard controller. x86-64 in comparison is really easy
to leave (though the tripple fault and keyboard controller reset also
works).

BTW: AMD just annonced that they'll delay Athlon-64 another half year, while
the Opteron ships in time. Either they have severe yield problems, or they
want to wait for a Windows XP for x86-64 (and that it will be there in
September is still a tough bet). Maybe Microsoft still struggles with the
difficulties to let a DOS box run side by side with 64 bit programs ;-).

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

M. Ranjit Mathews

unread,

Jan 31, 2003, 11:47:40 AM1/31/03

to

Andi Kleen wrote:
> Terje Mathisen <terje.m...@hda.hydro.com> writes:
>
>>glen herrmannsfeldt wrote:
>>
>>>I wonder if Windows 2020 will still run MSDOS 1.0 programs?
>>Sure it will!
>
> x86-64 dropped vm86 mode.

It might be useful to have a vm-Athlon 32 mode, though. It might
facilitate vmware and win4lin working well under a 64 bit OS and might
allow x86-64 to be used for server consolidation by letting a large
x86-64 machine/ cluster host many virtual x86-32 machines.

> If it catches on and Windows 2020 is a long mode operating system then it won't run
> MSDOS 1.0 programs directly on the CPU.

Hmm, to think that DOS programs were ubiquitous a mere 7 years back.

Yousuf Khan

unread,

Jan 31, 2003, 12:13:18 PM1/31/03

to

"M. Ranjit Mathews" <ranjit_...@yahoo.com> wrote in message
news:3E3A9DB6...@yahoo.com...

> > x86-64 dropped vm86 mode.
>
> It might be useful to have a vm-Athlon 32 mode, though. It might
> facilitate vmware and win4lin working well under a 64 bit OS and might
> allow x86-64 to be used for server consolidation by letting a large
> x86-64 machine/ cluster host many virtual x86-32 machines.

Oh, that's there alright. In 64-bit mode, you have a Long submode and a
Compatibility submode. Long is full 64-bit support, while Compatibility is
all previous Protected mode incarnations (16-bit and 32-bit) running
natively under 64-bit mode.

Yousuf Khan

Fred Kleinsorge

unread,

Jan 31, 2003, 12:31:04 PM1/31/03

to

"Brannon Batson" <Brannon...@yahoo.com> wrote in message
news:4495ef1f.03013...@posting.google.com...

> nm...@cus.cam.ac.uk (Nick Maclaren) wrote in message
news:<b1anhh$kqp$1...@pegasus.csx.cam.ac.uk>...
> > In article <4495ef1f.03012...@posting.google.com>,
Brannon...@yahoo.com (Brannon Batson)

> >

> > Poor support for memory synchronisation, which is important
> > if you have no atomic operations. LL/STC is NOT a good solution.
> > Again, I may be misremembering this one.
>
> Are you just alluding to my #1 point above, or was there something
> more subtle that concerned you? I waffle on LL/STC vs. atomic
> operations, myself. On the one hand, LL/STC is faster, more general,
> and easier to implement. On the other hand, LL/STC as it is defined
> under Alpha can create some hard forward progress issues on poorly
> written software, and there are some fairness issues in hardware as
> well.
>

As a SW guy, LL/STC has grown on me. Of course it took a number of years,
and successive generations of Alpha before us SW guys really understood the
sequence well enough to design really good code. Simple atomic operations
has spoiled us.

Stephen Fuld

unread,

Jan 31, 2003, 12:32:23 PM1/31/03

to

"Jan C. Vorbrüggen" <jvorbr...@mediasec.de> wrote in message
news:3E3A37D4...@mediasec.de...

> > >Don't forget S/390, Z/900, eserver Z series, et al.
> >
> > They have never dominated more than a VERY specialist market. The
> > System 360/370 dominated much of computing - as some people believe
> > the IA-64 will.
>
> Please define your figure of merit for "dominates".
>
> Number of units sold - 4-bit/8-bit systems win by orders of magnitude.
> Number of users - embedded systems of all sorts (4- to 64-bit) win...
> Revenue and earnings - embedded systens...
> Amount of money moved - "mainframes" (360 et seq., Burroughs, VAX/Alpha)
> win...
> Public perception - only Wintel is on the map.

A perhaps interesting metric is number of jobs for programmers writing code
for. . .

I don't know who would win on that one. Does anyone know?

--
- Stephen Fuld
e-mail address disguised to prevent spam

Fred Kleinsorge

unread,

Jan 31, 2003, 12:42:02 PM1/31/03

to

"Nick Maclaren" <nm...@cus.cam.ac.uk> wrote in message
news:b1dd6e$rff$1...@pegasus.csx.cam.ac.uk...

Not quite sure I understand this. Your concern (at least on Alpha) is only
valid if there are multiple CPUs (or a DMA device). That is, an aligned
long or quadword write in user mode - when interrupted on the same
processor - will not read back stale data. (In general, almost every
interrupt will generate a memory barrier anyway). If you need to do
*multiple* writes, then you need to use some type of lock.

The DEC-C language supports the built-in __MB() which will force a memory
barrier, and will also force the compiler to honor some optimization rules
about optimizing across a MB. There is also a write memory barrier (in that
case, I think you need to use a ASM statement to issue it) which is a little
more lightweight than a MB, and will simply force write ordering (designed
for writing to devices).

MB's are only really interesting if you are interacting with a DMA device,
or another CPU. Not just an asynchronous interrupt on the same CPU.

Nick Maclaren

unread,

Jan 31, 2003, 1:11:41 PM1/31/03

to

In article <3e3ab2d9$1...@hpb10302.boi.hp.com>,

Fred Kleinsorge <kleinsorge@star-dot-zko-dot-dec-dot-com> wrote:
>
>"Brannon Batson" <Brannon...@yahoo.com> wrote in message
>news:4495ef1f.03013...@posting.google.com...
>> nm...@cus.cam.ac.uk (Nick Maclaren) wrote in message
>news:<b1anhh$kqp$1...@pegasus.csx.cam.ac.uk>...
>> > In article <4495ef1f.03012...@posting.google.com>,
>Brannon...@yahoo.com (Brannon Batson)
>
>> > Poor support for memory synchronisation, which is important
>> > if you have no atomic operations. LL/STC is NOT a good solution.
>> > Again, I may be misremembering this one.

I was slightly misremembering, but my point stands. MB is a fairly
heavyweight operation, and it is common to want to synchronise only
accesses. Also, the handbook is wrong in saying that it is needed
only for multiprocessor systems - it is also needed if memory access
exceptions are part of the program's semantic model.

>> Are you just alluding to my #1 point above, or was there something
>> more subtle that concerned you? I waffle on LL/STC vs. atomic
>> operations, myself. On the one hand, LL/STC is faster, more general,
>> and easier to implement. On the other hand, LL/STC as it is defined
>> under Alpha can create some hard forward progress issues on poorly
>> written software, and there are some fairness issues in hardware as
>> well.
>
>As a SW guy, LL/STC has grown on me. Of course it took a number of years,
>and successive generations of Alpha before us SW guys really understood the
>sequence well enough to design really good code. Simple atomic operations
>has spoiled us.

Hmm. As a third-party software guy, I remain unhappy about the progress
problem Brannon mentioned and don't see that it is entirely a matter of
poorly written software. There isn't enough information in my copy of
the Alpha architecture handbook to be able to tell what will be handled
well and what won't. For example the lock unit can be anything from
8 bytes to a page - how do I write an efficient, portable program
where I need different algorithms according to which it is?

As a support guy, I loathe things like this, though I have never had
to support Alphas. If I discover some poorly written LL/STC code that
runs badly but does not obviously break the rules, often vendor code,
then trying to report it can be hell and getting it dealt with almost
or actually impossible. LL/STC is just too complicated and too
underspecified, and I don't see how it could be fully specified without
hobbling the hardware people.

Software developers for their own company's hardware have it easy :-(

Nick Maclaren

unread,

Jan 31, 2003, 1:37:11 PM1/31/03

to

In article <3e3ab56b$1...@hpb10302.boi.hp.com>,

Fred Kleinsorge <kleinsorge@star-dot-zko-dot-dec-dot-com> wrote:
>
>
>Not quite sure I understand this. Your concern (at least on Alpha) is only
>valid if there are multiple CPUs (or a DMA device). That is, an aligned
>long or quadword write in user mode - when interrupted on the same
>processor - will not read back stale data. (In general, almost every
>interrupt will generate a memory barrier anyway). If you need to do
>*multiple* writes, then you need to use some type of lock.

Not stale data - corrupted data - i.e. half old and half new. But,
otherwise, yes.

However, the Alpha architecture handbook I have says very clearly
that memory barriers are NOT created by interrupts, and that the
software must do it.

>The DEC-C language supports the built-in __MB() which will force a memory
>barrier, and will also force the compiler to honor some optimization rules
>about optimizing across a MB. There is also a write memory barrier (in that
>case, I think you need to use a ASM statement to issue it) which is a little
>more lightweight than a MB, and will simply force write ordering (designed
>for writing to devices).

Which is good, though not portable. I can't find a write memory barrier
in my (old) copy of the Alpha architecture handbook.

>MB's are only really interesting if you are interacting with a DMA device,
>or another CPU. Not just an asynchronous interrupt on the same CPU.

Sorry, but that is NOT true. Not at all, at all. If the handling
of memory exceptions is part of your program's semantic model, and
the main code does something like this

STORE 0,fred
STORE 1,joe

and both stores cause interrupts, then you have a race condition on
interrupts as well as stores, and I don't think that is architected.
That assumes, of course, that stores within interrupt handlers and
the main code are synchronised strictly according to the architecture,
which is not something I would like to rely on too much.

Perhaps an even more common "gotcha" is when joe interrupts and the
handler changes the accessibility of fred (e.g. by freeing the memory).
Metadata changes are often not synchronised with access.

Note that I am not saying that this all can't be got right, but what
I am saying is that you either have to use an explicit memory barrier
or rely on unarchitected properties. Or, of course, the architecture
could specify all of this :-)

Fred Kleinsorge

unread,

Jan 31, 2003, 1:59:11 PM1/31/03

to

"Nick Maclaren" <nm...@cus.cam.ac.uk> wrote in message

news:b1efon$pu2$1...@pegasus.csx.cam.ac.uk...

> In article <3e3ab56b$1...@hpb10302.boi.hp.com>,
> Fred Kleinsorge <kleinsorge@star-dot-zko-dot-dec-dot-com> wrote:
> >
> >
> >Not quite sure I understand this. Your concern (at least on Alpha) is
only
> >valid if there are multiple CPUs (or a DMA device). That is, an aligned
> >long or quadword write in user mode - when interrupted on the same
> >processor - will not read back stale data. (In general, almost every
> >interrupt will generate a memory barrier anyway). If you need to do
> >*multiple* writes, then you need to use some type of lock.
>
> Not stale data - corrupted data - i.e. half old and half new. But,
> otherwise, yes.
>

For a single write, to an address that doesn't cross a cache line, you
should never see word tearing. In fact, we have a device that uses this
trick for it's queuing.

> However, the Alpha architecture handbook I have says very clearly
> that memory barriers are NOT created by interrupts, and that the
> software must do it.
>

Right, and unless *you* are coding your own OS, you will find that in the
platform PALcode, or in the exiting Tru64 or VMS (and probably Linux)
OS interrupt service, you will find a MB someplace in the path.

> >The DEC-C language supports the built-in __MB() which will force a memory
> >barrier, and will also force the compiler to honor some optimization
rules
> >about optimizing across a MB. There is also a write memory barrier (in
that
> >case, I think you need to use a ASM statement to issue it) which is a
little
> >more lightweight than a MB, and will simply force write ordering
(designed
> >for writing to devices).
>
> Which is good, though not portable. I can't find a write memory barrier
> in my (old) copy of the Alpha architecture handbook.
>

#ifdef ALPHA
#define _memory_barrier() __MB()
#define _WMB() asm("wmb")
#else
// For other systems... roll your own. For instance Memory Fence
// on IA64 == MB on Alpha for all intents & purposes
#define _memory_barrier()
#define _WMB()
#endif

Look in 4.11.7 of the 2nd edition of the manual.

The write memory barrier was created in response to a demand from the
graphics
group. We had a really fast PIO device, but the problem was A) the original
device
did not have well ordered registers, and B) in a loop, you had to issue a MB
to
prevent aggregation in the write buffers. The MB on EV4 was super
heavyweight.
(The MB has progressively become less and less heavyweight since then). But
the
WMB was designed to only guarantee that writes would be ordered with respect
to
themselves - that is, all writes before the WMB would complete before any
writes
after the WMB (but nothing is implied about reads). Of course, in the mean
time, we
found that the graphics device didn't decode all the address pins, and had
ghost alias
addresses, and that we could mung our software to use alternate HW aliases.
And
knowing explicit things about how EV4 (and EV5) flushed write buffers, we
could be
assured that they would be written in-order to the device.

Fred Kleinsorge

unread,

Jan 31, 2003, 2:12:26 PM1/31/03

to

"Nick Maclaren" <nm...@cus.cam.ac.uk> wrote in message

news:b1ee8t$ot1$1...@pegasus.csx.cam.ac.uk...

The progress problem exists no matter what you do. Either you do it in SW,
or you
do it in HW. My fear is that the HW may *not* do it, or do it in a
heavyhanded
way that makes all atomic operations slower than they should be.

Well written SW *can* guarantee forward progress for atomic (LL/STC)
sequences.

> mentioned and don't see that it is entirely a matter of
> poorly written software. There isn't enough information in my copy of
> the Alpha architecture handbook to be able to tell what will be handled
> well and what won't. For example the lock unit can be anything from
> 8 bytes to a page - how do I write an efficient, portable program
> where I need different algorithms according to which it is?
>

Frankly, the "lock unit" on EV6 and later is all of memory. The only real
important
think for performance in placement of the lock is to make sure that it is
entirely
within it's own cache line. Wanna be safe, place it at the start of a page
and don't
put anything else in it. Kind of a waste though.

On EV6 and later there is only a local "lock" flag. This flag is cleared if
*any*
memory reference is made. Memory can't be referenced between the LL/STC,
so the only way the flag can be touched is if there is an interrupt.

The STC succeeds if the local flag is valid AND cache write transation
succeeds.

> As a support guy, I loathe things like this, though I have never had
> to support Alphas. If I discover some poorly written LL/STC code that
> runs badly but does not obviously break the rules, often vendor code,
> then trying to report it can be hell and getting it dealt with almost
> or actually impossible. LL/STC is just too complicated and too
> underspecified, and I don't see how it could be fully specified without
> hobbling the hardware people.
>

Frankly, most 3rd party guys shouldn't use LL/STC directly. They should use
the built-in atomic operators in DEC-C, or use a system library call for a
spinlock, or
some other type of semaphore. Or you better really know what you are doing.

Just like most 3rd party guys should never try to write a multi-threaded
application, or
a parallel application - most of them will get it wrong ;-)

> Software developers for their own company's hardware have it easy :-(
>

Don't bet on it.

Nick Maclaren

unread,

Jan 31, 2003, 2:17:54 PM1/31/03

to

In article <3e3ac...@hpb10302.boi.hp.com>,

Fred Kleinsorge <kleinsorge@star-dot-zko-dot-dec-dot-com> wrote:
>
>For a single write, to an address that doesn't cross a cache line, you
>should never see word tearing. In fact, we have a device that uses this
>trick for it's queuing.

Is that architected? If so, that is what I regard as the basic level
of atomic operation (i.e. atomic loads and stores). The way I read
the architecture, it is almost certain but not architected.

>> However, the Alpha architecture handbook I have says very clearly
>> that memory barriers are NOT created by interrupts, and that the
>> software must do it.
>
>Right, and unless *you* are coding your own OS, you will find that in the
>platform PALcode, or in the exiting Tru64 or VMS (and probably Linux)
>OS interrupt service, you will find a MB someplace in the path.

Well, my interest is in VERY efficient, application-level exception
handling - more-or-less theoretical in the systems I used, but I have
heard it is still alive and well in embedded systems.

>Look in 4.11.7 of the 2nd edition of the manual.

Mine is the first edition :-)

>The write memory barrier was created in response to a demand from the

>graphics group ...

Ah! Thanks. Yes, that is precisely the primitive I was talking about!

With that primitive and guaranteed atomic loads and stores of 2^N
bytes up to (say) N = 4, you can do 90% of what most people need to
do in the middle of really performance critical code. For the
remainder, which facilities are best gets rather religious.

So, I am happy to admit that I have been maligning the modern Alpha
architecture, though I remain a little dubious about the number of
things that are true but not architected. However, as you know, that
is not unique to the Alpha :-(

John Dallman

unread,

Jan 31, 2003, 2:42:00 PM1/31/03

to

In article <b1c730$156$1...@pegasus.csx.cam.ac.uk>, nm...@cus.cam.ac.uk (Nick
Maclaren) wrote:

> >I think that's what I said. I was asking about someone's earlier claim
> >that the Alpha and Opteron *processors* have a 32-bit mode, which I
> >believe is false in the case of the Alpha.
> At least SPARC, MIPS and POWER are like PA-RISC, except that I can't
> say whether the architecture allows the kernel to be either.

SPARC Solaris and MIPS Irix happily host 32-bit processes under a 64-bit
kernel but while they have 32-bit kernels, I'm pretty sure that those
don't support 64-bit code. Solaris defaults to the 64-bit kernel on
UltraSPARC II and later. You can run the 32-bit kernel on those
processors, but you have to be firm to get it to happen.

---
John Dallman j...@cix.co.uk
"Any sufficiently advanced technology is indistinguishable from a
well-rigged demo"

John Dallman

unread,

Jan 31, 2003, 2:42:00 PM1/31/03

to

In article <m3el6t8...@averell.firstfloor.org>,
fre...@alancoxonachip.com (Andi Kleen) wrote:

> Terje Mathisen <terje.m...@hda.hydro.com> writes:
> > glen herrmannsfeldt wrote:
> >> I wonder if Windows 2020 will still run MSDOS 1.0 programs?
> > Sure it will!
> x86-64 dropped vm86 mode. If it catches on and Windows 2020 is a long
> mode operating system then it won't run MSDOS 1.0 programs directly on
> the CPU.

Win64 doesn't run 16-bit programs any more anyway. It was left out
deliberately, as I understand it, probably because the complexity was
getting silly. Itanium x86 mode and Opteron compatibility mode could both
run the 16-bit code, but the OS support isn't there.

M. Ranjit Mathews

unread,

Jan 31, 2003, 3:24:32 PM1/31/03

to

John Dallman wrote:
> nm...@cus.cam.ac.uk (Nick Maclaren) wrote:

> SPARC Solaris and MIPS Irix happily host 32-bit processes under a 64-bit
> kernel but while they have 32-bit kernels, I'm pretty sure that those
> don't support 64-bit code.

Solaris 2.6 and AIX 4.3 had 32 bit kernels that supported 32 and 64 bit
applications.

"AIX 4.3 is a 32 bit kernel and allows 32 and 64 bit executables."
http://hpcf.nersc.gov/computers/SP/64bit.html

> Solaris defaults to the 64-bit kernel on
> UltraSPARC II and later. You can run the 32-bit kernel on those
> processors, but you have to be firm to get it to happen.

From Solars 2.7 onward, that is.

Andi Kleen

unread,

Jan 31, 2003, 3:30:42 PM1/31/03

to

"Fred Kleinsorge" <kleinsorge@star-dot-zko-dot-dec-dot-com> writes:
>
> MB's are only really interesting if you are interacting with a DMA device,
> or another CPU. Not just an asynchronous interrupt on the same CPU.

Things would be soooo much easier if all computers only had a single CPU ....
but they haven't.

A lot of software has to be MP safe these days, so you cannot describe
"or another CPU" as some kind of exceptional situation. It is standard.

-Andi

Andi Kleen

unread,

Jan 31, 2003, 3:34:36 PM1/31/03

to

"Fred Kleinsorge" <kleinsorge@star-dot-zko-dot-dec-dot-com> writes:
>
> #ifdef ALPHA
> #define _memory_barrier() __MB()
> #define _WMB() asm("wmb")
> #else

The point is that Alpha needs much more memory barriers than all other
common architectures (IA64, IA32, PPC, Sparc, PA-RISC, ....). When you're
writing portable code this can be a major issue, because even when
you put the memory barriers in you have no way to test that they're
correct unless you're lucky enough to have an Multiprocessor Alpha testbox
too.

-Andi

John Dallman

unread,

Jan 31, 2003, 6:01:00 PM1/31/03

to

In article <3E3AD08...@yahoo.com>, ranjit_...@yahoo.com (M. Ranjit
Mathews) wrote:

> Solaris 2.6 ... had 32 bit kernels that supported 32 and 64 bit
> applications.

> > Solaris defaults to the 64-bit kernel on
> > UltraSPARC II and later. You can run the 32-bit kernel on those
> > processors, but you have to be firm to get it to happen.
> From Solars 2.7 onward, that is.

Ah, fair enough. We jumped from Solaris 2.5 to Solaris 7; the customers
didn't ask for 2.6 until we'd been on 7 for a couple of years

Richard Krehbiel

unread,

Jan 31, 2003, 6:42:42 PM1/31/03

to

Andi Kleen wrote:
> Terje Mathisen <terje.m...@hda.hydro.com> writes:
>
>
>>glen herrmannsfeldt wrote:
>>
>>>I wonder if Windows 2020 will still run MSDOS 1.0 programs?
>>
>>Sure it will!
>
>
> x86-64 dropped vm86 mode.

Not completely, only when the OS is a new "long mode" 64-bit OS. When
the OS is a 32-bit OS, vm86 mode is still there.

Charles Shannon Hendrix

unread,

Jan 31, 2003, 10:36:26 PM1/31/03

to

In article <m365s58...@averell.firstfloor.org>, Andi Kleen wrote:

> The point is that Alpha needs much more memory barriers than all other
> common architectures (IA64, IA32, PPC, Sparc, PA-RISC, ....). When you're

Is this true of all Alphas?

I thought that the 21264 and up were supposed to address this and other
issues, and that some versions didn't need it, or at least not as much?

Charles Shannon Hendrix

unread,

Jan 31, 2003, 10:34:27 PM1/31/03

to

In article <b1ei52$rfj$1...@pegasus.csx.cam.ac.uk>, Nick Maclaren wrote:

> So, I am happy to admit that I have been maligning the modern Alpha
> architecture, though I remain a little dubious about the number of
> things that are true but not architected. However, as you know, that
> is not unique to the Alpha :-(

I am pretty sure that the 21064 pretty well drained the pipeline when
it saw a trapb instruction, to garantee that nothing would trap before
moving on.

Other Alphas didn't do this, but I think this CPU was used so much in
Alphastations and low-end Alphaservers that it gave the Alpha a worse
reputation than it deserved.

There were also some nasty bugs in the 21064.

Charles Shannon Hendrix

unread,

Jan 31, 2003, 10:06:14 PM1/31/03

to

In article <b1dd6e$rff$1...@pegasus.csx.cam.ac.uk>, Nick Maclaren wrote:

> Now, the Alpha seems to rely on barriers, which can be used to handle
> this case (except that they aren't easily usable from high-level and
> portable code), but they are far too heavyweight. Also, in general,

I think what happens on a trap-barrier is that the CPU stalls until
it's not possible that any instructions before the barrier will have
exceptions.

You do this by putting a trapb after the point of exception, and then
you will know precisely where the exception occured. It's expensive
because instructions between current barrier and <something???> have to
be checked. During that time you can't jump in or out of that block of
instructions, and no registers can be modified.

I took this to mean that for all practical purposes, the CPU was
unavailable during this time. Seems like if you were careful you could
avoid excessive use, but maybe that's not true.

Math exceptions were different. The Alpha could find the trap without
a barrier pause. This was called an imprecise exception, because you
knew where the trap was, but not how many instructions past it had
been executed.

I wonder why all instructions couldn't work this way?

I thought though, that later Alphas changed this a little.

Larry Kilgallen

unread,

Jan 31, 2003, 11:31:32 PM1/31/03

to

In article <3E3A33A4...@mediasec.de>, Jan C. =?iso-8859-1?Q?Vorbr=FCggen?= <jvorbr...@mediasec.de> writes:
>> Are you just alluding to my #1 point above, or was there something
>> more subtle that concerned you? I waffle on LL/STC vs. atomic
>> operations, myself. On the one hand, LL/STC is faster, more general,
>> and easier to implement. On the other hand, LL/STC as it is defined
>> under Alpha can create some hard forward progress issues on poorly
>> written software, and there are some fairness issues in hardware as well.
>

> In my view, LL/STC wins because you can build atomic ops out of it, but
> you can't use atomic ops to build LL/STC - well, I suppose you could, but
> it would be putting the cart before the horse.
>
> With regard to the other issues, doesn't every synchronization/lock
> mechanism suffer from these issues, i.e., the hardware _and_ software
> architects _and_ implementors must know what they're doing, or things
> _will_ go wrong?

On Alpha, higher level language programmers do not need to do these
details. They use the paradigm provided by their language (tasking,
threads, whatever) and rely on the compiler and runtime writers to
have gotten the synchronization details correct. After all, they
are relying on compiler/runtime implementation for a lot of other
support.

Larry Kilgallen

unread,

Jan 31, 2003, 11:41:19 PM1/31/03

to

In article <m365s58...@averell.firstfloor.org>, Andi Kleen <fre...@alancoxonachip.com> writes:

> The point is that Alpha needs much more memory barriers than all other
> common architectures (IA64, IA32, PPC, Sparc, PA-RISC, ....). When you're
> writing portable code this can be a major issue, because even when
> you put the memory barriers in you have no way to test that they're
> correct unless you're lucky enough to have an Multiprocessor Alpha testbox
> too.

If it isn't tested, does it really count as software ?

del cecchi

unread,

Jan 31, 2003, 11:51:48 PM1/31/03

to

"Charles Shannon Hendrix" <sha...@news.widomaker.com> wrote in message
news:slrnb3mg23....@news.widomaker.com...
snip

> I am pretty sure that the 21064 pretty well drained the pipeline when
> it saw a trapb instruction, to garantee that nothing would trap before
> moving on.
>
> Other Alphas didn't do this, but I think this CPU was used so much in
> Alphastations and low-end Alphaservers that it gave the Alpha a worse
> reputation than it deserved.
>
> There were also some nasty bugs in the 21064.
>
>

Bugs? Quirks? I'm shocked. Shocked I say. How can that be? :-)
Bill, say it isn't so!
>
>
>

Bill Todd

unread,

Feb 1, 2003, 1:33:53 AM2/1/03

to

"Andi Kleen" <fre...@alancoxonachip.com> wrote in message
news:m38yx18...@averell.firstfloor.org...

The context of Fred's statement was a response to a specific example
contending that an issue existed with interrupts on a single CPU.

- bill

Peter Dickerson

unread,

Feb 1, 2003, 5:26:04 AM2/1/03

to

"John Dallman" <j...@cix.co.uk> wrote in message
news:memo.2003013...@jgd.compulink.co.uk...

> In article <m3el6t8...@averell.firstfloor.org>,
> fre...@alancoxonachip.com (Andi Kleen) wrote:
>
> > Terje Mathisen <terje.m...@hda.hydro.com> writes:
> > > glen herrmannsfeldt wrote:
> > >> I wonder if Windows 2020 will still run MSDOS 1.0 programs?
> > > Sure it will!
> > x86-64 dropped vm86 mode. If it catches on and Windows 2020 is a long
> > mode operating system then it won't run MSDOS 1.0 programs directly on
> > the CPU.
>
> Win64 doesn't run 16-bit programs any more anyway. It was left out
> deliberately, as I understand it, probably because the complexity was
> getting silly. Itanium x86 mode and Opteron compatibility mode could both
> run the 16-bit code, but the OS support isn't there.

In the case of the Opteron 16-bit protected mode yes, but not vm86 from a
64-bit OS. I.e. 16 bit windows apps but not MSDOS apps.

Peter

Terje Mathisen

unread,

Feb 1, 2003, 7:06:30 AM2/1/03

to

Fred Kleinsorge wrote:
> Of course, in the mean time, we found that the graphics device didn't
> decode all the address pins, and had ghost alias addresses, and that
> we could mung our software to use alternate HW aliases.

> And knowing explicit things about how EV4 (and EV5) flushed write
> buffers, we could be assured that they would be written in-order
> to the device.

That's about the best description of non-portable coding that I've read
in a long time. :-)

Terje
--
- <Terje.M...@hda.hydro.com>
"almost all programming can be viewed as an exercise in caching"

Nick Maclaren

unread,

Feb 1, 2003, 7:21:04 AM2/1/03

to

In article <9nhM3T...@eisner.encompasserve.org>,

Larry Kilgallen <Kilg...@SpamCop.net> wrote:
>
>On Alpha, higher level language programmers do not need to do these
>details. They use the paradigm provided by their language (tasking,
>threads, whatever) and rely on the compiler and runtime writers to
>have gotten the synchronization details correct. After all, they
>are relying on compiler/runtime implementation for a lot of other
>support.

That's a nice theory - pity about the practice.

Neither C nor Fortran HAVE any synchronisation paradigms, and I am
pretty sure the same is true of C++ (and probably Cobol and PL/I).
This isn't theoretical, either, as there almost certainly still is
a code-generation 'feature' where updates to C volatile objects are
not being synchronised properly on the Alpha. I checked with the
C standard committee, and the standard doesn't require that.

That excludes the minor detail that I failed to see any synchronisation
opcodes in the compiled code, and failed to find any compiler options
to request them. No, the reality is that almost all such programmers
solve such problems by closing their eyes and praying - which isn't
a bad strategy in the absence of any better one - and are saved by the
fact that the probability of these nasty race conditions is so low.

God help us with such an approach when controlling chemical plants.

Andi Kleen

unread,

Feb 1, 2003, 8:09:45 AM2/1/03

to

"Bill Todd" <bill...@metrocast.net> writes:
>
> The context of Fred's statement was a response to a specific example
> contending that an issue existed with interrupts on a single CPU.

Nick's example of the interrupt may have been misleading, but the general
problem exists on any multithreaded application on an operating system
that supports multiple CPUs.

-Andi

Nick Maclaren

unread,

Feb 1, 2003, 9:38:09 AM2/1/03

to

In article <m3bs1wn...@averell.firstfloor.org>,

Oh, yes, that is so. My point was and is that it also applies to
many or most systems in the context of at least some interrupts.
While such things are not truly parallel, they are usually specified
and sometimes implemented in a similar way.

Essentially, the problem arises in any system where not all interrupts
are absolutely precise and synchronised, and that includes pretty well
all systems nowadays. It is usually covered up by the software, at
the expense of making all out-of-line interrupt handling horrendously
expensive. But, even when it is, it is rarely SPECIFIED to be done
and the programmer has to rely on undocumented behaviour.

And, as I say, the language standards and compilers don't help anyone
to write clean or portable code that is safe in this respect.