Itanium finally passes Alpha at HP

Yousuf Khan

unread,

Aug 21, 2004, 4:38:35 PM8/21/04

to

Itanium sales have finally surpassed Alpha sales at HP. Looks like it's
mostly in the OpenVMS market though. Most OpenVMS customers are entrenching
around Itanium now. The Alpha-Tru64 market still seems to be volatile.

http://www.techweb.com/wire/story/TWB20040820S0005

Yousuf Khan

E.S.

unread,

Aug 21, 2004, 11:16:23 PM8/21/04

to

Yousuf Khan wrote:
> Itanium sales have finally surpassed Alpha sales at HP.

Where did you find that in the article you quoted ?

Alan Walpool

unread,

Aug 22, 2004, 12:46:18 AM8/22/04

to

>>>>> "E" == E S <e...@ecubics.com> writes:

E> Yousuf Khan wrote:
>> Itanium sales have finally surpassed Alpha sales at HP.

E> Where did you find that in the article you quoted ?

>> Looks like it's mostly in the OpenVMS market though. Most OpenVMS
>> customers are entrenching around Itanium now. The Alpha-Tru64
>> market still seems to be volatile.
>> http://www.techweb.com/wire/story/TWB20040820S0005
>>

Looks like someone posted before reading the article. It appears the
only thing the article discusses is an HP claim the performance of
openvms on itanium has surpassed the alpha. HP would love to kill the
alpha line, and it probably would save them some money on alpha
development. The alpha was dead when compaq brought it. Nice systems
but bound to fail eventually for many different reasons.

I have an old alpha system. Very nice computer, but now it is a
collectors item. You can tell by the low prices used alpha systems
fetch on e-bay that the alpha is pretty much history. With 64 bit
coming online in the commodity market it is just a matter of time. Sun
is trying to make the switch before it is too late.

Whatever. 64 bit has finally arrived!!! It goes to show that being
first does not mean you have a winner. The marketing of computers is
not like the Olympics ;-)).

Later,

Alan

Nick Maclaren

unread,

Aug 22, 2004, 4:50:10 AM8/22/04

to

In article <873c2fi...@spamme.onzedge.net>,

Alan Walpool <awal...@onzedge.net> wrote:
>
>Looks like someone posted before reading the article. It appears the
>only thing the article discusses is an HP claim the performance of
>openvms on itanium has surpassed the alpha. HP would love to kill the
>alpha line, and it probably would save them some money on alpha
>development. The alpha was dead when compaq brought it. Nice systems
>but bound to fail eventually for many different reasons.

All things are transitory. But it is as false to say that the Alpha
was dead when Compaq bought it as it is to say that Compaq killed
the most successful RISC architecture. The situation was that it
would have needed a massive change in approach to stop it fading,
but it is quite possible that would have changed it from a small
player to an x86 replacement. And it is possible that it would
have sunk even faster if that were attempted. We shall now never
know.

Regards,
Nick Maclaren.

Yousuf Khan

unread,

Aug 22, 2004, 10:40:41 AM8/22/04

to

E.S. <e...@ecubics.com> wrote:
> Yousuf Khan wrote:
>> Itanium sales have finally surpassed Alpha sales at HP.
>
> Where did you find that in the article you quoted ?

Oops, you're right upon further rereading, I find they were talking about
surpassing the performance of, not the sales of. Oh well, my bad. :-)

John Savard

unread,

Aug 22, 2004, 2:06:00 PM8/22/04

to

On Sat, 21 Aug 2004 23:46:18 -0500, Alan Walpool <awal...@onzedge.net>
wrote, in part:

>You can tell by the low prices used alpha systems
>fetch on e-bay that the alpha is pretty much history.

Pity I live in Canada and not the United States. It'll be awkward for me
to take advantage of one of those bargains - I'll have to wait till
someone in Canada wants to get rid of his Alpha.

John Savard
http://home.ecn.ab.ca/~jsavard/index.html

John Savard

unread,

Aug 22, 2004, 2:07:25 PM8/22/04

to

On Sat, 21 Aug 2004 23:46:18 -0500, Alan Walpool <awal...@onzedge.net>
wrote, in part:

>Whatever. 64 bit has finally arrived!!! It goes to show that being

>first does not mean you have a winner. The marketing of computers is
>not like the Olympics ;-)).

Well, the Itanium is also a really big chip with low yields.

64-bit has finally arrived, since AMD *was* first with something - first
at making it affordable.

John Savard
http://home.ecn.ab.ca/~jsavard/index.html

Rob Stow

unread,

Aug 22, 2004, 4:47:37 PM8/22/04

to

John Savard wrote:

1.) The Alpha servers/workstations available on E-Bay are
seldom with processors faster than 233 or 266 MHz.
In other words, only the really ancient stuff is being
sold on E-Bay - so its no surprise that the prices are
low.

Components for upgrading more modern Alpha servers, such
as 1 and 2 GB Memory upgrades for Alpha servers, by contrast
are selling for big bucks. People are willing to pay big
premiums to keep there Alpha servers are alive and well -
hardly a sign that the Alpha is history.

2.) Check the "for sale" newsgroups for your province or city.
Even in Saskatchewan (sk.forsale) we occasionally get local
sales of Alpha systems comparable to what is available on
E-Bay. However, those too are 233 and 266 MHz systems
almost all the time.

3.) What's wrong with using E-Bay but just limiting your search
to Canadian sellers ?

--
Reply to rob.sto...@shaw.ca
Do not remove anything.

Bernd Paysan

unread,

Aug 22, 2004, 5:19:48 PM8/22/04

to

John Savard wrote:
> 64-bit has finally arrived, since AMD *was* first with something -
> first at making it affordable.

And I thought that was Nintendo (sorry, couldn't resist ;-).

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

Yousuf Khan

unread,

Aug 22, 2004, 5:43:27 PM8/22/04

to

John Savard wrote:
> On Sat, 21 Aug 2004 23:46:18 -0500, Alan Walpool
> <awal...@onzedge.net> wrote, in part:
>
>> You can tell by the low prices used alpha systems
>> fetch on e-bay that the alpha is pretty much history.
>
> Pity I live in Canada and not the United States. It'll be awkward for
> me to take advantage of one of those bargains - I'll have to wait till
> someone in Canada wants to get rid of his Alpha.

One of the projects I was on last year was busy buying new Alphas too.

Yousuf Khan

Alan Walpool

unread,

Aug 22, 2004, 8:51:36 PM8/22/04

to

>>>>> "John" == John Savard <jsa...@excxn.aNOSPAMb.cdn.invalid> writes:

John> On Sat, 21 Aug 2004 23:46:18 -0500, Alan Walpool
John> <awal...@onzedge.net> wrote, in part:

>> Whatever. 64 bit has finally arrived!!! It goes to show that being
>> first does not mean you have a winner. The marketing of computers
>> is not like the Olympics ;-)).

John> Well, the Itanium is also a really big chip with low yields.

John> 64-bit has finally arrived, since AMD *was* first with
John> something - first at making it affordable.

Good point AMD 64 bit was first when it comes to price, and is still
the low price leader at the current time. Looks like the intel 64 bit
x86 processor is not going to compete with the AMD low end 64 bit
processors when it comes to price.

Sorry but forgot to mention the 64 bit powerpc that has been around
awhile. Powerpc is probably cheap but Mac is not cheap. To be complete
Sun has 64 bit also but moving to AMD 64 bit. Itanium is in there
somewhere.

I still like alpha good to hear there is still a demand for this
processor.

Curious - are there any other 64 bit processors that are still on the
market and being actively used?

Later,

Alan

John Mashey

unread,

Aug 23, 2004, 4:07:29 AM8/23/04

to

Alan Walpool <awal...@onzedge.net> wrote in message news:<874qmuy...@spamme.onzedge.net>...

> Curious - are there any other 64 bit processors that are still on the
> market and being actively used?

1) In 1Q92, SGI shipped the Crimson product, which used a 64-bit MIPS
R4000, albeit with 32-bit software. DEC shipped Alphas in systems
later that year, and {HP, Sun, IBM, in some order or other} over the
next few years.

2) R4x00s were shipping in Nintendo N64s around 1996.

3) There are of course lots of this family still around, and still
shipping ... in particular, folks like PMC-Sierra and Broadcom &
others sell them ...
and they're been in most CISCO routers for a long time, as well as
lots of laser printers, set-top boxes, and other products. See
http://www.mips.com/content/Ecosystem/Licensees/ProductCatalog/licensees
for companies with current licenses (not all of which use 64-bit, but
many), but many of these licensees sell the chips to other companies.

4) With all due respect to general-purpose computing, I suspect that
most of the world's 64-bit micros are *not* in general-purpose
computers. It is somewhat ironic that 64-bit micros have been there
in HPC and *embedded* for over a decade, and have taken soooo long to
get into the desktop and mid-range [and of course, are being viewed by
the press as "something new".] :-) AMD did a very nice job extending
32-bit X86 to 64-bit, almost exactly analogous to what {MIPS, HP, IBM,
Sun} did. Of course, 64-bit super-computers have been around ~30
years.

5) Toshiba's sampling a TX//99-core based one for $45 apiece in 100s,
and I think NEC VR4133s are <$30. There may be cheaper 64-bitters out
there [I just haven't looked lately.]

Alan Walpool

unread,

Aug 23, 2004, 10:57:53 AM8/23/04

to

>>>>> "John" == John Mashey <old_sys...@yahoo.com> writes:

John> 4) With all due respect to general-purpose computing, I suspect
John> that most of the world's 64-bit micros are *not* in
John> general-purpose computers. It is somewhat ironic that 64-bit
John> micros have been there in HPC and *embedded* for over a decade,
John> and have taken soooo long to get into the desktop and mid-range
John> [and of course, are being viewed by the press as "something
John> new".] :-) AMD did a very nice job extending 32-bit X86 to
John> 64-bit, almost exactly analogous to what {MIPS, HP, IBM, Sun}
John> did. Of course, 64-bit super-computers have been around ~30
John> years.

John that was interesting reading. I was not aware that 64 bit had
such a large foot hold in embedded systems. Amazing but my take on the
above notes is why did it take x86 so long to make it to 64 bit? Looks
like intel was never going to move from 64 bit unless forced to do so.
I guess this is one time competition forced the x86 desktop to 64 bit.
I wish I had purchased some AMD stock when it below 10 bucks a share
several years ago. ;-)). That is how it goes.

Thanks,

Alan

David Kanter

unread,

Aug 23, 2004, 11:11:33 AM8/23/04

to

> Well, the Itanium is also a really big chip with low yields.

What makes you think Itanium has low yields? IIRC >70% of the die
area is cache, which is very easy to repair.

> 64-bit has finally arrived, since AMD *was* first with something - first
> at making it affordable.

That is true.

David

Bernd Paysan

unread,

Aug 23, 2004, 11:31:26 AM8/23/04

to

Alan Walpool wrote:
> I wish I had purchased some AMD stock when it below 10 bucks a share
> several years ago. ;-)). That is how it goes.

Huh? Seveal years ago (2000-2001 frame), AMD was at one point up to ~50
bucks (actually, 100 bucks, that was before the split); last year, it was
down to 5 bucks or so, and now it's at 12 bucks, so not much away from your
"strong buy" price.

Alan Walpool

unread,

Aug 23, 2004, 4:03:25 PM8/23/04

to

>>>>> "Bernd" == Bernd Paysan <bernd....@gmx.de> writes:

Bernd> Alan Walpool wrote:
>> I wish I had purchased some AMD stock when it below 10 bucks a
>> share several years ago. ;-)). That is how it goes.

Bernd> Huh? Seveal years ago (2000-2001 frame), AMD was at one point
Bernd> up to ~50 bucks (actually, 100 bucks, that was before the
Bernd> split); last year, it was down to 5 bucks or so, and now it's
Bernd> at 12 bucks, so not much away from your "strong buy" price.

I was wrong thanks for the correction. Years are starting to run
together.

Thanks,

Alan

Bill Todd

unread,

Aug 26, 2004, 2:51:42 AM8/26/04

to

"Yousuf Khan" <bbb...@ezrs.com> wrote in message
news:Jf2Wc.26900$UYx....@twister01.bloor.is.net.cable.rogers.com...

> E.S. <e...@ecubics.com> wrote:
> > Yousuf Khan wrote:
> >> Itanium sales have finally surpassed Alpha sales at HP.
> >
> > Where did you find that in the article you quoted ?
>
> Oops, you're right upon further rereading, I find they were talking about
> surpassing the performance of, not the sales of. Oh well, my bad. :-)

It's also wise to observe exactly who was doing the talking - Terry Shannon,
well-known HP shill.

The only figures I've seen from that presentation indicated that while the
top-of-the-line Itanic was able to beat lower-end Alphas in the traditional
'Vax Unit of Performance' (VUPS) metric, the 1.15 GHz Alpha was still
marginally faster, even in its previous-generation process and without using
the absolutely newest (1.3 GHz) models.

Of course, if Alpha hadn't been killed three years ago, the new Itanics
would have been competing (though the performance gap would have made that
word somewhat laughable) against EV8, with over twice the performance of EV7
plus 4-way SMT (offering another factor of close to 3 in commercial
workloads like TPC-C, according to the simulations performed prior to EV8's
demise - enough to leave POWER5 in the dust as well, though in EV8's absence
POWER5 seems a good bet to thorougly shame Itanic for at least the next 30
months in TPC-C).

- bill

Nick Maclaren

unread,

Aug 26, 2004, 5:43:41 AM8/26/04

to

In article <MvqdnbwJLoa...@metrocastcablevision.com>,

"Bill Todd" <bill...@metrocast.net> writes:
|>
|> It's also wise to observe exactly who was doing the talking - Terry Shannon,
|> well-known HP shill.

That is unfair. He is biassed, because he wouldn't get cooperation
if he wasn't, but he is not simply a shill.

One line I noticed from his presentation was "No denial of service
attacks... and more". Like the 13th stroke of a clock, that casts
doubt over the rest of HP's claims. I have seen that claim made
by many vendors over many years, and it is invariably a sign of a
presentation that is largely bullshit. The reason is that it is
provably equivalent to solving the halting problem.

No, I can't say WHICH of the other statements in HP's presentation
are bullshit, and which are largely true, but I am sure that quite
a lot will be the former - simply because of that inclusion. It
really IS that indicative - marketdroids please note!

Regards,
Nick Maclaren.

Alex Johnson

unread,

Aug 26, 2004, 8:50:27 AM8/26/04

to

Bill Todd wrote:
> Of course, if Alpha hadn't been killed three years ago, the new Itanics
> would have been competing (though the performance gap would have made that
> word somewhat laughable) against EV8, with over twice the performance of EV7
> plus 4-way SMT (offering another factor of close to 3 in commercial
> workloads like TPC-C, according to the simulations performed prior to EV8's
> demise - enough to leave POWER5 in the dust as well, though in EV8's absence
> POWER5 seems a good bet to thorougly shame Itanic for at least the next 30
> months in TPC-C).

This is untrue. The EV8 was reported as having vastly better
performance than Itanium or Pentium 4, and I was eagerly waiting for it
(almost drooling). But one thing you leave out is the Alpha team's
tendancy to keep projects going long after their planned release dates.
EV8 would not be availible realistically until 2006. By that time it
would be 4 threads for Alpha vs 2 cores*2 threads for Pentium 4,
Itanium, and Power. It would be at best comparable, if not behind the
times, since all three of those architectures would have their 2*2 by
2005. Alpha has a problem with "give me 2% more performance and you can
delay the project as long as it takes". The team recognizes this as
their downfall and points it out all the time.

Alex
--
My words are my own. They represent no other; they belong to no other.
Don't read anything into them or you may be required to compensate me
for violation of copyright. (I do not speak for my employer.)

Peter Boyle

unread,

Aug 26, 2004, 10:00:57 AM8/26/04

to

On Thu, 26 Aug 2004, Alex Johnson wrote:

> 2005. Alpha has a problem with "give me 2% more performance and you can
> delay the project as long as it takes". The team recognizes this as
> their downfall and points it out all the time.

sources for this?
Peter

>
> Alex
> --
> My words are my own. They represent no other; they belong to no other.
> Don't read anything into them or you may be required to compensate me
> for violation of copyright. (I do not speak for my employer.)
>
>

Peter Boyle pbo...@physics.gla.ac.uk

Matt R.

unread,

Aug 26, 2004, 1:43:29 PM8/26/04

to

"Yousuf Khan" <bbb...@ezrs.com> wrote in message news:<fpOVc.436$Zu...@news04.bloor.is.net.cable.rogers.com>...

QUOTE:
"The performance crossover point -- the point at which IPF will meet,
and begin to exceed by a widening margin, the performance of Alpha --
is expected to occur in the EV7z/Madison9M timeframe," Shannon said
referring to the final iteration of the Alpha family (EV7z) and a new
Itanium configuration (Madison9M.)"
(End of quote)

Well, this is interesting, isn't it? The Madison9M can finally beat
the lower clocked Alpha. That makes me wonder what the promise of
higher IPC rates for EPIC compared to other ISAs are really worth.
Especially when you take the fact into account that the Madison has
huge 9 MiB L3 Cache on die compared to tiny 1.75 MiB L2 of the EV7z.
On-die cache mostly helps a lot for performance.

That's what we have seen regarding single CPU performance as i.e. SPEC
INT/FP BASE 2000 results are higher for Itanium than for Alpha. So if
HP (ok not really HP but Terry Shannon) is still saying that with
Madison9M the performance level of Alpha is reached/overtaken they/he
must refer to SMP systems. Here Alpha has an advantage with its four
highspeed interconnects for direct CPU communication compared to the
shared bus of Itanium which seems to be a bottleneck.

So if performance for larger system is more dependent on interconnect
choice and its implementation I ask myself why a new ISA is neccessary
and why other ISAs had to die in favor of this new one. The point
seems to me not to choose an ISA with a theoretical advantage in ILP
rate but to implement good (low latency) CPU interconnects.

Call me an Alpha fanboy but I find it simply amusing that a 0.18 痠
Alpha EV7z finally gets overtaken in terms of performance from a not
yet released 0.09 痠 Madison with 5 times the amount of on-die cache.
Well, I wonder when the point of "overtaken" would have been reached
if the original plans for an EV79 (0.13痠, 3 MB L2, 1.8 GHz+) had been
realized and not delayed and finally canceled by HP.

Sorry for OT, but when hearing those statements from the above quote I
really ask myself why Alpha is being replaced by Itanium.

Regards,

Matt

Stefan Monnier

unread,

Aug 26, 2004, 2:46:15 PM8/26/04

to

> drooling). But one thing you leave out is the Alpha team's tendancy to keep
> projects going long after their planned release dates. EV8 would not be

My memory is as blurry as ever, but I faintly seem to remeber that 21064 and
21164 were pretty much right on time. 21364 was late indeed, but with
plenty of excuses for it, given the tribulations of the company during
this time. As for 21264, I can't remember at all, tho I'd guess it was
a bit late, as most other CPUs are. So it seemed like the team was not
particularly bad w.r.t deadlines.

Stefan

Stefan Monnier

unread,

Aug 26, 2004, 2:49:06 PM8/26/04

to

> Sorry for OT, but when hearing those statements from the above quote I
> really ask myself why Alpha is being replaced by Itanium.

Why would you think the reasons are technical?

Stefan

Bill Todd

unread,

Aug 26, 2004, 3:41:48 PM8/26/04

to

"Nick Maclaren" <nm...@cus.cam.ac.uk> wrote in message
news:cgkbcd$co2$1...@pegasus.csx.cam.ac.uk...

>
> In article <MvqdnbwJLoa...@metrocastcablevision.com>,
> "Bill Todd" <bill...@metrocast.net> writes:
> |>
> |> It's also wise to observe exactly who was doing the talking - Terry
Shannon,
> |> well-known HP shill.
>
> That is unfair. He is biassed, because he wouldn't get cooperation
> if he wasn't, but he is not simply a shill.

I suppose if you have kept anywhere nearly as close track of Terry's pro-HP
blather as I have, especially since June 25, 2001, you have a right to make
such a statement (though I'll still challenge it).

Have you?

I'll append below a copy of the rebuttal to his two new 'analyses' that I
posted on comp.os.vms:

In 'Beyond Superdome", he first waxes poetic about current Superdome
capabilities, such as their internal interconnect fabric. Let's see: this
is the server architecture (at least somewhat reminiscent of the old and
rather mediocre GS320 server architecture) that using 64 top-of-the-line
Itanics barely manages to stay ahead of the new POWER5 box that requires
only 16 processors (on a grand total of 8 chips, since they're dual-core) in
TPC-C, right?

Then he crows, "HP delivers dual core before Intel" as some kind of
significant achievement. Well, maybe. Of course, Sun is delivering
dual-core SPARC processors today, and IBM started delivering dual-core
POWER4s nearly three years ago. So what beating Intel to the punch mostly
proves is just how far behind the curve Itanic really is, I'd suggest.

Then he starts talking about "How Superdome will maintain leadership", but
in fact that will be impossible - because it's not in the lead right now, so
there's no way it can 'maintain' any lead. And in fact, it will only fall
farther behind during the time-frame during which any reasonably informed
projections can be made.

Terry's first projected performance graph for OLTP explains why. Doubling
current system performance by about a year from now actually sounds pretty
impressive, until you recognize that Superdome's TPC-C performance today
with 64 processors falls slightly behind today's previous-design-generation
POWER4+ systems that use only half that number of processors and only
slightly manages to beat today's POWER5 boxes that use only 1/4th as many
processors. POWER5 will shortly be available with up to 64 processors,
which means that it should beat today's Superdome performance by a factor of
at least 3 within a few months. When Montecito comes along late next year
it will indeed close much of this gap with POWER5 (Terry's second
TPC-C-specific performance graph suggests it should slightly exceed 2
million tpmC), but POWER5 (a full process generation behind Montecito but
still heading for about 3 million tpmC late *this* year) will no longer be
IBM's top-of-the-line product by then, since POWER5+ (in the same process
generation as Montecito) should then be shipping and upping the ante
significantly.

No, once the full-sized POWER5 boxes appear there's no way that
top-of-the-line Superdome OLTP performance should be able to reach more than
about 50% of top-of-the-line POWER performance any time soon. Maybe Tukwila
will help close that gap when it arrives in 2007. Or maybe not, because
POWER6 is due around then. And Fujitsu has regular enhancements to SPARC64
coming along to keep pace with Itanic (though not POWER), regardless of what
one may think of Sun's future efforts for that architecture.

But perhaps the more important observation is that all the glowing
descriptions Terry makes about Superdome are not only features that IBM
perfected many years ago but are things that make pricing anything *but*
commodity-level. So Superdome won't be offering industry-leading
performance *or* industry-leading price/performance, because the x86-64
brigade will be attacking it from beneath on the second front.

'Son of Superdome' will be "Superdome-centric with Alpha attributes"?
That's, like, deja vu all over again: exactly the kind of thing that people
like Terry and Kerry and Rob were telling us the week of June 25, 2001 -
except that the time-frame being discussed back then for the appearance of
the "Alpha/IA64 hybrid" was about now, not 2007.

Well, given that 'about now' is upon us and I don't see any "Alpha/IA64
hybrids" being benchmarked, 2007 seems at least a lot more credible. I
guess my prediction of 2006 three years ago was slightly optimistic, but for
a 5-year-out guesstimate I don't feel *that* ashamed of it.

Terry may have been able to make the Superdome story sound superficially
attractive, but it just doesn't stand up to scrutiny when the *rest* of the
industry is taken into account. And it's also worth considering just how
Terry's fawning descriptions of more long-term architecture development gibe
with the recent reports of a grinning Shane Robison wielding an axe to R&D
like Jack Nicholson in The Shining.

Moving right along, we come to "Why IPF and Why HP: Because SPARC is dead,
Power5 isn't ready for prime time and Extentions won't cut it in the
datacenter."

SPARC is dead, eh? Or 'no longer relevant', as a later slide says. Someone
better tell Fujitsu so it will stop stomping all over the latest Itanics in
commercial benchmarks like jbb2000: that's really not suitable behavior for
a 'dead' processor. And by all means make sure those HP customers who are
defecting to Sun know this: what on earth do you suppose they're thinking?!

As for POWER5 not being ready for prime time, I guess we can say good-bye to
IBM: if they've released an unready product to their customer base, said
base won't be with them for long. Or could one possibly suppose that Terry
is simply blowing yet more thick, black smoke out of his ass for HP?

Intel waited until 1986 to 'begin executing a plan to achieve microprocessor
dominance'? Don't let the iAPX-432 people hear you say that! Oh, wait -
that failed and disappeared after a few years of futile effort...

And Terry's still shouting as loudly as he can (what size font was that?)
that Compaq made the *right* decision to kill Alpha. Well, despite what he
claims, three-plus years later history really doesn't seem at all inclined
to support that thesis, but when you're talking about what history *will*
prove you always have a built-in response to such observations: just wait
some more...

Terry's purported 'analyses' of the relative potential of EPIC vs. RISC, of
relative performance predictions for Itanic vs. Alpha, and of the commercial
viability of Alpha remain as chock-full of shit as ever. They are no more
convincing when looking back from today's vantage point than they were three
years ago, and I'm not going to bother to debunk them in detail yet again:
POWER5 has already done a more than adequate job of doing that out in the
real world, and EV8 (which of course would be shipping today, had it not
been canceled) would have done an even better one. As for the idea that at
least Itanic would provide a compatible hardware platform on which to run
both IA64 *and* IA32 code, well... turns out the hardware supporting IA32
wasn't quite up to the job, so they're replacing it with software
emulation - you know, like Alpha used? But that emulation, though faster
than the previous disaster on Itanic, still can't hold a candle to native
IA32 processors that can run *both* 32-bit and 64-bit code at full speed.

And what's with the slide that shows 64-bit Itanic code out-performing IA32
by a factor of about 2:1 right about now? Last time I checked, they were
pretty much dead-even in many benchmarks, and where that was not true the
leads were split about evenly. Couldn't be just a *bit* of misdirection in
such a slide, could there?

Note how carefully Terry refers to x86-64 as 'extensions' to a 32-bit
architecture, rather than as an actual 64-bit architecture. Kind of makes
you wonder why he doesn't refer to IA32 as 'extensions' to a 16-bit
architecture, doesn't it? After all, that's exactly the same concept.

If you don't believe that IA32 qualifies as a 'real' 32-bit architecture
(despite rather a lot of commercial and scientific evidence to the
contrary), I guess you could swallow the suggestion that x86-64 isn't really
a 64-bit architecture, even though it shows every promise of competing on
equal (and ofter better) footing with the 'real' 64-bit architectures out
there. Ah, Terry. And if you think that Itanic offers any performance
advantage over x86-64, you haven't looked at benchmarks lately.

As for talking about the decline in quality of the trade press, Terry,
that's pretty hard to stomach coming from such a trasnsparent HP
sock-puppet. But the bolder the lie, the more you seem attracted to it.

But inundating readers with dozens of pages of impressive-sounding buzzwords
that he himself apparently understands only in the vaguest terms may
actually be effective in convincing some portion of the population. It
really does seem that HP ought to be paying him *something* for this effort,
plus perhaps a significant tip for the total abandonment of personal
integrity that it requires.

- bill

Bill Todd

unread,

Aug 26, 2004, 3:43:08 PM8/26/04

to

"Alex Johnson" <comp...@jhu.edu> wrote in message
news:cgkmak$t6a$1...@news01.intel.com...

> Bill Todd wrote:
> > Of course, if Alpha hadn't been killed three years ago, the new Itanics
> > would have been competing (though the performance gap would have made
that
> > word somewhat laughable) against EV8, with over twice the performance of
EV7
> > plus 4-way SMT (offering another factor of close to 3 in commercial
> > workloads like TPC-C, according to the simulations performed prior to
EV8's
> > demise - enough to leave POWER5 in the dust as well, though in EV8's
absence
> > POWER5 seems a good bet to thorougly shame Itanic for at least the next
30
> > months in TPC-C).
>
> This is untrue. The EV8 was reported as having vastly better
> performance than Itanium or Pentium 4, and I was eagerly waiting for it
> (almost drooling). But one thing you leave out is the Alpha team's
> tendancy to keep projects going long after their planned release dates.
> EV8 would not be availible realistically until 2006.

Utter crap.

- bill

Del Cecchi

unread,

Aug 26, 2004, 3:52:07 PM8/26/04

to

"Bill Todd" <bill...@metrocast.net> wrote in message
news:rLWdnWEku78...@metrocastcablevision.com...

snip
> - bill
>
Do you have a link to terry shannon's article? Is it on his web site? I
want to read about power5 not being ready... :-)
Although I have to feel a little sorry for him, given how he's already had
two companies shot out from under him and everything.

del cecchi
(followups trimmed)
>
>

Bill Todd

unread,

Aug 26, 2004, 4:12:30 PM8/26/04

to

"Del Cecchi" <cecchi...@us.ibm.com> wrote in message
news:2p6tb8F...@uni-berlin.de...

...

> Do you have a link to terry shannon's article? Is it on his web site? I
> want to read about power5 not being ready... :-)

Well, it's one of Terry's casual slights rather than something he actually
(if incompetently) attempts to substantiate in any manner (at least if he
did, I missed it). It appears in his link to the actual .pdf file, at the
top of this Web page:

http://www.shannonknowshpc.com/stories.php?story=04/08/24/5313953

- bill

Nick Maclaren

unread,

Aug 26, 2004, 4:12:45 PM8/26/04

to

In article <rLWdnWEku78...@metrocastcablevision.com>,

Bill Todd <bill...@metrocast.net> wrote:
>
>> |> It's also wise to observe exactly who was doing the talking - Terry
>Shannon,
>> |> well-known HP shill.
>>
>> That is unfair. He is biassed, because he wouldn't get cooperation
>> if he wasn't, but he is not simply a shill.
>
>I suppose if you have kept anywhere nearly as close track of Terry's pro-HP
>blather as I have, especially since June 25, 2001, you have a right to make
>such a statement (though I'll still challenge it).
>
>Have you?

No, but I have read a fair number of his articles, both before and
after that, and have found them useful. I can tell you why our
opinions differ.

I regard the trumpet blowing, flag waving and generalised hype as
the content-free rubbish that it is. I doubt that you will disagree
with THAT - and, if you could get him drunk enough, I doubt that
Terry Shannon would, either. Producing that verbiage is the price of
getting the information that he does. I let that wash over my head.

Where he differs from the true shills is that he doesn't manipulate
the facts, and leaves the real information in there for those who
are prepared to dig it out. And I have generally found it pretty
reliable. There are other commentators who I have seen lying black
is white about hard facts in order to justify their position.

I agree that the quality of his articles has gone down as DEC gave
way to Compaq and Compaq to DEC, because he has been trying to
maintain a positive spin in the face of a more and more negative
situation.

Regards,
Nick Maclaren.

Del Cecchi

unread,

Aug 26, 2004, 5:06:10 PM8/26/04

to

"Bill Todd" <bill...@metrocast.net> wrote in message

news:dL2dnQiIn-Z...@metrocastcablevision.com...

Thanks. He claims to be independent, but he sure has drunk the kool-aid.

del cecchi
>
>

Nick Maclaren

unread,

Aug 26, 2004, 6:58:57 PM8/26/04

to

In article <2p71m3F...@uni-berlin.de>,

Del Cecchi <cecchi...@us.ibm.com> wrote:
>"Bill Todd" <bill...@metrocast.net> wrote in message
>news:dL2dnQiIn-Z...@metrocastcablevision.com...
>

>> > Do you have a link to terry shannon's article? Is it on his web site?
>> > I want to read about power5 not being ready... :-)
>>
>> Well, it's one of Terry's casual slights rather than something he actually
>> (if incompetently) attempts to substantiate in any manner (at least if he
>> did, I missed it). It appears in his link to the actual .pdf file, at the
>> top of this Web page:
>>
>> http://www.shannonknowshpc.com/stories.php?story=04/08/24/5313953
>>

>Thanks. He claims to be independent, but he sure has drunk the kool-aid.

Well, there is that, but I should very much like to know what his
basis is for saying that, and wouldn't assume that he has none, based
on IBM's record of the past few years. On the other hand, based on
HIS record, I am not going to assume that he has even a scrap of
evidence for that statement, either.

Regards,
Nick Maclaren.

Alex Johnson

unread,

Aug 27, 2004, 8:24:49 AM8/27/04

to

Matt R. wrote:
> Call me an Alpha fanboy but I find it simply amusing that a 0.18 祄

> Alpha EV7z finally gets overtaken in terms of performance from a not

> yet released 0.09 祄 Madison with 5 times the amount of on-die cache.

> Well, I wonder when the point of "overtaken" would have been reached

> if the original plans for an EV79 (0.13祄, 3 MB L2, 1.8 GHz+) had been

> realized and not delayed and finally canceled by HP.

All very valid, but a technical correction on your rant: Madison9M is
0.13 祄 not 0.09 祄.

Del Cecchi

unread,

Aug 27, 2004, 8:30:52 AM8/27/04

to

"Nick Maclaren" <nm...@cus.cam.ac.uk> wrote in message

news:cglpvh$ls4$1...@pegasus.csx.cam.ac.uk...

> In article <2p71m3F...@uni-berlin.de>,
> Del Cecchi <cecchi...@us.ibm.com> wrote:
> >"Bill Todd" <bill...@metrocast.net> wrote in message
> >news:dL2dnQiIn-Z...@metrocastcablevision.com...
> >
> >> > Do you have a link to terry shannon's article? Is it on his web
site?
> >> > I want to read about power5 not being ready... :-)
> >>
> >> Well, it's one of Terry's casual slights rather than something he
actually
> >> (if incompetently) attempts to substantiate in any manner (at least if
he
> >> did, I missed it). It appears in his link to the actual .pdf file, at
the
> >> top of this Web page:
> >>
> >> http://www.shannonknowshpc.com/stories.php?story=04/08/24/5313953
> >>
> >Thanks. He claims to be independent, but he sure has drunk the kool-aid.
>
> Well, there is that, but I should very much like to know what his
> basis is for saying that, and wouldn't assume that he has none, based
> on IBM's record of the past few years. >
>

What is THAT supposed to mean? I know you have had issues with software
etc, but I took his statement as a processor hardware. And I have seen no
reports of significant hardware problems with Power4, 4+, or 5. Or with
the G5 thing either, although I have seen reports of supply problems.

del cecchi

Alex Johnson

unread,

Aug 27, 2004, 8:57:08 AM8/27/04

to

Bill Todd wrote:

> In 'Beyond Superdome", he first waxes poetic about current Superdome
> capabilities, such as their internal interconnect fabric. Let's see: this
> is the server architecture (at least somewhat reminiscent of the old and
> rather mediocre GS320 server architecture) that using 64 top-of-the-line
> Itanics barely manages to stay ahead of the new POWER5 box that requires
> only 16 processors (on a grand total of 8 chips, since they're dual-core) in
> TPC-C, right?

I believe you have misinterpretted the "16 processor" POWER5. IBM
actually refers to chips. "16 processor" as reported is 16 POWER5
chips, comprised of 32 cores, allowing 64 threads of execution. So the
64-thread Madison vs the 64-thread POWER5 having similar performance is
just a sign that things are about equal. I'm stunned by how good POWER5
is. But I know that next year Montecito will go from 1 thread per
package to 4 threads per package. Itanium will be down to a 16P system
to compete with IBM's 16P system.

> Then he crows, "HP delivers dual core before Intel" as some kind of
> significant achievement. Well, maybe. Of course, Sun is delivering
> dual-core SPARC processors today, and IBM started delivering dual-core
> POWER4s nearly three years ago. So what beating Intel to the punch mostly
> proves is just how far behind the curve Itanic really is, I'd suggest.

And Itanium being behind the curve is a joint decision between intel and
HP, pushed by HP. If not for staffing levels on Itanium a few years
back and HP pushing to be the first to do the interesting dual-core
project, there would have been a dual-core Itanium 2 on the market last
year.

> Doubling
> current system performance by about a year from now actually sounds pretty
> impressive, until you recognize that Superdome's TPC-C performance today
> with 64 processors falls slightly behind today's previous-design-generation
> POWER4+ systems that use only half that number of processors and only
> slightly manages to beat today's POWER5 boxes that use only 1/4th as many
> processors.

As explained above, if you compare per thread, these machines are
equivalent in size (64P Madison, 32P * 2 cores POWER4+, 16P * 2 cores *
2 threads POWER5).

> When Montecito comes along late next year
> it will indeed close much of this gap with POWER5 (Terry's second
> TPC-C-specific performance graph suggests it should slightly exceed 2
> million tpmC), but POWER5 (a full process generation behind Montecito but
> still heading for about 3 million tpmC late *this* year) will no longer be
> IBM's top-of-the-line product by then, since POWER5+ (in the same process
> generation as Montecito) should then be shipping and upping the ante
> significantly.

I have not seen these graphs. Could you tell me what configuration
those X million tpmC results are for? 4P, 16P, 64P, 64 *thread*. How
are the estimates being made. I don't have a lot of TPC numbers, but I
know a 4-socket Madison today is 121K and a 4-socket POWER5 (yes, that's
16 threads) is 371K and Montecito is supposed to also be around 370K in
4-socket. It will be a tight race. If you could explain the
configurations, that would help me. If you could quote published
4-socket numbers for POWER4 and POWER4+, that would help me (I'm trying
to make a table).

> And Fujitsu has regular enhancements to SPARC64
> coming along to keep pace with Itanic (though not POWER), regardless of what
> one may think of Sun's future efforts for that architecture.

> SPARC is dead, eh? Or 'no longer relevant', as a later slide says.

> Someone better tell Fujitsu so it will stop stomping all over the
> latest Itanics in commercial benchmarks like jbb2000: that's really
> not suitable behavior for a 'dead' processor. And by all means make
> sure those HP customers who are defecting to Sun know this: what on
> earth do you suppose they're thinking?!

Fujitsu is far ahead of Sun in performance, but they are far behind even
the laggard (intel) when it comes to features. They say dual-core at
end of '05, dual-core with 2 threads each sometime in '07. Compare that
to Montecito which is mid-'05 with dual-core and 2 threads per core.
Two years ahead. What Fujitsu *does* have that keeps pace, or even
stays ahead, is RAS. I don't have much hope for the SPARC family going
ahead. Niagra and Rock could either be a revolution or a flop, but I
think that if Sun sticks with SPARC-64, it will drag them to the bottom
of the ocean.

Itanium is a lousy performer in Java. That is because Java employs
self-modifying code and the Itanium spec explicitly states you can't do
that. It was shortsighted to put that in, but they hoped to escape the
IA32 complexities self-modifying code added to the design. It cost them
vast amounts of performance. It was analyzed and Montecito should have
much better jbb results as they've added new instructions and features
to directly speed up SMC. After all, intel targetted Sun when they
marketted Itanium. To leave Java performance in the shitter would be
marketting suicide.

> Well, given that 'about now' is upon us and I don't see any "Alpha/IA64
> hybrids" being benchmarked, 2007 seems at least a lot more credible. I
> guess my prediction of 2006 three years ago was slightly optimistic, but for
> a 5-year-out guesstimate I don't feel *that* ashamed of it.

Yeah, having "hybrid" designs now was always BS. It was from an
external guess with no information. Assuming the Alpha folks were
divided up and sent to each project being worked on in 2001, there might
be Alpha concepts coming out now, but intel kept the Alphans together
and gave them a new project of their own that wasn't on any roadmaps in
2001. Shannon just couldn't have known that and spread his rosie ideal
picture of the future. It was unprofessional to report on what he'd
like to see rather than what he knows, but it's common practice.

Alex

Nick Maclaren

unread,

Aug 27, 2004, 9:16:25 AM8/27/04

to

In article <2p8nrtF...@uni-berlin.de>,

"Del Cecchi" <cecchi...@us.ibm.com> writes:
|>
|> What is THAT supposed to mean? I know you have had issues with software
|> etc, but I took his statement as a processor hardware. And I have seen no
|> reports of significant hardware problems with Power4, 4+, or 5. Or with
|> the G5 thing either, although I have seen reports of supply problems.

Well, I have posted evidence several times, and am pretty sure that
you responded to at least one of those postings. But I may be
misremembering. There are two aspects to this:

IBM has had significant problems with a good many components
(disks at least 3 times, the POWER4, the PSeries PCI, the various
Switches and so on), and was extremely effective at stopping
the information from reaching the press. That is good short-term
marketing. What is not good is when it leads customers to buy
what they think is going to be an excellent product (and have paid
premium prices for), only to find that the problems they then have
were known to IBM long before the sale.

Please note that I am NOT referring primarily to my personal
experience, though I am not denying that we have seen some of
that, but am referring to information that is widely known and I
have heard first-hand from some of the customers involved.

In the case of the POWER4, it was hyped as having world-shattering
memory performance, that would remain excellent under load. Oops.
It has AT LEAST the following problems, which I have near-certain
knowledge were known by IBM (and being worked on, unsuccessfully)
for a long time before they were admitted to customers:

The latency and even bandwidth degrades badly as the number of
CPUs in use for memory-bound threads increases, whether or not
there is any memory overlap. At one stage, a 32-CPU Regatta had
fractionally over twice the aggregate memory bandwidth of an 8-CPU
655, as measured by John McCalpin. I can witness from personal
tests that this problem applies even within a 655.

The claimed performance was more dependent on the use of 16 MB
pages than the claims implied, and in more ways that were expected.
The software to make use of this was late, and appeared in a form
where very few customers (even HPC ones) could make use of it. We
can't, and are typical in this respect. I can't tell you whether
this is purely the fault of AIX, or whether hardware constraints
are at least a partial cause.

Regards,
Nick Maclaren.

Del Cecchi

unread,

Aug 27, 2004, 9:44:49 AM8/27/04

to

"Nick Maclaren" <nm...@cus.cam.ac.uk> wrote in message

news:cgnc79$2oi$1...@pegasus.csx.cam.ac.uk...

The memory bandwidth thing seems to be the only one on your list that
relates to processor "readyness for prime time" or whatever shannon said.
Switches, Disks, PCI etc. are hardly evidence supporting Shannon's snarky
remarks about the Power5 processor, as I'm sure you realize.

IBM is perfectly capable of releasing hardware and software that turn out to
not work too well, especially at first. It is something that goes back at
least to 360 days, before even my time. But to pick some examples of when
they did so and use them to cast aspersions on Power5 is not become you
Nick.

del cecchi

Paul Repacholi

unread,

Aug 27, 2004, 9:35:06 AM8/27/04

to

Stefan Monnier <mon...@iro.umontreal.ca> writes:

The EV6 was significantly delayed due to funding being chopped and
having to remove/not do development and debugging extras for the chip.
Net result was when early chips dropped their guts, you had 2 choises;
reset and read out the Jtag, nothing interesting here, move on, or
suck out a pile of bits and wonder which of them are live registers,
where the PC is, or thinks it is...

That delayed the 264, and also delayed the EV7, as it was to use the
EV6 core as a drop in part, sort off, plus delayed people moving onto
the EV8. Well done curly!

The EV8 was not far from 1st tape out when the axe fell I was told.

--
Paul Repacholi 1 Crescent Rd.,
+61 (08) 9257-1001 Kalamunda.
West Australia 6076
comp.os.vms,- The Older, Grumpier Slashdot
Raw, Cooked or Well-done, it's all half baked.
EPIC, The Architecture of the future, always has been, always will be.

Nick Maclaren

unread,

Aug 27, 2004, 10:15:23 AM8/27/04

to

In article <2p8s6jF...@uni-berlin.de>,

"Del Cecchi" <cecchi...@us.ibm.com> writes:
|>
|> The memory bandwidth thing seems to be the only one on your list that
|> relates to processor "readyness for prime time" or whatever shannon said.
|> Switches, Disks, PCI etc. are hardly evidence supporting Shannon's snarky
|> remarks about the Power5 processor, as I'm sure you realize.

I never said that they were. They are, however, good evidence of
IBM's recent marketing approach and skills.

|> IBM is perfectly capable of releasing hardware and software that turn out to
|> not work too well, especially at first. It is something that goes back at
|> least to 360 days, before even my time. But to pick some examples of when
|> they did so and use them to cast aspersions on Power5 is not become you
|> Nick.

That is not my point. The first sentence is true of every other
vendor I have heard of.

The point is that IBM used to be more scrupulous about ensuring
that customers were properly informed. In recent years, IBM has
frequently been very cavalier with the truth, and many customers
have felt misled. There is considerable evidence that this was
not simply a matter of a few salesmen, but is at least a corporate
malaise.

Furthermore, IBM has been very successful at ensuring that the
facts did not reach the press, or even customers, which caused the
customer problems I referred to. The only information I saw on
the POWER4 memory problems at the same stage in its cycle as the
POWER5 is now were a couple of very similar remarks to Shannon's
by people who were promoting other systems.

And, as at present, the first people who even hinted that all
was not well with the POWER4 were flamed for it. It was only
LATER that we learnt they were right.

We were late on the POWER4 scene but, from talking to people who
used it early on, IBM and its supporters were spreading misleading
and (in some of the latter cases) even false information about its
memory system for at least 6 months after the problems were known.
I do not know when they were known to be insoluble.

Given the above, I neither believe nor disbelieve Shannon (nor
you, for that matter), and await some independent evidence. The
POWER5 may have resolved all of the major problems of the POWER4
and introduced no new ones, or it may not have. I don't know.

Regards,
Nick Maclaren.

Robert Myers

unread,

Aug 27, 2004, 11:15:43 AM8/27/04

to

Del Cecchi wrote:

<snip>

> What is THAT supposed to mean? I know you have had issues with software
> etc, but I took his statement as a processor hardware. And I have seen no
> reports of significant hardware problems with Power4, 4+, or 5. Or with
> the G5 thing either, although I have seen reports of supply problems.
>

Maybe not the best place to ask it, but what does it mean when IBM sells
a high profile installation to the army based on Opteron:

http://www.nwfusion.com/news/2004/0803ibmtap.html

IBM presumably wants to sell the best processor for the job, and the
best processor for this particular job, apparently, isn't Power.

Possible answers:

1. The infrastructure for Opteron is more mature for this application
than the infrastructure for Power.

2. The customer had a strong preference for AMD.

3. Power is slotted to sell at a higher price point and IBM doesn't want
to erode the margins on its top-billed processor just to win a
competitive procurement.

4. IBM needs the higher price on Power to maintain decent margin because
it is more expensive to make and/or IBM can't make Power in sufficient
quantities at an acceptable cost (cost, yield, and volume being related
in complicated ways).

Shannon's claim with respect to Alpha is that it was too expensive to
make (DEC was losing $800 on every Alpha, or some such) and that HPaq
needed a "value" proposition. Whatever the virtues of Power relative to
Itanium, it clearly is not aiming at a value propostion, and no matter
how I read the Aberdeen procurement, it confirms that IBM isn't aiming
at a value proposition for Power, but will be perfectly happy to sell
Opteron into that market, instead.

None of that translates into Power 5 not being "ready for prime time,"
but it suggests a more complicated landscape than Power wins, Itanium
loses, or vice versa. IBM could well be in a position to sell Power 5
into a high-price, high-margin slot and never in a position to sell it,
or any derivative, into a high-volume "value proposition"--witness the
reported problems with G5.

I haven't been very articulate in asking it, but I'm wondering if the
same forces that kept Alpha in a niche and eventually sidelined it might
not do the same things to Power. If it seems like I'm casting
aspersions on IBM or IBM Microelectronics in the process, that reflects
on the limitations in my skill with language, not on my intent.

RM

Bill Todd

unread,

Aug 27, 2004, 11:45:58 AM8/27/04

to

"Robert Myers" <rmyer...@comcast.net> wrote in message
news:zeIXc.100391$TI1.43430@attbi_s52...

...

> Shannon's claim with respect to Alpha is that it was too expensive to
> make (DEC was losing $800 on every Alpha, or some such) and that HPaq
> needed a "value" proposition.

And you would be wise not to base any attempted logical extrapolations on
this 'claim', because to the degree to which it is valid at all it is a
purely paper 'loss'.

Alpha development cost about $150 million annually. Alpha system profits
ran well over $1 billion annually. Alpha manufacturing volume has been
claimed by people who probably know more than I do about such things as
being about 100K annually (though I've also seen figures as high as 300K).

Even if production costs were such that each Alpha indeed was 'sold' to
manufacturing at an $800 paper loss (and I'm afraid that given Terry's other
gross misstatements I don't entirely trust him on this), the $1 billion or
more annual net system profit left after the above $150 million in annual
development cost was subtracted was sufficient to push this $80 million
paper loss almost down into the 'noise' category.

It's not as if DECpaq sold Alphas primarily in boxes or trays, after all:
they all went into reasonably high-margin systems built by DECpaq, and the
overall result was that Alpha was a *very* profitable business and could
have been even more so if actual marketing had been used to reassure
customers about its owner's long-term commitment and thus drive up volumes.

- bill

Robert Myers

unread,

Aug 27, 2004, 12:09:42 PM8/27/04

to

Bill Todd wrote:

<snip>

>
> It's not as if DECpaq sold Alphas primarily in boxes or trays, after all:
> they all went into reasonably high-margin systems built by DECpaq, and the
> overall result was that Alpha was a *very* profitable business and could
> have been even more so if actual marketing had been used to reassure
> customers about its owner's long-term commitment and thus drive up volumes.
>

I think we've had a version of this exchange...at least once. You
_have_ suggested in the past that DECpaq might have gone into the
merchant chip business, but I won't hold you to that, if that's not the
position you want to take at the moment.

A processor like Power (or Alpha) might make perfectly good sense as the
essential ingredient for a high-margin line of big iron and not
otherwise. If that's the case, then the margins of the big iron have to
subsidize the costs of processor development and manufacturing. That
means you can sell 32-processor (or whatever) SMP Power boxes to
commecial clients but not multi-thousand processor clusters to the Army.
I infer that to be the current situation with Power, and I infer it to
have been the situation with Alpha.

Whether a processor so situated can long endure is another matter. The
overwhelming lesson of history at one point seems to have been: it if
can't survive as a commodity, it can't survive. The stated logic of
IBM's accounting for profit by division would seem to support that
logic, but we all know that there is a great deal of room for creativity
in such accountings.

What will actually happen, I would not venture to guess. That IBM has
been able to nurture its beleaguered big iron business so successfully
while purportedly turning itself into a services business is a tribute
to their management and marketing savvy, not to mention to their
tenacity. DEC, at the critical moment, seems to have run short of
everything but tenacity, not to say denial. How history might have been
different is rarely an empty exercise, but it is, in the end, history.

As to my relying on anything by Terry Shannon, I don't know him at all.
The cited document is puzzling in all kinds of ways, not the least of
which I can't imagine who its target audience is intended to be, and I
can't imagine who would be persuaded by it. That doesn't stop me from
trying to parse what I imagine the real point of the story might be.

RM

Eric Gouriou

unread,

Aug 27, 2004, 12:25:39 PM8/27/04

to

Alex Johnson wrote:
[...]

> Itanium is a lousy performer in Java. That is because Java employs
> self-modifying code and the Itanium spec explicitly states you can't do
> that.

The last few times I looked, HP's Itanium JVM (derived from Sun's Hotspot)
hold the top SpecJBB numbers, and not just at one price point but
for every hardware level (4 way, 16 way, 64 way). I must admit I haven't
looked in a couple of months though.

In the last results I saw, a 64 way Superdome had better numbers than
the top Sun platform (106 way ?).

The Itanium architecture has no problem with self-modifying code.
One just has to be explicit about it (sync.i). It's trivial for a JIT.
Bundle-sized atomic writes (documented in the publicly available
architecture documents) will provide further help once available.

I won't comment on the rest of your post.

Eric

Nick Maclaren

unread,

Aug 27, 2004, 12:34:49 PM8/27/04

to

In article <412F6083...@hp.com>,

Eric Gouriou <eric.g...@hp.com> writes:
|>
|> In the last results I saw, a 64 way Superdome had better numbers than
|> the top Sun platform (106 way ?).

That's the F15K with all possible MaxCats. 106 CPUs in a box is
a joke, and nobody has that many. We are seriously unusual, even
at 100. The UltraSPARC IIIcu never was the fastest CPU around,
and the strength of the F15K is that it maintains performance
under parallel, memory-limited loading. My guess is that benchmark
is heavily CPU-limited.

There is now the dual-core F25K, which can have up to 144 CPUs
in a box. That might be faster. But no current SPARCs are known
for blazing single-CPU performance.

Regards,
Nick Maclaren.

Del Cecchi

unread,

Aug 27, 2004, 12:41:01 PM8/27/04

to

"Robert Myers" <rmyer...@comcast.net> wrote in message
news:zeIXc.100391$TI1.43430@attbi_s52...

> Del Cecchi wrote:
>
> <snip>
>
> > What is THAT supposed to mean? I know you have had issues with software
> > etc, but I took his statement as a processor hardware. And I have seen
no
> > reports of significant hardware problems with Power4, 4+, or 5. Or
with
> > the G5 thing either, although I have seen reports of supply problems.
> >
>
> Maybe not the best place to ask it, but what does it mean when IBM sells
> a high profile installation to the army based on Opteron:
>
> http://www.nwfusion.com/news/2004/0803ibmtap.html
>
> IBM presumably wants to sell the best processor for the job, and the
> best processor for this particular job, apparently, isn't Power.

Actually IBM wants to sell whatever the customer wants to buy.

>
> Possible answers:
>
> 1. The infrastructure for Opteron is more mature for this application
> than the infrastructure for Power.
>
> 2. The customer had a strong preference for AMD.

My guess is that the RFP or whatever the government calls it specified AMD.
But I don't know that, and I don't even know where to look to find out.
Must be a public record. Was it bid or sole source? Don't know that
either.

>
> 3. Power is slotted to sell at a higher price point and IBM doesn't want
> to erode the margins on its top-billed processor just to win a
> competitive procurement.

Could be, if it was that kind of procurement. The P series boxes are
certainly more expensive than the 325's

>
> 4. IBM needs the higher price on Power to maintain decent margin because
> it is more expensive to make and/or IBM can't make Power in sufficient
> quantities at an acceptable cost (cost, yield, and volume being related
> in complicated ways).

Power4/+/5 clearly was targetted at a different kind of system, and is not
as cheap as Opteron. G5 is more in the same league.

>
> Shannon's claim with respect to Alpha is that it was too expensive to
> make (DEC was losing $800 on every Alpha, or some such) and that HPaq
> needed a "value" proposition. Whatever the virtues of Power relative to
> Itanium, it clearly is not aiming at a value propostion, and no matter
> how I read the Aberdeen procurement, it confirms that IBM isn't aiming
> at a value proposition for Power, but will be perfectly happy to sell
> Opteron into that market, instead.

I wouldn't say "instead", I would say "as well". I like to think we have
gotten beyond dictating to customers.

>
> None of that translates into Power 5 not being "ready for prime time,"
> but it suggests a more complicated landscape than Power wins, Itanium
> loses, or vice versa. IBM could well be in a position to sell Power 5
> into a high-price, high-margin slot and never in a position to sell it,
> or any derivative, into a high-volume "value proposition"--witness the
> reported problems with G5.

I just took issue with the dismissive comments implying that the Power5
(along with everyone else) does not comprise a plausable alternative to
Itanium.

>
> I haven't been very articulate in asking it, but I'm wondering if the
> same forces that kept Alpha in a niche and eventually sidelined it might
> not do the same things to Power. If it seems like I'm casting
> aspersions on IBM or IBM Microelectronics in the process, that reflects
> on the limitations in my skill with language, not on my intent.
>
> RM

Not you. Hard to say about economic factors. That's another issue.

del
>

Bill Todd

unread,

Aug 27, 2004, 1:15:04 PM8/27/04

to

"Alex Johnson" <comp...@jhu.edu> wrote in message

news:cgnb35$673$1...@news01.intel.com...

> Bill Todd wrote:
>
> > In 'Beyond Superdome", he first waxes poetic about current Superdome
> > capabilities, such as their internal interconnect fabric. Let's see:
this
> > is the server architecture (at least somewhat reminiscent of the old and
> > rather mediocre GS320 server architecture) that using 64 top-of-the-line
> > Itanics barely manages to stay ahead of the new POWER5 box that requires
> > only 16 processors (on a grand total of 8 chips, since they're
dual-core) in
> > TPC-C, right?
>
> I believe you have misinterpretted the "16 processor" POWER5.

You believe incorrectly.

IBM
> actually refers to chips.

What IBM may or may not 'refer' to (and they're not always very consistent
in this) is not what matters in this case. What matters is how TPC-C counts
processors - and TPC-C counts cores as processors.

"16 processor" as reported is 16 POWER5
> chips, comprised of 32 cores, allowing 64 threads of execution.

Incorrect. It is 16 cores, on 8 chips, allowing 32 threads of execution.

So the
> 64-thread Madison vs the 64-thread POWER5 having similar performance is
> just a sign that things are about equal.

Absolute rubbish. The POWER5 system uses 1/4 as many cores, on 1/8 as many
chips, to achieve 80% of the result. Equating dual-thread SMT to twice as
many cores is the kind of nonsense even someone like Terry probably would
not try to pull off: at *most*, it likely improves the TPC-C throughput of
each core by about 30%, which still leaves each 'raw' (non-SMT) POWER5 core
pumping *well* over twice the TPC-C throughput of each Itanic core (hardly
surprising, since even the 32-core non-SMT POWER4+ system marginally beat
out the 64-processor Superdome in TPC-C).

I'm stunned by how good POWER5
> is. But I know that next year Montecito will go from 1 thread per
> package to 4 threads per package. Itanium will be down to a 16P system
> to compete with IBM's 16P system.

If you consider that having less than half the TPC-C performance of the
POWER5 system with an equal number of cores qualifies as 'competing with'
it, perhaps.

>
> > Then he crows, "HP delivers dual core before Intel" as some kind of
> > significant achievement. Well, maybe. Of course, Sun is delivering
> > dual-core SPARC processors today, and IBM started delivering dual-core
> > POWER4s nearly three years ago. So what beating Intel to the punch
mostly
> > proves is just how far behind the curve Itanic really is, I'd suggest.
>
> And Itanium being behind the curve is a joint decision between intel and
> HP, pushed by HP. If not for staffing levels on Itanium a few years
> back and HP pushing to be the first to do the interesting dual-core
> project, there would have been a dual-core Itanium 2 on the market last
> year.

More bullshit.

Adding staffing to Itanic wouldn't have speeded up the applicable process
technology significantly, so any dual-core Itanic released last year would
still have been in 130 nm. IIRC the current Madison core occupies about 43%
of the space on a 376 mm^2 chip: doubling it would have left no room for
the gargantuan caches that Itanic requires to achieve competitive
performance, not to mention creating a 200W chip to have to cool.

>
> > Doubling
> > current system performance by about a year from now actually sounds
pretty
> > impressive, until you recognize that Superdome's TPC-C performance today
> > with 64 processors falls slightly behind today's
previous-design-generation
> > POWER4+ systems that use only half that number of processors and only
> > slightly manages to beat today's POWER5 boxes that use only 1/4th as
many
> > processors.
>
> As explained above, if you compare per thread, these machines are
> equivalent in size (64P Madison, 32P * 2 cores POWER4+, 16P * 2 cores *
> 2 threads POWER5).

As explained above, that explanation is even more chock-full of shit than
Terry's tend to be.

>
> > When Montecito comes along late next year
> > it will indeed close much of this gap with POWER5 (Terry's second
> > TPC-C-specific performance graph suggests it should slightly exceed 2
> > million tpmC), but POWER5 (a full process generation behind Montecito
but
> > still heading for about 3 million tpmC late *this* year) will no longer
be
> > IBM's top-of-the-line product by then, since POWER5+ (in the same
process
> > generation as Montecito) should then be shipping and upping the ante
> > significantly.
>
> I have not seen these graphs. Could you tell me what configuration
> those X million tpmC results are for? 4P, 16P, 64P, 64 *thread*. How
> are the estimates being made. I don't have a lot of TPC numbers, but I
> know a 4-socket Madison today is 121K and a 4-socket POWER5 (yes, that's
> 16 threads) is 371K and Montecito is supposed to also be around 370K in
> 4-socket. It will be a tight race. If you could explain the
> configurations, that would help me. If you could quote published
> 4-socket numbers for POWER4 and POWER4+, that would help me (I'm trying
> to make a table).

Why don't you try getting a clue what you're talking about first? Learning
something about what SMT is and is not would be a good start. Then try
getting some *quantitative* idea about how much the different SMT
implementations you're so casually throwing together add to the performance
of the core they're associated with.

>
> > And Fujitsu has regular enhancements to SPARC64
> > coming along to keep pace with Itanic (though not POWER), regardless of
what
> > one may think of Sun's future efforts for that architecture.
>
> > SPARC is dead, eh? Or 'no longer relevant', as a later slide says.
> > Someone better tell Fujitsu so it will stop stomping all over the
> > latest Itanics in commercial benchmarks like jbb2000: that's really
> > not suitable behavior for a 'dead' processor. And by all means make
> > sure those HP customers who are defecting to Sun know this: what on
> > earth do you suppose they're thinking?!
>
> Fujitsu is far ahead of Sun in performance, but they are far behind even
> the laggard (intel) when it comes to features.

Except that they seem to be trouncing Intel in jbb2000, and seem likely to
do very well in other commercial benchmarks given Fujitsu's experience in
large-system design. Funny about that.

They say dual-core at
> end of '05, dual-core with 2 threads each sometime in '07. Compare that
> to Montecito which is mid-'05 with dual-core and 2 threads per core.
> Two years ahead.

Or zero years ahead, depending upon how useful Montecito's relatively crude
two-way SMT turns out to be. But even if SPARC64 falls slightly behind
Itanic in performance (a fact decidedly not yet in evidence) it probably
won't hurt Sun: it will still be far more relatively competitive in
performance than any Sun SPARC has been in recent memory, so should if
anything improve Sun's position.

- bill

Alex Johnson

unread,

Aug 27, 2004, 1:15:18 PM8/27/04

to

Paul Repacholi wrote:
> That delayed the 264, and also delayed the EV7, as it was to use the
> EV6 core as a drop in part, sort off, plus delayed people moving onto
> the EV8. Well done curly!
>
> The EV8 was not far from 1st tape out when the axe fell I was told.

I have to correct myself. I was not accurate when I guessed a 2006 EV8
release to the public. From the horse's mouth: "end of this year or
early 2005." So the Alpha with 4 threads would have been competing
against the POWER with 4 threads and in 6-9 months the Pentium 4 with 4
threads and the Itanium with 4 threads. All is right in the world
again, as every company stays relatively abreast of it's competitors'
threadcount.

...except SPARC64, which will have 4 threads in 2007, from news earlier
this year.

Jouni Osmala

unread,

Aug 27, 2004, 2:54:48 PM8/27/04

to

> >>You can tell by the low prices used alpha systems
> >>fetch on e-bay that the alpha is pretty much history.
> >
> >
> > Pity I live in Canada and not the United States. It'll be awkward for me
> > to take advantage of one of those bargains - I'll have to wait till
> > someone in Canada wants to get rid of his Alpha.
> >
> > John Savard
> > http://home.ecn.ab.ca/~jsavard/index.html
>
> 1.) The Alpha servers/workstations available on E-Bay are
> seldom with processors faster than 233 or 266 MHz.
> In other words, only the really ancient stuff is being
> sold on E-Bay - so its no surprise that the prices are
> low.
>
> Components for upgrading more modern Alpha servers, such
> as 1 and 2 GB Memory upgrades for Alpha servers, by contrast
> are selling for big bucks. People are willing to pay big
> premiums to keep there Alpha servers are alive and well -
> hardly a sign that the Alpha is history.
>
> 2.) Check the "for sale" newsgroups for your province or city.
> Even in Saskatchewan (sk.forsale) we occasionally get local
> sales of Alpha systems comparable to what is available on
> E-Bay. However, those too are 233 and 266 MHz systems
> almost all the time.
>
> 3.) What's wrong with using E-Bay but just limiting your search
> to Canadian sellers ?

Still there WHERE some alpha's available some time ago, and I missed
my opportunity to get them... [Rembered that at highschool I was
intern in certain city hall, and there where workstations there with
96kb of ondie L2 cache and ran over 500mhz and that time I was using a
brand new PPro... Anyway they put it dumpster when cpq decided to kill
alpha and got PC as replacement...
[I would of loved to get that peace of HW as upgrade from my 366
celeron at the time, since I've been using linux for quite a while.]
Nowadays people put things on ebay, but if there is shop that uses
alpha and is migrating away there is chance of getting one if its
medium age cheap enough.]

Toon Moene

unread,

Aug 27, 2004, 3:15:06 PM8/27/04

to

Alex Johnson wrote:

> Bill Todd wrote:
>
>> Of course, if Alpha hadn't been killed three years ago, the new Itanics
>> would have been competing (though the performance gap would have made
>> that
>> word somewhat laughable) against EV8,
>

> This is untrue. The EV8 was reported as having vastly better
> performance than Itanium or Pentium 4, and I was eagerly waiting for it

> (almost drooling). But one thing you leave out is the Alpha team's

> tendancy to keep projects going long after their planned release dates.

The problem with *your* analysis is that you view the Alpha demise in
isolation.

Alpha development was drained of resources long before the axe fell.

--
Toon Moene - e-mail: to...@moene.indiv.nluug.nl - phone: +31 346 214290
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Maintainer, GNU Fortran 77: http://gcc.gnu.org/onlinedocs/g77_news.html
A maintainer of GNU Fortran 95: http://gcc.gnu.org/fortran/

Thu

unread,

Aug 27, 2004, 10:34:16 PM8/27/04

to

Alex Johnson <comp...@jhu.edu> wrote in message news:<cgnb35$673$1...@news01.intel.com>...

> Bill Todd wrote:
>
> > In 'Beyond Superdome", he first waxes poetic about current Superdome
> > capabilities, such as their internal interconnect fabric. Let's see: this
> > is the server architecture (at least somewhat reminiscent of the old and
> > rather mediocre GS320 server architecture) that using 64 top-of-the-line
> > Itanics barely manages to stay ahead of the new POWER5 box that requires
> > only 16 processors (on a grand total of 8 chips, since they're dual-core) in
> > TPC-C, right?
>
> I believe you have misinterpretted the "16 processor" POWER5. IBM
> actually refers to chips. "16 processor" as reported is 16 POWER5
> chips, comprised of 32 cores, allowing 64 threads of execution. So the
> 64-thread Madison vs the 64-thread POWER5 having similar performance is
> just a sign that things are about equal. I'm stunned by how good POWER5
> is. But I know that next year Montecito will go from 1 thread per
> package to 4 threads per package. Itanium will be down to a 16P system
> to compete with IBM's 16P system.

No, you are incorrect.

IBM's always refers 16 processors as the number of cores. So a
maximum p570 is 8 Power5 Dual core chips, 16 cores and with SMT 32
threads.

On a per chip basis then, Power5 is > 6X the performance on TPC-C
compared to the HP Superdome.

Here's their spec submission for a fully loaded p570:
http://www.spec.org/cpu2000/results/res2004q3/cpu2000-20040712-03234.html

>
> > Then he crows, "HP delivers dual core before Intel" as some kind of
> > significant achievement. Well, maybe. Of course, Sun is delivering
> > dual-core SPARC processors today, and IBM started delivering dual-core
> > POWER4s nearly three years ago. So what beating Intel to the punch mostly
> > proves is just how far behind the curve Itanic really is, I'd suggest.
>
> And Itanium being behind the curve is a joint decision between intel and
> HP, pushed by HP. If not for staffing levels on Itanium a few years
> back and HP pushing to be the first to do the interesting dual-core
> project, there would have been a dual-core Itanium 2 on the market last
> year.
>
> > Doubling
> > current system performance by about a year from now actually sounds pretty
> > impressive, until you recognize that Superdome's TPC-C performance today
> > with 64 processors falls slightly behind today's previous-design-generation
> > POWER4+ systems that use only half that number of processors and only
> > slightly manages to beat today's POWER5 boxes that use only 1/4th as many
> > processors.
>
> As explained above, if you compare per thread, these machines are
> equivalent in size (64P Madison, 32P * 2 cores POWER4+, 16P * 2 cores *
> 2 threads POWER5).

Nope, wrong again. Power4+ is 16 Chips * 2 cores. Power4+ 4X better
per chip compared to Itanium in TPC-C.

>
> > When Montecito comes along late next year
> > it will indeed close much of this gap with POWER5 (Terry's second
> > TPC-C-specific performance graph suggests it should slightly exceed 2
> > million tpmC), but POWER5 (a full process generation behind Montecito but
> > still heading for about 3 million tpmC late *this* year) will no longer be
> > IBM's top-of-the-line product by then, since POWER5+ (in the same process
> > generation as Montecito) should then be shipping and upping the ante
> > significantly.
>
> I have not seen these graphs. Could you tell me what configuration
> those X million tpmC results are for? 4P, 16P, 64P, 64 *thread*. How
> are the estimates being made. I don't have a lot of TPC numbers, but I
> know a 4-socket Madison today is 121K and a 4-socket POWER5 (yes, that's
> 16 threads) is 371K and Montecito is supposed to also be around 370K in
> 4-socket. It will be a tight race. If you could explain the
> configurations, that would help me. If you could quote published
> 4-socket numbers for POWER4 and POWER4+, that would help me (I'm trying
> to make a table).

IBM has not published any smaller configuaration numbers for Power4+.

Here is the best non clustered results in terms of performance for
Itanium and power5/power4+ on a per core basis.

4 cores
IBM eServer p5 570 4P - Oracle 10G - 194,391
HP Integrity rx5670 Linux - Oracle 10G - 136,110

8 Cores
IBM eServer p570 8P - Oracle 10G - 371,044
Bull NovaScale 5080 - C/S SQL Server 2000 - 175,366

16 cores
IBM eServer p5 570 16P - IBM DB2 UDB 8.1 - 809,144
Unisys ES7000 Aries 420 Enterprise Server - SQL Server 2000 - 309,036

32 cores
IBM eServer pSeries 690 - IBM DB2 UDB 8.1 - 1,025,486
NEC Express5800/1320Xd - Oracle 10G - 683,575

64 cores
HP Integrity Superdome - Oracle 10G - 1,008,144

All itaniums used 1.5ghz chips
power5s were 1.9Ghz
the p690 (power4+) was 1.9ghz

The 64 way (32 Chips * 2 cores) Power5 based p5 590 is due to be
announced in the next 2 months!

Another thing to note is performance of power5 is greater when using
DB2 compared to Oracle 10G. If IBM had submitted 4 way and 8 way
TPC-C results using DB2, the performance in likely to be ~220K and
~420K respectively.

A few things to note:
Montecito's dual thread implementation is not SMT, it is the much
simpler HMT. Performance increase expected from this is much less
than SMT.

Itanium's java performance is not that lousy. It is approx similar to
power4+ and a bit less than Sparc64V. Way behind power5 though.

I'll do another comparison using specjbb2000

8 core
IBM eServer p5 570 1.9Ghz - 328996
Fujitsu PRIMEPOWER650 1.89Ghz- 213956
HP Integrity rx7620 Server - 190393

16 core
IBM eServer p5 570 1.9Ghz - 633106
Fujitsu PRIMEPOWER900 1.89Ghz - 402961
HP Integrity rx8620 Server 1.5Ghz - 341098

32 core
Fujitsu PRIMEPOWER1500 1.89Ghz - 663133
NX7700 i9510 1.5Ghz - 580536
IBM eServer pSeries 690 Turbo 1.7Ghz - 553480

48 core
Sun Fire E6900 1.2Ghz - 421773

64 core
HP Integrity Superdome server 1.5Ghz - 1008604
Fujitsu PRIMEPOWER2500 1.3Ghz - 835479

112 core
Fujitsu PRIMEPOWER2500 1.3Ghz - 1420177

Note that power4+ results didn't use their fastest processor (1.9Ghz)
The primepower 2500 is shipping (or about to) with 1.82Ghz Sparc64V.
No official results are in for that config yet.

Of course Itaniums with 1.7Ghz /9M is due very soon as well.

As you can see, in specjbb2000, the lead of power5 is not as great as
the lead of that chip in tpc-c compared to itanium.

del cecchi

unread,

Aug 27, 2004, 10:47:27 PM8/27/04

to

"Thu" <th...@iprimus.com.au> wrote in message
news:8341eb81.04082...@posting.google.com...

> Alex Johnson <comp...@jhu.edu> wrote in message
news:<cgnb35$673$1...@news01.intel.com>...

snip.

>
> Here's their spec submission for a fully loaded p570:
>
http://www.spec.org/cpu2000/results/res2004q3/cpu2000-20040712-03234.html
> >
> > > Then he crows, "HP delivers dual core before Intel" as some kind
of
> > > significant achievement. Well, maybe. Of course, Sun is
delivering
> > > dual-core SPARC processors today, and IBM started delivering
dual-core
> > > POWER4s nearly three years ago. So what beating Intel to the
punch mostly
> > > proves is just how far behind the curve Itanic really is, I'd
suggest.
> >

snip

sorry if I screwed up the attributions. The above may have been written
by Bill Todd. Not sure.

Anyway, finally crawled through the pitch. HP didn't do dual core.
What they did was redesign the package to allow two chips to fit in an
area where Intel packages one chip. Their "special relationship"
apparently lets them buy unpackaged die.

Terry has really slipped since he was writing about DEC.
I guess all his sources got layed off or sent to Intel.

del cecchi

Tony Hill

unread,

Aug 28, 2004, 3:00:04 AM8/28/04

to

On Fri, 27 Aug 2004 16:25:39 GMT, Eric Gouriou <eric.g...@hp.com>
wrote:

>Alex Johnson wrote:
>[...]
>> Itanium is a lousy performer in Java. That is because Java employs
>> self-modifying code and the Itanium spec explicitly states you can't do
>> that.
>
> The last few times I looked, HP's Itanium JVM (derived from Sun's Hotspot)
>hold the top SpecJBB numbers, and not just at one price point but
>for every hardware level (4 way, 16 way, 64 way). I must admit I haven't
>looked in a couple of months though.

You've missed some recent results. HP's Itanium performance in
SPECjbb2000 isn't really what I would call "lousy", but it's
definitely NOT where Intel would like it to be I'm sure, particularly
if you look at 4-core systems (the most widely used config in this
benchmark). Here the fastest 5 chips are:

1. IBM Power5 - 170127
2. AMD Opteron - 133427
3. Intel XeonMP - 118031
4. Intel Itanium2 - 116466
5. IBM Power4 - 96377

The results for the Itanium are a bit dated and the most current
version of the hardware and software would probably move it up ahead
of the Xeon at least, but it seems unlikely that it could match the
Opterons performance in this test while the Power5 looks well out of
reach.

Also of note is that HP's PA-8800 processor is still turning in some
VERY respectable scores in this test considering how dated the design
is. There are no 4-core designs, but their 4-chip/8-core turns in a
result of 214932, finishing just ahead of Fujitsu's SPARC64V at 213956
and noticeably ahead of the top 8-processor Itanium result of 190393
(again, a slightly dated result but on current hardware).

Anyway, as it stands right now it looks like IBM's Power5 is the chip
to beat in this test (as it is in pretty much all benchmarks) on the
high-end while AMD's Opteron and Intel's XeonMP are almost certainly
the "bang-for-buck" leaders (again, pretty much the norm here).

-------------
Tony Hill
hilla <underscore> 20 <at> yahoo <dot> ca

Bill Todd

unread,

Aug 28, 2004, 3:03:19 AM8/28/04

to

"del cecchi" <dcecchi...@att.net> wrote in message
news:2paa59F...@uni-berlin.de...

>
> "Thu" <th...@iprimus.com.au> wrote in message
> news:8341eb81.04082...@posting.google.com...
> > Alex Johnson <comp...@jhu.edu> wrote in message
> news:<cgnb35$673$1...@news01.intel.com>...
> snip.
> >
> > Here's their spec submission for a fully loaded p570:
> >
> http://www.spec.org/cpu2000/results/res2004q3/cpu2000-20040712-03234.html
> > >
> > > > Then he crows, "HP delivers dual core before Intel" as some kind
> of
> > > > significant achievement. Well, maybe. Of course, Sun is
> delivering
> > > > dual-core SPARC processors today, and IBM started delivering
> dual-core
> > > > POWER4s nearly three years ago. So what beating Intel to the
> punch mostly
> > > > proves is just how far behind the curve Itanic really is, I'd
> suggest.
> > >
> snip
>
> sorry if I screwed up the attributions. The above may have been written
> by Bill Todd. Not sure.

The last part you quoted was.

>
> Anyway, finally crawled through the pitch. HP didn't do dual core.
> What they did was redesign the package to allow two chips to fit in an
> area where Intel packages one chip. Their "special relationship"
> apparently lets them buy unpackaged die.

Actually, my impression was that Terry was referring to the true dual-core
PA-RISC 8800 (which I think is currently shipping), not to the dual-chip
kludge HP created as an interim band aid for Itanic.

- bill

Casper H.S. Dik

unread,

Aug 28, 2004, 5:12:04 AM8/28/04

to

nm...@cus.cam.ac.uk (Nick Maclaren) writes:

>That's the F15K with all possible MaxCats. 106 CPUs in a box is
>a joke, and nobody has that many. We are seriously unusual, even
>at 100. The UltraSPARC IIIcu never was the fastest CPU around,
>and the strength of the F15K is that it maintains performance
>under parallel, memory-limited loading. My guess is that benchmark
>is heavily CPU-limited.

The OP could have done a simple SPEC query to find out; the HP
result seems to have lost its top billing to a 112-way Fujitsu
Prime Power system. (Which seems to deliver about as "SpecJBBs"
per CPU per MHz as the HP system)

Casper
--
Expressed in this posting are my opinions. They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

Nick Maclaren

unread,

Aug 28, 2004, 5:19:06 AM8/28/04

to

In article <8341eb81.04082...@posting.google.com>,

Thu <th...@iprimus.com.au> wrote:
>
>On a per chip basis then, Power5 is > 6X the performance on TPC-C
>compared to the HP Superdome.

What I would like to know is how the various designs work on benchmarks
that are not embarrassingly parallel (in a memory access sense). The
tradeoff appears to be between single-threaded performance and
scalability (even for non-communicating codes), and you can't have
both. Now, I haven't tested the Altix or SuperDome myself, but I
can witness that most other designs seem to do one or the other well,
but never both.

This is why SGI and Sun systems do much better in practice than that
sort of benchmark would imply - and do NOT trail in the way that they
are said to in the press.

>Montecito's dual thread implementation is not SMT, it is the much
>simpler HMT. Performance increase expected from this is much less
>than SMT.

What on earth is that new TLA? Anyway, a simpler form of threading
is likely to be MORE efficient than Pentium 4 SMT, because the latter
is a technical failure (though a marketing success). Even Eggers'
model (MIPS-based) showed that there was a pretty marginal gain (and
might be none) over a simple CMT design.

Regards,
Nick Maclaren.

Steve Greatbanks

unread,

Aug 28, 2004, 8:58:28 AM8/28/04

to

"Alex Johnson" <comp...@jhu.edu> wrote in message
news:cgnb35$673$1...@news01.intel.com...

> Bill Todd wrote:
>
>> In 'Beyond Superdome", he first waxes poetic about current Superdome
>> capabilities, such as their internal interconnect fabric. Let's see:
>> this
>> is the server architecture (at least somewhat reminiscent of the old and
>> rather mediocre GS320 server architecture) that using 64 top-of-the-line
>> Itanics barely manages to stay ahead of the new POWER5 box that requires
>> only 16 processors (on a grand total of 8 chips, since they're dual-core)
>> in
>> TPC-C, right?
>
> I believe you have misinterpretted the "16 processor" POWER5. IBM
> actually refers to chips. "16 processor" as reported is 16 POWER5 chips,
> comprised of 32 cores, allowing 64 threads of execution. So the 64-thread
> Madison vs the 64-thread POWER5 having similar performance is just a sign
> that things are about equal. I'm stunned by how good POWER5 is. But I
> know that next year Montecito will go from 1 thread per package to 4
> threads per package. Itanium will be down to a 16P system to compete with
> IBM's 16P system.

Bill is right. The p5 570, as benchmarked for TPC-C[1] has 4 "building
blocks",
each of which is a 4-way machine. Reading the relevant redpaper[2] each of
these
building blocks has two processor slots, and each of the processor cards
contains a
single DCM (dual chip module). The DCM is comprised of a dual-core Power 5
and
the off-chip L3 cache. That means the 16 way box mentioned has 4 building
blocks,
each with two chips, each with two cores (and each of the cores is
SMT-capable),
so the "16 processor" Power 5 box is really "16 cores on 8 chips".

[1] http://www.tpc.org/tpcc/results/tpcc_result_detail.asp?id=104071202
[2] http://www.redbooks.ibm.com/redpapers/pdfs/redp9117.pdf

---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.744 / Virus Database: 496 - Release Date: 24/08/2004

Rupert Pigott

unread,

Aug 28, 2004, 8:28:38 PM8/28/04

to

Alex Johnson wrote:

[SNIP]

> As explained above, if you compare per thread, these machines are
> equivalent in size (64P Madison, 32P * 2 cores POWER4+, 16P * 2 cores *
> 2 threads POWER5).

Can you explain to me why you think 2 way SMT is equivelent
to 2 processors ?

I can't see how it can be in terms of transistor count or
performance characteristics (think about contention). Just
to add to my confusion, the concensus is that SMT gives < 30%
more oomph at best (depending on workload of course)...

I can see how you could claim that a 64 *package* Madison box
was analogous to a 32 *package* dual-core box though.

Cheers,
Rupert

Bill Todd

unread,

Aug 28, 2004, 10:52:12 PM8/28/04

to

"Rupert Pigott" <r...@try-removing-this.darkboong.demon.co.uk> wrote in
message news:10937393...@teapot.planet.gong...

...

Just
> to add to my confusion, the concensus is that SMT gives < 30%
> more oomph at best (depending on workload of course)...

While this indeed seems to be about the limit (and in fact seems quite a bit
too generous on average) for the throughput boost that existing SMT
implementations can provide (though I may have encountered a claim of 40%
for one outlier application somewhere), the simulations performed for EV8
seemed to indicate that a single core with far more resources (in terms of
execution units, number of in-flight instructions supported, etc.) than the
current SMT cores provide could obtain significantly higher percentage
throughput boosts from SMT.

- bill

Nick Maclaren

unread,

Aug 29, 2004, 6:01:01 AM8/29/04

to

In article <MM2dnV8ff-g...@metrocastcablevision.com>,

Bill Todd <bill...@metrocast.net> wrote:
>
>>Just
>> to add to my confusion, the concensus is that SMT gives < 30%
>> more oomph at best (depending on workload of course)...
>
>While this indeed seems to be about the limit (and in fact seems quite a bit
>too generous on average) for the throughput boost that existing SMT
>implementations can provide (though I may have encountered a claim of 40%
>for one outlier application somewhere), the simulations performed for EV8
>seemed to indicate that a single core with far more resources (in terms of
>execution units, number of in-flight instructions supported, etc.) than the
>current SMT cores provide could obtain significantly higher percentage
>throughput boosts from SMT.

As did Eggers' simulation, which was based on MIPS. My guess is that
much of the problem of Hyperthreading was that it had to start from
the x86 architecture, which was already too complex and messy. But
it could have been other factors as well.

However, the flaw in Eggers' work (I have not seen DEC's) is that it
did not compare SMT with CMP using the same number of transistors.
Yes, the latter would have been slower for serial code, and perhaps
even for a small number of threads, but is MUCH more scalable. Even
Eggers' model ran out of steam at 4 cores (just arguably 8). But, as
I have posted, all these TLAs and other acronyms are intended to give
a veneer of importance to minor variants of a general model.

There is a continuum of shared memory designs, according to how close
the sharing is to the core. SMT is nearly as close as it is possible
to get (and certainly as close as it is sane to go), but you can move
that out to L1, L2, L3 or main memory. And, of course, you can share
difference resources at different levels - e.g. compare the memory
bandwidth handling between the Opteron, MIPS/SPARC and POWERx, all
of which do it differently, and at a different level from any on-chip
multi-threading.

Regards,
Nick Maclaren.

Paul Repacholi

unread,

Aug 29, 2004, 9:34:57 AM8/29/04

to

nm...@cus.cam.ac.uk (Nick Maclaren) writes:

> However, the flaw in Eggers' work (I have not seen DEC's) is that it
> did not compare SMT with CMP using the same number of transistors.

So what is the performance boast you get from CMP with ~110% of a single
core?

Robert Myers

unread,

Aug 29, 2004, 10:34:00 AM8/29/04

to

Bill Todd wrote:

> "Rupert Pigott" <r...@try-removing-this.darkboong.demon.co.uk> wrote in
> message news:10937393...@teapot.planet.gong...
>
> ...
>
> Just
>
>>to add to my confusion, the concensus is that SMT gives < 30%
>>more oomph at best (depending on workload of course)...
>
>
> While this indeed seems to be about the limit (and in fact seems quite a bit
> too generous on average) for the throughput boost that existing SMT
> implementations can provide (though I may have encountered a claim of 40%
> for one outlier application somewhere),

http://www-106.ibm.com/developerworks/linux/library/l-htl/

Table 7. 45% geometric mean improvement for handling chat rooms on a
linux kernel tweaked for Hyperthreading.

RM

Bill Todd

unread,

Aug 29, 2004, 11:24:34 AM8/29/04

to

"Robert Myers" <rmyer...@comcast.net> wrote in message

news:sPlYc.80023$Fg5.55137@attbi_s53...

Well, I suppose that could have been what I remembered - and it does appear
to be the outlier in the article. Still, somewhat better than I'd expect:
I wonder exactly what it is about that workload that's so much more
HT-friendly than the others (unless it's something dead-simple like a
ridiculously short time quantum per thread, such that context-switching
overheads dominate the workload and halving them helps a *lot*).

- bill

Nick Maclaren

unread,

Aug 29, 2004, 1:27:32 PM8/29/04

to

In article <877jrin...@k9.prep.synonet.com>,

Paul Repacholi <pr...@prep.synonet.com> wrote:
>nm...@cus.cam.ac.uk (Nick Maclaren) writes:
>
>> However, the flaw in Eggers' work (I have not seen DEC's) is that it
>> did not compare SMT with CMP using the same number of transistors.
>
>So what is the performance boast you get from CMP with ~110% of a single
>core?

Sigh. I said that the flaw in her work is that it did not provide
that information. No, I don't know. What I do know is that the
claimed benefits of SMT are dubious without that comparison.

OK?

Regards,
Nick Maclaren.

Robert Redelmeier

unread,

Aug 29, 2004, 5:05:51 PM8/29/04

to

In comp.sys.ibm.pc.hardware.chips Bill Todd <bill...@metrocast.net> wrote:
>> http://www-106.ibm.com/developerworks/linux/library/l-htl/
>> Table 7. 45% geometric mean improvement for handling chat rooms on a
>> linux kernel tweaked for Hyperthreading.

> I wonder exactly what it is about that workload that's so

> much more HT-friendly than the others (unless it's something
> dead-simple like a ridiculously short time quantum per
> thread, such that context-switching overheads dominate the
> workload and halving them helps a *lot*).

Well, IRC does mean ridiculously short timeslices (little work)
before a blocking syscall that yields the CPU. Just shovelling
data from one port to another.

Also important in this case will be doing useful work during
memory latency fetches. The busmaster ethernet devices will
drop data into RAM, and some code (probably the kernel TCP/IP
stack) will stall loading it into cache.

-- Robert in Houston

Maynard Handley

unread,

Aug 29, 2004, 7:02:53 PM8/29/04

to

In article <F8-dncgAU95...@metrocastcablevision.com>,
"Bill Todd" <bill...@metrocast.net> wrote:

For christ sake. Why do we have to keep going through this?
Surely it's really simple.
SMT will perform useful work if execution slots are available, and not
otherwise. So if the code running has properties like
- it frequently misses in L1 (either I or D)
- it frequently mispredicts branches
- it consists of long streams of sequentially dependent instructions
(say integer ops) on a machine that has two or three integer exec units
then SMT will work wonderfully.
If these properties don't hold, then it won't.
(And of course, there is the issue of locks and so on shared in L1 which
may help certain types of code.)

BUT of course these are properties of an optimal SMT system. If the
particular IMPLEMENTATION of an SMT system is poorly designed, for
example the number of of rename registers or completion buffers, when
shared across both threads, falls below the knee of the curve; or if the
system allows one blocked thread to block execution for both threads
(for example miss to main memory of thread 0, thread 0 keeps executing,
along with thread 1, until all the completion buffers are full, then
both threads block), then obviously the results may be far far more
disappointing than the above analysis would suggest.

As such, complaining about "SMT" being this or that is like complaining
that "RISC" is this or that; it's a complete waste of time for most
people. How about we establish a rule from now on that any discussions
about SMT start along the lines of
"SMT on Prescott sucks because of ..." or
"SMT on Power5 only gets 15% performance boost on my code, clearly
everyone at IBM is an idiot..."
Even more useful would be criticisms of specific SMT implementations
that actually tell us what went wrong --- not enough resources,
resources are statically not dynamically partitioned, even if one thread
is blocked, the second thread only gets half the fetch slots from the I1
cache, completion buffer/ROB fills up on L2 miss like I describe above
or whatever.

Maynard

David Schwartz

unread,

Aug 30, 2004, 5:45:53 AM8/30/04

to

"Robert Redelmeier" <red...@ev1.net.invalid> wrote in message
news:PyrYc.14774$af6....@newssvr22.news.prodigy.com...

> In comp.sys.ibm.pc.hardware.chips Bill Todd <bill...@metrocast.net>
> wrote:

>>> http://www-106.ibm.com/developerworks/linux/library/l-htl/
>>> Table 7. 45% geometric mean improvement for handling chat rooms on a
>>> linux kernel tweaked for Hyperthreading.

>> I wonder exactly what it is about that workload that's so
>> much more HT-friendly than the others (unless it's something
>> dead-simple like a ridiculously short time quantum per
>> thread, such that context-switching overheads dominate the
>> workload and halving them helps a *lot*).

The workload is bogus, deliberately designed to inflate the numbers.

> Well, IRC does mean ridiculously short timeslices (little work)
> before a blocking syscall that yields the CPU. Just shovelling
> data from one port to another.

> Also important in this case will be doing useful work during
> memory latency fetches. The busmaster ethernet devices will
> drop data into RAM, and some code (probably the kernel TCP/IP
> stack) will stall loading it into cache.

If you look at the way they created the test, the 'chat' test is really
just a measure of how fast you can do context switches. With HT (and this
ridiculously unrealistic type of workload), you need half as many context
switches. Only an idiot would design a chat application such that a context
switch would be needed every time the server wanted to change which client
it was working on behalf of.

DS

Bill Todd

unread,

Aug 30, 2004, 6:25:26 AM8/30/04

to

"Maynard Handley" <nam...@name99.org> wrote in message
news:name99-3F92F1.16021829082004@localhost...

Since the discussion prior to your rant above was quite explicit in its
differentiation among the various flavors of SMT on POWER5, EV8, Montecito,
and P4/Xeon, plus noting the differences in chip resources where applicable
that could affect the utility of the specific SMT implementation, I'm afraid
whatever point you thought you were making is unclear.

- bill

Nick Maclaren

unread,

Aug 30, 2004, 6:37:52 AM8/30/04

to

In article <cgut0k$dj0$1...@nntp.webmaster.com>,

David Schwartz <dav...@webmaster.com> wrote:
>
> If you look at the way they created the test, the 'chat' test is really
>just a measure of how fast you can do context switches. With HT (and this
>ridiculously unrealistic type of workload), you need half as many context
>switches. Only an idiot would design a chat application such that a context
>switch would be needed every time the server wanted to change which client
>it was working on behalf of.

Eh? Why? That is precisely what you want to do to get security,
without having to be very clever. I agree that this is an unusual
requirement, but it is not unreasonable.

Regards,
Nick Maclaren.

Bill Todd

unread,

Aug 30, 2004, 7:41:02 AM8/30/04

to

"Nick Maclaren" <nm...@cus.cam.ac.uk> wrote in message
news:cgv020$bvu$1...@pegasus.csx.cam.ac.uk...

> In article <cgut0k$dj0$1...@nntp.webmaster.com>,
> David Schwartz <dav...@webmaster.com> wrote:
> >
> > If you look at the way they created the test, the 'chat' test is
really
> >just a measure of how fast you can do context switches. With HT (and this
> >ridiculously unrealistic type of workload), you need half as many context
> >switches. Only an idiot would design a chat application such that a
context
> >switch would be needed every time the server wanted to change which
client
> >it was working on behalf of.
>
> Eh? Why? That is precisely what you want to do to get security,
> without having to be very clever.

While there may be a legitimate differentiation between an 'idiot' and
someone who is merely not 'very clever', the underlying sentiments do not
seem all that different.

I'm not in the habit of completely ignoring performance in favor of
dirt-simple coding when I create software, unless performance is truly
unimportant. And that's especially true for production software, where any
effort expended may be repaid by benefits for literally millions of users.

File servers, which have far more stringent security requirements than chat
rooms, often not only eschew per-request context-switching but may run
entirely in the kernel to achieve optimal performance, even given the
resulting need to roll their own security mechanisms. So embedding
relatively simple security mechanisms in a chat-room server to achieve
significantly better performance hardly seems impractical.

- bill

Alex Johnson

unread,

Aug 30, 2004, 8:56:20 AM8/30/04

to

Thu wrote:
> IBM's always refers 16 processors as the number of cores. So a
> maximum p570 is 8 Power5 Dual core chips, 16 cores and with SMT 32
> threads.
>

> Here's their spec submission for a fully loaded p570:
> http://www.spec.org/cpu2000/results/res2004q3/cpu2000-20040712-03234.html

Of course! How did I miss that?

> Here is the best non clustered results in terms of performance for
> Itanium and power5/power4+ on a per core basis.

That's a good read. Thanks for the research you put in.

> A few things to note:
> Montecito's dual thread implementation is not SMT, it is the much
> simpler HMT. Performance increase expected from this is much less
> than SMT.

Montecito's implementation is SoEMT. Maybe HMT means the same thing,
but I am unfamiliar with the abbreviation. What is "H" MT?

> I'll do another comparison using specjbb2000

That's another interesting read. Unlike the TPC numbers, the ordering
of the competitors is not the same at each number of cores. At one
scale POWER4+ is leading by a mile, at the next SPARC64V by a mile, and
at one point Itanium 2 is ahead. Quite strange results. But then the
larger systems use lower speed parts. Another oddity.

Greg Lindahl

unread,

Aug 30, 2004, 11:31:27 AM8/30/04

to

In article <PyrYc.14774$af6....@newssvr22.news.prodigy.com>,
Robert Redelmeier <red...@ev1.net.invalid> wrote:

>Well, IRC does mean ridiculously short timeslices (little work)
>before a blocking syscall that yields the CPU. Just shovelling
>data from one port to another.

I don't know which IRC server you've worked with, but that's not how
it works; when it wakes up, it does as much as it can (non-blocking)
before it sleeps in select(). I believe the chat benchmark in question
is more aimed at a multi-threaded chat server, which does a lot less
work at a time.

-- greg

Stephen Sprunk

unread,

Aug 30, 2004, 11:20:18 AM8/30/04

to

"Nick Maclaren" <nm...@cus.cam.ac.uk> wrote in message
news:cgv020$bvu$1...@pegasus.csx.cam.ac.uk...

ircd, the oldest chat system still running with half a million or so current
users, does all operations in a single thread because that removes the need
for context switches, synchronization, and message sequence tracking. AFAIK
there's not been a security breach in over a decade.

I have no idea how IM systems operate, since (with the exception of Jabber)
they're not open source. However, I can't imagine that AIM servers have 100
million threads, one per user. That is clearly unreasonable with current OS
designs.

S

--
Stephen Sprunk "Those people who think they know everything
CCIE #3723 are a great annoyance to those of us who do."
K5SSS --Isaac Asimov

Nick Maclaren

unread,

Aug 31, 2004, 5:10:52 AM8/31/04

to

In article <v2KYc.31833$JG7....@hydra.nntpserver.com>,

"Stephen Sprunk" <ste...@sprunk.org> writes:
|>
|> ircd, the oldest chat system still running with half a million or so current
|> users, does all operations in a single thread because that removes the need
|> for context switches, synchronization, and message sequence tracking. AFAIK
|> there's not been a security breach in over a decade.

And CICS did the same. But, in order to deliver that security, they
have to impose a lot of constraints. One slip, and you have a
security breach - been there, seen that :-(

|> I have no idea how IM systems operate, since (with the exception of Jabber)
|> they're not open source. However, I can't imagine that AIM servers have 100
|> million threads, one per user. That is clearly unreasonable with current OS
|> designs.

What is unreasonable is that there are no current designs that can
handle it. Nobody is claiming that every system should be able to
work that way, but the fact that none can is not good.

Regards,
Nick Maclaren.

Bill Davidsen

unread,

Sep 1, 2004, 5:41:45 PM9/1/04

to

Okay, enlighten me, how do you handle client requests without a context
switch? Having a thread do a blocking read on a socket actually scales
better than select() on many system (certainly Linux). Let's skip the
writes for a little while until I think about the issues. Having a mix
of slow and fast connections and comments going to multiple clients
makes it worthy of some thought.

Unless I misremember both apache and sendmail use threads just because
of the select scaling issues.

--
-bill davidsen (davi...@tmr.com)
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me