400/450MHz SA-110

Steven M. Ottens

unread,

Aug 13, 1999, 3:00:00 AM8/13/99

to

Hello all,

Just saw on the cybervillage that intel is going to release a 400/450MHz
SA-110. Is it possible to use it for a RPC or the Imago. In other words: has
the 400/450MHz version 26bit support? If not, why is it called SA110?

Question for Millipede: When there the SA110 400/450 does support 26bit will
the Imago be equiped with it?

Greetings
--
/ Steven M. Ottens
O'dd- Ottens' Dutch Designs
\ http://www.futuretrain.com/odd

Richard Jozefowski

unread,

Aug 13, 1999, 3:00:00 AM8/13/99

to

In article <ff9bfd304...@lx.student.wau.nl>, Steven M. Ottens

<URL:mailto:ste...@lx.student.wau.nl> wrote:
>
> Just saw on the cybervillage that intel is going to release a 400/450MHz
> SA-110. Is it possible to use it for a RPC or the Imago. In other words: has
> the 400/450MHz version 26bit support? If not, why is it called SA110?
>
> Question for Millipede: When there the SA110 400/450 does support 26bit will
> the Imago be equiped with it?

The little I do know about future processors is under NDA. All I'll say is:
If there's a faster version available that's suitable then we'll endeavour
to support it as soon as possible.

Hope that helps.

--
Richard

Gary Partis

unread,

Aug 14, 1999, 3:00:00 AM8/14/99

to comp.sys.acorn.misc

Hi,

Steven M. Ottens <ste...@lx.student.wau.nl> wrote in article
<ff9bfd304...@lx.student.wau.nl>...
> Hello all,

>
> Just saw on the cybervillage that intel is going to release a 400/450MHz
> SA-110. Is it possible to use it for a RPC or the Imago. In other words:
has
> the 400/450MHz version 26bit support? If not, why is it called SA110?

Try comp.sys.arm on UseNet. Some questions are what we (Acorn users) call
simplistic but it is surely the place to find out about the ARM chip!

--
Gary Partis, North Shields, Tyne & Wear, UK
Fast Fax : 0870 056 1096
Secure Fax: 0191 280 1306
http://www.partis.demon.co.uk

Want regular laughs in your in box, then go to
http://www.partis.demon.co.uk/funny.htm and
follow the instructions!

Chris Downs

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

On Fri, 13 Aug 1999 16:03:15 +0100, Steven M. Ottens
<ste...@lx.student.wau.nl> wrote:

>Hello all,
>
>Just saw on the cybervillage that intel is going to release a 400/450MHz
>SA-110. Is it possible to use it for a RPC or the Imago. In other words: has
>the 400/450MHz version 26bit support? If not, why is it called SA110?

You wouldn't be able to make use of it in a RPC even if it was 26-bit.
The 33MHz bus in the RPC would be too much of a bottleneck for it to
be worth it.

Contact: chris...@arg000net.co.uk
IRC: #argonet, #acorn as "Xeno"

Steffen Huber

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

Chris Downs schrieb:

>
> On Fri, 13 Aug 1999 16:03:15 +0100, Steven M. Ottens
> <ste...@lx.student.wau.nl> wrote:
>
> >Hello all,
> >
> >Just saw on the cybervillage that intel is going to release a 400/450MHz
> >SA-110. Is it possible to use it for a RPC or the Imago. In other words: has
> >the 400/450MHz version 26bit support? If not, why is it called SA110?
>
> You wouldn't be able to make use of it in a RPC even if it was 26-bit.

Judging by the "SA-110" bit, it surely supports 26bit modes. Intel won't
redesign the chip and call it the same.

However, I have that feeling that the news item was a hoax. On the other
hand, maybe Intel has a production line free (e.g. from the old
Pentium-II now that the whole world will buy AMD ;-)) with a 0.25 process
capability, so they just did a "simple" design shrink. However, I
doubt that doubling the frequency would be possible.

I have still not found a reference to the new SA-110 from the usual
sources (electronics weekly, microprocessor report, ee times), so for
the moment I will doubt the soon-to-be-existence of a 400 MHz SA-110.

> The 33MHz bus in the RPC would be too much of a bottleneck for it to
> be worth it.

People said that with StrongARM, too. Never underestimate the
efficiency of a decent 1st level cache - it was described before that
even GNAT gets nearly linearly faster when overclocking the current
StrongARM, and GNAT is very memory-intensive and should be a cache
nightmare.

So long, Steffen

--
Steffe...@icongmbh.de hub...@lcs.wn.bawue.de
GCC for RISC OS - http://www.arcsite.de/hp/gcc/

J.Bland

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

In message <37b80cfe...@news.geccs.gecm.com>
chris...@b0gu5-3ma1l.gecm.com (Chris Downs) wrote:

> On Fri, 13 Aug 1999 16:03:15 +0100, Steven M. Ottens
> <ste...@lx.student.wau.nl> wrote:
>
>>Hello all,
>>
>>Just saw on the cybervillage that intel is going to release a 400/450MHz
>>SA-110. Is it possible to use it for a RPC or the Imago. In other words:
>>has the 400/450MHz version 26bit support? If not, why is it called SA110?
>
> You wouldn't be able to make use of it in a RPC even if it was 26-bit.

> The 33MHz bus in the RPC would be too much of a bottleneck for it to
> be worth it.

Is that 33MHz bus in the J233s? In all the others it's 16MHz. With the slow
bus you'd be about right, and with the faster one you're still throttled by
slow memory. Even so, despite all the "it'll be throttled by the bus" talk
when I overclocked my SA by 27% I still got an average performance increase
of 20%. If you want to run rc5 it'd be great, otherwise yes the diminishing
returns would eventually mean it'd be not much better than the ones we've
got.

--
John Bland Webmaster and Ph.D. Research Student,
M.Phys (Hons) Grad.Inst.P Condensed Matter Physics Department,
J.B...@liv.ac.uk The University of Liverpool
http://www.sliced.uk.eu.org/~shrike http://www.liv.ac.uk/~olmsg01/physics/
"The Sleeper Must Awaken" - Dune Messiah

Simon E. John

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

In article <50a08d3249%J.B...@pc084085.melgrove.liv.ac.uk>,
J.Bland <J.B...@liv.ac.uk> wrote:

[snip]

> > > Just saw on the cybervillage that intel is going to release a 400/450MHz
> > > SA-110. Is it possible to use it for a RPC or the Imago. In other words:
> > > has the 400/450MHz version 26bit support? If not, why is it called
> > > SA110?

> > You wouldn't be able to make use of it in a RPC even if it was 26-bit.
> > The 33MHz bus in the RPC would be too much of a bottleneck for it to
> > be worth it.

> Is that 33MHz bus in the J233s? In all the others it's 16MHz. With the slow
> bus you'd be about right, and with the faster one you're still throttled by
> slow memory. Even so, despite all the "it'll be throttled by the bus" talk
> when I overclocked my SA by 27% I still got an average performance increase
> of 20%. If you want to run rc5 it'd be great, otherwise yes the diminishing
> returns would eventually mean it'd be not much better than the ones we've
> got.

Erm, I though it was only the A7000+ that had a 33MHz bus - all the RiscPC's,
even the new RiscPC233T have a 16MHz bus AFAIK.

I have a 202MHz SA running at 258MHz and getting a linear 27% speed increase
in practically every operation and using most testing proggies like SICK etc.

I doubt that a 450MHz SA would be useful though - unless we get the 64MHz bus
(why 64 not 66 or 100 as PCs?) of the Millipede etc.

--
Simon E. John, BA(Hons)

Email: si...@suit-u-sir.com | 258MHz StrongARM RiscPC
Website: www.suit-u-sir.com | 50Mb RAM, 4.3Gb HD, Zip
PGP 'bot: send...@suit-u-sir.com | 32x CD, USR 56K V.90/X2

43rd Law of Computing: Anything that can go wr...

Andy McMullon

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

In article <37b80cfe...@news.geccs.gecm.com>, Chris Downs

<URL:mailto:chris...@b0gu5-3ma1l.gecm.com> wrote:
> On Fri, 13 Aug 1999 16:03:15 +0100, Steven M. Ottens
> <ste...@lx.student.wau.nl> wrote:
>
> >Hello all,
> >

> >Just saw on the cybervillage that intel is going to release a 400/450MHz
> >SA-110. Is it possible to use it for a RPC or the Imago. In other words: has
> >the 400/450MHz version 26bit support? If not, why is it called SA110?
>
> You wouldn't be able to make use of it in a RPC even if it was 26-bit.
> The 33MHz bus in the RPC would be too much of a bottleneck for it to
> be worth it.

But it is just what the new motherboard needs!

64Mhz bus I think?

--
Andy: skyp...@bigfoot.com / http://www.mcfamily.demon.co.uk

Richard Jozefowski

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

In article <4932981...@suit-u-sir.com>, Simon E. John

<URL:mailto:si...@suit-u-sir.com> wrote:
>
> I doubt that a 450MHz SA would be useful though - unless we get the 64MHz bus
> (why 64 not 66 or 100 as PCs?) of the Millipede etc.

The current StrongARM is rated with a memory clock of 66MHz. Running it at
64MHz makes it a lot easier to retain compatibility with the podule bus and
PC card etc which use 16MHz, 8MHz, 2MHz etc timings.

The design is such that it is easy to adapt if, say, there were a new
processor capable of 100MHz memory clock. In this case I guess we'd opt for
running it at 96MHz.

--
Richard

Kira L. Brown

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

In message <50a08d3249%J.B...@pc084085.melgrove.liv.ac.uk>
J.Bland <J.B...@liv.ac.uk> wrote:

> Is that 33MHz bus in the J233s?

J233 (and all other RiscPCs) clock their busses at 16Mtransfers/second,
using an actual clocking rate of 32Mcycles/sec.

A7000+ clocks at 64MHz and thus gets 32Mtransfers/sec

kira.

--
This is a tagline.

Oliver Booth

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

In article <4932A483D9%kbr...@neutralino.demon.com.uk>, Kira L. Brown

<URL:mailto:kbr...@neutralino.demon.com.uk> wrote:
> In message <50a08d3249%J.B...@pc084085.melgrove.liv.ac.uk>
> J.Bland <J.B...@liv.ac.uk> wrote:
>
> > Is that 33MHz bus in the J233s?
>

> J233 (and all other RiscPC's) clock their busses at 16Mtransfers/second,

> using an actual clocking rate of 32Mcycles/sec.
>
> A7000+ clocks at 64MHz and thus gets 32Mtransfers/sec
>
> kira.
>

So what exactly does a J233 Motherboard have that a standard Risc PC
motherboard doesn't?

--
Oliver Booth. Accountancy and Finance Degree Student at Manchester
Metropolitain University. Website: http://www.ashwood98.freeserve.co.uk
Email: Oli...@ashwood98.freeserve.co.uk . "Darren" on that #acorn IRC thing.

Kira L. Brown

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

In message <ant16050...@ashwood98.freeserve.co.uk>
Oliver Booth <O_B...@ashwood98.freeserve.co.uk> wrote:

> So what exactly does a J233 Motherboard have that a standard Risc PC
> motherboard doesn't?

A better VIDC than most, better decoupling, and it doesn't tend to have
as many timing problems.

J.Bland

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

>
> [snip]

>
> > >> Just saw on the cybervillage that intel is going to release a
> > >> 400/450MHz SA-110. Is it possible to use it for a RPC or the Imago. In
> > >> other words: has the 400/450MHz version 26bit support? If not, why is
> > >> it called SA110?
>
> > > You wouldn't be able to make use of it in a RPC even if it was 26-bit.
> > > The 33MHz bus in the RPC would be too much of a bottleneck for it to
> > > be worth it.
>

> >Is that 33MHz bus in the J233s? In all the others it's 16MHz. With the
> >slow bus you'd be about right, and with the faster one you're still
> >throttled by slow memory. Even so, despite all the "it'll be throttled by
> >the bus" talk when I overclocked my SA by 27% I still got an average
> >performance increase of 20%. If you want to run rc5 it'd be great,
> >otherwise yes the diminishing returns would eventually mean it'd be not
> >much better than the ones we've got.
>
>Erm, I though it was only the A7000+ that had a 33MHz bus - all the
>RiscPC's, even the new RiscPC233T have a 16MHz bus AFAIK.
>

I think I confused the slightly faster VIDC with the bus. Or maybe I just
dreamt it ;).

>I have a 202MHz SA running at 258MHz and getting a linear 27% speed increase
>in practically every operation and using most testing proggies like SICK
>etc.
>

I would contest a 27% increase across the board, I did a number of real-life
tests ranging from BASIC apps, through benchmarkers like SI etc to
games/demos, such as quake, with fps indicators and it averaged to ~20% at
258MHz. Some operations are linear some definitely aren't, but at the speeds
we're restricted too with the current overclocked SAs the bus isn't
restricting the machine anywhere near as much as you'd be led to believe.

<snip>

Shrike

John Duffell

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

In message <4932981...@suit-u-sir.com>

Simon E. John <si...@suit-u-sir.com> wrote:

> In article <50a08d3249%J.B...@pc084085.melgrove.liv.ac.uk>,
> J.Bland <J.B...@liv.ac.uk> wrote:
>

> [snip]
>
> > > > Just saw on the cybervillage that intel is going to release a 400/450MHz
> > > > SA-110. Is it possible to use it for a RPC or the Imago. In other words:
> > > > has the 400/450MHz version 26bit support? If not, why is it called
> > > > SA110?
>
> > > You wouldn't be able to make use of it in a RPC even if it was 26-bit.
> > > The 33MHz bus in the RPC would be too much of a bottleneck for it to
> > > be worth it.
>
> > Is that 33MHz bus in the J233s? In all the others it's 16MHz. With the slow
> > bus you'd be about right, and with the faster one you're still throttled by
> > slow memory. Even so, despite all the "it'll be throttled by the bus" talk
> > when I overclocked my SA by 27% I still got an average performance increase
> > of 20%. If you want to run rc5 it'd be great, otherwise yes the diminishing
> > returns would eventually mean it'd be not much better than the ones we've
> > got.
>
> Erm, I though it was only the A7000+ that had a 33MHz bus - all the RiscPC's,
> even the new RiscPC233T have a 16MHz bus AFAIK.
>

> I have a 202MHz SA running at 258MHz and getting a linear 27% speed increase
> in practically every operation and using most testing proggies like SICK etc.
>

> I doubt that a 450MHz SA would be useful though - unless we get the 64MHz bus
> (why 64 not 66 or 100 as PCs?) of the Millipede etc.

It'll be the doubling up, 8MHz -> 16MHz -> 32MHz -> 64MHz
I'm not sure why they don't round to 66. Sometimes I wish the standard
numbering system was like hex. because it's so much easier and more
logical.
When I rule the world...

John

Remove m from com to reply
--

--

Matthias Seifert

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

J.Bland <J.B...@liv.ac.uk> wrote:
> In message <37b80cfe...@news.geccs.gecm.com>

> chris...@b0gu5-3ma1l.gecm.com (Chris Downs) wrote:

> > On Fri, 13 Aug 1999 16:03:15 +0100, Steven M. Ottens
> > <ste...@lx.student.wau.nl> wrote:
> >
> >>Hello all,
> >>

> >>Just saw on the cybervillage that intel is going to release a
> >>400/450MHz SA-110. Is it possible to use it for a RPC or the Imago. In
> >>other words: has the 400/450MHz version 26bit support? If not, why is
> >>it called SA110?
> >
> > You wouldn't be able to make use of it in a RPC even if it was 26-bit.
> > The 33MHz bus in the RPC would be too much of a bottleneck for it to
> > be worth it.

> Is that 33MHz bus in the J233s? In all the others it's 16MHz.

Erm, no, it's actually 32 MHz. With all RPCs.

> With the slow bus you'd be about right, and with the faster one you're
> still throttled by slow memory. Even so, despite all the "it'll be
> throttled by the bus" talk when I overclocked my SA by 27% I still got
> an average performance increase of 20%. If you want to run rc5 it'd be
> great, otherwise yes the diminishing returns would eventually mean it'd
> be not much better than the ones we've got.

I don't think so. My overclocked SA is almost linear faster - even with
Photodesk and other memory intensive applications. Maybe a 400 MHz SA will
not be twice as fast as a 200 MHz one (in the actual RPC), but nonetheless
it will surely be faster.

But of course inside an Imago such a thing would make even more fun. :-)

--
_ _ | Acorn Risc PC, StrongARM @ 287 MHz
| | | _, _|__|_ |) ' _, , | 256+2 Mbyte RAM, >40 Gbyte HD
| | | / | | | |/\ | / | / \ | ------------------------------------
| | |_/\/|_/|_/|_/| |/|/\/|_/ \/ | http://www.deutschlandwetter.de

dgs

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

In article <50a08d3249%J.B...@pc084085.melgrove.liv.ac.uk>,
J.Bland <J.B...@liv.ac.uk> wrote:

> > You wouldn't be able to make use of it in a RPC even if it was 26-bit.
> > The 33MHz bus in the RPC would be too much of a bottleneck for it to
> > be worth it.
>

> Is that 33MHz bus in the J233s? In all the others it's 16MHz. With the slow

> bus you'd be about right, and with the faster one you're still throttled by
> slow memory. Even so, despite all the "it'll be throttled by the bus" talk
> when I overclocked my SA by 27% I still got an average performance increase
> of 20%.

Quite. ISTR us being told with some certainty that an StrongARM
upgrade without L2 cache on the basic RPC bus would "choke the bus"
and not be worth it. The reality (substantial speed increases) was
rather different.

I suppose with a 400MHz+ part this lack of improvement might be far
more the case than with a 200MHz+ part :-)

With the Imago, rather than the Risc PC, things might be rather more
interesting...

--
d...@argonet.co.uk

Manchester Acorn User Group - http://www.acorn.manchester.ac.uk/

RPC x86 Card Info Pages - http://acorn.cybervillage.co.uk/pccard/

dgs

unread,

Aug 16, 1999, 3:00:00 AM8/16/99

to

In article <4932981...@suit-u-sir.com>,

Simon E. John <si...@suit-u-sir.com> wrote:

> I have a 202MHz SA running at 258MHz and getting a linear 27% speed increase
> in practically every operation and using most testing proggies like SICK etc.
>
> I doubt that a 450MHz SA would be useful though - unless we get the 64MHz bus
> (why 64 not 66 or 100 as PCs?) of the Millipede etc.

Quite - the Millipede board is the only sensible way to do this, I
very much doubt that there is any point doing it for lesser systems.

But the benefits with that board could be huge! (OK, so it won't be
cheap, at around 1000ukp, but still...)

J.Bland

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

In message <493297271...@t-online.de>
Matthias Seifert <M.Se...@t-online.de> wrote:

> J.Bland <J.B...@liv.ac.uk> wrote:
> > In message <37b80cfe...@news.geccs.gecm.com>
> > chris...@b0gu5-3ma1l.gecm.com (Chris Downs) wrote:
>
> > > On Fri, 13 Aug 1999 16:03:15 +0100, Steven M. Ottens
> > > <ste...@lx.student.wau.nl> wrote:
> > >
> > >>Hello all,
> > >>
> > >>Just saw on the cybervillage that intel is going to release a
> > >>400/450MHz SA-110. Is it possible to use it for a RPC or the Imago. In
> > >>other words: has the 400/450MHz version 26bit support? If not, why is
> > >>it called SA110?
> > >

> > > You wouldn't be able to make use of it in a RPC even if it was 26-bit.
> > > The 33MHz bus in the RPC would be too much of a bottleneck for it to
> > > be worth it.
>
> > Is that 33MHz bus in the J233s? In all the others it's 16MHz.
>

> Erm, no, it's actually 32 MHz. With all RPCs.

Unless this is the old "32MHz in PC terminology" cobblers I must disagree.
It's 16MHz.

>
> > With the slow bus you'd be about right, and with the faster one you're
> > still throttled by slow memory. Even so, despite all the "it'll be
> > throttled by the bus" talk when I overclocked my SA by 27% I still got

> > an average performance increase of 20%. If you want to run rc5 it'd be
> > great, otherwise yes the diminishing returns would eventually mean it'd
> > be not much better than the ones we've got.
>
> I don't think so. My overclocked SA is almost linear faster - even with
> Photodesk and other memory intensive applications. Maybe a 400 MHz SA will
> not be twice as fast as a 200 MHz one (in the actual RPC), but nonetheless
> it will surely be faster.
>
> But of course inside an Imago such a thing would make even more fun. :-)
>

By ones we have I meant overclocked. The bus and/or memory is definitely
having an effect, this is why across a number of quantitatively measured
tests my SA is, on average, 7% slower than it should be given a linear
increase in clock speed. This would become more apparent with further
increases. Not actually having a 400MHz part to stick in an rpc and try it
it's hard to say at what point you lose performance gains. I'd guess
somewhere 300MHz+.

Simon E. John

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

In article <ant16050...@ashwood98.freeserve.co.uk>,
Oliver Booth <O_B...@ashwood98.freeserve.co.uk> wrote:

[snip]

> > > Is that 33MHz bus in the J233s?

> > J233 (and all other RiscPC's) clock their busses at 16Mtransfers/second,

> > using an actual clocking rate of 32Mcycles/sec.

> > A7000+ clocks at 64MHz and thus gets 32Mtransfers/sec

> So what exactly does a J233 Motherboard have that a standard Risc PC
> motherboard doesn't?

AFAIK, just RISC OS 3.71 and a 233MHz Rev-S StrongARM plus Java, Browse,
EasiWriter, Universal !Boot.....

--
Simon E. John, BA(Hons)

Email: si...@suit-u-sir.com | 258MHz StrongARM RiscPC
Website: www.suit-u-sir.com | 50Mb RAM, 4.3Gb HD, Zip
PGP 'bot: send...@suit-u-sir.com | 32x CD, USR 56K V.90/X2

I canna do it cap'n, I just doont ha' the pooer!

Matthias Seifert

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

J.Bland <J.B...@liv.ac.uk> wrote:
> In message <493297271...@t-online.de>
> Matthias Seifert <M.Se...@t-online.de> wrote:

> > J.Bland <J.B...@liv.ac.uk> wrote:

[...]

> > > Is that 33MHz bus in the J233s? In all the others it's 16MHz.
> >
> > Erm, no, it's actually 32 MHz. With all RPCs.

> Unless this is the old "32MHz in PC terminology" cobblers I must
> disagree. It's 16MHz.

How?

Well, the TRM of the RPC tells the following tale: "DRAM [...] control and
timing is controlled by a state machine running at 32 MHz. [...] S-cycles
run at 16 MHz, and N-cycles take 2.5 times the S-cycle time [...]. This
means that S-cycles take 2 cycles of the 32 MHz clock, and N-cycles take 5
cycles of the 32 MHz clock."

Do you see it? They speek about a "32 MHz clock" - or how would you
describe that "5 cycles of the 32 MHz clock" with your 16 MHz theory?

[...]

> > My overclocked SA is almost linear faster - even with Photodesk and
> > other memory intensive applications. Maybe a 400 MHz SA will not be
> > twice as fast as a 200 MHz one (in the actual RPC), but nonetheless it
> > will surely be faster.

[...]

> By ones we have I meant overclocked. The bus and/or memory is definitely
> having an effect, this is why across a number of quantitatively measured
> tests my SA is, on average, 7% slower than it should be given a linear
> increase in clock speed.

Didn't I write "is almost linear faster"?
^^^^^^

> This would become more apparent with further increases.

Depending on the things you are doing of course. Code that runs entirely
in the cache(s) will surely still be (almost?) linearly faster.

> Not actually having a 400MHz part to stick in an rpc and try it it's
> hard to say at what point you lose performance gains. I'd guess
> somewhere 300MHz+.

Erm, you mean that above 300 MHz (or so) there will be no performance
increase at all? How should this be the case? It really would be strange
if overclocking to nearly 290 MHz would give _almost_ linear speed
increase but anything above 300 MHz will give no increase at all...

Marko Lukat

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

Matthias Seifert wrote:
[..]

> > Unless this is the old "32MHz in PC terminology" cobblers I must
> > disagree. It's 16MHz.
>
> How?
>
> Well, the TRM of the RPC tells the following tale: "DRAM [...] control and
> timing is controlled by a state machine running at 32 MHz. [...] S-cycles
> run at 16 MHz, and N-cycles take 2.5 times the S-cycle time [...]. This
> means that S-cycles take 2 cycles of the 32 MHz clock, and N-cycles take 5
> cycles of the 32 MHz clock."
>
> Do you see it? They speek about a "32 MHz clock" - or how would you
> describe that "5 cycles of the 32 MHz clock" with your 16 MHz theory?

How do you explain that !SICK reports MCLK (which I refer to as the bus
clock, correct me if I'm wrong here) as being 16MHz (*cache idws)?

Marko

Richard Jozefowski

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

In article <4932e450a...@t-online.de>, Matthias Seifert

<URL:mailto:M.Se...@t-online.de> wrote:
>
> Well, the TRM of the RPC tells the following tale: "DRAM [...] control and
> timing is controlled by a state machine running at 32 MHz. [...] S-cycles
> run at 16 MHz, and N-cycles take 2.5 times the S-cycle time [...]. This
> means that S-cycles take 2 cycles of the 32 MHz clock, and N-cycles take 5
> cycles of the 32 MHz clock."
>
> Do you see it? They speek about a "32 MHz clock" - or how would you
> describe that "5 cycles of the 32 MHz clock" with your 16 MHz theory?

The DRAM state machine is indeed driven from by a 32MHz clock, but the ARM
memory clock (MCLK) is driven at 16MHz (ish). I say "ish" because, as you
rightly point out, it is stretched by half a 16MHz period during the N
cycle. This extra half cycle is required in order to meet the DRAM random
access time specification, two cycles at 16MHz not being quite enough.

The upshot is that the memory bus can be said to be run at 16MHz. It's the
MCLK that counts, not the clock rate of the state machine. After all, it
would be possible to run the DRAM state machine at 64MHz, taking 10 clocks
for S-cycles and 5 for N-cycles - I'm afraid this would be a marketing
benefit only!

--
Richard

Stuart Tyrrell

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

In message <ant17085...@milliped.demon.co.uk>
Richard Jozefowski <ric...@millipede.co.uk> wrote:

> The upshot is that the memory bus can be said to be run at 16MHz.
> It's the MCLK that counts, not the clock rate of the state machine.
> After all, it would be possible to run the DRAM state machine at
> 64MHz, taking 10 clocks for S-cycles and 5 for N-cycles - I'm afraid
> this would be a marketing benefit only!

Haven't PC motherboard manufacturers taken advantage of that marketing
benfit though?

OK, things are confused now with SDRAM, but back in the days of
FPM/EDO we were seeing 33MHz motherboards advertised which were no
more so than the RPC.

Stuart.
--
Stuart Tyrrell Developments Stu...@stdevel.demon.co.uk
PO Box 183, OLDHAM. OL2 8FB http://www.stdevel.demon.co.uk
Tel: 01706 848 600 Orange: 0976 255 256 dFax: 0870 164 1604
** NEW Acorn Trackball UKP34.95 Use PS/2 devices only UKP 24.95 **

Andreas Joos

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

In message <4932981...@suit-u-sir.com>

Simon E. John <si...@suit-u-sir.com> wrote:

>> Is that 33MHz bus in the J233s? In all the others it's 16MHz. With the slow

>> bus you'd be about right, and with the faster one you're still throttled by
>> slow memory. Even so, despite all the "it'll be throttled by the bus" talk
>> when I overclocked my SA by 27% I still got an average performance increase
>> of 20%. If you want to run rc5 it'd be great, otherwise yes the diminishing
>> returns would eventually mean it'd be not much better than the ones we've
>> got.
>

> Erm, I though it was only the A7000+ that had a 33MHz bus - all the RiscPC's,
> even the new RiscPC233T have a 16MHz bus AFAIK.

AFAIK RiscPCs run wirh 32 Mhz-bus, but with 1 waitstate every 2nd cycles
(4-2-2-2 bursts) for a 16 byte access (maybe only 4 waitstates for every
start of a sequential access, and not every 16 bytes).

Bye, Andreas Joos

J.Bland

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

In message <4932e450a...@t-online.de>
Matthias Seifert <M.Se...@t-online.de> wrote:

> J.Bland <J.B...@liv.ac.uk> wrote:
> > In message <493297271...@t-online.de>
> > Matthias Seifert <M.Se...@t-online.de> wrote:
>
> > > J.Bland <J.B...@liv.ac.uk> wrote:
>
> [...]
>

> > > > Is that 33MHz bus in the J233s? In all the others it's 16MHz.
> > >

> > > Erm, no, it's actually 32 MHz. With all RPCs.
>

> > Unless this is the old "32MHz in PC terminology" cobblers I must
> > disagree. It's 16MHz.
>
> How?
>

> Well, the TRM of the RPC tells the following tale: "DRAM [...] control and
> timing is controlled by a state machine running at 32 MHz. [...] S-cycles
> run at 16 MHz, and N-cycles take 2.5 times the S-cycle time [...]. This
> means that S-cycles take 2 cycles of the 32 MHz clock, and N-cycles take 5
> cycles of the 32 MHz clock."
>
> Do you see it? They speek about a "32 MHz clock" - or how would you
> describe that "5 cycles of the 32 MHz clock" with your 16 MHz theory?
>

> [...]
>

The bus is clocked at 32MHz, what you actually get out of it is 16MHz (or
12.8 for N). You can quite happily state that "the bus runs at" 16 or 32, but
as we're never getting 32MHz out of it it seems a bad misnomer to use that
number. Or maybe everything I've ever seen about the RiscPC has been lying
through its teeth and the 64MHz bus of the Phoebe wasn't that much of an
improvement after all.

> > > My overclocked SA is almost linear faster - even with Photodesk and
> > > other memory intensive applications. Maybe a 400 MHz SA will not be
> > > twice as fast as a 200 MHz one (in the actual RPC), but nonetheless it
> > > will surely be faster.
>
> [...]
>
> > By ones we have I meant overclocked. The bus and/or memory is definitely
> > having an effect, this is why across a number of quantitatively measured
> > tests my SA is, on average, 7% slower than it should be given a linear
> > increase in clock speed.
>
> Didn't I write "is almost linear faster"?
> ^^^^^^

Almost isn't quantitative ;).

> > This would become more apparent with further increases.
>
> Depending on the things you are doing of course. Code that runs entirely
> in the cache(s) will surely still be (almost?) linearly faster.
>

As I'd already said.

> > Not actually having a 400MHz part to stick in an rpc and try it it's
> > hard to say at what point you lose performance gains. I'd guess
> > somewhere 300MHz+.
>
> Erm, you mean that above 300 MHz (or so) there will be no performance
> increase at all? How should this be the case? It really would be strange
> if overclocking to nearly 290 MHz would give _almost_ linear speed
> increase but anything above 300 MHz will give no increase at all...
>

I mean that at 257.6MHz my tests showed that the improvement across a range
of procedures from the 202.4MHz setting gave a 20% increase from the
theoretical 27%. This is already a 35% loss of performance, definitely not
insignificant I would think. And this can only get worse as you go higher.

If bus bandwidth had such little effect as you appear to be stating we
wouldn't be needing all these whizzy new motherboards.

Simon E. John

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

In article <4932a0...@argonet.co.uk>,
dgs <d...@argonet.co.uk> wrote:

[snip]

> > I doubt that a 450MHz SA would be useful though - unless we get the 64MHz
> > bus (why 64 not 66 or 100 as PCs?) of the Millipede etc.

> Quite - the Millipede board is the only sensible way to do this, I
> very much doubt that there is any point doing it for lesser systems.

> But the benefits with that board could be huge! (OK, so it won't be
> cheap, at around 1000ukp, but still...)

But cheaper and more effective than a 32 SA's - plus we'd be able to run RISC
OS instead of RiscBSD.

I wonder if it would have any form of FPU?...... ;o)

--
Simon E. John, BA(Hons)

Email: si...@suit-u-sir.com | 258MHz StrongARM RiscPC
Website: www.suit-u-sir.com | 50Mb RAM, 4.3Gb HD, Zip
PGP 'bot: send...@suit-u-sir.com | 32x CD, USR 56K V.90/X2

I'm givin' ya all she's got cap'n.

Matthias Seifert

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

Marko Lukat <marko...@tao-group.com> wrote:
> Matthias Seifert wrote:
> [..]

> > > Unless this is the old "32MHz in PC terminology" cobblers I must
> > > disagree. It's 16MHz.
> >
> > How?
> >
> > Well, the TRM of the RPC tells the following tale: "DRAM [...] control
> > and timing is controlled by a state machine running at 32 MHz. [...]
> > S-cycles run at 16 MHz, and N-cycles take 2.5 times the S-cycle time
> > [...]. This means that S-cycles take 2 cycles of the 32 MHz clock, and
> > N-cycles take 5 cycles of the 32 MHz clock."
> >
> > Do you see it? They speek about a "32 MHz clock" - or how would you
> > describe that "5 cycles of the 32 MHz clock" with your 16 MHz theory?

> How do you explain that !SICK reports MCLK (which I refer to as the bus

> clock, correct me if I'm wrong here) as being 16MHz (*cache idws)?

> Marko

Well, if you take another look you surely will see the addendum '(reported
by OS)'. :-)

Matthias Seifert

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

J.Bland <J.B...@liv.ac.uk> wrote:

[...]

> If bus bandwidth had such little effect as you appear to be stating we
> wouldn't be needing all these whizzy new motherboards.

And if bus bandwidth had such huge effect as you appear to be stating we
wouldn't be using a SA at all. ;-)

But don't get me wrong. Surely the bus bandwith has a big effect and I
have no doubt that the Imago will be a lot faster than the RPC. Maybe an
Imago with SA/233 would even be faster than a RPC with SA/450 (in most
cases). But nontheless I'm very sure that a RPC with a SA/450 would be
considerably faster than a RPC with SA/233 or even an overclocked SA/287.
And the speed increase to a SA/233 must be anything between 0% (for simple
copying of large RAM areas [byte by byte]) to 93% (for code that runs
entirely in the caches). How close you get to 0% or 93% depents on what
you are using your RPC for of course...

And I would like to get a SA/450 for my RPC _and_ an additional Imago
(with SA/450). :-)

Matthias Seifert

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

Andreas Joos <Gr...@AmiUni.Au.S.Shuttle.de> wrote:
> In message <4932981...@suit-u-sir.com>
> Simon E. John <si...@suit-u-sir.com> wrote:

[...]

> > Erm, I though it was only the A7000+ that had a 33MHz bus - all the
> > RiscPC's, even the new RiscPC233T have a 16MHz bus AFAIK.

> AFAIK RiscPCs run wirh 32 Mhz-bus, but with 1 waitstate every 2nd cycles
> (4-2-2-2 bursts) for a 16 byte access

Quite, its 5-2-2-2 (and thus it can't be easily translated to 16 MHz).

> (maybe only 4 waitstates for every start of a sequential access, and not
> every 16 bytes).

Indeed, so you can even have 5-2-2-2-2-2-2-2-2-2-2-2-2-2-2-2.

Richard Jozefowski

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

In article <4933198...@suit-u-sir.com>, Simon E. John

<URL:mailto:si...@suit-u-sir.com> wrote:
> In article <4932a0...@argonet.co.uk>,
> dgs <d...@argonet.co.uk> wrote:
>
> [snip]
>
> > > I doubt that a 450MHz SA would be useful though - unless we get the 64MHz
> > > bus (why 64 not 66 or 100 as PCs?) of the Millipede etc.
>
> > Quite - the Millipede board is the only sensible way to do this, I
> > very much doubt that there is any point doing it for lesser systems.
>
> > But the benefits with that board could be huge! (OK, so it won't be
> > cheap, at around 1000ukp, but still...)
>
> But cheaper and more effective than a 32 SA's - plus we'd be able to run RISC
> OS instead of RiscBSD.
>
> I wonder if it would have any form of FPU?...... ;o)
>

Not that can of worms again. I'll pretend I didn't read that!

--
Richard

Matthias Seifert

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

Richard Jozefowski <ric...@millipede.co.uk> wrote:
> In article <4932e450a...@t-online.de>, Matthias Seifert
> <URL:mailto:M.Se...@t-online.de> wrote:

[...]

> > Do you see it? They speek about a "32 MHz clock" - or how would you
> > describe that "5 cycles of the 32 MHz clock" with your 16 MHz theory?

> The DRAM state machine is indeed driven from by a 32MHz clock, but the

> ARM memory clock (MCLK) is driven at 16MHz (ish). I say "ish" because,
> as you rightly point out, it is stretched by half a 16MHz period during
> the N cycle. This extra half cycle is required in order to meet the
> DRAM random access time specification, two cycles at 16MHz not being
> quite enough.

> The upshot is that the memory bus can be said to be run at 16MHz. It's

> the MCLK that counts, not the clock rate of the state machine. After
> all, it would be possible to run the DRAM state machine at 64MHz, taking
> 10 clocks for S-cycles and 5 for N-cycles - I'm afraid this would be a
> marketing benefit only!

Well, I have not started this MHz thing. I would never do this as
comparing MHz of RAM makes as much sense as comparing MHz of processors.
It would make much more sense to talk about data transfer rates instead...

oliverc

unread,

Aug 17, 1999, 3:00:00 AM8/17/99

to

In message <37B91A1F...@tao-group.com>
Marko Lukat <marko...@tao-group.com> wrote:

> Matthias Seifert wrote:
> [..]
> > > Unless this is the old "32MHz in PC terminology" cobblers I must
> > > disagree. It's 16MHz.
> >
> > How?
> >
> > Well, the TRM of the RPC tells the following tale: "DRAM [...] control
>>and
> > timing is controlled by a state machine running at 32 MHz. [...] S-cycles
> > run at 16 MHz, and N-cycles take 2.5 times the S-cycle time [...]. This

> > means that S-cycles take 2 cycles of the 32 MHz clock, and N-cycles
>>take 5
> > cycles of the 32 MHz clock."

> >
> > Do you see it? They speek about a "32 MHz clock" - or how would you
> > describe that "5 cycles of the 32 MHz clock" with your 16 MHz theory?
>

> How do you explain that !SICK reports MCLK (which I refer to as the bus
> clock, correct me if I'm wrong here) as being 16MHz (*cache idws)?

I just tried this "*cache idws" thingy, and hey presto, my riscpc ran
at half the speed....

Incidentally, why would anybody want to do this so badly that they actually
implement a *command to do it?

Cheers,
Oli

--
Another gem from the ZapEmail default signatures file.
Worse things happen in C.

J.Bland

unread,

Aug 18, 1999, 3:00:00 AM8/18/99

to

In message <4933247e8...@t-online.de>
Matthias Seifert <M.Se...@t-online.de> wrote:

> J.Bland <J.B...@liv.ac.uk> wrote:
>
> [...]
>
> > If bus bandwidth had such little effect as you appear to be stating we
> > wouldn't be needing all these whizzy new motherboards.
>
> And if bus bandwidth had such huge effect as you appear to be stating we
> wouldn't be using a SA at all. ;-)
>

Not at all, the SAs we have seem to do ok, but they are close to the limit of
performance that the hardware can support. Increase them by even moderate
amounts and you find the cpu being choked. It's a fact.

> But don't get me wrong. Surely the bus bandwith has a big effect and I
> have no doubt that the Imago will be a lot faster than the RPC. Maybe an
> Imago with SA/233 would even be faster than a RPC with SA/450 (in most
> cases). But nontheless I'm very sure that a RPC with a SA/450 would be
> considerably faster than a RPC with SA/233 or even an overclocked SA/287.
> And the speed increase to a SA/233 must be anything between 0% (for simple
> copying of large RAM areas [byte by byte]) to 93% (for code that runs
> entirely in the caches). How close you get to 0% or 93% depents on what
> you are using your RPC for of course...
>
> And I would like to get a SA/450 for my RPC _and_ an additional Imago
> (with SA/450). :-)
>

A 450 will be a *lot* faster than a 233, *fairly* faster then a 287. No
real-life code sits permanently in cache particularly when multitasking so
bus/memory will strip off a lot of the performance. Given a choice between
233 and 450 I'd take 450 cos it *will* be faster, but unless it's cheap it
probably wouldn't be worth it for most people except just to say "I have a
450MHz cpu".

Steffen Huber

unread,

Aug 18, 1999, 3:00:00 AM8/18/99

to

J.Bland wrote:
> In message <4933247e8...@t-online.de>
> Matthias Seifert <M.Se...@t-online.de> wrote:
> > J.Bland <J.B...@liv.ac.uk> wrote:
[...]
> > > If bus bandwidth had such little effect as you appear to be stating we
> > > wouldn't be needing all these whizzy new motherboards.
> > And if bus bandwidth had such huge effect as you appear to be stating we
> > wouldn't be using a SA at all. ;-)
> Not at all, the SAs we have seem to do ok, but they are close to the limit of
> performance that the hardware can support. Increase them by even moderate
> amounts and you find the cpu being choked. It's a fact.

Where does this fact come from?

The only facts available to me are:
- upclocking an SA from 200 to <any higher speed> in a current RPC leads
to a speedup
- the speedup is almost linear, even when using memory-intensive things
like Photodesk and GNAT

So, if you want to prove your opinion, come up with some contradicting
facts. Like e.g. what the cache hit/miss ratio is in a typical RISC OS
desktop environment. Which applications didn't scale with the SA
frequency.

It is interesting to note how little effect SDRAM had in the PC world.
When it comes to benchmarks - big difference, because the benchmark tries
to measure the raw speed without the effect of the various cache levels.
When it comes to real world desktop performance - well, a little faster...

Yes, there are situations where faster memory in a Risc PC could help a
lot. Things like Games come to my mind, where you want to shift a lot of
memory around. However, if you are doing things cleverly, you can do
real work during the memory reads/writes due to the SA architecture, so
you would gain not much by having the reads/writes completed faster.

BTW, even the ARM710 was too fast considering the memory speed in a
Risc PC...yet it was much faster than the ARM610 (which was also too
fast ;-)).

So long, Steffen

--
Steffe...@icongmbh.de hub...@lcs.wn.bawue.de
GCC for RISC OS - http://www.arcsite.de/hp/gcc/

Marko Lukat

unread,

Aug 18, 1999, 3:00:00 AM8/18/99

to

Matthias Seifert wrote:
[..]

> > > Do you see it? They speek about a "32 MHz clock" - or how would you
> > > describe that "5 cycles of the 32 MHz clock" with your 16 MHz theory?
>
> > How do you explain that !SICK reports MCLK (which I refer to as the bus
> > clock, correct me if I'm wrong here) as being 16MHz (*cache idws)?
>

> > Marko
>
> Well, if you take another look you surely will see the addendum '(reported
> by OS)'. :-)

The CPU clock ??? I thought you actually measure it ... ;)

Marko

J.Bland

unread,

Aug 18, 1999, 3:00:00 AM8/18/99

to

> >> And if bus bandwidth had such huge effect as you appear to be stating we
> >> wouldn't be using a SA at all. ;-)
> >Not at all, the SAs we have seem to do ok, but they are close to the limit
> >of performance that the hardware can support. Increase them by even
> >moderate amounts and you find the cpu being choked. It's a fact.
>
> Where does this fact come from?
>
> The only facts available to me are:
> - upclocking an SA from 200 to <any higher speed> in a current RPC leads
> to a speedup
> - the speedup is almost linear, even when using memory-intensive things
> like Photodesk and GNAT
>
> So, if you want to prove your opinion, come up with some contradicting
> facts. Like e.g. what the cache hit/miss ratio is in a typical RISC OS
> desktop environment. Which applications didn't scale with the SA
> frequency.

It's at times like these I wish I still had that rough piece of paper I
jotted all the figures down on when I was upclocking for the first time.

But anyway, my feeble memory remembers the following;

BASIC; no screen output - *almost* linear
screen output - less than linear
ARCQuake; Much less than linear
FastQuake; Also much less than linear

There were others (desktop apps too), with a range of responses. The overall
effect was at 256.7MHz there was an *average* loss of performance of 35%
compared to the "linear" response expected. Whether this is bus or memory or
even a dodgy SA I wouldn't like to say for definite, but *something* was
slowing it, and I call 35% significant. I would estimate that this loss would
increase with clock speed. It could be 35% at all higher frequencies for all
I know cos I don't have them to play with but I doubt it.

I did the tests, I worked out the numbers and I believe them. Feel free to
believe what you want, I'm saying no more.

Shrike

ps And I did them at a range of frequencies too, those results would be damn
interesting right now, but I'm not tearing my rpc to bits to do em again ;)

Matthias Seifert

unread,

Aug 18, 1999, 3:00:00 AM8/18/99

to

J.Bland <J.B...@liv.ac.uk> wrote:
> In message <4933247e8...@t-online.de>
> Matthias Seifert <M.Se...@t-online.de> wrote:

> > J.Bland <J.B...@liv.ac.uk> wrote:
> >
> > [...]
> >
> > > If bus bandwidth had such little effect as you appear to be stating
> > > we wouldn't be needing all these whizzy new motherboards.

> >
> > And if bus bandwidth had such huge effect as you appear to be stating
> > we wouldn't be using a SA at all. ;-)
> >

> Not at all, the SAs we have seem to do ok, but they are close to the
> limit of performance that the hardware can support. Increase them by
> even moderate amounts and you find the cpu being choked. It's a fact.

Well, you could say the same about the ARM610 or ARM710 (i.e. even if you
upclock these you will not get an exact linear speed increase) - and
nonetheless the (actual) SA _is_ considerably faster...

Dave Cooper

unread,

Aug 18, 1999, 3:00:00 AM8/18/99

to

In article <37BA6C38...@icongmbh.de>, Steffen Huber

<Steffe...@icongmbh.de> wrote:
>
> BTW, even the ARM710 was too fast considering the memory speed in a
> Risc PC...yet it was much faster than the ARM610 (which was also too
> fast ;-)).
>
> So long, Steffen
>

I remember being disappointed with the 710 upgrade. I think it was supposed
to be about 25 percent but I hardly noticed it.

The upgrade to StrongArm was stunning in comparison.

But let's hope there is a 'usable' 450 SA-110 to come.

Regards, Dave C.

--
__ __ __ __ __ ___ ________________________________________________
|__||__)/ __/ \|\ ||_ | / StrongArm Risc Pc (586x100PcCard)in ZFC & MAUG
| || \\__/\__/| \||__ |/ RINGS:- Acorn,SF Review & Classical Music.Also
_________________________/ Satellite,SF,DTP & AV.Email: d...@argonet.co.uk
Homepage (inc.free photos) http://www.argonet.co.uk/users/dac/index.html

Matthias Seifert

unread,

Aug 18, 1999, 3:00:00 AM8/18/99

to

Erm, what? You talked about MCLK, which is not measured but reported by
the OS (as !SICK tells by itself). The CPU clock on the other hand _is_
measured by !SICK (and it does this quite well I have to add).

Ah, now I see what you mean: You try to argue that the SA with the cache
set to "idws" runs synchronous with the RAM and thus the measured CPU
clock should be identical to MCLK...

Well, I don't know if this should work. At least I don't believe that the
SA really is able to access the RAM syncronous. And even if this is the
case, the routine which measures the clock speed isn't thought for
measuring RAM performance and thus may well get these things wrong...

At least !SICK doesn't report 16 MHz but 15.92 MHz (with my machine) which
is neither 16 MHz nor 32 MHz... ;-)

What I wonder about is the fact that it runs that slow at all, as the
Icache is still emabled I had guessed that it runs with full speed (as the
'processor clock measuring routine' sits completely in the Icache)... 8-}

J.Bland

unread,

Aug 18, 1999, 3:00:00 AM8/18/99

to

In message <493362b12...@t-online.de>
Matthias Seifert <M.Se...@t-online.de> wrote:

> J.Bland <J.B...@liv.ac.uk> wrote:
> > In message <4933247e8...@t-online.de>
> > Matthias Seifert <M.Se...@t-online.de> wrote:
>
> > > J.Bland <J.B...@liv.ac.uk> wrote:
> > >
> > > [...]
> > >
> > > > If bus bandwidth had such little effect as you appear to be stating
> > > > we wouldn't be needing all these whizzy new motherboards.
> > >
> > > And if bus bandwidth had such huge effect as you appear to be stating
> > > we wouldn't be using a SA at all. ;-)
> > >
>
> > Not at all, the SAs we have seem to do ok, but they are close to the
> > limit of performance that the hardware can support. Increase them by
> > even moderate amounts and you find the cpu being choked. It's a fact.
>
> Well, you could say the same about the ARM610 or ARM710 (i.e. even if you
> upclock these you will not get an exact linear speed increase) - and
> nonetheless the (actual) SA _is_ considerably faster...
>

A SA isn't the same as an ARM710 with a faster clock though is it, it's
different in its arcitechture and caching. We're not replacing a SA with an
improved SA, just one with a higher clockrate.

Shrike

Matthias Seifert

unread,

Aug 19, 1999, 3:00:00 AM8/19/99

to

J.Bland <J.B...@liv.ac.uk> wrote:

[...]

> > Well, you could say the same about the ARM610 or ARM710 (i.e. even if
> > you upclock these you will not get an exact linear speed increase) -
> > and nonetheless the (actual) SA _is_ considerably faster...
> >

> A SA isn't the same as an ARM710 with a faster clock though is it, it's
> different in its arcitechture and caching.

Then again, in several aspects a SA is _worse_ than an ARM710...

Marko Lukat

unread,

Aug 19, 1999, 3:00:00 AM8/19/99

to

Matthias Seifert wrote:
[..]

> Erm, what? You talked about MCLK, which is not measured but reported by
> the OS (as !SICK tells by itself). The CPU clock on the other hand _is_
> measured by !SICK (and it does this quite well I have to add).

Erm, the 's' parameter disables core clock switching for the SA means
that the core logic is tied to MCLK instead being switched between
highspeed clock DCLK and memory clock MCLK (SA-110 reference manual).
This - of course - only happens after the next cache miss. (option
'f' re-enables switching, note that 'fs' resolves to 's' but parameter
handling wasn't one of Acorns strengths :)

> Ah, now I see what you mean: You try to argue that the SA with the cache
> set to "idws" runs synchronous with the RAM and thus the measured CPU
> clock should be identical to MCLK...

's' does not mean synchronous.

> Well, I don't know if this should work. At least I don't believe that the
> SA really is able to access the RAM syncronous. And even if this is the

(see above)

> What I wonder about is the fact that it runs that slow at all, as the
> Icache is still emabled I had guessed that it runs with full speed (as the
> 'processor clock measuring routine' sits completely in the Icache)... 8-}

... think again :)

Marko

Matthias Seifert

unread,

Aug 19, 1999, 3:00:00 AM8/19/99

to

Marko Lukat <marko...@tao-group.com> wrote:
> Matthias Seifert wrote:
> [..]
> > Erm, what? You talked about MCLK, which is not measured but reported by
> > the OS (as !SICK tells by itself). The CPU clock on the other hand _is_
> > measured by !SICK (and it does this quite well I have to add).

> Erm, the 's' parameter disables core clock switching for the SA means
> that the core logic is tied to MCLK instead being switched between
> highspeed clock DCLK and memory clock MCLK (SA-110 reference manual).
> This - of course - only happens after the next cache miss. (option
> 'f' re-enables switching, note that 'fs' resolves to 's' but parameter
> handling wasn't one of Acorns strengths :)

And what would 'sf' do? ;-)

> > Ah, now I see what you mean: You try to argue that the SA with the
> > cache set to "idws" runs synchronous with the RAM and thus the
> > measured CPU clock should be identical to MCLK...

> 's' does not mean synchronous.

Ah.

But nonetheless the measured CPU clock is (most probably) quite useless to
determine MCLK characteristics...

Marko Lukat

unread,

Aug 19, 1999, 3:00:00 AM8/19/99

to

Matthias Seifert wrote:
[..]

> > Erm, the 's' parameter disables core clock switching for the SA means
> > that the core logic is tied to MCLK instead being switched between
> > highspeed clock DCLK and memory clock MCLK (SA-110 reference manual).
> > This - of course - only happens after the next cache miss. (option
> > 'f' re-enables switching, note that 'fs' resolves to 's' but parameter
> > handling wasn't one of Acorns strengths :)
>
> And what would 'sf' do? ;-)

Knowing you a bit leads me to the conclusion that you're not taking
this seriously ;)

Marko

David J. Ruck

unread,

Aug 19, 1999, 3:00:00 AM8/19/99

to

In article <493397465...@t-online.de>, Matthias Seifert
<URL:mailto:M.Se...@t-online.de> wrote:

> Marko Lukat <marko...@tao-group.com> wrote:
> Ah, now I see what you mean: You try to argue that the SA with the cache
> set to "idws" runs synchronous with the RAM and thus the measured CPU
> clock should be identical to MCLK...

Um, no the MCLK on the SA is 66MHz (well 64MHz), but the large amount
of logic on the SA card interfaces this onto the Risc PC's bus.

Which isn't 32MHz or 16MHz, its variable depending on what you are talking
to, with a maximum transfer rate of 16M transfers per second.

Its not to be confused with the DRAM subsystem uses a 32MHz clock but thats
hanging off IOMD, not the main system bus.

Cheers
---druck

David Watson

unread,

Aug 19, 1999, 3:00:00 AM8/19/99

to

In message <ant17192...@milliped.demon.co.uk>
Richard Jozefowski <ric...@millipede.co.uk> wrote:

Having a FPU would be nice though. Isn't one of the advantages of having one
that you can perform 2 calculations at the same time? One fixed point and
one float? Now, how can that be a bad thing? It would also make the porting
of programs from other platforms easier. Is it true that the real reason that
we never get FPUs is that they are against the RISC philosophy?

Dave
--
____ _ __
/ __/__ ____(_) /____ __ _ ___ ____
_\ \/ _ \/ __/ / __/ -_) ' \/ _ `/ _ \
/___/ .__/_/ /_/\__/\__/_/_/_/\_,_/_//_/
/ / http://come.to/daves-place
Move any faster and you'll break into a standstill.

mike

unread,

Aug 20, 1999, 3:00:00 AM8/20/99

to

J.Bland wrote:
>
> > >> And if bus bandwidth had such huge effect as you appear to be stating we
> > >> wouldn't be using a SA at all. ;-)
> > >Not at all, the SAs we have seem to do ok, but they are close to the limit
> > >of performance that the hardware can support. Increase them by even
> > >moderate amounts and you find the cpu being choked. It's a fact.
> >

> > Where does this fact come from?
> >
> > The only facts available to me are:
> > - upclocking an SA from 200 to <any higher speed> in a current RPC leads
> > to a speedup
> > - the speedup is almost linear, even when using memory-intensive things
> > like Photodesk and GNAT
> >
> > So, if you want to prove your opinion, come up with some contradicting
> > facts. Like e.g. what the cache hit/miss ratio is in a typical RISC OS
> > desktop environment. Which applications didn't scale with the SA
> > frequency.
>
> It's at times like these I wish I still had that rough piece of paper I
> jotted all the figures down on when I was upclocking for the first time.
>
> But anyway, my feeble memory remembers the following;
>

[I think I have missed the start of this thread]

Perhaps, if the next generation of RiscOs machines have PCI slots, the
Processor -> Main memory speed would be even less important. A dedicated
graphics card (2d or 3d or both) would perform most of it's operations
within it's own local memory. The processor is then freed up to do what
it is good at, running code.

A possible solution for the Risc PC would be a podule with a standard
graphics chip set. Games and graphics intensive apps could take
advantage of this. A subset of Risc Os screen redrawing functions could
be patched to use the card. A faster (400Mhz+) processor, the slow
memory bandwidth and the acellerated graphics could work togther fine.

Could the the direct graphics writes of older apps be trapped and
redirected to the RAM on the hardware? Perhaps the thing could use a
pass through like on my 3dfx card. Only the desktop and specialy writen
software would use the new hardware. Can anybody tell me, is anything
being done in this direction?

Risc Os 4 sales show us that many RiscOS users are willing (and able!)
to pay a high price to keep their machine afloat.
--
Cheers

Mike
...Congratulations, you've discovered the secret message
Fido 2:2502/40.2
www.unmusic.freeserve.co.uk - Home recording, books
and furious political thought.

Marko Lukat

unread,

Aug 20, 1999, 3:00:00 AM8/20/99

to

David J. Ruck wrote:
[..]

> Um, no the MCLK on the SA is 66MHz (well 64MHz), but the large amount
> of logic on the SA card interfaces this onto the Risc PC's bus.

Hard to believe. When I switch the SA core logic to MCLK it runs with
a bit less than 16MHz. Where are the missing 48MHz then?

Marko

Richard Jozefowski

unread,

Aug 20, 1999, 3:00:00 AM8/20/99

to

In article <37BD1A04...@tao-group.com>, Marko Lukat

Have I had a bad day or is this thread is getting tiresome?

On the Risc PC the MCLK frequency is nominally 16MHz. I say nominally
because the period is 62.5ns for S-cycles, but 93.75ns for one of the two
N-cycles.

It is *NOT* 64MHz. The StrongARM is run in asynchronous mode, using the
motherboard MCLK. The motherboard MCLK remains the same whether you plug in
an ARM610 or StrongARM. If the StrongARM card did derive MCLK locally it
would need to drive this onto the motherboard, disabling the normal
motherboard MCLK. This is not possible.

QED (I hope).

--
Richard

Matthias Seifert

unread,

Aug 21, 1999, 3:00:00 AM8/21/99

to

David Watson <9404...@udcf.gla.ac.uk> wrote:
> In message <ant17192...@milliped.demon.co.uk>
> Richard Jozefowski <ric...@millipede.co.uk> wrote:

> > In article <4933198...@suit-u-sir.com>, Simon E. John
> > <URL:mailto:si...@suit-u-sir.com> wrote:
> > > In article <4932a0...@argonet.co.uk>,
> > > dgs <d...@argonet.co.uk> wrote:

> > > > But the benefits with that board could be huge! (OK, so it won't
> > > > be cheap, at around 1000ukp, but still...)

> > > But cheaper and more effective than a 32 SA's - plus we'd be able to
> > > run RISC OS instead of RiscBSD.

> > > I wonder if it would have any form of FPU?...... ;o)
>
> > Not that can of worms again. I'll pretend I didn't read that!

> Having a FPU would be nice though. Isn't one of the advantages of having
> one that you can perform 2 calculations at the same time? One fixed
> point and one float?

Well, not really. One of the advantages of having a FPU is, that FP
calculations will be executed faster. :-)

And at least with ARM based processors I don't think that we will get the
parallel execution of FP and integer commands (soon)...

> Now, how can that be a bad thing?

Even if it would mean that you could perform 2 calculations at the same
time, it wouldn't help much if there is (almost) no program that makes use
of this.

> It would also make the porting of programs from other platforms easier.

And would make Linux faster.

> Is it true that the real reason that we never get FPUs is that they are
> against the RISC philosophy?

How could it? We already have/had FPUs several times: There was an add-on
for the A4xx, the FPA10 for the ARM3 (mainly used in the A5000), there is
the ARM7500FE which is used in the A7000+ (and in the RiscStation) and
there even was a FPA11 combined with an ARM700 (but probably only _very_
few of them were actually produced). And there will be the ARM10 with an
integrated FPU.

The main reason for the little success of FPUs with RISC OS is, that
almost all programs are developed to not rely on FP commands and thus will
not be speeded up at all by a FPU. Thus the last time people had to choose
between higher FP performance or higher interger performance (i.e. between
ARM700+FPA11 and StrongARM), by far most of them have choosen the higher
integer performance.

Tony Houghton

unread,

Aug 21, 1999, 3:00:00 AM8/21/99

to

In <4934cd931...@t-online.de>, Matthias Seifert <M.Se...@t-online.de> wrote:

> And at least with ARM based processors I don't think that we will get the
> parallel execution of FP and integer commands (soon)...

ARM1020 sounds like it can do that sort of thing and is generally quite
groovy. But I think you're right that "we" (as in RISC OS users) won't get
that any time soon :-(.

--
TH * http://homepages.tcp.co.uk/~tonyh/
Supporting CUT: http://www.unmetered.org.uk/

Stephen Crocker

unread,

Aug 22, 1999, 3:00:00 AM8/22/99

to

Before being shot for writing message <slrn7rthu...@tonyh.tcp.co.uk>
to...@tcp.co.uk (Tony Houghton) wrote:

> In <4934cd931...@t-online.de>, Matthias Seifert <M.Se...@t-online.de> wrote:
>
> > And at least with ARM based processors I don't think that we will get the
> > parallel execution of FP and integer commands (soon)...
>
> ARM1020 sounds like it can do that sort of thing and is generally quite
> groovy. But I think you're right that "we" (as in RISC OS users) won't get
> that any time soon :-(.

Unless we start feeding ROL steroids...

--
x^ ( ) _________ // Email: mailto:cr...@crok.demon.co.uk
< U O |_|_|_|_|_| O || WWW: http://www.crok.demon.co.uk
\, |/|\ _________ [ ]
. |/^\ . 2 . /__\
... "Virtual" means never knowing where your next byte is coming from.

Alan P Dawes

unread,

Aug 22, 1999, 3:00:00 AM8/22/99

to

In article <4934cd931...@t-online.de>,

Matthias Seifert <M.Se...@t-online.de> wrote:
> > Is it true that the real reason that we never get FPUs is that they are
> > against the RISC philosophy?

> How could it? We already have/had FPUs several times: There was an add-on
> for the A4xx, the FPA10 for the ARM3 (mainly used in the A5000), there is
> the ARM7500FE which is used in the A7000+ (and in the RiscStation) and
> there even was a FPA11 combined with an ARM700 (but probably only _very_
> few of them were actually produced). And there will be the ARM10 with an
> integrated FPU.

But FPA10 and FPA11 are not full FPUs, they still need a 'softloaded'
floating point emulator called FPE400 to handle the more complex FP
instructions. I suppose this is as near a 'RISC philosophy' FPU as you can
get. I don't know if the same applies to the ARM7500FE but would assume
that it has an FPA11 integrated into the chip alng with the ARM processor etc.

Alan

--
--. --. --. --. : : --- --- ----------------------------
|_| |_| | _ | | | | |_ | alan....@argonet.co.uk
| | |\ | | | | |\| | |
| | | \ |_| |_| | | |__ | Using an Acorn RiscPC

Matthias Seifert

unread,

Aug 23, 1999, 3:00:00 AM8/23/99

to

Alan P Dawes <alan....@argonet.co.uk> wrote:
> In article <4934cd931...@t-online.de>,
> Matthias Seifert <M.Se...@t-online.de> wrote:
> > > Is it true that the real reason that we never get FPUs is that they
> > > are against the RISC philosophy?

> > How could it? We already have/had FPUs several times: There was an
> > add-on for the A4xx, the FPA10 for the ARM3 (mainly used in the
> > A5000), there is the ARM7500FE which is used in the A7000+ (and in the
> > RiscStation) and there even was a FPA11 combined with an ARM700 (but
> > probably only _very_ few of them were actually produced). And there
> > will be the ARM10 with an integrated FPU.

> But FPA10 and FPA11 are not full FPUs, they still need a 'softloaded'
> floating point emulator called FPE400 to handle the more complex FP
> instructions.

But you didn't wrote this in you previous posting...

If you want, you can see it the other way round: The FPAs are full FPUs
and the FPE extends their functions by more complex ones...

> I suppose this is as near a 'RISC philosophy' FPU as you can get.

I don't think that this has to be a bad thing.

> I don't know if the same applies to the ARM7500FE but would assume that
> it has an FPA11 integrated into the chip alng with the ARM processor etc.

Right.

And (probably) even the ARM10 will still need a FPE module...

Simon E. John

unread,

Aug 23, 1999, 3:00:00 AM8/23/99

to

In article <493595f7f1...@argonet.co.uk>,

Alan P Dawes <alan....@argonet.co.uk> wrote:

[snip]

> But FPA10 and FPA11 are not full FPUs, they still need a 'softloaded'
> floating point emulator called FPE400 to handle the more complex FP

> instructions. I suppose this is as near a 'RISC philosophy' FPU as you can
> get. I don't know if the same applies to the ARM7500FE but would assume

> that it has an FPA11 integrated into the chip alng with the ARM processor
> etc.

IIRC FPE400 is not an FPE but simply a patch to re-route calls to the FPE to
the FPU for apps that assumed the presence of the FPE.

The ARM7500FE presumably just has this included in RISC OS 3.71.

The ARM7500FE isn't as simple as an ARM700+FPA11, it uses the ARM250
philosophy of having IOMD etc. on-chip.

--
Simon E. John, BA(Hons)

Email: si...@suit-u-sir.com | 258MHz StrongARM RiscPC
Website: www.suit-u-sir.com | 50Mb RAM, 4.3Gb HD, Zip
PGP 'bot: send...@suit-u-sir.com | 32x CD, USR 56K V.90/X2

Santa's a lucky guy, he knows where all the bad girls live!

Tony Houghton

unread,

Aug 23, 1999, 3:00:00 AM8/23/99

to

In <4935ffb7d...@t-online.de>, Matthias Seifert <M.Se...@t-online.de> wrote:

> Alan P Dawes <alan....@argonet.co.uk> wrote:
>
> > But FPA10 and FPA11 are not full FPUs, they still need a 'softloaded'
> > floating point emulator called FPE400 to handle the more complex FP
> > instructions.
>

> But you didn't wrote this in you previous posting...
>
> If you want, you can see it the other way round: The FPAs are full FPUs
> and the FPE extends their functions by more complex ones...
>

> > I suppose this is as near a 'RISC philosophy' FPU as you can get.
>

> I don't think that this has to be a bad thing.

The trouble is that to meet the IEEE [1] standard and implement the full
instruction set, some calls have to be routed through the emulator with
most of the overhead associated with it. That is a bad thing. If the
instruction set had been designed with "RISC philosophy" and ignored IEEE
to begin with, maybe that would work as well as other current FPUs.

[1] There's an IEE and IEEE, one's American or international and the
other's English or European, I can't remember which is which ATM.
Presumably the American one applies here.

Philip Blundell

unread,

Aug 23, 1999, 3:00:00 AM8/23/99

to

In article <slrn7s321...@tonyh.tcp.co.uk>,

Tony Houghton <to...@tcp.co.uk> wrote:
>In <4935ffb7d...@t-online.de>, Matthias Seifert <M.Se...@t-online.de> wrote:
>> I don't think that this has to be a bad thing.
>
>The trouble is that to meet the IEEE [1] standard and implement the full
>instruction set, some calls have to be routed through the emulator with
>most of the overhead associated with it. That is a bad thing. If the
>instruction set had been designed with "RISC philosophy" and ignored IEEE
>to begin with, maybe that would work as well as other current FPUs.

Only very rare cases require software completion. There are some
corner cases for which IEEE 754 requires specific behaviour (denormal
numbers for example) that are not encountered often enough to merit
the silicon to handle them. There is no performance hit for common
operations.

The Alpha floating point unit uses a similar philosophy.

p.

Philip Blundell

unread,

Aug 23, 1999, 3:00:00 AM8/23/99

to

In article <493595f7f1...@argonet.co.uk>,

Alan P Dawes <alan....@argonet.co.uk> wrote:

>get. I don't know if the same applies to the ARM7500FE but would assume
>that it has an FPA11 integrated into the chip alng with the ARM
>processor etc.

Yes. You can find a 7500FE data sheet on the ARM Ltd web site, I
think.

p.

Matthias Seifert

unread,

Aug 23, 1999, 3:00:00 AM8/23/99

to

Tony Houghton <to...@tcp.co.uk> wrote:
> In <4935ffb7d...@t-online.de>, Matthias Seifert
> <M.Se...@t-online.de> wrote:

> > Alan P Dawes <alan....@argonet.co.uk> wrote:
> >

> > > But FPA10 and FPA11 are not full FPUs, they still need a 'softloaded'
> > > floating point emulator called FPE400 to handle the more complex FP
> > > instructions.
> >
> > But you didn't wrote this in you previous posting...
> >
> > If you want, you can see it the other way round: The FPAs are full FPUs
> > and the FPE extends their functions by more complex ones...
> >
> > > I suppose this is as near a 'RISC philosophy' FPU as you can get.
> >

> > I don't think that this has to be a bad thing.

> The trouble is that to meet the IEEE [1] standard and implement the full
> instruction set, some calls have to be routed through the emulator with
> most of the overhead associated with it. That is a bad thing. If the
> instruction set had been designed with "RISC philosophy" and ignored IEEE
> to begin with, maybe that would work as well as other current FPUs.

And what's the problem? If you want it RISC-like, use it RISC-like (i.e.
only the commands the FPA supports), and if you want IEEE standard, you
can have this too (with the help of the FPE)...

Ian Bannister

unread,

Aug 24, 1999, 3:00:00 AM8/24/99

to

In article <4935fa3...@suit-u-sir.com>, Simon E. John

<si...@suit-u-sir.com> wrote:
>
> IIRC FPE400 is not an FPE but simply a patch to re-route calls to the FPE
> to the FPU for apps that assumed the presence of the FPE.

It is still correct to refer to it as an emulator. It does after all
provide emulation of floating point instructions not handled by the FPU. It
is just that these instructions will make use of the FPU as they call the
simpler functions.
There was no assumption of an FPE in the code written for ARM on Acorn
machines. The code uses genuine FP instructions and it was up to the
hardware/software combination to route these correctly.

--
|-*- Ian Bannister -| ____________________________
||---------------------/ \
|| <-* banni...@argonet.co.uk *->
||---------------------\____________________________/

Alan P Dawes

unread,

Aug 25, 1999, 3:00:00 AM8/25/99

to

In article <4935ffb7d...@t-online.de>,

Matthias Seifert <M.Se...@t-online.de> wrote:
> Alan P Dawes <alan....@argonet.co.uk> wrote:

> > But FPA10 and FPA11 are not full FPUs, they still need a 'softloaded'
> > floating point emulator called FPE400 to handle the more complex FP
> > instructions.

> But you didn't wrote this in you previous posting...

What previous posting are you referring to? (Except for this one), I have
only sent the one posting to this 'thread'.

Matthias Seifert

unread,

Aug 25, 1999, 3:00:00 AM8/25/99

to

Alan P Dawes <alan....@argonet.co.uk> wrote:
> In article <4935ffb7d...@t-online.de>,
> Matthias Seifert <M.Se...@t-online.de> wrote:
> > Alan P Dawes <alan....@argonet.co.uk> wrote:

> > > But FPA10 and FPA11 are not full FPUs, they still need a 'softloaded'
> > > floating point emulator called FPE400 to handle the more complex FP
> > > instructions.

> > But you didn't wrote this in you previous posting...

> What previous posting are you referring to? (Except for this one), I have
> only sent the one posting to this 'thread'.

Oh, yes, I messed it up... Sorry.

David Watson

unread,

Aug 31, 1999, 3:00:00 AM8/31/99

to

In message <4934cd931...@t-online.de>
Matthias Seifert <M.Se...@t-online.de> wrote:

> David Watson <9404...@udcf.gla.ac.uk> wrote:
> > Having a FPU would be nice though. Isn't one of the advantages of having
> > one that you can perform 2 calculations at the same time? One fixed
> > point and one float?
>
> Well, not really. One of the advantages of having a FPU is, that FP
> calculations will be executed faster. :-)
>

> And at least with ARM based processors I don't think that we will get the
> parallel execution of FP and integer commands (soon)...

Ah, I must be thinking of the x86 family et al. I had always thought that
games like Quake (in the pre 3DFX card days) used this parallel execution
to make themselves whizz along. I was therefore very surprised by the fact
that ArcQuake is actually playable at all.

> > Now, how can that be a bad thing?
>
> Even if it would mean that you could perform 2 calculations at the same
> time, it wouldn't help much if there is (almost) no program that makes use
> of this.

..Chicken .. Egg .. Chicken .. Egg .. Chicken .. Egg .. Chicken .. Egg ..

It's the old story. However if the ARM10 is to have an integrated FPU (and
RiscOS can use this processor) then things may change a little.

> > It would also make the porting of programs from other platforms easier.

> And would make Linux faster.

Two *very* useful consequences!

> > Is it true that the real reason that we never get FPUs is that they are
> > against the RISC philosophy?

> How could it? We already have/had FPUs several times:

Yeah, but RISC means *Reduced* Instruction Set. Adding an FPU *adds* more
instructions. Hence not such a reduced instruction set :-)

> The main reason for the little success of FPUs with RISC OS is, that
> almost all programs are developed to not rely on FP commands and thus will
> not be speeded up at all by a FPU.

So how did the FPU become so important in systems that use x86, MIPS, etc
processors? Until the Pentium the FPU was an option and not the norm. I'm
sure that a lot of the fixed point 'work arounds' that programmers have used
to avoid FPs are significantly slower than more direct floating operations.

> Thus the last time people had to choose
> between higher FP performance or higher interger performance (i.e. between
> ARM700+FPA11 and StrongARM), by far most of them have choosen the higher
> integer performance.

Would I be right in thinking that the StrongARM is faster at processing
FP instructions than an ARM7 anyway because it has more brute force
processing power? Cetainly under normal use thais should be the case.
Tasks such as MP3 encoding is very floating point intensive but is still
faster on a SA (AFAIK).

Food for thought,

Dave Watson

--
____ _ __
/ __/__ ____(_) /____ __ _ ___ ____
_\ \/ _ \/ __/ / __/ -_) ' \/ _ `/ _ \
/___/ .__/_/ /_/\__/\__/_/_/_/\_,_/_//_/
/ / http://come.to/daves-place

Overall, I'd rather lay in a hammock with a couple of girls than be dead. -MASH

Matthias Seifert

unread,

Sep 1, 1999, 3:00:00 AM9/1/99

to

David Watson <9404...@udcf.gla.ac.uk> wrote:
> In message <4934cd931...@t-online.de>
> Matthias Seifert <M.Se...@t-online.de> wrote:

[...]

> > Even if it would mean that you could perform 2 calculations at the
> > same time, it wouldn't help much if there is (almost) no program that
> > makes use of this.

> ..Chicken .. Egg .. Chicken .. Egg .. Chicken .. Egg .. Chicken ..
> Egg ..

> It's the old story. However if the ARM10 is to have an integrated FPU
> (and RiscOS can use this processor) then things may change a little.

Well, we already have FP support for guite a while - you could get the FPA
for the ARM3 (which 'some' users did) and an A7000+ (or now the
RiscStation) come with hardware FP support as standard. And what happenend
on the software side so far? Nothing. Your theory seems logical, but
reality looks a bit different...

[...]

> > > Is it true that the real reason that we never get FPUs is that they
> > > are against the RISC philosophy?

> > How could it? We already have/had FPUs several times:

> Yeah, but RISC means *Reduced* Instruction Set. Adding an FPU *adds* more
> instructions. Hence not such a reduced instruction set :-)

Erm, ever seen the instruction sets of other 'RISC' processors? I guess
that the ARM instruction set would be considerably 'reduced' even with FP
support. :-)

> > The main reason for the little success of FPUs with RISC OS is, that
> > almost all programs are developed to not rely on FP commands and thus
> > will not be speeded up at all by a FPU.

> So how did the FPU become so important in systems that use x86, MIPS,
> etc processors? Until the Pentium the FPU was an option and not the
> norm.

Well, until the 486DX in fact which was a few years before the Pentium...

> I'm sure that a lot of the fixed point 'work arounds' that programmers
> have used to avoid FPs are significantly slower than more direct
> floating operations.

Well, if they are just 'work arounds' then you're probably right. But if
they design the code completely with integer and fixed point in mind, I
guess the programs are faster. It is still the case that FP instructions
are slower (or as fast) as integer commands - I never seen a FP
instruction that was faster than the comparable integer command. Thus most
time critical code (even on x86) is developed with as less FP as possible
(until they make use the parallel execution of FP and integer
instructions).

Of course this is only true for cases where it _is_ actually possible to
avoid FP. :-)

And that FP became _that_ important with x86 (and there mainly with games)
was 'a bit' forced by Intel, because their FP unit was considerably faster
than those of AMD and Cyrix - but now there is the Athlon...

> > Thus the last time people had to choose between higher FP performance
> > or higher interger performance (i.e. between ARM700+FPA11 and
> > StrongARM), by far most of them have choosen the higher integer
> > performance.

> Would I be right in thinking that the StrongARM is faster at processing
> FP instructions than an ARM7 anyway because it has more brute force
> processing power?

No, you wouldn't.

A StrongARM @202.7 MHz reaches about 2884 kWhetstones/sec whereas even an
ARM3/FPA10 @25 MHz reaches about 2975 kWhetstone/sec. (As !SICK tells me.)

> Cetainly under normal use thais should be the case.

But it isn't.

> Tasks such as MP3 encoding is very floating point intensive but is still
> faster on a SA (AFAIK).

But even 'FP intensive' code has to do some integer calculation and must
access the RAM from time to time... :-)

Tony van der Hoff

unread,

Sep 1, 1999, 3:00:00 AM9/1/99

to

In article <551c5b3a49%9404...@udcf.gla.ac.uk>, David Watson
<9404...@udcf.gla.ac.uk> writes
>
Since there appears to be a moratorium on thread-drift during this silly
season...

>..Chicken .. Egg .. Chicken .. Egg .. Chicken .. Egg .. Chicken .. Egg ..
>

Chicken .. Egg ... <a very long period of time> ... Egg .. Dinosaur ..
Egg .. Dinosaur.

It is self-evident that the egg evolved considerably earlier than the
chicken. The analogy is therefore void.

>Food for thought,
>
Chicken curry, or curried eggs?

--
Tony van der Hoff | Mailto:to...@mk-net.demon.co.uk
| Mailto:avand...@iee.org
Buckinghamshire, England | http:www.mk-net.demon.co.uk

Jim Lesurf

unread,

Sep 1, 1999, 3:00:00 AM9/1/99

to

In article <493a96818...@t-online.de>,

Matthias Seifert <M.Se...@t-online.de> wrote:
> David Watson <9404...@udcf.gla.ac.uk> wrote:
> > In message <4934cd931...@t-online.de>
> > Matthias Seifert <M.Se...@t-online.de> wrote:

> > Would I be right in thinking that the StrongARM is faster at processing
> > FP instructions than an ARM7 anyway because it has more brute force
> > processing power?

> No, you wouldn't.

> A StrongARM @202.7 MHz reaches about 2884 kWhetstones/sec whereas even an
> ARM3/FPA10 @25 MHz reaches about 2975 kWhetstone/sec. (As !SICK tells me.)

<ahem> I happen to have a 710+FPA11 in one of our machines. Can't quote
any specific performance figures. However 'intense' fp routines like
bog-standard FFTs, etc, compiled from 'C' typically run about three - five
times faster on this than on a 200MHz StrongArm. Can't recall the clock
speed of the 710, but I think it is about 33MHz. So the FPA is *very*
effective at speeding up fp. I'd give real money for StrongArm
replacements with an equivalent FPA.

Slainte,

Jim

--
Electronics http://www.st-and.ac.uk/~www_pa/Scots_Guide/intro/electron.htm
MMWaves http://www.st-and.ac.uk/~www_pa/Scots_Guide/MMWave/Index.html
Barbirolli Soc. http://www.st-and.demon.co.uk/JBSoc/JBSoc.html
TechWriter http://www.st-and.demon.co.uk/TechWrite/Tips1.html
Dutton CDs http://www.duttonlabs.demon.co.uk/index.html

Philip Blundell

unread,

Sep 1, 1999, 3:00:00 AM9/1/99

to

In article <551c5b3a49%9404...@udcf.gla.ac.uk>,

David Watson <9404...@udcf.gla.ac.uk> wrote:
>In message <4934cd931...@t-online.de>
> Matthias Seifert <M.Se...@t-online.de> wrote:
>

>> David Watson <9404...@udcf.gla.ac.uk> wrote:
>> > Having a FPU would be nice though. Isn't one of the advantages of having
>> > one that you can perform 2 calculations at the same time? One fixed
>> > point and one float?
>>
>> Well, not really. One of the advantages of having a FPU is, that FP
>> calculations will be executed faster. :-)
>>
>> And at least with ARM based processors I don't think that we will get the
>> parallel execution of FP and integer commands (soon)...
>
>Ah, I must be thinking of the x86 family et al. I had always thought that
>games like Quake (in the pre 3DFX card days) used this parallel execution
>to make themselves whizz along.

That's certainly partly true. Both the x86 family and the ARM with
FPA can execute floating point instructions in parallel with integer
operations to some extent.

>Would I be right in thinking that the StrongARM is faster at processing
>FP instructions than an ARM7 anyway because it has more brute force
>processing power?

Depends on the instruction mix, and the software you use on the
StrongARM to emulate floating point. I'm sure you could construct
examples to make either one come out ahead.

p.

Darren Salt

unread,

Sep 1, 1999, 3:00:00 AM9/1/99

to

In message <493aa94...@st-and.demon.co.uk>
Jim Lesurf <jc...@st-and.demon.co.uk> wrote:

> In article <493a96818...@t-online.de>,
> Matthias Seifert <M.Se...@t-online.de> wrote:

[snip]

>> A StrongARM @202.7 MHz reaches about 2884 kWhetstones/sec whereas even an
>> ARM3/FPA10 @25 MHz reaches about 2975 kWhetstone/sec. (As !SICK tells me.)

> <ahem> I happen to have a 710+FPA11 in one of our machines. [...]

Unless you mean ARM700 and FPA11, then I don't believe you :-)

Reading the small print is education; not reading it is experience.

Geoff Crossland

unread,

Sep 1, 1999, 3:00:00 AM9/1/99

to

In message <493a96818...@t-online.de>
Matthias Seifert <M.Se...@t-online.de> wrote:

> David Watson <9404...@udcf.gla.ac.uk> wrote:
> > In message <4934cd931...@t-online.de>
> > Matthias Seifert <M.Se...@t-online.de> wrote:

<snip>

> > > > Is it true that the real reason that we never get FPUs is that they

> > > > are against the RISC philosophy?
>
> > > How could it? We already have/had FPUs several times:
>
> > Yeah, but RISC means *Reduced* Instruction Set. Adding an FPU *adds* more
> > instructions. Hence not such a reduced instruction set :-)
>
> Erm, ever seen the instruction sets of other 'RISC' processors? I guess
> that the ARM instruction set would be considerably 'reduced' even with FP
> support. :-)

Of course, technically, all the FP instructions
come under the bracket of coprocessor instructions,
so adding FP doesn't give us ARM users any more
instructions at all ;-).

> > > The main reason for the little success of FPUs with RISC OS is, that
> > > almost all programs are developed to not rely on FP commands and thus
> > > will not be speeded up at all by a FPU.
>
> > So how did the FPU become so important in systems that use x86, MIPS,
> > etc processors? Until the Pentium the FPU was an option and not the
> > norm.
>
> Well, until the 486DX in fact which was a few years before the Pentium...

I suppose people needed accurate floats for doing
their accounts on their Lotus 1-2-3 or whathaveyous.
It's also incredibly important in engineering (I'm
aware of at least one company which delayed moving
its design work over to CAD software because it
used only single, not double, precision).

> > I'm sure that a lot of the fixed point 'work arounds' that programmers
> > have used to avoid FPs are significantly slower than more direct
> > floating operations.
>
> Well, if they are just 'work arounds' then you're probably right. But if
> they design the code completely with integer and fixed point in mind, I
> guess the programs are faster. It is still the case that FP instructions
> are slower (or as fast) as integer commands - I never seen a FP
> instruction that was faster than the comparable integer command. Thus most
> time critical code (even on x86) is developed with as less FP as possible

> (until they make use the parallel execution of FP and integer
> instructions).

Addition and subtraction FP instructions, on newer
Intel processors, do take around three times as long
as integer ones, but a good compiler (or a better
asm coder) should schedule the instructions, keeping
the FPUs and integer units busy (you can cut them
down to one clock this way). Division and (scream!)
roots, trig and soforth are still killers (unless
you mask the time delay with gobs of other integer
ops and only then in some cases).

Shifts (pretty necessary for fixed point work) take
far, far, far too long on Pentia (where far=one
cycle).

> Of course this is only true for cases where it _is_ actually possible to
> avoid FP. :-)
>
> And that FP became _that_ important with x86 (and there mainly with games)
> was 'a bit' forced by Intel, because their FP unit was considerably faster
> than those of AMD and Cyrix - but now there is the Athlon...

<fx: jumps up and down>

Wey!

AMD rock. :-))

> > > Thus the last time people had to choose between higher FP performance
> > > or higher interger performance (i.e. between ARM700+FPA11 and
> > > StrongARM), by far most of them have choosen the higher integer
> > > performance.
>

> > Would I be right in thinking that the StrongARM is faster at processing
> > FP instructions than an ARM7 anyway because it has more brute force
> > processing power?
>

> No, you wouldn't.

>
> A StrongARM @202.7 MHz reaches about 2884 kWhetstones/sec whereas even an
> ARM3/FPA10 @25 MHz reaches about 2975 kWhetstone/sec. (As !SICK tells me.)

How much practial difference is this 3% going to
make? Am I misinterpreting the benchmark?

<snip>

--

Geoff Crossland

gcrossland @ eccentricity . freeserve . co . uk

James Reynolds

unread,

Sep 2, 1999, 3:00:00 AM9/2/99

to

In article <493AE47F5C%ne...@youmustbejoking.demon.com.uk>, Darren Salt

<ne...@youmustbejoking.demon.com.uk> wrote:
>
> In message <493aa94...@st-and.demon.co.uk>
> Jim Lesurf <jc...@st-and.demon.co.uk> wrote:
>
> > In article <493a96818...@t-online.de>,

> > Matthias Seifert <M.Se...@t-online.de> wrote:
> [snip]

> >> A StrongARM @202.7 MHz reaches about 2884 kWhetstones/sec whereas even
> an
> >> ARM3/FPA10 @25 MHz reaches about 2975 kWhetstone/sec. (As !SICK tells
> me.)
>

> > <ahem> I happen to have a 710+FPA11 in one of our machines. [...]
>
> Unless you mean ARM700 and FPA11, then I don't believe you :-)
>
>

This may be a stupid question but is it possible to run an ARM700 + FPA11 in
the second processor slot and use a re-write of the FPE module to cope with
this? I know it can be done with a PC card but the speed is not that great.

--
j-r...@argonet.co.uk

Andy McMullon

unread,

Sep 2, 1999, 3:00:00 AM9/2/99

to

Wow!

http://developer.intel.com/solutions/archive/issue20/stories/top3.htm

--
Andy: skyp...@bigfoot.com / http://www.mcfamily.demon.co.uk

Torben AEgidius Mogensen

unread,

Sep 2, 1999, 3:00:00 AM9/2/99

to

James Reynolds <j-r...@argonet.co.uk> writes:

>This may be a stupid question but is it possible to run an ARM700 + FPA11 in
>the second processor slot and use a re-write of the FPE module to cope with
>this? I know it can be done with a PC card but the speed is not that great.

In theory, yes. In practice it will probably not give you a great deal
of speed-up:

You will still need to trap the FP instructions on the main processor
and communicate to the other processor that it should handle it. This
is probably easiest done by changing the value of an uncached memory
location that the other processor is constantly watching. The
instruction itself is written to another uncached location, followed
by a return instruction (MOV PC, R14). The ARM700+FPA11 will then jump
to this instruction and execute it.

This is fine as long as the instructions only operate on FP
registers. If you awnt to transfer an FP register to another regiser
or vice-versa, a few extra instructions (and another uncached memory
location) are needed to do this. If the FP instruction accesses
memory, the address has to be uncached. This may require flushing the
cache with the attendant cost this gives.

However, the advantage over using an x86 processor is that you
(mostly) don't have to decode the FP instructions fully.

Still, the overhead is probably so big that for a 202MHz SA110 with a
33MHz ARM700+FPA11 the win is close to zero.

Torben Mogensen (tor...@diku.dk)

Matthias Seifert

unread,

Sep 2, 1999, 3:00:00 AM9/2/99

to

James Reynolds <j-r...@argonet.co.uk> wrote:
> In article <493AE47F5C%ne...@youmustbejoking.demon.com.uk>, Darren Salt
> <ne...@youmustbejoking.demon.com.uk> wrote:
> >
> > In message <493aa94...@st-and.demon.co.uk>
> > Jim Lesurf <jc...@st-and.demon.co.uk> wrote:
> >
> > > In article <493a96818...@t-online.de>,
> > > Matthias Seifert <M.Se...@t-online.de> wrote:
> > [snip]
> > >> A StrongARM @202.7 MHz reaches about 2884 kWhetstones/sec whereas
> > >> even an ARM3/FPA10 @25 MHz reaches about 2975 kWhetstone/sec. (As
> > >> !SICK tells me.)
> >
> > > <ahem> I happen to have a 710+FPA11 in one of our machines. [...]
> >
> > Unless you mean ARM700 and FPA11, then I don't believe you :-)
> >
> >

> This may be a stupid question but is it possible to run an ARM700 +
> FPA11 in the second processor slot and use a re-write of the FPE module
> to cope with this?

No, because both ARMs would act as 'master' (or however this is called in
this case)...

> I know it can be done with a PC card but the speed is not that great.

I guess that the increase with a compination of ARM700+FPA11 (or an
ARM7500FE) wouldn't be any better as the main problem is the overhead that
is needed to comunicate with the second processor. On the other hand one
could feed the FPA directly with the FP command and would't need to do the
conversation of the code...

Well, if at all I would go for a completely different (and much faster)
FPU on a dedicated processor card (and some logic that does the
conversation 'on the fly'). But how many of such pieces could be sold...?

Matthias Seifert

unread,

Sep 2, 1999, 3:00:00 AM9/2/99

to

Geoff Crossland <gcros...@eccentricity.freeserve.co.uk> wrote:
> In message <493a96818...@t-online.de>
> Matthias Seifert <M.Se...@t-online.de> wrote:

> > David Watson <9404...@udcf.gla.ac.uk> wrote:

[...]

> > > Would I be right in thinking that the StrongARM is faster at
> > > processing FP instructions than an ARM7 anyway because it has more
> > > brute force processing power?
> >
> > No, you wouldn't.
> >

> > A StrongARM @202.7 MHz reaches about 2884 kWhetstones/sec whereas even
> > an ARM3/FPA10 @25 MHz reaches about 2975 kWhetstone/sec. (As !SICK
> > tells me.)

> How much practial difference is this 3% going to make? Am I
> misinterpreting the benchmark?

Well, you don't misinterprete the benchmark, but (probably) the conclusion
out of it.

It was not the question how much an ARM3/FPA is faster in executing FP
instructions, the question was, if a SA would be faster (emulating FP
instructions) than a combination of ARM700/FPA. I only mentioned the
figure of an ARM3/FPA combination as this is (probably) the lowest figure
you will get with a FPA - and even this is faster than the SA (well, with
RISC OS 4 the SA will be a bit faster).

In practice you wont notice this anhanced FP performance as there are very
few RISC OS programs that make use of FP instructions...

Steven M. Ottens

unread,

Sep 3, 1999, 3:00:00 AM9/3/99

to

In message <493b6d297...@t-online.de>
Matthias Seifert <M.Se...@t-online.de> wrote:

Yeah that is because there were only a few FPA equiped RISC OS computers. And
still not all has them. So with all the A7000+ and clones we get a lot of
people with FPA. Why not write software which use it?

just an idea
--
/ Steven M. Ottens
O'dd- Ottens' Dutch Designs
\ http://www.futuretrain.com/odd

oliverc

unread,

Sep 3, 1999, 3:00:00 AM9/3/99

to

In message <2445aa3b4...@lx.student.wau.nl>

I'm starting to get a little confused - we've always had an FPE module,
and programs use this (on occasion), if one has an FPA, I would assume
that it would come with a new FPE module which would pass all FP
instructions to the FPA instead?

Oli

--
This user hasn't supplied a signature file for ZapEmail.
Computer: a device designed to speed and automate errors.

Geoff Crossland

unread,

Sep 3, 1999, 3:00:00 AM9/3/99

to

In message <9c10d83b49%oli...@connect4free.net>
oliverc <oli...@connect4free.net> wrote:

<snip>

> I'm starting to get a little confused - we've always had an FPE module,
> and programs use this (on occasion), if one has an FPA, I would assume
> that it would come with a new FPE module which would pass all FP
> instructions to the FPA instead?

I presume an FPU would come up on the ARM as a
coprocessor and thus get any FP instructions before
an Undefined Instruction happened. You'd still have
a stub FPEm around, for RMEnsuring against.

Having said that, the software (from WSS?) which
uses an x86 CoPro to provide ARM FPU facilities
would have to work by picking up aborts and
translating instructions; I recall reading somewhere
that some or all of the ARM 3 FPUs did not directly
support the ARM FPU instruction set.

Matthias Seifert

unread,

Sep 4, 1999, 3:00:00 AM9/4/99

to

oliverc <oli...@connect4free.net> wrote:

[...]

> I'm starting to get a little confused - we've always had an FPE module,
> and programs use this (on occasion), if one has an FPA, I would assume
> that it would come with a new FPE module which would pass all FP
> instructions to the FPA instead?

Competition puzzle: Name 5 RISC OS applications (no ports from other
platforms) which make use of FP instructions... ;-)

The Draw module has defined the relevant SWI names (e.g.
"Draw_ProcessPathFP"), but as the PRM says: "They may be supported in some
future version of RISC OS, but if you try to use them in current versions
you'll get an error back."

Peter Bell

unread,

Sep 4, 1999, 3:00:00 AM9/4/99

to

In message <493bf9254...@t-online.de>
Matthias Seifert <M.Se...@t-online.de> wrote:

> Competition puzzle: Name 5 RISC OS applications (no ports from other
> platforms) which make use of FP instructions... ;-)

I guess that any app which goes to the trouble of Rmensuring the
FPEmulator would use FP (without dumping the code to check).

I found that the FPE is loaded by the first five I looked at, which
were:

Eureka, PipeDream, FireWorkz, Schema and S-Base.

Ah, PipeDream was ported to the RO environment (many years ago), so
substitute DataPower - that started on RO, didn't it. Just for good
measure, add Impact.

--
-------------------------------------------------------------------
Peter Bell - pe...@foursqre.demon.co.uk - FourSquare Computing Ltd
5 Drome Path, Winnersh, Wokingham, Berkshire RG41 5HB, UK.
Tel. +44 (0) 118 989 0982 Fax. +44 (0) 118 989 0929

Matthias Seifert

unread,

Sep 4, 1999, 3:00:00 AM9/4/99

to

Geoff Crossland <gcros...@eccentricity.freeserve.co.uk> wrote:
> In message <9c10d83b49%oli...@connect4free.net>
> oliverc <oli...@connect4free.net> wrote:

> <snip>

> > I'm starting to get a little confused - we've always had an FPE module,

> > and programs use this (on occasion), if one has an FPA, I would assume
> > that it would come with a new FPE module which would pass all FP
> > instructions to the FPA instead?

> I presume an FPU would come up on the ARM as a coprocessor and thus get

> any FP instructions before an Undefined Instruction happened. You'd
> still have a stub FPEm around, for RMEnsuring against.

Not quite right.

> Having said that, the software (from WSS?) which uses an x86 CoPro to
> provide ARM FPU facilities would have to work by picking up aborts and
> translating instructions; I recall reading somewhere that some or all of
> the ARM 3 FPUs did not directly support the ARM FPU instruction set.

Well, all "FPUs" for ARM processors (i.e. FPA10 for ARM3 and FPA11 for
ARM700 [and ARM7500FE]) do not support the _full_ FP instruction set.
(That's why they are not called "floating point unit" but "floating point
accelerator".) They support most of the common commands and use the FPE to
emulate the others.

Charlie Baylis

unread,

Sep 4, 1999, 3:00:00 AM9/4/99

to

In article <493bf9254...@t-online.de>,

Matthias Seifert <M.Se...@t-online.de> wrote:
> Competition puzzle: Name 5 RISC OS applications (no ports from other
> platforms) which make use of FP instructions... ;-)

Browse, SciCalc, Sleuth2, Eureka, ArcFax.

Charlie

--
New RISC OS mp3 player: http://www.fish.zetnet.co.uk/

Tony Houghton

unread,

Sep 4, 1999, 3:00:00 AM9/4/99

to

In <493bf9254...@t-online.de>, Matthias Seifert <M.Se...@t-online.de> wrote:

> Competition puzzle: Name 5 RISC OS applications (no ports from other
> platforms) which make use of FP instructions... ;-)

Well you can count ArtToSpr, Notice Board Pro and all the apps in
Picture Book 2 :-). They only use a little bit, just enough to calculate
scaling factors, then it's converted to fixed point for rendering.

--
TH * http://homepages.tcp.co.uk/~tonyh/
Supporting CUT: http://www.unmetered.org.uk/

David Watson

unread,

Sep 7, 1999, 3:00:00 AM9/7/99

to

In message <493a96818...@t-online.de>
Matthias Seifert <M.Se...@t-online.de> wrote:

> David Watson <9404...@udcf.gla.ac.uk> wrote:
> > In message <4934cd931...@t-online.de>
> > Matthias Seifert <M.Se...@t-online.de> wrote:

> > It's the old story. However if the ARM10 is to have an integrated FPU
> > (and RiscOS can use this processor) then things may change a little.
>
> Well, we already have FP support for guite a while - you could get the FPA
> for the ARM3 (which 'some' users did) and an A7000+ (or now the
> RiscStation) come with hardware FP support as standard. And what happenend
> on the software side so far? Nothing. Your theory seems logical, but
> reality looks a bit different...

The point of having a FPA is that you can run processes that require
FP operations faster, or so it seems at first. However, since the StrongARM
RPC is still the fastest RISC OS machine about, optimising a program for
FP operations would probably still result in a slower program overall as
the RiscStation (et al) is the fastest RISC OS box with an FPA. The lack
of integer speed on these computers would negate any gain in FP performance
over the SA. However, if a future, flagship, RISC OS machine was to have
an FPA the story of program development may be different..

> > > > Is it true that the real reason that we never get FPUs is that they
> > > > are against the RISC philosophy?
>
> > > How could it? We already have/had FPUs several times:
>
> > Yeah, but RISC means *Reduced* Instruction Set. Adding an FPU *adds* more
> > instructions. Hence not such a reduced instruction set :-)
>
> Erm, ever seen the instruction sets of other 'RISC' processors? I guess
> that the ARM instruction set would be considerably 'reduced' even with FP
> support. :-)

1 CISC instruction = many RISC instructions
So, is it because of the more compact architecture in RISC chips that
these 'many' instructions are executed faster than the 1 CISC one? Also,
do RISC processors still have more registers than CISC equivelants? I
believe that this was one of the reasons for RISC's high performance.

> > Until the Pentium the FPU was an option and not the norm.

> Well, until the 486DX in fact which was a few years before the Pentium...

I see the 486DX as an 'option' in the 486 family. For the Pentium family
there was not a non-FPU option.

> > I'm sure that a lot of the fixed point 'work arounds' that programmers
> > have used to avoid FPs are significantly slower than more direct
> > floating operations.

<snip: interesting stuff on integer vs float performance>

> And that FP became _that_ important with x86 (and there mainly with games)
> was 'a bit' forced by Intel, because their FP unit was considerably faster
> than those of AMD and Cyrix - but now there is the Athlon...

He he he. What will Intel's next trick be? A quick word in Bill Gates' ear
and "whoops", Win2000 doesn't work with Athlon processors.

(Kidding! Only kidding, Bill. Put those lawers away... No, really. I was only
kidding! Bill? Aaargh..... bankrupt!)

> > Would I be right in thinking that the StrongARM is faster at processing
> > FP instructions than an ARM7 anyway because it has more brute force
> > processing power?

> No, you wouldn't.

Well, wasn't the first time :-)

> > Cetainly under normal use thais should be the case.

> But it isn't.

I meant in normal day-to-day operation. If floats were used as needed (not
worked around or used excessively) a SA RPC would still be marginally faster
in my opinion. But with future machines possibly having a fast processor
AND FPA things may be different.

> > Tasks such as MP3 encoding is very floating point intensive but is still
> > faster on a SA (AFAIK).

> But even 'FP intensive' code has to do some integer calculation and must
> access the RAM from time to time... :-)

See my point above :-)

It's nice to have a discussion rather than an argument! Pity others in this
news group don't always seem to agree with this idea. (Not thatI get into
arguments all the time :-) )

Dave Watson
--
____ _ __
/ __/__ ____(_) /____ __ _ ___ ____
_\ \/ _ \/ __/ / __/ -_) ' \/ _ `/ _ \
/___/ .__/_/ /_/\__/\__/_/_/_/\_,_/_//_/
/ / http://come.to/daves-place

Slam a revolving door today.

David Watson

unread,

Sep 7, 1999, 3:00:00 AM9/7/99

to

In message <OSLqsHAh...@mk-net.demon.co.uk>

Tony van der Hoff <to...@mk-net.demon.co.uk> wrote:

> In article <551c5b3a49%9404...@udcf.gla.ac.uk>, David Watson

> <9404...@udcf.gla.ac.uk> writes

> >Food for thought,

> Chicken curry, or curried eggs?

Your mind must be really screwed up :-)

Dave (had an omelette tonight) Watson

Matthias Seifert

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

David Watson <9404...@udcf.gla.ac.uk> wrote:

[...]

> > > Yeah, but RISC means *Reduced* Instruction Set. Adding an FPU *adds*

> > > more instructions. Hence not such a reduced instruction set :-)
> >
> > Erm, ever seen the instruction sets of other 'RISC' processors? I
> > guess that the ARM instruction set would be considerably 'reduced'
> > even with FP support. :-)

> 1 CISC instruction = many RISC instructions

Usually - but the ARM already proofed that this doesn't has to be so. :-)

> So, is it because of the more compact architecture in RISC chips that
> these 'many' instructions are executed faster than the 1 CISC one?

Well, even on CISC processors most instructions take 1 cycle. The
advantage of RISC is, that the compact design leaves space for bigger
caches and/or makes high clock rates easier.

> Also, do RISC processors still have more registers than CISC
> equivelants? I believe that this was one of the reasons for RISC's high
> performance.

Well, AFAIK even the 68000 had (approx.) as many registers as an ARM...

> > > Until the Pentium the FPU was an option and not the norm.

> > Well, until the 486DX in fact which was a few years before the
> > Pentium...

> I see the 486DX as an 'option' in the 486 family. For the Pentium family
> there was not a non-FPU option.

Well, AFAIK there were only the 486SX25(?), 486SX33 and 486SX40 - all
others, i.e. with clock rates >40 MHz, where only available as DX versions.

[...]

> > > Would I be right in thinking that the StrongARM is faster at
> > > processing FP instructions than an ARM7 anyway because it has more
> > > brute force processing power?
>
> > No, you wouldn't.

> Well, wasn't the first time :-)

> > > Cetainly under normal use thais should be the case.

> > But it isn't.

> I meant in normal day-to-day operation. If floats were used as needed
> (not worked around or used excessively) a SA RPC would still be
> marginally faster in my opinion. But with future machines possibly
> having a fast processor AND FPA things may be different.

Well, _if_ we would have programs that make use of FP instead of bypassing
it by using fixed point routines (e.g. Draw, Photodesk, Artworks) I guess
that a RiscStation would be faster than a SA RPC in those cases (but not
only because of FP performance).

Torben AEgidius Mogensen

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

David Watson <9404...@udcf.gla.ac.uk> writes:

>1 CISC instruction = many RISC instructions

>So, is it because of the more compact architecture in RISC chips that

>these 'many' instructions are executed faster than the 1 CISC one? Also,

>do RISC processors still have more registers than CISC equivelants? I
>believe that this was one of the reasons for RISC's high performance.

If you look at a complex instruction in isolation and compare it to a
sequence of simpler instructions, it is most often possible to
implement the complex instruction so it runs (slightly) faster than
the sequence of simple instructions. However, some of the hardware
needed to make the complex instruction fast may interact badly with
the hardware that executes simple instructions (e.g., the decode stage
or the memory interface may be more complex). Since, even on a CISC,
you execute mostly simple instructions, the tradeoff turns in favour
of using only single instructions.

Furthermore, by avoiding complexity in e.g. the memory interface
(maximum one use of the MMU per instruction causes simpler restart
mechanisms on memory failures), you can use the saved transistors to
implement other speed-saving features: A faster multiplication array,
more cache, more registers etc. Additionally, by keeping the
instructions simple, you can use hardwired implementation instead of
microcode.

Some of the advantages have been reduced in the last decade as a
product of vastly increased transistor budgets: Even complex
instructions can be hardwired and there is room for fast
multiplication arrays and large caches even in complex processors.
Furthermore, the logic needed for out-of-order superscalar processing
is of similar complexity for CISC and RISC, and this is taking an ever
increasing portion of the total number of transistors.

A RISC may actually have fairly complex instructions. The ARM, for
example, allows a single instruction to test a condition, shift a
register, add this to another register, use the result as an address
for a load or store operation and finally write the address back to a
register. The main difference between this complex instruction and a
complex CISC instruction is that all these operations (mainly) use
different parts of the CPU where a complex CISC instruction may, e.g.,
use the memory interface at three or more different addresses.

As for the number of registers, this is somewhat tied to compiler
technology. If you do register allocation per basic block, only
FORTRAN style numeric code can effectively more than 4-8 registers
most of the time. When graph-colouring introduced practical "global"
(per-procedure) register allocation, exploiting 16-32 registers was
possible, but for most modern code, which is quite call-intensive,
little more than this can be effectively exploited. Interprocedural
register allocation can exploit more registers, but few mainstream
compilers do this.

Torben Mogensen (tor...@diku.dk)

Tau Press MD

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

In message <7r5bb8$k...@grimer.diku.dk>

tor...@diku.dk (Torben AEgidius Mogensen) wrote:

> David Watson <9404...@udcf.gla.ac.uk> writes:
>
> >1 CISC instruction = many RISC instructions
> >So, is it because of the more compact architecture in RISC chips that
> >these 'many' instructions are executed faster than the 1 CISC one? Also,
> >do RISC processors still have more registers than CISC equivelants? I
> >believe that this was one of the reasons for RISC's high performance.
>

> [snip]

>
> Furthermore, by avoiding complexity in e.g. the memory interface
> (maximum one use of the MMU per instruction causes simpler restart
> mechanisms on memory failures), you can use the saved transistors to
> implement other speed-saving features: A faster multiplication array,
> more cache, more registers etc. Additionally, by keeping the
> instructions simple, you can use hardwired implementation instead of
> microcode.
>

> [...]

I think you underplay the significance of microcode here -- the point being
that in CISC processors the complex instructions are effectively
"interpreted" into another language, the microcode, hence you have another
timing overhead which RISC chips do not.

--
Steve Turnbull, Managing Director m...@tau-press.com
Tau Press Ltd, Magazine Publishing www.tau-press.com
Tau Press, Media House, Adlington Park, Macclesfield, SK10 4NP, UK

Matthias Seifert

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

Tau Press MD <m...@tau-press.com> wrote:
> In message <7r5bb8$k...@grimer.diku.dk>
> tor...@diku.dk (Torben AEgidius Mogensen) wrote:

[...]

> > Additionally, by keeping the instructions simple, you can use
> > hardwired implementation instead of microcode.
> >
> > [...]

> I think you underplay the significance of microcode here -- the point
> being that in CISC processors the complex instructions are effectively
> "interpreted" into another language, the microcode, hence you have
> another timing overhead which RISC chips do not.

I though Torben already meant this...

On the other hand, more complex instructions have much more potential for
optimisation. We have seen this with the MUL and MLA instructions of the
ARM, but 'unfortunately' most other ARM instructions already use 1 cycle
to complete...

Torben AEgidius Mogensen

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

Tau Press MD <m...@tau-press.com> writes:

>I think you underplay the significance of microcode here -- the point being
>that in CISC processors the complex instructions are effectively
>"interpreted" into another language, the microcode, hence you have another
>timing overhead which RISC chips do not.

It is possible to avoid microcode on CISCs. It just costs a good deal
more to do so than on RISCs. A common technique with CISCs these days
is to translate the CISC instructions to sequences of
microinstructions during decode. This is different from traditional
microcode in the same way that compilers are different from
interpreters. Granted, you add a pipeline stage (or 2) for the
translation, but this doesn't affect the instruction throughput, only
the latency. This means that mispredicted branches are more expensive,
which is why good branch-prediction is more important on PIII than it
is on Alpha or StrongARM. However, with a larg etransistor budget you
can implement quite good predictors, so this again leads to the
conclusion that, giving unlimited resources, RISC dorsn't have that
much of an advantage over CISC. Where RISC shines is in low-budget
processors, where the low budget (by todays standard) can be caused by
limits in technology (as it was in the 80's) or by economic
considerations.

Torben Mogensen (tor...@diku.dk)

Alan P Dawes

unread,

Sep 8, 1999, 3:00:00 AM9/8/99

to

In article <fd0903d49%9404...@udcf.gla.ac.uk>,

David Watson <9404...@udcf.gla.ac.uk> wrote:
> 1 CISC instruction = many RISC instructions
> So, is it because of the more compact architecture in RISC chips that
> these 'many' instructions are executed faster than the 1 CISC one?

There isn't really a simple answer to this.

The original 'raison d'etre' for the development of RISC processors in the
1980s was that when programs were analysed it was found that for over 96%
of the time they were only executing simple instructions. Thus when run on
a cpu optimised for complex instructions such that all instructions
(simple or complex) took the same number of processor cycles this was
wasting a lot of processor time. So a RISC processor optimised to run the
simple subset of commands as quickly as possible (in one cycle each if
possible) and synthesize the more complex ones on the few occasions when
needed (although this would be slower than if this complex instruction had
been 'hard coded' into the processor) would lead to overall faster
execution of the program.

Thus the answer 10 uears ago would be that although a complex instruction
on a CISC cpu was executed faster than on a RISC cpu, since the majority
of instuctions when running a program are simple, overall a RISC cpu
executes the program faster.

But things have moved on since the 1980s. Computer functionality and
complexity of programs have vastly increased with the >96% simple
instructions no longer being valid for some programs. CPU architecture
both CISC and RISC have developed a long way since then. Designers of both
learning from each other. Modern CISC cpus will be much more efficient
using processor cycles as effectively as possible depending on the type of
instruction, so simple instructions may now take no more cycles on a
modern CISC processor than on a RISC one whilst still executing the
complex ones faster thus negating the 'speed' advantage of RISC on normal
programs.

But RISC processors would be simpler to design, test and fabricate, would
use less power, run cooler and thus be run faster than the equivalent CISC
cpu and would be cheaper and so are more cost effective in many
applications which is why ARM is doing well.

Alan

--
--. --. --. --. : : --- --- ----------------------------
|_| |_| | _ | | | | |_ | alan....@argonet.co.uk
| | |\ | | | | |\| | |
| | | \ |_| |_| | | |__ | Using an Acorn RiscPC