M2
10 custom processors
bus bandwidth: 528 Mbytes/sec
100 mill pixels/sec rendering speed
1 million polygons/sec
700K polygons/sec w/ all features
CPU: power PC 602
speed: 66 MHz RISC
instruction/data caches: 64 Kbytes total (32K/32K)
Floating Point Math capability: 132 Mill floating point ops/sec
Main memory: SDRAM/ROM 48 Mbits
Bus: 64-bit
Cache coherent memory system
Graphic res: 640x480 and 320x240x24 or 16-bit color depth
Full motion video: MPEG-1 video built in
MPEG engine supports JPEG decompresion
any enlightenment would be appreciated.
Keyz
[juicy specs on the M2 deleted]
What it means in english is ,it makes the saturn & playstation look
like gameboys in comparison.
Paul
--
.-----------------------------------------------------------------.
!Email pa...@rance.demon.co.uk 2:254/516.2@Fidonet !
! !
! WWW page is http://metro.turnpike.net/P/paulr/index.html !
`-----------------------------------------------------------------'
>You know about this already... I think this CPU is about as powerful as a
>slow 486. Maybe 3-4 times as fast as the CPU in the 3DO now. (?)
I don't think the PowerPC can be compared to a slow 80486. If
anything, you should compare a 66 MHz PPC with a 66 MHz Pentium (ap-
proximately). Both processors have about the same integer perform-
ance, if I recall correctly. I'm not sure how the PPC 602 compares
with the PPC 603 but the PPC 603 destroys a Pentium when it comes to
floating point arithmetic.
>: Floating Point Math capability: 132 Mill floating point ops/sec
>Math that deals with really big or really small numbers. Floating point
>math is almost never used in PC games because it's slow. Perhaps the M2
>is different somehow... Hopefully somebody else can clear this up.
Floating point gives you much better precision (you don't dis-
card fractional components). I believe that using floating-point,
while slow, makes it so the programmers don't have to worry about a
jittery image caused by integer round off.
>: Cache coherent memory system
>Not sure. Sounds like it uses those caches I mentioned in an efficient way.
It's a way of ensuring that a memory write will be seen by all
subsequent accesses to the modified memory. If one sub-system does
a write to cache and the cache doesn't immediately write the changed
data to main memory (i.e., it doesn't write-through), another sub-
system could try reading the same memory but not get the recently
written data since the correct data is still in the cache and not in
the main memory.
--
Milton W. Kuo
seg0...@bayou.uh.edu
I'd bet on rerendering; it's convenient to be able to draw something, then
draw something else on top of it. :)
Also, the current system's spec is "64 million pixels/second", but they
mean 16 million, plus interpolation.
>: CPU: power PC 602
>You know about this already... I think this CPU is about as powerful as a
>slow 486. Maybe 3-4 times as fast as the CPU in the 3DO now. (?)
s/slow/blindingly fast/.
A PPC601 can *emulate* a 486 fast enough to be usable. As usual, motorola
processors are detectably to dramatically faster than the Intel product.
>: Floating Point Math capability: 132 Mill floating point ops/sec
>Math that deals with really big or really small numbers. Floating point
>math is almost never used in PC games because it's slow. Perhaps the M2
>is different somehow... Hopefully somebody else can clear this up.
That means it's much much much faster than the '486. Floating point math
is useful when you are okay with losing some precision on very large or
very small numbers, but want to be able to store anything from .00001 to
2 trillion. (Actually the range is much, much wider.) The M2 is fast
enough to use it sanely; as is a 68040 based system, for instance. The
486 isn't. A 586 nearly is, apart from that division bug.
>: Bus: 64-bit
>Hard to explain... I think I'll skip it. :) Basically, it's how many
>wires run between each processor.
Actually, not exactly; basically, it's how many bits of information "at a time"
can go from one processor to another. Sort of. It's worse in real life.
>It can play back movies that are recorded on to discs. (You won't have to
>buy a seperate decoder to do this.)
It can also have small mpeg streams mixed in with audio, and decode them,
and use them as texture maps. Dream about it.
-s
--
Peter Seebach - se...@solon.com -- se...@intran.xerox.com
All the arrogant jerks who object to stereotypes are all alike.
C/Unix proto-wizard -- C/Unix questions? Send mail for help.
Copyright 1995 Peter Seebach. Not for distribution through Microsoft Network.
KK> M2
[impressive specs deleted]
KK> any enlightenment would be appreciated.
How about, KICK-ASS GAMES MACHINE? :)
.\\arco
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
:: E-mail [ kr...@xs4all.nl ] :: For the ASCII-impared, the ::
:: [ mar...@euronet.nl ] :: .\\ is supposed to be an M. ::
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
* RM 1.3 02366 * I didn't mean to blow up the Academy building! Wesley
[after I had written]
: >"use up" 100 million pixels. Obviously, that would be impossible to
: >display, so I don't quite know what this spec tells us.
: I'd bet on rerendering; it's convenient to be able to draw something, then
: draw something else on top of it. :)
No doubt... Also perhaps it would be handy when you can only "render" a
smallish percentage of the time.
: >You know about this already... I think this CPU is about as powerful as a
: >slow 486. Maybe 3-4 times as fast as the CPU in the 3DO now. (?)
: s/slow/blindingly fast/.
What's that mean? :) Must be one of those programmer things... Someday...
: A PPC601 can *emulate* a 486 fast enough to be usable. As usual, motorola
: processors are detectably to dramatically faster than the Intel product.
So your saying that a PPC602'66 is as fast or even faster than a 486'66?
I recall a chart showing "specInt"s or something that seemed to be
showing that the 602 was quite a bit slower than the (say) Pentium. Like
40 specInts as compared to 200 for the Pentium. Did I just imagine that?
: >: Floating Point Math capability: 132 Mill floating point ops/sec
: That means it's much much much faster than the '486. Floating point math
: is useful when you are okay with losing some precision on very large or
: very small numbers, but want to be able to store anything from .00001 to
: 2 trillion. (Actually the range is much, much wider.) The M2 is fast
: enough to use it sanely; as is a 68040 based system, for instance. The
: 486 isn't. A 586 nearly is, apart from that division bug.
The 602 does more MFLOPS than the Pentium'60 then? Question: Can the
602 do a floating point operation at the same time it's doing "something
else"?
: It can also have small mpeg streams mixed in with audio, and decode them,
: and use them as texture maps. Dream about it.
I do... I'm trying to get help though. :)
Dave Nagy
>No doubt... Also perhaps it would be handy when you can only "render" a
>smallish percentage of the time.
Exactly; specs should be higher than you need.
>: s/slow/blindingly fast/.
>What's that mean? :) Must be one of those programmer things... Someday...
ed lingo.
replace "slow" with "blindingly fast" in previous comment.
>: A PPC601 can *emulate* a 486 fast enough to be usable. As usual, motorola
>: processors are detectably to dramatically faster than the Intel product.
>So your saying that a PPC602'66 is as fast or even faster than a 486'66?
>I recall a chart showing "specInt"s or something that seemed to be
>showing that the 602 was quite a bit slower than the (say) Pentium. Like
>40 specInts as compared to 200 for the Pentium. Did I just imagine that?
That sounds wayyyyy too good for the Pentium. It's not that fast a chip;
maybe 1.5x as fast as a 486 at the same speed. Well, Dunno For Sure, but
as I recall, a 486/66 was a tad faster than a 68040/40, and that's QUITE
a bit slower than a 66 Mhz ppc.
>The 602 does more MFLOPS than the Pentium'60 then? Question: Can the
>602 do a floating point operation at the same time it's doing "something
>else"?
That, I don't remember. I'd guess it can; everything else with an FPU
can, sort of. I'd bet that it can do at least a handful of them at the
same time as other things.
>: It can also have small mpeg streams mixed in with audio, and decode them,
>: and use them as texture maps. Dream about it.
>I do... I'm trying to get help though. :)
Glad to help, as much as I can.
1. MUCH wider range. Say, 3.4e+38 instead of 4.3e+9.
2. Can represent numbers smaller than 1.
The latter is the big key; you can scale by "75%".
Basically, floats give you better accuracy on most math, because they
can represent intermediate states. They still have rounding error, but
it's less severe.
Also, for at least some chips, you can do both at once; basically, floating
point may not take much CPU time. Not sure if this is one of those chips.
Historically, floating point is hugely expensive and slow; recent chips,
especially the 680x0 and Power PC, have been changing this.
I'm no expert, but hear's what I think this stuff means. Someone will
correct me when I say something really stupid. (I hope.)
: M2
"Mark 2" The 2nd version of the 3DO hardware spec. May not be the name
used at launch.
: 10 custom processors
Specialized hardware that does various "stuff" so that the main CPU
(central processing unit, a PowerPC 602 in this case) doesn't have to.
The number "10" may include the 3-4 processors that the 3DO has now.
They'll be included with the M2 for compatability.
: bus bandwidth: 528 Mbytes/sec
That's how much info can be moved "between" the processors in any given
second. I'm sure it's far more complicated than this... A higher number
is good because it implies that the processors won't have to wait for
data to get to them.
: 100 mill pixels/sec rendering speed
Not too sure about this one... The highest resolution picture an M2 is
likely to generate is about 640x480. That's about 300,000 pixels. The
M2 would have to "draw" over 300 of these pictures a second in order to
"use up" 100 million pixels. Obviously, that would be impossible to
display, so I don't quite know what this spec tells us.
: 1 million polygons/sec
: 700K polygons/sec w/ all features
700K means 700 thousand. "Polygon" refers to the little "faces" or
"facets" that 3D objects are composed of. For example, a cube is
composed of 6 (square) faces; a top, a bottom, and four walls. One of the
characters in a 3D fighting game might be composed of anywhere from 50
to 5000 polygons, depending on how detailed "he" was.
"All the features" means that the polygons are textured and anti-aliased
and shaded, etc...
: CPU: power PC 602
You know about this already... I think this CPU is about as powerful as a
slow 486. Maybe 3-4 times as fast as the CPU in the 3DO now. (?)
: speed: 66 MHz RISC
The CPU runs at 66 million "cycles" per second. RISC means "Reduced
Instruction Set Computer". (don't ask)
: instruction/data caches: 64 Kbytes total (32K/32K)
Caches are little areas of REALLY fast memory. The CPU doesn't have to
go out and get stuff from the slower main memory if the info is already
in one of these areas.
: Floating Point Math capability: 132 Mill floating point ops/sec
Math that deals with really big or really small numbers. Floating point
math is almost never used in PC games because it's slow. Perhaps the M2
is different somehow... Hopefully somebody else can clear this up.
: Main memory: SDRAM/ROM 48 Mbits
It's got 4 megabytes (millions of bytes) of system memory and 2 megabytes
of ROM (Read Only Memory) The ROM is full of stuff that "every" game
needs. You can't change it. The RAM is empty space that can be filled
with stuff from the disc.
: Bus: 64-bit
Hard to explain... I think I'll skip it. :) Basically, it's how many
wires run between each processor.
: Cache coherent memory system
Not sure. Sounds like it uses those caches I mentioned in an efficient way.
: Graphic res: 640x480 and 320x240x24 or 16-bit color depth
The number of "dots" it can display at once. For example, 640 across and
480 down. The depth refers to how many colors the hardware has to choose
from for each dot on the screen. 24-bits gives you any of 16 million
colors, 16-bits gives you, uhhh, any of 65,000. Sorta.
: Full motion video: MPEG-1 video built in
: MPEG engine supports JPEG decompresion
It can play back movies that are recorded on to discs. (You won't have to
buy a seperate decoder to do this.)
Dave Nagy
So what makes Floating Point operations more effective than Interger based?
Anyone?
Chris C
------------------------------------------------+
In order to understand one must learn how others
see the world and then learn how not to see the
world how one wants it be (S)
------------------------------------------------+
^^^ the P6 is supposed to get around 200. The
current pentiums score in the low 100's I believe.
-------------> What it means is VAPORWARE. Easy to have the best
specs on a product when it is non-existant and when mock-ups are
run on another machine. Most importantly, they will never... never
be able to make the proper price-point with those specs...
I now happily own NO game system but it's cute that you
guys keep hoping for this thing. Don't hold your breath...
> M2
> 10 custom processors
> bus bandwidth: 528 Mbytes/sec
> 100 mill pixels/sec rendering speed
> 1 million polygons/sec
> 700K polygons/sec w/ all features
> CPU: power PC 602
> speed: 66 MHz RISC
> instruction/data caches: 64 Kbytes total (32K/32K)
> Floating Point Math capability: 132 Mill floating point ops/sec
> Main memory: SDRAM/ROM 48 Mbits
> Bus: 64-bit
> Cache coherent memory system
> Graphic res: 640x480 and 320x240x24 or 16-bit color depth
> Full motion video: MPEG-1 video built in
> MPEG engine supports JPEG decompresion
> any enlightenment would be appreciated.
> Keyz
According to Trip, the M2 does 45 INT specmark compared to Sony and Saturn
which do about 15, and Ultra which also does 45.
>: >: Floating Point Math capability: 132 Mill floating point ops/sec
Trip claims 264 MFLOPS for M2, which would seem to dwarf the numbers of all
the rivals.
> So what makes Floating Point operations more effective than Interger based?
> Anyone?
A floating point number can vary between representing very high values and
smaller values at very high precision. I.e. the decimal point can 'float'.
Integer math (fixed point really) has a tradeoff between range and precision.
Ultimately, for gaming, it means 'smoother and more precise action' in the
same or less ammount of CPU-time, compared to integer math.
Expect the M2 to deliver silky smooth games at wild speeds :)
Are
P.S.
The precision and range of fixed point can be improved by using multi-
precision calculations, meaning many smaller numbers represents a single
larger and/or more precise number, but that will be slow.
: You know about this already... I think this CPU is about as powerful as a
: slow 486. Maybe 3-4 times as fast as the CPU in the 3DO now. (?)
Um...Not quite. The Power PC 602 has a 64 bit data bus that runs at
66Mhz. Even the 486DX4 100 has a 32 bit data bus that runs at 33 Mhz
(severe bottlenecking). With a 528 Mbyte Per second memory arch the MPC
602 should have no bottlenecks to memory, and outperform the most powerful
Pentium currently available. The little 12.5 Mhz CPU in the
3DO dosn't even compare (528 Mbytes/sec vs. 50 Mbytes/sec).
: : speed: 66 MHz RISC
: The CPU runs at 66 million "cycles" per second. RISC means "Reduced
: Instruction Set Computer". (don't ask)
RISC means Reduced Instruction Set Computer. It uses fewer, and less
specific instructions than a CISC (Complex Intruction Set Computer)
computer like a Pentium, thus making it more versatile. Each of the
instructions in a RISC CPU are simpler than those in a CISC and take
fewer clock cycles to execute.
The MPC 602 can execute instructions out of order to make
better use of it's power, and I believe it can execute 3 instructions
simultaniously.
--
-Shawn Rader
>10 custom processors
Each of the custom processors has a specific job to do such as:
DSP-digital sound processor, required for 3DO's patented 3D sound
technology and CD quality sound.
VP-video processor, required to display the picture elements or pixels
to the TV.
Graphic animation processors, these will be dedicated processors used to
calculate various special effects like, texture map, the ability to lay
up a graphic such as skin or clothing onto a 3D wire frame. Shading, the
ability to smooth the sharp edges off a polygon rendered object such as
ball. The custom 602 CPU for running the operating system and many other
nonspecialized duties. Math coprocessors or floating point units are
used to calculate vector specific angles in ray tracing, scaling, light
source, polygon rotation.
To put it simply each of the 10 custom processors work simultaneously to
provide you with true realtime rendered 3D experience very much like
players on a football team.
>bus bandwidth: 528 Mbytes/sec
This is one of the most important specifications that will determine the
realtime rendering speed of the system. The 64bit memory bus is like a
highway shuttling data from RAM memory back to the processors to be
used. Without this very high speed a very fast CPU or processor will
spend most of its time waiting for work. Very much like the Maytag
repairman waiting for a repair call or like using a 1200 bps modem on
internet.
>100 mill pixels/sec rendering speed
With a graphic resolution of 640x480 x 60 frames per second the is about
18.5 million pixels or picture elements per second. The additional
capacity can be used as fogging, translucency or other special effects.
>1 million polygons/sec
This is a measure of 3D rendering ability. 250.000 is considered good.
At 1 million or more, sophisticated rendering techniques can be used.
>700K polygons/sec w/ all features
This will have to do with how real the rendered objects are. Each
special effect will add to the realism.
>CPU: power PC 602
This is a strip down power PC603 designed by IBM, Motorola, Masushita and
3DO that retains specific features that are essential for multimedia
applications. Uses a super scalar architecture and is able to execute
multiple instructions per clock cycles.
>speed: 66 MHz RISC
The speed that any processor will execute can sometimes be relative to
the clock speed, however the RISC design uses the more simple instruct
set and generally has fewer that 100. The ARM used in the
M1 has less than 50. Having these more frequently used
instructions has alot to do with designing faster and more cost
effective silicon. As a result for less transistors and fewer an more simple
instruction to execute you have a faster and cheaper unit. Adding the
superscalar architecture effectly doubles the speed, so at 66Mhz you can
execute up to 132 million instructions per second.
>instruction/data caches: 64 Kbytes total (32K/32K)
Each cache provides the processor with a list of data or instructions
which does not have to be accessed from ram but remain very close to the
CPU. This is like short term memory. Useful in data address locations or
simple algorithims or math formulas. Its like remembering a phone number
you just called without having to look it up again.
>Floating Point Math capability: 132 Mill floating point ops/sec
The 100 MFLOPS bench mark was often sighted as the minimum number of
floating point operations per second for realtime 3D rendering. When
ever a polygon is constructed on a computer all the component of its
structure is described in vector or mathematical coordinates. This
will require mathematical calculation for all rotations or movement.
Trip Hawkins recently doubled the single precision floating point
calculations to 264 MFLOPS. I believe that he may be taking into account
the 602s interger units acting as an additional floating point unit
between instructions.
>Main memory: SDRAM/ROM 48 Mbits
This is the amount of RAM and ROM available for the program to reside in
or use.
>Bus: 64-bit
Unlike Atari, the M2 will have a very fast 64-bit bus without bottle necks.
Simply, this 64-bit highway will have all cars running full speed with no
traffic jams.
>Cache coherent memory system
Whenever data or instruction can be cached the processor will spend less
time waiting and more of its time working on its task. If a processor
has to wait to get data that it uses for a graphic the graphic will be
painted slower its that simple. This is important for frequently uses
code such as data address or program kernals.
>Graphic res: 640x480 and 320x240 24 or 16-bit color depth
This has to do with the number of dots or pixels horizontal and vertical.
Color depth is given in bits equals about 64,000,000 colors for 24-bit and
32.000 colors for 16-bit and 256 colors for 8-bit
>Full motion video: MPEG-1 video built in
MEPG stands for Motion Pictures Experts Group. This is a digital
compression standard for coding and decoding a full motion video.
The algorithm will code a picture element whenever the is a change in a
picture. For example, a Fax machine will move across lines that are
writen in text slowly and code those line while quickly skipping over
blank lines. the Fax machine is a primitive form of graphic compression.
MPEG video is said to be slightly better than VHS, however different
coders have slightly different characteristics. 3DO's is above average.
>MPEG engine supports JPEG decompresion
JPEG another less sophisticated form of compressed video.
> >any enlightenment would be
appreciated. >
>Keyz
Hope I help clear up a few things on those spec's.
Note: The recently introduced Sega Saturn is using twin SH2 Hitach RISC
chips for its main engine. In this parallel scheme one of the SH2 will
have to double as the main operating engine and then serve in other tasks.
Contention has been noted in the gliching for graphics found in Virtua
Fighter and Daytona. Contention is an inhert weakness when the data for
jobs get missed due to bus restrictions. Also many of the jobs carried
out by this engine in the future will continue have gliching to the
nature of the architecture.
Aloha,
Mike Sone
: M2
: 10 custom processors
: bus bandwidth: 528 Mbytes/sec
: 100 mill pixels/sec rendering speed
: 1 million polygons/sec
: 700K polygons/sec w/ all features
: CPU: power PC 602
: speed: 66 MHz RISC
: instruction/data caches: 64 Kbytes total (32K/32K)
^^^^^^^^^
Sorry it is 64Kbits fo cache it has a 4Kbyte I-cache and a
4Kbyte D-cache. But I like your spec better :) .
: Floating Point Math capability: 132 Mill floating point ops/sec
: Main memory: SDRAM/ROM 48 Mbits
: Bus: 64-bit
: Cache coherent memory system
: Graphic res: 640x480 and 320x240x24 or 16-bit color depth
: Full motion video: MPEG-1 video built in
: MPEG engine supports JPEG decompresion
: any enlightenment would be appreciated.
: Keyz
--
------------------------------------------------------------------------
John Shamilian
j...@molson.ho.att.com
------------------------------------------------------------------------
: : You know about this already... I think this CPU is about as powerful as a
: : slow 486. Maybe 3-4 times as fast as the CPU in the 3DO now. (?)
: Um...Not quite. The Power PC 602 has a 64 bit data bus that runs at
: 66Mhz. Even the 486DX4 100 has a 32 bit data bus that runs at 33 Mhz
: (severe bottlenecking). With a 528 Mbyte Per second memory arch the MPC
: 602 should have no bottlenecks to memory, and outperform the most powerful
: Pentium currently available. The little 12.5 Mhz CPU in the
: 3DO dosn't even compare (528 Mbytes/sec vs. 50 Mbytes/sec).
Thanks for the info. I was basing my (wrong) guess on a "chart" posted
by someone quite a while ago. Since I don't even know what a Spec Int
_is_, I'm sure I interpreted it wrong. :P
: The MPC 602 can execute instructions out of order to make
: better use of it's power, and I believe it can execute 3 instructions
: simultaniously.
I thought it was two... But you know how confused I get. :)
Dave Nagy
: : : You know about this already... I think this CPU is about as powerful as a
: : : slow 486. Maybe 3-4 times as fast as the CPU in the 3DO now. (?)
: : Um...Not quite. The Power PC 602 has a 64 bit data bus that runs at
: : 66Mhz. Even the 486DX4 100 has a 32 bit data bus that runs at 33 Mhz
: : (severe bottlenecking). With a 528 Mbyte Per second memory arch the MPC
: : 602 should have no bottlenecks to memory, and outperform the most powerful
: : Pentium currently available.
It will hold it's own on floating point but can not match "the most powerful"
pentiums integer performace, ( remember pentiums are being clocked at
120MHz these days ) Most pentium systems have large secondary caches also.
: Bottlenecks to memory depend on a lot more than bus bandwidth. High
: bandwidth mostly helps with long sequential access. Random access
: depends more on memory latency.
Yes, but most programs are not very random at all. The instruction execution
is more or less sequential and the local variable heaps are clustered on top
of the stack. This is why caches work at all. The size of the cache is very
important also.
: Frankly, depending on the application, pentiums will easily outperform this
: processor.
Maybe the most high-end pentiums but not most pentiums. Pentiums are not
very superscaler they are constently stalling from branches and data
dependencies. PPC6XX series are far better on a mix if integer/floating
point instructions.
: : : : speed: 66 MHz RISC
: : : The CPU runs at 66 million "cycles" per second. RISC means "Reduced
: : : Instruction Set Computer". (don't ask)
: : RISC means Reduced Instruction Set Computer. It uses fewer, and less
: : specific instructions than a CISC (Complex Intruction Set Computer)
: : computer like a Pentium, thus making it more versatile.
: Haha. I suggest you read the article on cisc vs risc in the latest
: microprocessor reprot. Apparently decoupled superscalar design
: has made the differences less significant than they used to be.
yes adding extra hardware can speed things up and so can uping the
clock speed. The major difference is that risc is easier to design
and a true risc can have very high clock speeds.( the PPC6XX is not
a true risc the DEC alpha comes closer to a true risc ). RISC has
penalties too you need more memory and more memmory bandwidth because
you need more instructions to do the same as a CISC instruction set.
: : Each of the
: : instructions in a RISC CPU are simpler than those in a CISC and take
: : fewer clock cycles to execute.
: Yup, but good branch prediction can go a long way toward removing
: the disadvantages of a long pipeline.
Which the pentium doesn't have. But good branch prediction doesn't do
a darn thing for data dependencies, for that you need specualtive execution
and register renaming, which the pentium doesn't have.
If I recall correctly, the 602, while not nearly as powerful as the other
60X PPCs (single-issue vs. double issue, i.e. only issues one instruction
per clock cycle to the various execution pipelines) the 602 does something
like twice the Specints that an '040 does, and 30% more Specints than a
'486DX2. That's a pretty fast CPU for an inexpensive game machine.
/>: Floating Point Math capability: 132 Mill floating point ops/sec
/
/>Math that deals with really big or really small numbers. Floating point
/>math is almost never used in PC games because it's slow. Perhaps the M2
/>is different somehow... Hopefully somebody else can clear this up.
/
/ Floating point gives you much better precision (you don't dis-
/card fractional components). I believe that using floating-point,
/while slow, makes it so the programmers don't have to worry about a
/jittery image caused by integer round off.
Actually, a 32 bit float trades off accuracy for range. A float32 can
represent much larger and much smaller numbers than an int32, but with
fewer significant digits.
--
Jason Nyberg (nyb...@ctron.com) My thoughts, my opinions.
Cabletron Systems, Inc. Merrimack NH
On the surface, in the air, under water, I'll be there!
Ahh, that's twice as fast as a 33MHz '040, and 25% faster than a 66MHz
'486DX2.
{Long ago I, Dave Nagy, wrote the following passage...}
: />You know about this already... I think this CPU is about as powerful as a
: />slow 486. Maybe 3-4 times as fast as the CPU in the 3DO now. (?)
{And much discussion followed.}
: If I recall correctly, the 602, while not nearly as powerful as the other
: 60X PPCs (single-issue vs. double issue, i.e. only issues one instruction
: per clock cycle to the various execution pipelines) the 602 does something
: like twice the Specints that an '040 does, and 30% more Specints than a
: '486DX2. That's a pretty fast CPU for an inexpensive game machine.
Could you or someone else post the SpecInt "chart" that compares all
the various processors? Somebody on this group has one, and I'd be
interested in seeing how I mis-interpreted it originally.
Dave Nagy
The most impressive point concerning the Powerpc 602 is its price to
performance ratio. With a Specint92 of over 40, Mr. Hawkins stated 45
spec rating, this will be the important factor for all present and future
3DO owners. Present cpu's providing this level of performance cost
$300 or more CISC. While the R4300i mips for Nintendo will run about
$50 as a RISC. At present they has been no announced price point for the
602. However, the sections in the 603 that are used for graphic and video
instruction have been stripped out of the 602. Those sections can
account for up half of the interger performance for a CPU. If we take
this into account, the performance will be far ahead of a 100 mHz Pentium
for logical operations such as instruction. The internal 32 bit registers
are supported by the external 64 data and address bus (multiplexed) for
superscalar multipipelined operations to the 8 special graphics and sound
chips will add up to well over 1 billion operations per second. This is a
very impressive number considering it can take as little one operation
to execute a very simple instruction, while several is the normal
requirement. The Mips R4200 and R4300i will retain the instruction
for generating graphics and matrix calculations so it can be assumed
that graphics will be tasked by the CPU U64. The M2 will be a monster
graphics machine when you look at the architecture.
>: : speed: 66 MHz RISC >
>: The CPU runs at 66 million "cycles" per second. RISC means "Reduced
>: Instruction Set Computer". (don't ask)
>RISC means Reduced Instruction Set Computer. It uses fewer, and less
>specific instructions than a CISC (Complex Intruction Set Computer)
>computer like a Pentium, thus making it more versatile. Each of the
>instructions in a RISC CPU are simpler than those in a CISC and take
>fewer clock cycles to execute.
>The MPC 602 can execute instructions out of order to make
>better use of it's power, and I believe it can execute 3 instructions
>simultaniously.
>--
>-Shawn Rader
That is correct Shawn, both the floating point and interger unit are
superscalar and able to execute up to 2 instructions per cycle. Again
instructions pertaining to graphics have been stripped from the 602 and
replaced by the 3DO graphic specialized processors. In essence the M2
represents a new microprocessor technology whose overall graphic
performance far surpasses any single CPU for graphic performance. The
R3000 CPU in the Playstation I believe is rated at about 9-13 Specint 92.
At 66 mHz the 602 will do a total of 264 MFLOPS alone but don't forget to
consider the entire Architecture of M2.
Aloha,
Mike Sone
Sure thing, Chris.
When writing programs that involve lots of floating-point calculations
(that include all 3-D graphics games), it is much easier and more
natural to use floating-point representation than to use fraction
values based on integer data types. Unfortunately, most
general-purpose CPUs, excluding the ones used in high-end workstations,
lack sufficient floating-point processing power for realtime 3D games.
So, it is typical for programmers to use integer-based representation
to do floating-point math. This technique is used in all current
console-based games and almost all PC-based games.
This is somewhat awkard but effective for CPUs that have lopsided
integer vs. floating-point performance. What you lose is precision,
ease of programming, and extra CPU cycles for float-to-fractions
conversions (how much depends on the architecture).
However, these limitations are removed if you use a processor or
special hardware that cranks out respectable floating-point
performance. And this is what 3DO is trying to achieve with PPC602.
In a generic 3D graphics engine, every vertex in your 3D world has to
be multiplied by a 4-by-4 transformation matrix. That means each point
has to go through 16 multiplys and 12 adds to get the result. I'm not
a 3D graphics expert, so I don't know about any optimization that can
be done, but this is the textbook scenerio. So when you are talking
about a complicated 3D world which you have to update 20-30 times a
second, you're talking about a significant amount of floating-point
calculations. And this is just the transformation part to get
everything projected on the screen correctly, that's excluding color
calculations like shading, lighting, etc.
William Hsu
The 3DO Company, Product Engineering
|>> So what makes Floating Point operations more effective than Interger based?
|>> Anyone?
|>Sure thing, Chris.
<massive info deleted>
|>William Hsu
|>The 3DO Company, Product Engineering
Thanks a lot, its nice to get Information straight from the source ;-)
Well, I'm not going to say you're full of it, but as you suspected you
are definitely wrong about that. Accessing external memory is done to
fill the cache and that is done by issuing an address after which the
data streams into the CPU in large blocks without another address having
to be done.
The bandwidth is very close to 528Mb/sec, as 3DO has stated.
Wayne.
I think I know which chart you're referring to... :)
While a fast memory subsystem can help a processor reach its full potential,
that potential happens to be a hard ceiling which can't be exceeded. A good
cache system can get hit-rates well above 90% in some applications, making
a non-optimal memory system less of a problem. As much as I'd like to claim
otherwize, I don't think that the 602 will perform nearly as well as a 90-120
MHz Pentium... Let's face it, the 'x86 line may not be the most elegant set
of processors, but Intel has pumped tons of money into the architecture. Not
to say that it doesn't have problems... (I.e. asymmetrical superscalar arch,
poor floating-point performance, bubbles in the exec. pipelines, etc...)
/: The MPC 602 can execute instructions out of order to make
/: better use of it's power, and I believe it can execute 3 instructions
/: simultaniously.
/
/I thought it was two... But you know how confused I get. :)
The PPC602 (like the 601, 603, and 603e) has 4 execution units: branch,
load/store, integer, and FP. The 601/603/603e can "issue" (start down an
exec. pipeline) 2 instructions per clock cycle, while the 602 has had it's
instruction scheduling system trimmed down, allowing it to issue only a
single instruction per clock cycle. This is going to make it quite a bit
slower than it's better-equipped siblings. Remember though, that those
siblings are some of the fastest processors you can get in a home computer
right now! The 602 may not be as fast as them, but it's faster than anything
you saw in a home computer a year or two ago.
BTW, the 602's 64 bit data bus is multiplexed with the 32 bit address
bus, which means that the 602 will have to issue an address on one clock
cycle, and pull the data down on the next (AFAIK) giving the CPU an
effective data-bandwidth of 64 bits X 33MHz = 264 Mbytes/sec. (If anyone
knows I'm full of it here, please let me know!) Since no-one knows the
system architecture, I can't say what kind of data-bandwidth the rest of
the system can acheive...
Also, as someone else mentioned a while ago, the 132Mflops # is probably
derived from the fact that a multiply/accumulate operation could be counted
as two floating point operations, since we know that the 602 can't issue
more than 1 instruction/clock-cycle. This case is no different from the
rest, take any SpecInt/SpecFP/MIPS/MFLOPS numbers with a big grain of salt.
: I think I know which chart you're referring to... :)
: While a fast memory subsystem can help a processor reach its full potential,
: that potential happens to be a hard ceiling which can't be exceeded. A good
Just for the record specint and specfp and the others spec numbers are
rating of SYSTEMs not processors. They are comprised of a suite of benchmarks
which accumulated make up the spec numbers. They include things like gcc a
compiler which has nothing to do with what 3DO is using it for. Most of the
numbers I have seen published on intel processors include a lot of secondary
cache, a lot of DRAM and in some cases more than one SCSI disk subsystem.
: cache system can get hit-rates well above 90% in some applications, making
: a non-optimal memory system less of a problem. As much as I'd like to claim
: otherwize, I don't think that the 602 will perform nearly as well as a 90-120
: MHz Pentium... Let's face it, the 'x86 line may not be the most elegant set
: of processors, but Intel has pumped tons of money into the architecture. Not
: to say that it doesn't have problems... (I.e. asymmetrical superscalar arch,
: poor floating-point performance, bubbles in the exec. pipelines, etc...)
: /: The MPC 602 can execute instructions out of order to make
: /: better use of it's power, and I believe it can execute 3 instructions
: /: simultaniously.
: /
: /I thought it was two... But you know how confused I get. :)
: The PPC602 (like the 601, 603, and 603e) has 4 execution units: branch,
: load/store, integer, and FP. The 601/603/603e can "issue" (start down an
: exec. pipeline) 2 instructions per clock cycle, while the 602 has had it's
: instruction scheduling system trimmed down, allowing it to issue only a
: single instruction per clock cycle. This is going to make it quite a bit
: slower than it's better-equipped siblings.
I wouldn't say quite a bit slower. By making it simpler you can construct it
more cheaply and also up the clock rate more easily. I would expect to see
higher clock speeds fairly soon, they even hinted at that in the "roadmap"
for the end of the year.
: BTW, the 602's 64 bit data bus is multiplexed with the 32 bit address
: bus, which means that the 602 will have to issue an address on one clock
: cycle, and pull the data down on the next (AFAIK) giving the CPU an
: effective data-bandwidth of 64 bits X 33MHz = 264 Mbytes/sec. (If anyone
: knows I'm full of it here, please let me know!).
Sorry, to say but memory subsytems are no usually that fast for instance
Motorola CPU to memory subsystems using 70ns DRAM and a 66MHz memory
bus gets:
Number of Cycles
Single Cycle Access (Read/Write) 8/8
Read Burst Mode 8-4-4-4
Write Burst Mode 8-4-4-4
So, for a read burst it takes 8 cycles for the first 64-bits and
4 cycles there after. ( This is pretty much hoe most code
gets faulted into cache )
Now this may be faster on the 3DO beacuse of the extra mode 3DO had them put
into the design of the chip so that it does not have to do virtual to physical
mapping and a couple of other considerations like not having to deal with a
PCI bus.
: Since no-one knows the
: system architecture, I can't say what kind of data-bandwidth the rest of
: the system can acheive...
: Also, as someone else mentioned a while ago, the 132Mflops # is probably
: derived from the fact that a multiply/accumulate operation could be counted
Yes, and in doing your matrix trasformation this instruction is used all over
the place.
: as two floating point operations, since we know that the 602 can't issue
: more than 1 instruction/clock-cycle.
I don't understand your point here, It is 2 operations and it is 1 instruction.
The goal of a good programmer is to perform all the needed operations in as
few instruction as you can.
By the way, if you had to do this in a scaled integer form you would probably
need to do something like :
shift a ; scale
shift b ; scale
mult a,b ; muliply
shift a ; scale
add c,a ; accumulate
shift c ; scale
6 integer instructions to equal 1 floating point instruction
: This case is no different from the
: rest, take any SpecInt/SpecFP/MIPS/MFLOPS numbers with a big grain of salt.
True.
>: BTW, the 602's 64 bit data bus is multiplexed with the 32 bit address
>: bus, which means that the 602 will have to issue an address on one clock
>: cycle, and pull the data down on the next (AFAIK) giving the CPU an
>: effective data-bandwidth of 64 bits X 33MHz = 264 Mbytes/sec. (If anyone
>: knows I'm full of it here, please let me know!).
>
>Sorry, to say but memory subsytems are no usually that fast for instance
>Motorola CPU to memory subsystems using 70ns DRAM and a 66MHz memory
>bus gets:
>
> Number of Cycles
>Single Cycle Access (Read/Write) 8/8
>Read Burst Mode 8-4-4-4
>Write Burst Mode 8-4-4-4
>
>So, for a read burst it takes 8 cycles for the first 64-bits and
>4 cycles there after. ( This is pretty much hoe most code
>gets faulted into cache )
>
>Now this may be faster on the 3DO beacuse of the extra mode 3DO had them put
>into the design of the chip so that it does not have to do virtual to physical
>mapping and a couple of other considerations like not having to deal with a
>PCI bus.
John,
You are obviously an engineer, however, your information is outdated. 3DO
is using Synchronous DRAM in their design. Byte magazine has a series of
articles (starting on page 185) this month on all the new RAM technologies.
If you don't keep up in this business, your posts will sound as obsolete
as talking about ferrous core memories from the 1960's.
Wayne.
>Sorry, to say but memory subsytems are no usually that fast for instance
>Motorola CPU to memory subsystems using 70ns DRAM and a 66MHz memory
>bus gets:
>
> Number of Cycles
>Single Cycle Access (Read/Write) 8/8
>Read Burst Mode 8-4-4-4
>Write Burst Mode 8-4-4-4
>
>So, for a read burst it takes 8 cycles for the first 64-bits and
>4 cycles there after. ( This is pretty much hoe most code
>gets faulted into cache )
>
Actually, the M2 is not using a standard DRAM/Static RAM setup. It is has
a bus speed of 528 MB/sec, and the specs seem to indicate that is uses
SDRAM (Synchronous DRAM). At this bus speed, it should be capable of
fetching 64bits per cycle. The 602's architecture may be inherently
incapable of this, but it won't be the memory slowing it down.
I have some technical/background information about the type of RAM that
the M2 uses (SDRAM) and the RAM the Ultra 64 uses (RamBUS). I posted this
information once before, but I can post it again if it got lost in the
shuffle...
I pulled this off of Chris Long's 3do page, the entry itself pulled off
this newsgroup several months ago. Check out the name of the original
author... :)
PS: although they aren't attributed as such, I believe the "int" and "fp"
columns are SPECint and SPECfp numbers...
"""""""""""""""""""""""""""""
From nyb...@ctron.com (Jason W. Nyberg) Tue Feb 14 11:06:44 1995
Path: dziuxsolim.rutgers.edu!uunet!noc.near.net!ctron-news.ctron.com!boost!nyberg
From: nyb...@ctron.com (Jason W. Nyberg)
Newsgroups: rec.games.video.3do
Subject: PPC602 comparison
Date: 14 Feb 1995 16:06:44 GMT
Organization: Cabletron_Inc.
Lines: 33
Sender: nyberg@boost (Jason W. Nyberg)
Distribution: world
Message-ID: <3hqkek...@ctron-news.ctron.com>
NNTP-Posting-Host: boost.ctron.com
EE Times says that the PPC 602 turns out 40 SPECint92 at 66MHz.
This is an experpt from the PPC FAQ:
/Processor Clock Cache int fp System
/------------ ------- ------------ ----- ----- ---------------------
/MPC601 50 MHz 0/32k 41.7 51.0 IBM RS/6000 N40
/ 66 MHz 0/32k 62.6 72.2 IBM RS/6000 250
/ 66 MHz 0/32k 63.7 67.8 IBM RS/6000 40P
/ 66 MHz 256k/32k 75.1 77.0 IBM RS/6000 40P
/ 80 MHz 0/32k 78.8 90.4 IBM RS/6000 250
/ 80 Mhz 0.5M/32k 88.1 98.7 IBM RS/6000 41T & 41W
/ 80 Mhz 1M/32k 90.5 100.8 IBM RS/6000 C10
/MPC601+ 100 MHz ?/32k 105 125 ? estimate
/MPC603 66 MHz 1M/8k/8k 60 70 Motorola estimate
/ 80 MHz 1M/8k/8k 75 85 Motorola estimate
/MPC604 100 MHz 1M/16k/16k 160 165 Motorola estimate
/MPC620 133 MHz ?/32k/32k 225 300 estimate
/i486DX2 66 MHz 256k/8k 32.2 16.0 Compaq Deskpro
/i486DX4 100 MHz 256k/16k 51.4 26.6 Micronics M4P PCI
/Pentium 66 MHz 256k/8k/8k 65.1 63.6 Compaq Systempro/XL
/Pentium 90 MHz 512k/8k/8k 90.1 72.7 Intel XPRESS
/Pentium 100 MHz 512k/8k/8k 100.0 80.6 Intel XPRESS
/68040 33 MHz ? 18 13 Mac Q950
/68040 33 MHz ? 20.3 ? Mac Q800
Twice the speed of a 33MHz '040, 25% faster than a 486DX2... The 602
may not be near the top of the PPC heap, but it's no slouch.
As you well know the superscalar architecture common to both the PowerPC
and the Pentium are basically an additional processor in the unit tied
together on a common bus. The very fast 528 Mb/sec bus in the M2 unit
will allow it to be considered as a processor in itself. While the
advantages of having an instructions set the customized for fast
execution makes RISC more important.
Lets not forget that the pentium retains the instruction set for
compiling in BASIC. The RISC instruction set for the PowerPC 602 will
have a C complier. As you well know running a program in Basic is very
very slow. So the advantage of the additional instructions is worthless
when considering the comparison between RISC and CISC processor. So
interger performance when running a less efficent language is worthless.
The interger performance of the 602 is based on the 603 with the
exception of the registers. The 603 is a very powerful processor in
comparison with the pentium. But lets look at the M2 architecture more
in depth. The intent in M2 is to produce some very fast graphics. So it
would be better to design separate graphics and sound hardware and
pipeline them back to the CPU (603). Now the 602 can run up to one
instruction per clock cycle into its interger, floating point, branch
processing unit and load/store. Why was the registers removed? Actually
it wasn't, we now have to consider the registers in the pipelined
processors that share the 528 Mb/sec BUS. These units will also execute
instructions for the effects like texture mapping, shading, lighting,
MIP mapping, video and sound. The instructions for graphic are now
being piped to these units. So we now have to, when considering M2,
include these as additional units in the overall design. Moreover,
considering that these units are able to execute instructions issued
from the 602 they must be included in the comparison.
>
>: Bottlenecks to memory depend on a lot more than bus bandwidth. High
>: bandwidth mostly helps with long sequential access. Random access
>: depends more on memory latency.
>
>Yes, but most programs are not very random at all. The instruction execution
>is more or less sequential and the local variable heaps are clustered on top
>of the stack. This is why caches work at all. The size of the cache is very
>important also.
Again lets not forget that the high bandwidth on the multiplexed 64 bit
data and address bus is there to service the M2's overall superscalar
architecture. The key ingredient to the mix of units in the M2 is the
intergration of the graphical instructions to the 602's subset of
pipelined instructions. This would allow for the very fast graphics
performance superior to the 603 alone.
>
>: Frankly, depending on the application, pentiums will easily outperform this
>: processor.
>
>Maybe the most high-end pentiums but not most pentiums. Pentiums are not
>very superscaler they are constently stalling from branches and data
>dependencies. PPC6XX series are far better on a mix if integer/floating
>point instructions.
For the applications that must be emulated in software such as 3D graphics
and sound the Pentium and 603 couldn't keep up even at twice the clocking
speed. The matrix calculation for 2D graphic like windows are supported
by them but not the specialized 3D algorithims such texture and MIP
mapping or Gourand shading. As you well understand, applications the tax
system performance the most are graphically intense games and CAD
programs for the most part. You really couldn't see any differance in a
word processing application.
>
>: : : : speed: 66 MHz RISC >
>: : : The CPU runs at 66 million "cycles" per second. RISC means "Reduced
>: : : Instruction Set Computer". (don't ask)
>: : RISC means Reduced Instruction Set Computer. It uses fewer, and less
>: : specific instructions than a CISC (Complex Intruction Set Computer)
>: : computer like a Pentium, thus making it more versatile.
>
>: Haha. I suggest you read the article on cisc vs risc in the latest
>: microprocessor reprot. Apparently decoupled superscalar design
>: has made the differences less significant than they used to be.
>
>yes adding extra hardware can speed things up and so can uping the
>clock speed. The major difference is that risc is easier to design
>and a true risc can have very high clock speeds.( the PPC6XX is not
>a true risc the DEC alpha comes closer to a true risc ). RISC has
>penalties too you need more memory and more memmory bandwidth because
>you need more instructions to do the same as a CISC instruction set.
I agree with your above comment as a whole, however the additional
instructions in CISC are not really used that often. High BUS bandwidth
will shuttle the increase instructions load in M2 with lots of room to spare.
But when instructions per second count is to be compared, then the
overall 10 processors must be included. Consider this, the RISC approach
in will execute 1 instruction per clock cycle. The M2 will have 10
processor including the 602. So at 66mHz the possible number of
instructions per second will be about 660 million instructions per second.
Just on the basis of instructions per second the M2 is way ahead. Add the
fact the these units are specialized to run the various algorithims that
are required make M2 a wonder of all its' own.
>
>: : Each of the
>: : instructions in a RISC CPU are simpler than those in a CISC and take
>: : fewer clock cycles to execute.
>
>: Yup, but good branch prediction can go a long way toward removing
>: the disadvantages of a long pipeline.
>
>Which the pentium doesn't have. But good branch prediction doesn't do
>a darn thing for data dependencies, for that you need specualtive execution
>and register renaming, which the pentium doesn't have.
>
>--
>------------------------------------------------------------------------
>John Shamilian
>j...@molson.ho.att.com
>------------------------------------------------------------------------
Yes, the very nature of the register units in these processors show
limited usefulness of the 602 in the applications other than what the 602
has been designed for. In my opinion the 602 would be almost useless for
todays market in the general purpose processor industry if not
intergrated into the overall architecture of M2. The question now is,
who else besides 3DO will choose to intergrate onto the base 602 for
graphical applications?
Aloha,
Mike Sone
>Mike Sone (mi...@waena.mrtc.maui.com) wrote:
>: Lets not forget that the pentium retains the instruction set for
>: compiling in BASIC. The RISC instruction set for the PowerPC 602 will
>: have a C complier. As you well know running a program in Basic is very
>: very slow. So the advantage of the additional instructions is worthless
>: when considering the comparison between RISC and CISC processor. So
>: interger performance when running a less efficent language is worthless.
>Does this make any sense to you "experts"? It sounds kinda strange to
>me--but I'm a computer science novice.
> Dave Nagy
I wouldn't call myself an expert, but it makes no sense to me at all.
Care to expand on it Mike?
Simon Powers
--
G: Never mind all that, take a card. All opinions are my own
D: Card? What do I do with the card? and not BT's.
G: You can keep it I've got 51 left.
Duck Soup
I'll take a stab at what he might be saying:
The pentium processor must compile basic code to maintain backwards
compatibilty. The Basic compiler in DOS is a CISC compiler, i.e. it
generates CISC machine code. CISC is much slower than RISC in many
aspects, but easier to generate code in (and hand code). The pentium
also comes with a RISC instruction set that allows some heavy
optimizations. You can get C compilers that will generate RISC code
for the pentium (ever see that sticker on the box of games that says
"optimized for pentium")? So you can get a large performance gain.
Peace,
-Rich
--
Rich Barrette rbar...@zatharusta.cs.ohiou.edu
http://www.ohiou.edu/~rbarrett/webaholics/ver2/
: j...@molson.ho.att.com (jhs) writes:
: >: BTW, the 602's 64 bit data bus is multiplexed with the 32 bit address
: >: bus, which means that the 602 will have to issue an address on one clock
: >: cycle, and pull the data down on the next (AFAIK) giving the CPU an
: >: effective data-bandwidth of 64 bits X 33MHz = 264 Mbytes/sec. (If anyone
: >: knows I'm full of it here, please let me know!).
: >
: >Sorry, to say but memory subsytems are no usually that fast for instance
: >Motorola CPU to memory subsystems using 70ns DRAM and a 66MHz memory
: >bus gets:
: >
: > Number of Cycles
: >Single Cycle Access (Read/Write) 8/8
: >Read Burst Mode 8-4-4-4
: >Write Burst Mode 8-4-4-4
: >
: >So, for a read burst it takes 8 cycles for the first 64-bits and
: >4 cycles there after. ( This is pretty much hoe most code
: >gets faulted into cache )
: >
: >Now this may be faster on the 3DO beacuse of the extra mode 3DO had them put
: >into the design of the chip so that it does not have to do virtual to physical
: >mapping and a couple of other considerations like not having to deal with a
: >PCI bus.
: John,
: You are obviously an engineer, however, your information is outdated. 3DO
: is using Synchronous DRAM in their design. Byte magazine has a series of
: articles (starting on page 185) this month on all the new RAM technologies.
From what I understand SDRAM is simply a synchronous DRAM, that is the memory
uses the system clock. If used wisely can achieve a better "burst like
throughput" by possible overlapping row and column addresses. But just because
you are synchronized with the clock doesn't mean you can fetch 64-bits in one
clock cycle continuously.
If anyone has detailed info on SDRAM, post it here, I would be interested.
I find most magazine articles are very vague in their descriptions.
> In article <davenagyD...@netcom.com>,
> David Nagy <dave...@netcom.com> wrote:
> >Mike Sone (mi...@waena.mrtc.maui.com) wrote:
> >: Lets not forget that the pentium retains the instruction set for
> >: compiling in BASIC. The RISC instruction set for the PowerPC 602 will
The 586 retains the X86 instruction set to remain compatable with
previous X86 processors, regardless of how it was written. It also
adds a few of its own instructions to better facilitate optimization
and such, as well as adding a few new features.
> >: have a C complier. As you well know running a program in Basic is very
> >: very slow. So the advantage of the additional instructions is worthless
Running a program written in BASIC is not inherantly *that*
slow. Running a program through an interpreter is the slow-down
here. Compiled BASIC's are decent speed wise I hear (not that I use
them tho, I stick with C & Modula 3 for big things), so even tho
compiled BASIC programs are probably slower than compiled C programs,
you really can't make a large generalization like that.
> >: when considering the comparison between RISC and CISC processor. So
> >: interger performance when running a less efficent language is worthless.
> >
> >Does this make any sense to you "experts"? It sounds kinda strange to
> >me--but I'm a computer science novice.
I'm in school for Computer Engineering right now, so I ain't no
expert yet, and have just started with X86 machines at all.
> I'll take a stab at what he might be saying:
> The pentium processor must compile basic code to maintain backwards
> compatibilty. The Basic compiler in DOS is a CISC compiler, i.e. it
The 586 must compile BASIC code because it is backwards compatible
with the previous X86 line. The 586 is not a compiler however, it just
does whatever the software says to do, and the compiler says to compile.
> generates CISC machine code. CISC is much slower than RISC in many
Certain BASIC compilers generate CISC code, primarily because thats
the general family that the X86 line belings in.
> aspects, but easier to generate code in (and hand code). The pentium
YMMV.
> also comes with a RISC instruction set that allows some heavy
"also comes with a RISC instruction set". Cute.
> optimizations. You can get C compilers that will generate RISC code
> for the pentium (ever see that sticker on the box of games that says
> "optimized for pentium")? So you can get a large performance gain.
And nah, its not that the code gets compiled into a completely
different instruction set, its that the compiler lines the code up
different and uses additional 586-specific commands to allow the (I
think) quad issue (could be double issue, but I thought it was quad)
dual pipeline to remain full as much of the time as possible. I think
there are commands/code flows that help with predictive branching as
well. Anyways the basic commands are mostly the same, and any command
the 486 can perform, so can the 586. Well, there may be a few little
ones, but there shouldn't be. :)
Oh yea, register set. I think the 586-optimized code does something
with the register set too. :)
--
"This year will go down in history. For the first time, a civilized
nation has full gun registration! Our streets will be safer, our
police more efficient, and the world will follow our lead into the
future!"-Adolf Hitler,1935 * Sean Kellner * skel...@ddt.eng.uc.edu
This makes no sense. I'm not sure what he's trying to say, so he may
have had a valid point. But what is stated there is just not true.
CPU's deal with machine language. BASIC interpreters and C compilers
produce machine language for the CPU to execute.
The reason running a BASIC program is slower than running a C program
is largely due to the fact that BASIC is interpreted, i.e. every time
you run a BASIC program it is converted from BASIC to machine language
as it runs. Once compiled, the C program is machine language. If you
change the C source you have to recompile the program.
I have no idea what that has to do with the integer performance of a
CPU.
-Honus
--
hon...@cmu.edu, the NetBill project
Information Networking Institute, Carnegie Mellon University
"I'm not quite clear about what you just spoke--
Was that a parable, or a very subtle joke?" - Brad Roberts, Crash Test Dummies
No. Mike's babbling again.
Mike Sone (mi...@waena.mrtc.maui.com) wrote:
: Lets not forget that the pentium retains the instruction set for
: compiling in BASIC. The RISC instruction set for the PowerPC 602 will
: have a C complier. As you well know running a program in Basic is very
: very slow. So the advantage of the additional instructions is worthless
: when considering the comparison between RISC and CISC processor. So
: interger performance when running a less efficent language is worthless.
Does this make any sense to you "experts"? It sounds kinda strange to
me--but I'm a computer science novice.
Dave Nagy
>In article <davenagyD...@netcom.com>,
>David Nagy <dave...@netcom.com> wrote:
>>Mike Sone (mi...@waena.mrtc.maui.com) wrote:
>>: Lets not forget that the pentium retains the instruction set for
>>: compiling in BASIC. The RISC instruction set for the PowerPC 602 will
>>: have a C complier. As you well know running a program in Basic is very
>>: very slow. So the advantage of the additional instructions is worthless
>>: when considering the comparison between RISC and CISC processor. So
>>: interger performance when running a less efficent language is worthless.
>I'll take a stab at what he might be saying:
>
>The pentium processor must compile basic code to maintain backwards
>compatibilty.
Any processor could compile BASIC code, as long as it had a BASIC compiler.
>The Basic compiler in DOS is a CISC compiler, i.e. it
>generates CISC machine code. CISC is much slower than RISC in many
>aspects, but easier to generate code in (and hand code).
I wouldn't agree with that at all. RISC is much simpler to use and write.
The x86 instruction set is a complete pig.
>The pentium
>also comes with a RISC instruction set that allows some heavy
>optimizations. You can get C compilers that will generate RISC code
>for the pentium (ever see that sticker on the box of games that says
>"optimized for pentium")? So you can get a large performance gain.
No. The Pentium uses the x86 instruction set. It does not have a RISC
instruction set, although it uses RISC ideas in its internal design. If
its say optimised for the Pentium it means they've compiled it with a
compiler that can pair instructions up to move through it twin pipelines
together ( + all the other optimisations you can do).
>
>Peace,
>-Rich
>Mike Sone (mi...@waena.mrtc.maui.com) wrote:
>: Lets not forget that the pentium retains the instruction set for
>: compiling in BASIC. The RISC instruction set for the PowerPC 602 will
>: have a C complier. As you well know running a program in Basic is very
>: very slow. So the advantage of the additional instructions is worthless
>: when considering the comparison between RISC and CISC processor. So
>: interger performance when running a less efficent language is worthless.
>Does this make any sense to you "experts"? It sounds kinda strange to
>me--but I'm a computer science novice.
It's sounds like a bunch of complete horseapples. And I design CMOS
processors, so I think I am qualified to comment. :)
I'm also familiar with the architecture of the PPC's and of the Intel
devices.
> Dave Nagy
--
| Chrispy | Dodge Omni GLH Turbo | Amiga Forever! |
| School: Drexel University | 190hp, 2350lbs, 2.2L | (Commo-who?) |
| Workplace: Unisys Corporation | 0-60 in 6.2 s, cheap |_________________|
| (CMOS Processor Designer) | to insure, looks like hell :) |
You're right, I had forgotten about burst-mode access, i.e. one request
leads to several cycles of data in a row. The CPU will have access to
somewhat less than max bandwidth, and anything else that accesses memory
could presumably still use separate addr. and data lines, for full band-
width access. I've been told by someone who knows that the M2 is going
to use a DMA based architecture, like the M1.
Simon the x86 code was first developed with programmers in mind. The
early rational for including instructions that would assist in compiling
code that was written in Basic was to increase the availabilty or number
of innovative applications for the PC. As programmers became more
skilled, basic was more or less abandoned in favor of C. Some of the
instructions that were once often used in basic was less often used in C.
This is the kind of rational that both 3DO and Sony have used in their
C development systems. Although many of the programs will run faster in
assembly, C has a much bigger following. So if software is written in C
it becomes alot more portable. You'll end up seeing alot of games
ported each way when C is used.
Now when the instruction set is reduced, careful consideration of which
instructions will be retain or thrown out had to be made. Instructions
that were once useful for Basic was now not considered a waste. To many
operations were required for one instruction to be executed. For the ARM
the instructions were reduced to less than 40 while the x86 instruction set
has more than a hundred. The PowerPC 602 has over a hundred instructions
but if you eliminate the speech and handwriting recognition instruction
it falls into the RISC set category.
Now whenever code is written for a specific type of processor the
assembly code becomes a little different for it because of the instruction
set that it has. However, a language such as C is developed to execute
for a given assembler which is native to a processor type. To level
performance comparisons the Spec performance uses common algorthims that
are written in C and then run them against a bench mark having known
performance. A processor that has retain instructions that were once used
to optimize performance a program written in Basic is now not even being
measured. The same may be said for the PowerPC 602 having a set of
instructions set aside for speech and handwriting recognition.
These differences in processor types makes performance comparisons a bit
unclear. If specific performance is to be compared for a purpose such as
3D performance or some other specialized functions then more specific
instruction types must be targeted, say interger and floating point
performance. But what about the performance features that are left out?
Now that becomes apples and oranges.
In my opinion the dollar cost per specint or specfp is also very important.
This is were the RISC and CISC processors diverge. Any way all I wanted
to say was what exactly are you comparing? Processors or a specific
performance for a given programming language. Maybe I didn't follow the
thread close enough.
Aloha,
Mike Sone
Now THAT, I think I understood! Thanks Mike...
DaveNagy
It sounds rational enough, the problem is that it simply isn't
true. The CPU was developed without cooperation with developers
of a "basic compiler" or even interpreter, for that matter.
None of the instructions are high level enough to be considered
specific to any given language. An example of CISC vs. RISC
technology would be the difference between a CPU that implements
instructions with a load / modify / store cycle (such as an
indstruction to implement adding a number to a value stored in
memory) vs a CPU that allows only one of these three operations,
forcing three instructions to be used to fetch, modify and then
store the value.
The latter approach, typical of RISC designs, is MUCH easier to
streamline and offered significant performance benefits as a
result. This trade-off has NOTHING to do with Basic vs. C or any
other language.
CPU designs to assist in language-specific code execution are
rare, even today. LISP machines were perhaps the most prevalent,
and those hardly made a dent in any market except AI research.
--
-------------------------------------------------------------------------
Blake W. Stone bst...@arcane.com
Object Addict - Arcane Systems Ltd. 'Twas brillig, and the slithy toves
Publishers of ThreadKit Did gyre and gimble in the wabe...
Correct, and SpecINT, SpecFP, etc. are all extremely compiler dependent,
i.e. one may be able to fit an entire benchmark into cache while another
can't. Take all Spec numbers with a grain of salt, they are by no means
definitive. Remember that the 602 benchmark figures that we know about are
simulated, and Motorola has been known to be a little optimistic about
their numbers when it comes to simulating specs... Still, the 602 ought
to be a damn fine processor for a console.
/: The PPC602 (like the 601, 603, and 603e) has 4 execution units: branch,
/: load/store, integer, and FP. The 601/603/603e can "issue" (start down an
/: exec. pipeline) 2 instructions per clock cycle, while the 602 has had it's
/: instruction scheduling system trimmed down, allowing it to issue only a
/: single instruction per clock cycle. This is going to make it quite a bit
/: slower than it's better-equipped siblings.
/
/I wouldn't say quite a bit slower. By making it simpler you can construct it
/more cheaply and also up the clock rate more easily. I would expect to see
/higher clock speeds fairly soon, they even hinted at that in the "roadmap"
/for the end of the year.
I personally would say quite a bit slower, and my guess is reflected in
the numbers... Cheaper doesn't mean faster, cheaper means cheaper.
Also, the 602 in the M2 will be 66MHz. Period. It won't get any faster,
at least not in this generation of 3do's hardware, and not if 3do wants
to risk the whole "standard" thing... Remember, we're talking 3do system
here. I don't know about you, but I don't want a 602 in any PC I buy...
/: BTW, the 602's 64 bit data bus is multiplexed with the 32 bit address
/: bus, which means that the 602 will have to issue an address on one clock
/: cycle, and pull the data down on the next (AFAIK) giving the CPU an
/: effective data-bandwidth of 64 bits X 33MHz = 264 Mbytes/sec. (If anyone
/: knows I'm full of it here, please let me know!).
/
/Sorry, to say but memory subsytems are no usually that fast for instance
/Motorola CPU to memory subsystems using 70ns DRAM and a 66MHz memory
/bus gets:
/
/ Number of Cycles
/Single Cycle Access (Read/Write) 8/8
/Read Burst Mode 8-4-4-4
/Write Burst Mode 8-4-4-4
/
/So, for a read burst it takes 8 cycles for the first 64-bits and
/4 cycles there after. ( This is pretty much hoe most code
/gets faulted into cache )
/
/Now this may be faster on the 3DO beacuse of the extra mode 3DO had them put
/into the design of the chip so that it does not have to do virtual to physical
/mapping and a couple of other considerations like not having to deal with a
/PCI bus.
Actually the M2 will be using really fast SDRAM, from all accounts. This
(from what I've heard) will mean that there won't be any wait states, you
just send an address and pick up the data on the next cycle. As far as the
602's multiplexed bus: I've already been corrected by a 3do engineer about
that: I hadn't accounted for burst access which will be used to fill cache
lines, giving the 602 several times more effective bandwidth to memory.
/: Since no-one knows the
/: system architecture, I can't say what kind of data-bandwidth the rest of
/: the system can acheive...
/
/: Also, as someone else mentioned a while ago, the 132Mflops # is probably
/: derived from the fact that a multiply/accumulate operation could be counted
/
/Yes, and in doing your matrix trasformation this instruction is used all over
/the place.
Correct, and that's a good thing. I was just pointing out one more thing
to watch out for when comparing "specs."
/: as two floating point operations, since we know that the 602 can't issue
/: more than 1 instruction/clock-cycle.
/
/I don't understand your point here, It is 2 operations and it is 1 instruction.
/The goal of a good programmer is to perform all the needed operations in as
/few instruction as you can.
The multiply/accumulate operation is one instruction which happens to do two
things. Other instructions (plain ol' multiply, add, subtract, etc.) can only
be issued once per clock cycle, for a maximum of 66 million adds per second,
as opposed to 132 million that the mflops "spec" implies.
/By the way, if you had to do this in a scaled integer form you would probably
/need to do something like :
/
/shift a ; scale
/shift b ; scale
/mult a,b ; muliply
/shift a ; scale
/add c,a ; accumulate
/shift c ; scale
/
/6 integer instructions to equal 1 floating point instruction
The 602 has a relatively fast floating point unit, and ought to compare
favorably to the Pentium's. And beat the living crap out of the P5's when
you consider price.
/: This case is no different from the
/: rest, take any SpecInt/SpecFP/MIPS/MFLOPS numbers with a big grain of salt.
/
/True.
I'm sorry it makes no sense. What I was pointing to is the fact that the
machine code that is executed by one cpu is not quite the same for another.
As a matter of fact processors can be made so that they have a very
limited fuctionality. This is the case with the coprocessors that are
designed to optimize numeric functions and perhaps some of M2's
coprocessors.
The debate of M2's performance based solely on the CPU is unfair to the
overall scheme of the design.
>
>The reason running a BASIC program is slower than running a C program
>is largely due to the fact that BASIC is interpreted, i.e. every time
>you run a BASIC program it is converted from BASIC to machine language
>as it runs. Once compiled, the C program is machine language. If you
>change the C source you have to recompile the program.
correct
>
>I have no idea what that has to do with the integer performance of a
>CPU.
>
Now thats a good topic. Does anyone know how much difference integer
operations will make to the overall performance of the M2 meeting its spec's.
Or would anyone care to comment on why or why not the Spec's can be met?
I'm curious.
Aloha,
Mike Sone
Hi Mike. While I often don't understand everything you post, and
: I'm sorry it makes no sense. What I was pointing to is the fact that the
: machine code that is executed by one cpu is not quite the same for another.
: As a matter of fact processors can be made so that they have a very
: limited fuctionality. This is the case with the coprocessors that are
: designed to optimize numeric functions and perhaps some of M2's
: coprocessors.
That's certainly true. At least 3DO isn't pulling an Atari and saying
that their blitter "can do 100 MIPS". I (as a newbie) would _think_ that
the specs on the 602 would be a little more... interpretable. I guess
Specints aren't that valuable a gauge of "real world" performance, but it
appears that most folks regard the 602 as being a pretty decent
all-around CPU with "P5-like" performance.
: The debate of M2's performance based solely on the CPU is unfair to the
: overall scheme of the design.
I should say... To an untrained eye, (mine) it would appear that the CPU
is only accounting for a small portion of the M2's "apparent" power. I'd
say it appears to be ~10 times as capable (at 3D rendering) as a 66MHz P5
PC, which of course relies on the CPU for just about everything.
The M2 is probably no more "dependant" on its CPU than the Jag is on its
little 68000. (A scary thought considering how much more powerful the PPC
chip is...)
: Now thats a good topic. Does anyone know how much difference integer
: operations will make to the overall performance of the M2 meeting its spec's.
: Or would anyone care to comment on why or why not the Spec's can be met?
: I'm curious.
If I understand you,(doubtful) can't you just find some commonly used
software-only rendering benchmark--like the Utah teapot or something--
and compare the M2's reported performance to it? Say, if a PowerMac can
render 150K poly/sec... (I'll let someone who knows what they're talking
about finish that sentence) :)
DaveNagy
yea but your TV will not support that resolution. 640x480 will blow your
mind when you see it.
:/ Jive Baby can always be reached at :/
:/ Jive...@aol.com :/
:/ or Mi...@intrepid.chm.jhu.edu :/
>Can M2 have better resolution than 640x480? I want better Res!
>~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>I've been wondering for a bit and reading this ultra 64 specs from
>EGM (i dunnda if its true) Ultra 64 can go to 1200X1200!!
> I dont think my computer can go that high!
>But i dunno.. Egm also says its 100 000 polygons a sec. isn't it more
>like 900000?
The Ultra 64 cannot do 1200x1200 on a normal television... It simply
isn't possible. 640 x 480 is about the best you can on a TV. I have seen
claims that the U64 can do anywhere from 100,000 to 1,300,000 polys per
sec (the 1,300,00 rumor started shortly after the M2 info was released. :)
). Since Nintendo has not seen fit to release ANY polygon performance
info, and since EGM is usually full of shit, you should take this info
with a grain of salt.
You have a Sony XBR too eh? Anyway, as far as anything over 640*480, I
wouldn't care much about. In fact, I would be surprised if the 640*480
was used that often. Why? Well, if they're using 32,000 colors at 320*200
and it uses a certain amount of memory, what do you think of 640*480? You
have an amiga so I'm sure you'll remember how big the hi-res pics got with
lot's of colors (if you had an AGA amiga that is). And with just 4 megs
of ram... Well... it might be used sometimes, but we'll see.
--
Smile
Rob C.
: You have a Sony XBR too eh?
Yup. :) An older one that has the RGB connector but no S-video inputs.
Lemme know if anybody hears of a fairly cheap s-video-to-RGB adaptor. I
know that some older Mitsibishi sets had and optional cartridge doohickey
that would perform that function...
: Anyway, as far as anything over 640*480, I wouldn't care much about.
Uh huh. I believe one can digitize video at a frame-size of 320x480 and
have it look essentially "perfect". (as good as video ever looks)
: In fact, I would be surprised if the 640*480
: was used that often. Why? Well, if they're using 32,000 colors at 320*200
: and it uses a certain amount of memory, what do you think of 640*480? You
: have an amiga so I'm sure you'll remember how big the hi-res pics got with
: lot's of colors (if you had an AGA amiga that is). And with just 4 megs
: of ram... Well... it might be used sometimes, but we'll see.
It seems like this would be mostly a video RAM issue (chip RAM on an
Amiga). Would a game that produced 640x400 graphics in fact use more
memory than an "identical" program that used 320x200? Aside from the Ram
used to store the actual "picture" I mean. The texture maps and object
polygon descriptions wouldn't HAVE to be any bigger...
Dave Nagy
I investigated this same issue a few years ago, as I also have an older
Sony XBR-series TV with analog RGB in, but no S-Video input. [Nit -
this is NOT "SVHS" input - SVHS is a videotape recording format. It's
true that most SVHS recorders have S-video outputs, but the two aren't
the same thing.]
I've been told that Sony does make (or did make) an S-Video-to-RGB
converter for use with these TVs. However, it was only sold in Japan,
and has an external power-supply brick set up for 100 volt AC.
Philips (or its Signetics division) makes a very flexible set of video
format conversion chips, including one which can do the S-Video-to-RGB
conversion. They even sell an evaluation board, with the chips, an
on-board microcontroller to initialize the chips, and the necessary
jacks and connectors. I don't know what the price is these days. It's
an engineering-evaluation board, not a consumer product.
I don't know of any other solutions at this point.
--
Dave Platt dpl...@3do.com
USNAIL: The 3DO Company, Systems Software group
600 Galveston Drive
Redwood City, CA 94063
Same TV, I bought mine the year BEFORE the SVHS standard came out (it
figures...). If YOU find out about any RGB to SVHS, lemme know. Come to
think of it, if you or anyone out there find a SVHS to RGB, I'd take that
too (I'd love to hook the 3DO onto the RGB :( .
--
Smile
Rob C.
Thanks for the info.
--
Smile
Rob C.
> In fact, I would be surprised if the 640*480
> was used that often. Why? Well, if they're using 32,000 colors at 320*200
> and it uses a certain amount of memory, what do you think of 640*480?
For doing videos, a 320x480 resolution picture is probably just fine. For
doing video games it is not high enough to give a totally realistic looking
image. I'll explain why...
When you are rendering a frame in a game, your software decides what color
to make a certain pixel. Now say that object moves 1/2 pixel to the right.
What would happen on a filmed movie is that the colors would be different
for both the original pixel and the one to the right. Your game software
and hardware is not sophisticated enough to do this, so the pixel to the
right is rendered as the same color it was when it was at the original
location This is yet another form of aliasing. It is noticable with
distance objects in Doom, for instance.
The way around it is to render the frame at a higher resolution, then
average the pixels to go down to the lower resolution. If you did this,
then why not just display it at the higher resolution (if you can afford
the display buffer RAM)?
In the future, we will see machines that render so incredibly fast that
they will be rendering at a higher resolution than the TV can display,
but you will notice the difference anyway. For today, the best thing
that can be done is to field render at 60hz so that you are using *time*
to do the averaging for you. That's why even if a picture no longer looks
jerky, it is STILL better to render at a higher frame rate.
Wayne.
[...]
: Philips (or its Signetics division) makes a very flexible set of video
: format conversion chips, including one which can do the S-Video-to-RGB
: conversion. They even sell an evaluation board, with the chips, an
: on-board microcontroller to initialize the chips, and the necessary
: jacks and connectors. I don't know what the price is these days. It's
: an engineering-evaluation board, not a consumer product.
: I don't know of any other solutions at this point.
Well Dave, I guess the only thing left to do would be for you to make
sure that one of the M2 models has an RGB signal available somewhere. :)
You know...as a favor to us. ;)
Dave Nagy
: For doing videos, a 320x480 resolution picture is probably just fine. For
: doing video games it is not high enough to give a totally realistic looking
: image. I'll explain why...
: When you are rendering a frame in a game, your software decides what color
: to make a certain pixel. Now say that object moves 1/2 pixel to the right.
: What would happen on a filmed movie is that the colors would be different
: for both the original pixel and the one to the right. Your game software
: and hardware is not sophisticated enough to do this, so the pixel to the
: right is rendered as the same color it was when it was at the original
: location This is yet another form of aliasing. It is noticable with
: distance objects in Doom, for instance.
Exactly... But the M2 [sounds like it] has all kinds of hardware in it to
prevent/minimize this kind of stuff from happening. I have NO idea if
it's hip enough to do this, but some hardware can do sub-pixel averaging
and other kinds of tricks to make the picture "look" as though it's of
higher resolution than it is...
As for the "distant texture twinkle" that you see in Doom, etc, that
preCISEly the effect that MIP mapping and filtering can get rid of. I
think it was Jason that gave a nice explanation of how that might work.
: The way around it is to render the frame at a higher resolution, then
: average the pixels to go down to the lower resolution. If you did this,
: then why not just display it at the higher resolution (if you can afford
: the display buffer RAM)?
I think that they have ways of acheiving similar results without using
that method. For one thing, it gives them a chance to remove "illegal"
color values and transitions that might give NTSC fits.
I dunno though... Your method _would_ work. It might be a memory
issue. Perhaps by doing the anti-aliasing a pixel at a time, they save
space.
Calling Dave Platt! We'd LOVE an explanation of the M2's anti-aliasing
methods and trade-offs. (suitably vague, of course)
: In the future, we will see machines that render so incredibly fast that
: they will be rendering at a higher resolution than the TV can display,
: but you will notice the difference anyway.
About six months in the future, if we're lucky. :)
: For today, the best thing
: that can be done is to field render at 60hz so that you are using *time*
: to do the averaging for you. That's why even if a picture no longer looks
: jerky, it is STILL better to render at a higher frame rate.
Motion blurring effects too? You ARE greedy. :) On my home renderer, I
can get a good subtle blurring effect by rendering 8x more frames than
necessary and then averaging them together. The M2's alpha-channel
hardware would probably eat that averaging stuff for lunch, so let's see,
how many polys could we do?
Let's say 600k polys/sec are actually do-able on the M2 with all the
trimmings. An eighth of that would be 75k/sec. If we want an output of
25 motion-blurred frames/sec, we'll be able to render frames with... 3000
polygons in them.
Jeez, that's still a lot. I wonder if that's possible? I don't see why
not. You wouldn't have to store the extra frames, you'd just average
them into the "frame in progress" you're working on. Could this
operation be done 200 times a second?
Inquiring minds want to know.
Dave Nagy
As a master's student in Computer Engineering at Michigan,
who knows probably too much about the x86 architecture, and has
looked at 68k and Alpha architectures, this sounds like a completely
off the wall claim...
There are some very CISC-ish instructions in the x86 instruction set
(BOUND and the string operations come to mind), that maybe helpful
in executing BASIC or C high level functions, but to my knowledge
most optimizing compilers ignore them nowadays, in favor of
using the more RISC-like (meaning 1-cycle throughput instructions,
like MOV, and basic arithmetic and logical) instructions.
I think this is probably the source of the original poster's confusion.
(BTW, as an aside and off-topic, you could conceivably put a
high-level language interpreter in Alpha's pal-code, although
doing so would probably not provide any benefit over doing it
the traditional way, and may impact performance, especially in
multi-tasking systems)
Bernard Yeh
ber...@engin.umich.edu
Who, me? :)
I don't know the nitty gritty details, but I do know that the chroma (color)
bandwidth is about half as much as the luma (brightness) bw. Which means
that your TV might be capable of displaying 600 alternating black and white
(or red and black, or blue and black, etc.) lines, but not 600 alternating
red and blue lines. The brightness can change more quickly than the color.
This is why the interpolation scheme the Opera uses is so elegant. It decrea-
ses aliasing without consuming the memory which would otherwise be required
to store a higher resoloution bitmap, while pushing the signal to near the
edge of the NTSC envelope.
: On the surface, in the air, under water, I'll be there!
^^^^^^^^^^^^
Is that, that Dead Guy song?
No, just a few hobbies of mine rolled into a cryptic poem... :)