Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Design of future silicon to handle Java JVM

11 views
Skip to first unread message

Roedy Green

unread,
May 30, 1996, 3:00:00 AM5/30/96
to

I have high hopes for the eventual speed of CPUs tailored specifically for
the Java runtime environment.

I could see the top 8 or so elements on the operand stack living in
registers that independenly rename themselves allowing stack swaps and
inserts that can be done in a single cycle. That stack cache will lazily
back itself to some fast static ram or even dynamic ram. All the sorts of
things that RISC machines do with executing parallel pipelines will also
eventually be possible.

The nice thing about Java code is everyone learns threads right off the
bat. This means that as Java accelerators come on the market with a 2 cpus
on them, the later models with 16, 32, 64 ... To get more speed, just add
more processors. Java is nicely designed so that the read-only code and
ready-only data and read-write data could be put onto separate busses or
even onto a dozen little local busses. Code is nice and compact with high
locality of reference cutting the needed RAM access bandwith and the need
for a totally global pool of ram. It isn't quite Occam, but we are in much
better shape than ever before.

Ro...@bix.com <Roedy Green of Canadian Mind Products> contract programming
-30-

TheWebBook

unread,
May 31, 1996, 3:00:00 AM5/31/96
to

In article <4olaqb$m...@news2.delphi.com>, Roedy Green <ro...@BIX.com>
writes:

>I could see the top 8 or so elements on the operand stack living in
>registers that independenly rename themselves allowing stack swaps and
>inserts that can be done in a single cycle.

Check out the ShBoom chip at http://www.ptsc.com/. Its the silicon I'm
using in my WebBook for the JVM.

Chuck


Chuck Durrett
The WebBook Company
810.644.2725
Birmingham, Michigan, USA

"Thanks for listening."

Phillip Bogle

unread,
May 31, 1996, 3:00:00 AM5/31/96
to

But a good question is why a stack-based VM was chosen in the first place.
JIT compilers have to go through a fair amount of work to reconstruct
the underlying expression trees and then do register allocation. Chip
designers tried stack-based instructions sets a while back and they were a
dismal failure in comparison with register-based CPUs, as seen by the fact
that there are essentially no real chips that don't use registers.

Given these facts, I wonder why the Java designers didn't simply use, say,
16 or 32 virtual registers. Architectures which support fewer physical
registers would have to spill them to memory, but that's very well
understood.

-phil

> Roedy Green <ro...@BIX.com> wrote in article
<4olaqb$m...@news2.delphi.com>...


> I have high hopes for the eventual speed of CPUs tailored specifically
for
> the Java runtime environment.
>

> I could see the top 8 or so elements on the operand stack living in
> registers that independenly rename themselves allowing stack swaps and

Wayne Morellini

unread,
Jun 1, 1996, 3:00:00 AM6/1/96
to

Phillip Bogle (phi...@saranac.microsoft.com) wrote:
: But a good question is why a stack-based VM was chosen in the first place.

: JIT compilers have to go through a fair amount of work to reconstruct
: the underlying expression trees and then do register allocation. Chip
: designers tried stack-based instructions sets a while back and they were a
: dismal failure in comparison with register-based CPUs, as seen by the fact
: that there are essentially no real chips that don't use registers.

: Given these facts, I wonder why the Java designers didn't simply use, say,
: 16 or 32 virtual registers. Architectures which support fewer physical
: registers would have to spill them to memory, but that's very well
: understood.

I have not studied the Java document myself, but as far as stack based
processors, I am familuar with a few that get astonishing results compared
to the less efficent register based processor designs. It realy depends on
what your running, if you want to run normal programming languages through
stack processors your looking at a performance hit, mind you you probably
looking at a bigger one running the Forth stack based language on a register
based machine without good stack support. Funny thing is that Unix and C
are made at present to run more efficently on register based machines and
register based machines are made to run C and Unix more efficently, there is
the catch-22 situation.

The actual inventor of Forth has been designing stack-based processors since
the early 80's and virtually every design he's had have had some efficencies
above all available commercial register based processors.

First the Novix was out performing the origional Risc chips in the early
80's, using less on chip circuitry, simpler design and probably (though I
have never read the actual specs of this) bigger design rule.

This was developed by Harris Semiconductor (that went on to become one of the
biggest manufactures in the States before dumping advancement in the digital
side (the exact details I forget)) as their RTX200x series micro-controller
in a big marketing push.

He went on to design in around the 1990 the shboom chip (origional design of
the Shboom-II mentioned by Chuck Durret in the last post), it origionally
did 50 mips out of dram (but I think internal speed was 200mhz (1
instruction per cycle)), I beleive I read in some early 1990's report that
he had got the proto type upto 400Mhz, significantly faster than any
processor on the market and probably faster than any other prototype at the
time. The chip was meant to run both Forth and C I beleive. It was
origionally commisioned as a work station processor for somebody wishing to
ressurect the Apolo Workstation company. As he was basically designing
these things on his own and the deal to buy out Apolo fell through he was
stuck with a chip that he couldn't get manufactured.

After this Mosis had a custom chip proto-typing service available with 40
Pin chips. He had designed his on silicon CAD program using his techniques
and set about designing a chip that would fit that foot print. This thing
was due to come out years before it did with a higher spec than it has, 7000
transistors, 100Mips (max), approx 100mw on a 1.2 micron process, 4
instuctions per memory load (as with Sh-boom). Note the estimates of speed
capability on 1.2 is somewhere between 150-250 Mhz (I forget which one), it
was just a matter of experimenting and learning, that is why the chip hasn't
obtained those speeds yet. It has I/O, Video, and memory co-processor in
those 7000+ tranistors as well. The chip is the Mup21

Presently there is lots of derivits being planned, one which we don't know
about is being done for a set top box (maybe it will run Java?). The next
official one off the block will hopefully be the F21, which is also
experiencing delays due to debugging, it is a improved version of the F21
with serial network, higher resolution video output, D/A or was that A/D
etc. Apparently, I read a post from a guy commisioning the design that the
proto-type is clocking 400Mhz (0.8 micron design rules) internally though
nothing like this speed is available through the memory interface. The
design has larger stacks and is supposed to be more efficent then the
previouse effort.

The stated limit of his designs is supposed to be 1Ghz, but judging from the
speed he is getting from these old design rules I wonder if it is higher
now.

This guy virtually designs these in his garage (not literally, but that is
the mentality of it) on low budgets, now imagine what he could do if he had
some real money behind him and some professional trainees.

I am and probably a few other people are waiting for his next 32 bit design.

See the problem isn't stacks, it's what you run on them and the
implementation of the stacks, if you ran stack based software on a good
stack system you would probably notice amazing efficencies.

A full stack instruction set will fit in 8-bits, allowing for 4 instructions
for every memory cycle (allowing for cheap dram to be used or maximising on
fast sram), they can fit in on low transistor counts that miniumise power
consumption and maximise speed. These people are also searching for ram
manufacturers that will put the core on ram chips. The 8-bit instruction
size should fit all the core instructions needed to make any operation and
even be able to support C language as well.

For the latest on Charles Moores efforts in integer Forth engines see:

http://www.dnai.com/~jfox

Follow all the links for info.

There are other stack based designs done by other people scattered around
the world, you'll have to look for them or ask in the comp.lang.forth
newsgroup.

Hope this helps Phil.

Wayne.

way...@cq-pan.cqu.edu.au

: -phil

--

Matt Kennel

unread,
Jun 2, 1996, 3:00:00 AM6/2/96
to

Phillip Bogle (phi...@saranac.microsoft.com) wrote:
: But a good question is why a stack-based VM was chosen in the first place.
: JIT compilers have to go through a fair amount of work to reconstruct
: the underlying expression trees and then do register allocation. Chip
: designers tried stack-based instructions sets a while back and they were a
: dismal failure in comparison with register-based CPUs, as seen by the fact
: that there are essentially no real chips that don't use registers.

: Given these facts, I wonder why the Java designers didn't simply use, say,
: 16 or 32 virtual registers. Architectures which support fewer physical
: registers would have to spill them to memory, but that's very well
: understood.

Why should a good *virtual* machine be architected after good hardware?

Why is there the notion that the hardware architecture of 'JVM optimized'
chips would necessarily ape the structure of the java virtual machine?

Does the hardware architecture of a SPARC look like the 'c+stdio'
virtual machine?

A Java-chip is one that runs JVM fast.


Roedy Green

unread,
Jun 2, 1996, 3:00:00 AM6/2/96
to

Chuck Moore, the designer of Forth and various stack based CPU chips gave
me his card:

Chuck Moore
Computer Cowboys
410 Star Hill Road
Woodside CA 94062
(415) 851-4362

The design of the Java JVM is to wear well over the years.

Roughly speaking elements near the top of the stack deserve prime real
estate. Local variables with low slot numbers deserve prime real estate.
Classes with lower class numbers deserve better real estate. With
registers, you have to make some decision on exactly how many you will
have. The JVM architecture leaves things open to throw more and more
transistors at the same design.

You can think of the JVM design as having a huge register set. It is much
like the old PDP 10s where there first locations in RAM were actually
registers.

The big advantage of the JVM is compact code.

Wayne Morellini

unread,
Jun 3, 1996, 3:00:00 AM6/3/96
to

Roedy Green (ro...@BIX.com) wrote:
: Chuck Moore, the designer of Forth and various stack based CPU chips gave
: me his card:

: Chuck Moore
: Computer Cowboys
: 410 Star Hill Road
: Woodside CA 94062
: (415) 851-4362

He also has some web pages on the:

http:\\www.dnai.com\~jfox site.

Wayne.

: Ro...@bix.com <Roedy Green of Canadian Mind Products> contract programming
: -30-

--

Roedy Green

unread,
Jun 3, 1996, 3:00:00 AM6/3/96
to

Chuck Moore, the designer of Forth and various stack based CPU chips gave
me his card:

Chuck Moore
Computer Cowboys
410 Star Hill Road
Woodside CA 94062
(415) 851-4362

The design of the Java JVM is to wear well over the years.

Roughly speaking elements near the top of the stack deserve prime real
estate. Local variables with low slot numbers deserve prime real estate.
Classes with lower class numbers deserve better real estate. With
registers, you have to make some decision on exactly how many you will
have. The JVM architecture leaves things open to throw more and more
transistors at the same design.

You can think of the JVM design as having a huge register set. It is much
like the old PDP 10s where there first locations in RAM were actually
registers.

The big advantage of the JVM is compact code.

Ro...@bix.com <Roedy Green of Canadian Mind Products> contract programming
-30-

Phillip Bogle

unread,
Jun 3, 1996, 3:00:00 AM6/3/96
to

way...@cq-pan.cqu.edu.au (Wayne Morellini) mentioned several real
machines that achieve very high performance using stack based
architectures:

>
> For the latest on Charles Moores efforts in integer Forth engines see:
>
> http://www.dnai.com/~jfox

It seems I've been mislead by only looking at the mainstream chips and
didn't realize it was such an area of active development. As noted
stack-based architectures have the advantage that they permit compact
instruction codes.

The argument I'd always heard against stack-based instruction sets is that
you waste a lot of cycles shuffling things around on the stack, and
getting at items other than the few items at the top of the stack.

As long as we're designing virtual machines, couldn't you get many of the
same compactness benefits by having register-based instructions, but then
also having very compact common-case instructions where the registers are
implicit.

m...@caffeine.engr.utk.edu (Matt Kennel) wrote in article
<4or04l$7...@gaia.ns.utk.edu>...

>
> Why should a good *virtual* machine be architected after good hardware?
>

> Why is there the notion that the hardware architecture of 'JVM
optimized'
> chips would necessarily ape the structure of the java virtual machine?
>
> Does the hardware architecture of a SPARC look like the 'c+stdio'
> virtual machine?
>
> A Java-chip is one that runs JVM fast.
>

I was under the impression, probably mistaken, that Sun's Java chips would
execute the Java byte codes directly, which seems a bad idea because the
byte codes are design for compactness, not efficiency of execution. Can
someone point me to more information on this subject?

There are two issues with a virtual machine-- (1) how easy (and fast) is
it to translate the VM codes to native machine code, and (2) how quickly
does the executed code run. If a register-based format is easier to
translate to the majority of real machines, and the resulting native code
runs no slower than that translated from stack-based code, then I would
argue that, yes, there is a good reason why the virtual machine
architecture should mimic real machine architectures.

The question is, when Sun designed the byte-codes, were they optimizing
for their VM interpreter, or for JIT native-code compilers. Since no JITs
existed at the time, they would have needed a lot of foresight to really
optimize for that case.

-phil


Roedy Green

unread,
Jun 4, 1996, 3:00:00 AM6/4/96
to

Phillip Bogle wrote:
>The argument I'd always heard against stack-based instruction sets is that
>you waste a lot of cycles shuffling things around on the stack, and
>getting at items other than the few items at the top of the stack.
>
>As long as we're designing virtual machines, couldn't you get many of the
>same compactness benefits by having register-based instructions, but then
>also having very compact common-case instructions where the registers are
>implicit.

You can write the same program with giant long expressions and few local
variables(which is usually fast but hard to read), or unravelled with many
local variables.

The stack mainly handles evalutating expressions. I can forsee tools that
will optimise byte codes by replacing stack juggling ops with local
variables or vice versa.

If you have been exposed to Forth, you know how you almost never need local
variables if you have even two stacks. Smart architectures could make
those stack juggling operations almost free. Right now stack operations
are quite expensive since stacks are implemented in RAM, and stack juggling
means copying many elements, especially burying something deep in the
stack.

The nice thing about the stack is the natural ordering. The elements on
the top of the stack are almost certainly going to be needed before ones
buried in it. I think this may give some powerful hints to parallel and
speculative execution.

If you have only a little iron, throw it at the top 2 stack elements. If
you have a lot throw it at the top 32. Local variables also act like
registers. Again we should presume the low numbered ones are backed by
better iron.

I have always wanted to experiment with a machine where ALL the registers
were actually stacks, backed by little independent processors to handle
overflow to RAM.

The big advantage of stack machines is the compact code. The operands you
want are nearly always within the top 4.

In Forth and PostScript, if you order parameters cleverly, a routine can
avoid the overhead of having to build parameter lists for the routines it
calls. It just feeds the same list it was handed, and lets the callee
consume one or more of the parameters off the stack. I don't know if this
sort of optimisation will ever be possible for Java. I will need to think
hard about just what sorts of behind the scenes methods could be used to
implement Java's call mechanism.

Wayne Morellini

unread,
Jun 5, 1996, 3:00:00 AM6/5/96
to

Phillip Bogle (phi...@saranac.microsoft.com) wrote:
: way...@cq-pan.cqu.edu.au (Wayne Morellini) mentioned several real

: machines that achieve very high performance using stack based
: architectures:

: >
: > For the latest on Charles Moores efforts in integer Forth engines see:
: >
: > http://www.dnai.com/~jfox

: It seems I've been mislead by only looking at the mainstream chips and
: didn't realize it was such an area of active development. As noted
: stack-based architectures have the advantage that they permit compact
: instruction codes.

: The argument I'd always heard against stack-based instruction sets is that


: you waste a lot of cycles shuffling things around on the stack, and
: getting at items other than the few items at the top of the stack.

Well this requires efficent code design to avoid this. But if you were
trying to run C or something through a forth like stack machine I imagine
that it would be a problem becuase C is not coded efficently for forth
machines.

: As long as we're designing virtual machines, couldn't you get many of the


: same compactness benefits by having register-based instructions, but then
: also having very compact common-case instructions where the registers are
: implicit.

Undoubtably you would get speed benefits from this, but trying to do this in
less than 5 bits per instruction could prove difficult. The minium forth
instruction set will fit in 5-bits, actually if you have an instruction to
dig down into the stack and retreive or save data you can use the stack
locations as registers (not really needed in forth). As the stacks can be a
seperate on chip memory area doing this is not to bad (no clash with main
memory bus access).

With register based machines you either have to have feilds within the
instructions to access the registers (even when common-case instructions are
used other instructions have to access that register and require these
fields) or a instruction that percifically accesses them. Therefore band
width is still wasted somewhere, mind you I think common-case has merit. But
then the accumulator based machines assumed that the accumulator was always
involved and results could be shifted out to memory variables (like a slow
array of registers), but then you loose that nice quick interchangability of
data between variables (in registers) that you have with a large set of
registers.

Probably the benefit of stack based machines is that the stack provides an
automatic way to handle and control data, meaning you put the data on the
stack (like a register but without the feilds specifying which register to
put it in) the order of the data decides which data shall be used
next. You operate on the data (without needing to specify the feild again,
as the order on the stack decides wich is to be used next). As you use it
the data on stack locations are automatically used or new data is added.
The stack locations left are automatically available for the next
instruction. In other words you don't need the feild handeling
parts of the instruction as the order data is loaded automatically decides
who gets what. But then again we have the problem of what happens if we
want the same data to be used again or passed back, requiring it to be
stored somewhere (the return stack can be used as a temporary store in a
sub-routine, but not accross these boundaries) so it is stored to register,
memory or using special stack instructions swapped down the stack (hence
some overhead), so this type of stack machine needs carefull coding (but
modern long word architecher processors require special compilers to set up
their coding sequencies efficently as well).

Correct me if I'm wrong I am not up on current trends.

By the way don't tell your boss about the performance or these stack based
processors, I'm sure most people in the Forth community wouldn't really want
MicroSoft stack based processor for Windows as the future standard we all had
to follow :) Mind you the Micro-Unity project Microsoft have invested in
might mak all this a bit irrelavent.

Wayne Morellini
way...@cq-pan.cqu.edu.au
Wayne Morellini.

: m...@caffeine.engr.utk.edu (Matt Kennel) wrote in article
: <4or04l$7...@gaia.ns.utk.edu>...

: >
: > Why should a good *virtual* machine be architected after good hardware?
: >

: > Why is there the notion that the hardware architecture of 'JVM
: optimized'
: > chips would necessarily ape the structure of the java virtual machine?
: >
: > Does the hardware architecture of a SPARC look like the 'c+stdio'
: > virtual machine?
: >
: > A Java-chip is one that runs JVM fast.
: >

: I was under the impression, probably mistaken, that Sun's Java chips would
: execute the Java byte codes directly, which seems a bad idea because the
: byte codes are design for compactness, not efficiency of execution. Can
: someone point me to more information on this subject?

: There are two issues with a virtual machine-- (1) how easy (and fast) is
: it to translate the VM codes to native machine code, and (2) how quickly
: does the executed code run. If a register-based format is easier to
: translate to the majority of real machines, and the resulting native code
: runs no slower than that translated from stack-based code, then I would
: argue that, yes, there is a good reason why the virtual machine
: architecture should mimic real machine architectures.

: The question is, when Sun designed the byte-codes, were they optimizing
: for their VM interpreter, or for JIT native-code compilers. Since no JITs
: existed at the time, they would have needed a lot of foresight to really
: optimize for that case.

: -phil


--

Pete Lynch

unread,
Jun 5, 1996, 3:00:00 AM6/5/96
to

For a well-known example of a stack-based processor implementing a VM
implementing a language with inbuilt threading and monitors, try the
Inmos Transputer and the Occam language, now owned by SGS-Thomson.
David May took the same approach as Sun/Gosling, i.e. there will be
only one semantics. There was reasonable support for serialising
structures for data transfer, but no equivalent of OO. ISTR that the
T4xx was significantly faster the same generation Intel.


James F'jord Lynn

unread,
Jun 5, 1996, 3:00:00 AM6/5/96
to

Phillip Bogle <phi...@saranac.microsoft.com> wrote:
>I was under the impression, probably mistaken, that Sun's Java chips would
>execute the Java byte codes directly, which seems a bad idea because the
>byte codes are design for compactness, not efficiency of execution. Can
>someone point me to more information on this subject?

It would be a bad idea to make a Java PC based around a Java chip, because
the bytecodes are designed for compactness. However, that's not what the
Java chips are for. They are for controlling appliences and such: VCRs,
telephones, etc. In this case, having compacts bytecode is more benificial
than having word of dword aligned bytecodes. The reason for this is because
to make a telephone cheap, you can't go throwing Megs of flash ROM in it.
You put as little as possibile in, and having tight opcodes helps you
cram those extra few instructions in. At BNR, some of the devices I was
working with only had 128K of code space, which was very little (these
weren't telephones), and it was expanded to 512K in the next version causing
a price hike. In these systems having a compact CISC based bytecode is much
more reasonable than a word alligned RISCed based one.


--
Life - F'jord of Timelord, James Lynn
Java - http://www.undergrad.math.uwaterloo.ca/~j2lynn/java.html <NEW LOCATION>
SuperButton v1.1 and MessageBox v1.0 available there

Brian N. Miller

unread,
Jun 5, 1996, 3:00:00 AM6/5/96
to

At first peek, the Harris RTX 2010
(http://www.semi.harris.com/datasheets/rh/hs-rtx2010rh/)
looks like a likely candidate for the JVM. It's got these
promising features, quoting from the data sheet:

- Direct execution of Forth language
- (42 peak, 8 sustained) Meg Forth ops / second at 8Mhz.
- Two on-chip 256 word stacks (parameter stack and return stack).
- Configurable stack partitioning and under/overflow control.
- Pair of registers for top two parameter stack elements.
- Register for top of return stack.
- Multitasking capabilities.
- Single cycle instruction execution (2 cycles for those that access RAM).
- Single cycle multiply.
- Single cycle subroutine call.
- Free subroutine return (incorporated in previous instruction).
- On-chip interrupt controller.
- Eight 16-bit general purpose registers.
- Tiny package and pinout.
- 1 Meg address space. Yuk! :(

Here's an encouraging quote about an older (slower?) version:
(http://eli.wariat.org/~rj/dreams/node28.html)

On the RTX-2000, the machine cycle is 100 ns, and 25 ns static
RAM would do the job just fine. Harris has indicated that with
the RAM on the same chip as the processor, 40 ns RAM would do.
They say it is straightforward to put 8 K cells of such RAM on
the same chip as the processor. This would support 8 tasks with
a 500 ns context switching overhead running an object oriented
program. The cost for a commercial grade chip should be around
$65.00 in quantity. We do not feel that there is another processor
around that can come anywhere near this price/performance ratio
for a real-time micro-controller running object oriented
applications.

WebBook (http://webbook.com/) chose the PSC1000 ShBoom
(http://www.ptsc.com/) which claims:

- Dual stack (register stack and data stack).
- Top 18 words of data stack cached on chip.
- Top 16 words of register stack cached on chip.
- Stacked ALU.
- 8 bit instructions and no operands.
- 1 word (4 instructions) instruction cache.
- Single cycle instructions.
- 32 bit RISC
- 100 peak native MIPS at 100MHz
- Seperate on-chip I/O processor.
- No traditional I-cache or D-cache.
- Instruction pre-fetch.

The AMD 29K is popular inside PostScript printers.
http://www.amd.com/html/products/EPD/29k/29kover.html claims for
the Am29050:

- 192 addressable general purpose registers, 4-ported.
- Addressable register file can be mapped to top n words of
application stack.
- 3 address instruction set
- 32 bit RISC
- Burst mode RAM reads.
- I/O data registers.
- 1K branch target cache.
- 55 native sustained MIPS at 40 Mhz.
- 4 Gig VM address space with demand paging.


Roedy Green

unread,
Jun 6, 1996, 3:00:00 AM6/6/96
to

Wayne Morellini wrote:
>Probably the benefit of stack based machines is that the stack provides an
>automatic way to handle and control data, meaning you put the data on the
>stack (like a register but without the fields specifying which register to

>put it in) the order of the data decides which data shall be used
>next. You operate on the data (without needing to specify the field again,
>as the order on the stack decides which is to be used next). As you use it

>the data on stack locations are automatically used or new data is added.
>The stack locations left are automatically available for the next
>instruction.

Thank you for putting that idea out so clearly.

The net effect is all garden variety JVM operators always work on the top
two stack elements so you don't waste ANY bits specifying the operands.
Operands are all implied by the postfix order the stack is built.

The top N elements of the stack can be implemented as hardware. Potential
hardware stack underflow/overflow can be averted in the backgound
asynchronously. The nice thing about the JVM architecture is just what N
is an elastic limit, not something compilers have to keep firmly in mind.

I suspect we will see some very clever scheduling of operations inside
future Java cpus. The deeper in the stack the result of the computation
gets buried, the lower its priority to access the arithmetic units --
calculation by procrastination.

Even BEFORE the CPU has analysed the following instruction stream, it knows
the probable order that results will be needed. Couple that with a very
compact op-coding scheme, and the Java CPU can look very far ahead to
schedule out-of-order computation.

These RISCy ideas may sound silly when discussing a super-CISCY design like
the JVM. My gut feeling is it is only a matter of time till we have
hardware optimised to the transistor level to handle each JVM op-code. The
hardware for the more popular instructions like "add two ints" will be
duplicated. On top of that, every app is going to have a CPU per thread at
its disposal.

When there was a USSR, I had a conversation with Professor Barinov at the
Moscow University. He had only access to 80286 level chip fabrication. The
West would not even let him buy a 386 or 486 even if he could afford it.
He also had to contend with high RAM prices since hard currency was not
easy to come by. He figured the only possible way he could build a CPU to
compete with those in the West was to use a stack design, similar to some
of Chuck Moore's Forth chips.

With Java, the economics are changing. Specialised Java chips will have an
even bigger market than the Intel Pentium. This is not the same unfair
game the implementors of specialised LISP machines faced.

Java may be a pig now, but watch it pull ahead over the next few years.

Achim Gratz

unread,
Jun 6, 1996, 3:00:00 AM6/6/96
to

>>>>> "Brian" == Brian N Miller <b...@indica.bbt.com> writes:

Brian> The AMD 29K is popular inside PostScript printers.
Brian> http://www.amd.com/html/products/EPD/29k/29kover.html
Brian> claims for the Am29050:

The 29K line had its EOL (end of life) announced in 1995. It is a
rather interesting architecture though.

--
Achim Gratz.

--+<[ It's the small pleasures that make life so miserable. ]>+--
WWW: http://www.inf.tu-dresden.de/~ag7/
E-Mail: gr...@ite.inf.tu-dresden.de
Phone: +49 351 4575 - 325

Brian N. Miller

unread,
Jun 6, 1996, 3:00:00 AM6/6/96
to

In article <4p5dlv$j...@news2.delphi.com>, Roedy Green <ro...@BIX.com> writes:
|
|The net effect is all garden variety JVM operators always work on the top
|two stack elements. ...
|Operands are all implied by the postfix order the stack is built.
|
|The top N elements of the stack can be implemented as hardware.
|...

|I suspect we will see some very clever scheduling of operations inside
|future Java cpus.
|...

|Even BEFORE the CPU has analysed the following instruction stream, it knows
|the probable order that results will be needed.

Using stack instead of GPRs might actually reduce the potential for
scheduling optimisation. This might be because each stack-based
instruction might tend to be heavily dependent on the results of
the previous one. Stack operations almost always alter TOS, and
the next instruction almost always relies on availability of new
data on TOS. So stack operations might over-constrain instruction
ordering.

|It is only a matter of time till we have


|hardware optimised to the transistor level to handle each JVM op-code.

See http://www.ptsc.com/ and http://www.sun.com/sparc/java/index.html

sirm...@epix.net

unread,
Jun 6, 1996, 3:00:00 AM6/6/96
to

Roedy Green (ro...@BIX.com) wrote:
: I have high hopes for the eventual speed of CPUs tailored specifically for
: the Java runtime environment.

: I could see the top 8 or so elements on the operand stack living in
: registers that independenly rename themselves allowing stack swaps and
: inserts that can be done in a single cycle. That stack cache will lazily
: back itself to some fast static ram or even dynamic ram. All the sorts of
: things that RISC machines do with executing parallel pipelines will also
: eventually be possible.

: The nice thing about Java code is everyone learns threads right off the
: bat. This means that as Java accelerators come on the market with a 2 cpus
: on them, the later models with 16, 32, 64 ... To get more speed, just add
: more processors. Java is nicely designed so that the read-only code and
: ready-only data and read-write data could be put onto separate busses or
: even onto a dozen little local busses. Code is nice and compact with high
: locality of reference cutting the needed RAM access bandwith and the need
: for a totally global pool of ram. It isn't quite Occam, but we are in much
: better shape than ever before.

: Ro...@bix.com <Roedy Green of Canadian Mind Products> contract programming
: -30-

But don't caches become stale and memory access conflicts multiply
geometrically as you add so many processors?

John Ahlstrom

unread,
Jun 7, 1996, 3:00:00 AM6/7/96
to

Brian N. Miller (b...@indica.bbt.com) wrote:
: In article <4p5dlv$j...@news2.delphi.com>, Roedy Green <ro...@BIX.com> writes:
-- snip snip
: |...

: |I suspect we will see some very clever scheduling of operations inside
: |future Java cpus.
: |...
: |Even BEFORE the CPU has analysed the following instruction stream, it knows
: |the probable order that results will be needed.

: Using stack instead of GPRs might actually reduce the potential for
: scheduling optimisation. This might be because each stack-based
: instruction might tend to be heavily dependent on the results of
: the previous one. Stack operations almost always alter TOS, and
: the next instruction almost always relies on availability of new
: data on TOS. So stack operations might over-constrain instruction
: ordering.

: |It is only a matter of time till we have
: |hardware optimised to the transistor level to handle each JVM op-code.


It would be nice if some Burroughs A-Series compiler jocks
kicked in on this thread.

--
John Ahlstrom jahl...@cisco.com
408-526-6025 Using Java to Decrease Entropy

May 1996 -- "First free Webzine dies for lack of advertiser support."
Remember to click 3 ads whenever you visit a free site.

Any neural system sufficiently complex to generate the axioms of arithmetic
is too complex to be understood by itself.
Kaekel's Conjecture

Renny Bosch

unread,
Jun 7, 1996, 3:00:00 AM6/7/96
to

Roedy Green wrote:

>
> Wayne Morellini wrote:
> >Probably the benefit of stack based machines is that the stack provides an
> >automatic way to handle and control data, meaning you put the data on the
> >stack (like a register but without the fields specifying which register to

> >put it in) the order of the data decides which data shall be used
> >next. You operate on the data (without needing to specify the field again,
> >as the order on the stack decides which is to be used next). As you use it

> >the data on stack locations are automatically used or new data is added.
> >The stack locations left are automatically available for the next
> >instruction.
>
> Thank you for putting that idea out so clearly.
>
> The net effect is all garden variety JVM operators always work on the top
> two stack elements so you don't waste ANY bits specifying the operands.
> Operands are all implied by the postfix order the stack is built.
>
> The top N elements of the stack can be implemented as hardware. Potential
> hardware stack underflow/overflow can be averted in the backgound
> asynchronously. The nice thing about the JVM architecture is just what N
> is an elastic limit, not something compilers have to keep firmly in mind.
> . . . .

>
> Java may be a pig now, but watch it pull ahead over the next few years.
>
> Ro...@bix.com <Roedy Green of Canadian Mind Products> contract programming
> -30-

Are you familiar with the ShBoom processor (made by Patriot Scientific)?
It has a stack architecture, and claims to be well suited to Java.

See http://www.ptsc.com
--
_\/_
Renny Bosch /\
Davies Bosch Associates /
Newport Beach, California __(__

Roedy Green

unread,
Jun 8, 1996, 3:00:00 AM6/8/96
to

sirmango wrote:
>But don't caches become stale and memory access conflicts multiply
>geometrically as you add so many processors?

Granted, there are all kinds of problem in exploiting multiple CPUs.
Traditional approaches rapidly hit the problem of diminishing returns.

However, human ability to handle complexity is growing rapidly.

To get a feel for what sorts of optimisation are possible, SINGLE STEP for
few days through applications written in C++ or Java. While you are
clicking away, all kinds of ideas will occur to you about how regularities
could be exploited. What looks like busywork? What is essential to the
algorithm?

Look far enough into the future that you don't have strongly preconceived
ideas about what is possible. The typical loop COULD implement each
iteration simultaneously on a separate processor without RAM conflicts.
Nearly always you are processing completely different objects. Sometimes
you are summing. Here the accumulator is shared. That may be handled by
sending messages to a CPU in charge of the accumulator or to a local
accumulator that consolidates results for the master accumulator (much as
votes are tallied in parallel, polling station by polling station.)

Consider the Java heapsort I posted yesterday. It works analogously to an
office hierarchy with employees fighting each other for dominance. The
comparisons could be handled by one CPU per "employee" fighting
independently with its near neighbours with relatively little ram conflict.
The transformations of sequential Java-think applications to
parallel-think are just too much for the human mind to wrap around just
now. However, eventually we will have optimisers that CAN do just that.
We will specify our algorithms as if they were sequential. Optimisers will
then extract every last drop of parallelism out of them. This will likely
flow from mathematical work in automated theorem proving. Java is most
likely to be the springboard for the process.

Why Java? Java is also a social phenomenon. It is bringing together people
from a wide array of academic and non-academic backgrounds to
cross-fertilise ideas. It means an abundance of fresh approaches. The JVM
instruction set is simple enough to admit automated analysis. The Byte
Code Verifier is already a modern miracle in the depth of what it can
determine just by studying the byte codes. The stakes are high. Whoever
can implement JVM most efficiently, is in a position to take the whole
market. Java takes down the "tariff trade barriers" that lock people into
one vendor.

I think we will eventually evolve to the idea of objects being intelligent
beasts with their own private CPUs, each interacting fully simultaneously
potentially with any other object in the known universe.

Much of our programming will be done with COLLECTIONS as the atomic units.
These collections may be geographically dispersed. The parallelism comes
mainly from processing all the elements of the collection at once in
SQL-think.

A present day model for future parallel architectures would be the
Comp.Lang.Java newsgroup itself. You broadcast a request for information
to a subset of humanity likely to know the answer. Those that know respond
either publicly, or privately through email. Some of those people may have
done some research on their own to help you. In return, you answer queries
about your own domain of expertise. Very busy people, such as Mr. Gates
generating important/popular information don't interact directly, but
through proxies who have mirrored copies of some subset of his knowledge
base.

Looked at from a distance, the Internet is humanity's second major attempt
at building a parallel processing computer out of a hybrid of hardware and
wetware. The first was the phone system.

Brad Rodriguez

unread,
Jun 8, 1996, 3:00:00 AM6/8/96
to

Brian N. Miller wrote:
> At first peek, the Harris RTX 2010
> (http://www.semi.harris.com/datasheets/rh/hs-rtx2010rh/)
> looks like a likely candidate for the JVM. It's got these
> promising features, quoting from the data sheet:

Except that the RTX series is essentially "dead" at Harris, unless you
want to design your own chip (I believe Harris put the RTX CPU into their
standard cell library). BTW, I think the RTX 2010 is an RTX 2000 minus
the single-cycle hardware multiplier.

Better candidates might be Patriot Scientific's Shboom, the SC32 from
Silicon Composers, or Chuck Moore's MuP21 (or its variants).
--
Brad Rodriguez b...@headwaters.com Computers on the Small Scale
This brain for rent -- inquire within.
Contributing Editor, The Computer Journal... http://www.psyber.com/~tcj
Director, Forth Interest Group........... http://www.forth.org/fig.html
1996 Rochester Forth Conference: June 19-22 in Toronto, Ontario
http://maccs.dcss.mcmaster.ca/~ns/96roch.html

Nik Shaylor - Firenze - Italia

unread,
Jun 9, 1996, 3:00:00 AM6/9/96
to

I have not got the details, but as I recall the Transputer has this sort of
architecture. BTW anyone know how it is these days?

Nik Shaylor

Dan Young

unread,
Jun 10, 1996, 3:00:00 AM6/10/96
to

John Ahlstrom wrote:
>
> Brian N. Miller (b...@indica.bbt.com) wrote:
> : In article <4p5dlv$j...@news2.delphi.com>, Roedy Green <ro...@BIX.com> writes:
> -- snip snip
>
> : Using stack instead of GPRs might actually reduce the potential for
> : scheduling optimisation. This might be because each stack-based
> : instruction might tend to be heavily dependent on the results of
> : the previous one. Stack operations almost always alter TOS, and
> : the next instruction almost always relies on availability of new
> : data on TOS. So stack operations might over-constrain instruction
> : ordering.
>
> : |It is only a matter of time till we have
> : |hardware optimised to the transistor level to handle each JVM op-code.
>
> : See http://www.ptsc.com/ and http://www.sun.com/sparc/java/index.html
>
> It would be nice if some Burroughs A-Series compiler jocks
> kicked in on this thread.
>

The Unisys A Series architecture, is stack-based, but the Program Code Unit
(PCU) decodes instructions into 'micro-ops' which map down to a GPR kind of
architecture. Because, at the Instruction Architecture Level, programmers
never directly address registers the hardware designers have a great deal of
freedom in trading off cost vs. performance.

A lot of the techniques used in the Intel Pentium Pro; out-of-order, and
speculative execution for example were pioneered on the A Series at least 15
years ago.

---
Dan Young
Unisys Canada
Internet: yo...@po1.ny.unisys.com
Net^2: 462-4470 Off-Net: (416)495-4470

Alastair Mayer

unread,
Jun 10, 1996, 3:00:00 AM6/10/96
to Dan Young

Dan Young wrote:
> John Ahlstrom wrote:
> > Brian N. Miller (b...@indica.bbt.com) wrote:
> > : Using stack instead of GPRs might actually reduce the potential for
> > : scheduling optimisation. This might be because each stack-based
[snip]

> >
> > It would be nice if some Burroughs A-Series compiler jocks
> > kicked in on this thread.

> The Unisys A Series architecture, is stack-based, but the Program Code Unit
> (PCU) decodes instructions into 'micro-ops' which map down to a GPR kind of
> architecture. Because, at the Instruction Architecture Level, programmers
> never directly address registers the hardware designers have a great deal of
> freedom in trading off cost vs. performance.

> A lot of the techniques used in the Intel Pentium Pro; out-of-order, and
> speculative execution for example were pioneered on the A Series at least 15
> years ago.

Assuming that the Unisys A-series is the successor to the Burroughs B6700
(and B5500 before that) line, some of this stuff goes back more than 25 years.

Stack-based machines are ideal for executing high-level (block oriented)
languages -- the B6700 (etc) is essentially an Algol machine. (Although Algol
is more purely stack-based: all memory allocation is static. You can also
nest procedure declarations.)

Hence the Java VM is stack-oriented. However, that's not to say that the
micro-machine which interprets the stack-oriented op-codes (eg Java bytecode,
Burroughs code, etc) at the hardware (or microprogram) level should be oriented
that way. This is where approaches such as Unisys's (and possibly Sun's Java
chips - we'll see) come into play.

Hmm, I wonder if Intel will revive their i432 Ada chip for Java?

-- Alastair
---------------------------------------------------------------------
Alastair Mayer The above is not intended to represent
mailto:a...@bix.com the opinion of US West, CTG, BIX, etc...

Michael D. Nahas

unread,
Jun 10, 1996, 3:00:00 AM6/10/96
to

Anyone looking at implementing a stack based machine,
such as building a processor for the Java Virtual Machine,
should look at the AT&T CRISP. It was a stack based
CPU optimized for performing C code on a stack. The
CPU's internal registers were limited to things such
as the stack pointer and the program counter. Almost
all operations were stack to stack.
The top 128 bytes of the stack were cached in
a write back manner in the implementation described in
the paper I have. This design made function calls very
fast, but context switches slow.
The two paper's I've read on the CRISP are
"Register Allocation for Free: The C Machine Stack Cache"
by David R Ditzel and H R McLellan
"Compiling for the CRISP Microprocessor"
by Sumit Bandyopadhyay, Vimal Begwani, and Robert Murray.
There is a book out which is a collection of papers
on the CRISP processor. Try searching under the name
"Thomas G. Szymanski".

Mike


--
/********************************************************\
* Michael David Nahas na...@virginia.edu *
* Graduate Student in CS University of Virginia *
\********************************************************/

Paul DeMone

unread,
Jun 11, 1996, 3:00:00 AM6/11/96
to

Alastair Mayer wrote:
> Stack-based machines are ideal for executing high level (block oriented)

> languages -- the B6700 (etc) is essentially an Algol machine. (Although Algol
> is more purely stack-based: all memory allocation is static. You can also
> nest procedure declarations.)

The B6700 is an example of a machine with (then thought) theoretical
advantages over its contemporary antithesis the CDC6600; it had much
more compact programs, needed less instruction fetch bandwidth etc.
Unfortunately in real life the "proto-RISC" CDC6600 could run even
Algol programs much faster than the B6700. The problem with stack
machines is that
1) intermediate computation results that might be reused are lost and
2) the top of stack becomes a bottleneck, especially in superscalar
implementations.

A more comtemporary example of the failure of a stack machine is the
SGS (formerly Inmos) T9000 transputer. This 3.3 million transistor
chip occupied 200 mm2 in 1.0 CMOS. At 50 MHz this superscalar chip
cranked out only 36 "VAX MIPS" (yes I know, but I haven't seen any
better metric for this device). The MIPS R3000, available years
earlier, could exceed this at a lower clock rate using 1/30th of the
logic.

Designing processors "optimized" for specific languages has tantalized
computer architects for many years. Unfortunately the economics and
volumes of general purpose machines pushes them ahead so fast as to
negate the modest advantages of a language specific machine. As a grad
student I had a Symbolics 3600 Lisp machine to play with. This was a
great machine but even then 68020-based general purpose workstations
were becoming available that could run Lisp programs faster for much
less money.

I suspect with the rapidly improving technology of just-in-time (JIT)
Java compilers that the market window for Java chips is sliding shut if
it ever really existed in the first place.

all opinions strictly my own.

--
Paul W. DeMone The 801 experiment SPARCed an ARMs race to put
Kanata, Ontario more PRECISION and POWER into architectures with
pde...@tundra.com MIPSed results but ALPHA's well that ends well.

Satan's CPU? The P666 of course - check out Appendix H(ell) under NDA


YCYANG

unread,
Jun 12, 1996, 3:00:00 AM6/12/96
to

From John Ahlstrom (jahl...@cisco.com) article:

: It would be nice if some Burroughs A-Series compiler jocks


: kicked in on this thread.

Did I miss something here? From my understanding of the
JAVA virtual machine, a recompile of the Java class code
is possible and could take advantage of the parallelism
within the code sequence.

Is there really the need to build a Java-native CPU? I
mean, GPR machine is superior in terms of parallelism.
A fast register-relocation and code convert is possible
and an interesting topic.

Besides, CPU optimized to Java does not mean CPU optimized
to the applications. Direct implement of the instruction
set and programming model MAY NOT be a good idea.

Any comment is welcome.

--
Y.C. Yang, NCTU, Taiwan.


John R. Mashey

unread,
Jun 12, 1996, 3:00:00 AM6/12/96
to

A slight commerical: there will an interesting panel on this topic at
Hot Chips in August: see
http://www.hot.org/hotchips
--
-john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: ma...@sgi.com
DDD: 415-933-3090 FAX: 415-967-8496
USPS: Silicon Graphics 6L-005, 2011 N. Shoreline Blvd, Mountain View, CA 94039-7311

John Wilson

unread,
Jun 12, 1996, 3:00:00 AM6/12/96
to

m821...@athletes.EE.NCTU.edu.tw (YCYANG) wrote:

>Is there really the need to build a Java-native CPU? I
>mean, GPR machine is superior in terms of parallelism.
>A fast register-relocation and code convert is possible
>and an interesting topic.
>
>Besides, CPU optimized to Java does not mean CPU optimized
>to the applications. Direct implement of the instruction
>set and programming model MAY NOT be a good idea.
>

Focusing on the instruction set of a machine designed to run Java
optimally seems to me to be a mistake. As you observe, it is quite
possible to convert the code to an instruction set optimized for
execution speed rather than compactness.

Bill Joy (in one of the keynotes at JavaOne) claimed that future high
performance JVM implementations might well take advantage of the fact
that some compile time variables are run time constants, implying JIT
style JVM implementations.

However, I do think that the memory management aspect of processor
design could be optimized for the JVM. Perhaps it is time to revisit
the designs of Capability machines (the Chicago Magic Number machine,
the Cambridge CAP machine, the Plessey PP250, the IBM System/38, the
Intel 432) to see if we can rediscover hardware memory management
techniques that optimise JVM object access and support garbage
collection.
__________________________________________________________
John Wilson - The Wilson Partnership
5 Market Hill
Whitchurch Phone +44 1296 641072
Aylesbury Fax +44 1296 641874
Buckinghamshire Mobile +44 973 222089
HP22 4JB
UK
__________________________________________________________

Zalman Stern

unread,
Jun 12, 1996, 3:00:00 AM6/12/96
to

In article <4plb2g$5...@news.csie.nctu.edu.tw>,

m821...@athletes.EE.NCTU.edu.tw (YCYANG) wrote:
> Did I miss something here? From my understanding of the
> JAVA virtual machine, a recompile of the Java class code
> is possible and could take advantage of the parallelism
> within the code sequence.

The translated code has to end up somewhere. If one is targeting really
low-end embedded systems (which picoJava is and microJava may), memory
will be the dominant cost factor. If doing a Java CPU saves a meg or two
of RAM, its well worthwhile in this market. There is also a greater
overhead the first time the code is exectued with a compile-and-go
implementation. Java chips eliminate this problem.

Zalman Stern, Caveman Programmer, Macromedia Video Products, (415) 378 4539
3 Waters Dr. #100, San Mateo CA, 94403, zal...@macromedia.com
If you squeeze my lizard, I'll put my snake on you -- Lemmy

David Collier-Brown

unread,
Jun 12, 1996, 3:00:00 AM6/12/96
to

John Wilson wrote:
> Focusing on the instruction set of a machine designed to run Java
> optimally seems to me to be a mistake. As you observe, it is quite
> possible to convert the code to an instruction set optimized for
> execution speed rather than compactness.

The ease with which the transformation from a regular form (like
a stack machine) to something which is semantically very different (an
ancient accumulator-and-index registers machine) was once well known.

It strikes me I have a copy of Welsh and Mckeague (sp>) ``Structured
Systems programming'' sitting on the shelf at home, with a very good
example of converting a stack-like p-code into ICL 1900 machine code
via a simple register-history mechanism, all written in concurrent
pascal.

I suspect an interpretable intermediatre language is a **good thing**
for future java compilers, subject to the assumption that the
intermediate labguage is designed for ease of compilation, not just
execution.

--dave
--
David Collier-Brown, | Always do right. This will gratify some people
185 Ellerslie Ave., | astonish the rest. -- Mark Twain
Willowdale, Ontario | dav...@hobbes.ss.org, unicaat.yorku.ca
N2M 1Y3. 416-223-8968 | http://java.science.yorku.ca/~davecb

John Ahlstrom

unread,
Jun 12, 1996, 3:00:00 AM6/12/96
to

Paul DeMone (pde...@tundra.com) wrote:

: Alastair Mayer wrote:
: > Stack-based machines are ideal for executing high level (block oriented)
: > languages -- the B6700 (etc) is essentially an Algol machine. (Although Algol
: > is more purely stack-based: all memory allocation is static. You can also
: > nest procedure declarations.)

: The B6700 is an example of a machine with (then thought) theoretical
: advantages over its contemporary antithesis the CDC6600; it had much
: more compact programs, needed less instruction fetch bandwidth etc.
: Unfortunately in real life the "proto-RISC" CDC6600 could run even
: Algol programs much faster than the B6700. The problem with stack
: machines is that
: 1) intermediate computation results that might be reused are lost and
: 2) the top of stack becomes a bottleneck, especially in superscalar
: implementations.

I agree with most of your overall conclusions but your observations
of the 6600-6700 comparison are suspect.

The reports that I have seen compare the 5500 to the 6600 not the 6700 to
the 6600. Do you have 6700 to 6600 comparisons?

The 6600 cpu had 10 (or was it more) functional units and a CPU cycle
of 100 nsec and many more interleaved memories than the 6600.

The 6700 cpu had 1 functional unit and a cpu cycle time of 200 nsec

I don't know the transistor counts of the two machines, but that would
be a fascinating comparison.

-- snip snip


: I suspect with the rapidly improving technology of just-in-time (JIT)


: Java compilers that the market window for Java chips is sliding shut if
: it ever really existed in the first place.

I agree

: all opinions strictly my own.

: --
: Paul W. DeMone The 801 experiment SPARCed an ARMs race to put
: Kanata, Ontario more PRECISION and POWER into architectures with
: pde...@tundra.com MIPSed results but ALPHA's well that ends well.

: Satan's CPU? The P666 of course - check out Appendix H(ell) under NDA

Alastair Mayer

unread,
Jun 12, 1996, 3:00:00 AM6/12/96
to John Ahlstrom

John Ahlstrom wrote:

>
> Paul DeMone (pde...@tundra.com) wrote:
> : The B6700 is an example of a machine with (then thought) theoretical
> : advantages over its contemporary antithesis the CDC6600; it had much
> : more compact programs, needed less instruction fetch bandwidth etc.
> : Unfortunately in real life the "proto-RISC" CDC6600 could run even
> : Algol programs much faster than the B6700. The problem with stack
[snip]

>
> I agree with most of your overall conclusions but your observations
> of the 6600-6700 comparison are suspect.
>
> The reports that I have seen compare the 5500 to the 6600 not the 6700 to
> the 6600. Do you have 6700 to 6600 comparisons?
>
> The 6600 cpu had 10 (or was it more) functional units and a CPU cycle
> of 100 nsec and many more interleaved memories than the 6600.

Don't know about the CDC 6600, but the Cyber 170 series (later model,
essentially same architecture) had one CPU (60-bit, later 64-bit, word)
and up to 10 PPUs - peripheral processor units, with a more limited
instruction set mainly for I/O. Don't recall cycle time, 100ns or
less is the right ballpark.



> The 6700 cpu had 1 functional unit and a cpu cycle time of 200 nsec

The 6700 machine could take up to 4 CPUs. It had separate IO processors,
not as tightly integrated as CDC's PPUs (I'm a little more fuzzy on this
part of the architecture.) 48-bit word. Using multiple processors
in a single user program was up to the programmer, who really could only
multithread his program, not guarantee simultaneous access to multiple
CPUs.

Technology level comparable to the Cyber series would be more on the
order of a B6800 or B6900 (later models; the 'bigger faster' B6700 was
the B7700). I don't recall the basic cycle time -- instruction time
was highly variable (up to something like 3 seconds (yes seconds) if
you did something like deliberately invoke the LLLU (linked-list-lookup)
operator on a circular list where it wouldn't find the target, before
the "instruction time-out" interrupt kicked in. (Me, do a thing like
that? :-) Talk about your complex instruction sets...

> I don't know the transistor counts of the two machines, but that would
> be a fascinating comparison.

It may not be meaningful. I don't know about the 6600, but I've
heard that in some of the later machines that Seymour Cray designed,
a surprisingly high percentage of the gates are in there just to
slow down certain signal paths to keep everything synchronized.

[more snipped]

Bryan O'Sullivan

unread,
Jun 12, 1996, 3:00:00 AM6/12/96
to

In article <4po1ti$n...@news.csie.nctu.edu.tw>
m821...@athletes.EE.NCTU.edu.tw (YCYANG) writes:

m> Well, if the targeted market is at "really low-end embedded
m> systems", there has been so many low-cost micro-controllers.

This misses the point; such microcontrollers and other embedded frobs
don't run JVM bytecodes.

m> Besides I doubt the saving of 1-2 meg RAM..

This can make a big difference if you are trying to cut design and
component costs on a low-cost device.

<b

--
Let us pray:
What a Great System. b...@eng.sun.com
Please Do Not Crash. b...@serpentine.com
^G^IP@P6 http://www.serpentine.com/~bos

Bohdan Tashchuk

unread,
Jun 12, 1996, 3:00:00 AM6/12/96
to

In <4pjs0g$i...@kannews.ca.newbridge.com> Paul DeMone <pde...@tundra.com> writes:
>I suspect with the rapidly improving technology of just-in-time (JIT)
>Java compilers that the market window for Java chips is sliding shut if
>it ever really existed in the first place.

There never was a "market window" for Java chips.

The only thing that exists is a "market HYPE window" that Sun is using
in a transparent attempt to manipulate its stock price. Virtually ANY
association with "the Internet" is worth lots of $$$s in todays market.

--
Bohdan The Failed Clinton Presidency - America Held Hostage - Day 1240

John Bayko

unread,
Jun 13, 1996, 3:00:00 AM6/13/96
to

In article <31BC77...@virginia.edu>,

Michael D. Nahas <na...@virginia.edu> wrote:
>Anyone looking at implementing a stack based machine,
>such as building a processor for the Java Virtual Machine,
>should look at the AT&T CRISP. It was a stack based
>CPU optimized for performing C code on a stack. [...]

It's more of a memory based machine, with a cache for stack memory
only - operations take operands from memory and store the results to
memory, except when that memory is in the top whatever many words of
the stack.
The JVM is based on a stack, plus a pointer to memory for other
variables (an index register, which the CRISP/Hobbit lacks). Almost
all operations involve the stack directly - for example, the ADD
instruction pops two stack elements, adds them, then puts the result
back on the stack in a single operation.
In the CRISP, you'd have to specify the operands relative to the
stack pointer, perform the operation (one instruction), but then
readjust the stack pointer, which is another operation (at least -
more if you have to fill cache from memory) - and takes two instruction
words at best, or 8 bytes. On the JVM, it's a single byte.
The ShBoom is a recent, nicely designed stack oriented CPU which also
has a general register set and support for stack frames (unlike many
Forth chips), *and* byte coded instructions - pretty much everything a
Java Virtual Machine needs, and Patriot Scientific knows that, and is
promoting it as a Java platform.

--
John Bayko (Tau).
ba...@cs.uregina.ca
http://www.cs.uregina.ca/~bayko

YCYANG

unread,
Jun 13, 1996, 3:00:00 AM6/13/96
to

From Zalman Stern (zal...@macromedia.com) article:
: In article <4plb2g$5...@news.csie.nctu.edu.tw>,

: m821...@athletes.EE.NCTU.edu.tw (YCYANG) wrote:
: > Did I miss something here? From my understanding of the
: > JAVA virtual machine, a recompile of the Java class code
: > is possible and could take advantage of the parallelism
: > within the code sequence.

: The translated code has to end up somewhere. If one is targeting really
: low-end embedded systems (which picoJava is and microJava may), memory
: will be the dominant cost factor. If doing a Java CPU saves a meg or two
: of RAM, its well worthwhile in this market.

Well, if the targeted market is at "really low-end embedded systems",
there has been so many low-cost micro-controllers. I mean, if the
compactness of code density is so important, then why not use a
simplified 4-bit controller to do the job? Besides I doubt the


saving of 1-2 meg RAM..

: There is also a greater


: overhead the first time the code is exectued with a compile-and-go
: implementation. Java chips eliminate this problem.

Well, you are right at this one. Anyway if you are under a very
low bitrate channel, the re-compile time would be a small fraction
of the total time.

Richard A. O'Keefe

unread,
Jun 13, 1996, 3:00:00 AM6/13/96
to

jahl...@cisco.com (John Ahlstrom) writes:
>I agree with most of your overall conclusions but your observations
>of the 6600-6700 comparison are suspect.

>The reports that I have seen compare the 5500 to the 6600 not the 6700 to
>the 6600. Do you have 6700 to 6600 comparisons?

>The [CDC] 6600 cpu had 10 (or was it more) functional units and a CPU cycle


>of 100 nsec and many more interleaved memories than the 6600.

>The [B] 6700 cpu had 1 functional unit and a cpu cycle time of 200 nsec

From the Linpack User's Guide, 1979.
Table on p1.22:
ID Facility Computer Compiler
D UCSD B6700 H [Odd!]
V Northwestern U CDC 6600 FTN 4.6, OPT=2
X UoTex., Aust. CDC 6600/6400 RUN

[The B6700 entry is odd, because there was only one compiler, _not_
called "H", but it did have a number of optimisation-related options.]
The timing data are in appendix B.

Time, N=100 (sec)
D 38.2
V 1.44
X 1.93

Before drawing any conclusions about stack-based -vs- GPR from this,
one should note that
(a) When I was an undergraduate, the university I was at had a B6700.
A CDC600 would have been *way* outside our price bracket.
(b) We were able to have for job streams running at the same time
(it might have been five; I can't recall if RJE was always running)
and that does mean four large compilers running at the same time,
on a machine which had very little memory by today's standards (it
_couldn't_ have more than 6Mb, and we had only a fraction of that;
I think about 1Mb) and maybe 20Mb of disc (I'm not sure of the
exact figure, but it was very hard to keep files on disc because
the operators had to keep purging "old" files)
(c) This is one comparisons, for one program, for one language.
The B6700 was designed to support Fortran, COBOL, Algol, and to
some extent PL/I.
(d) The B6700 did bounds checking on every array access; even in Fortran.
You _couldn't_ switch it off. (Well, you could use 'VECTORMODE', but
I don't think the Linpack figure was obtained that way.)
There was also no notion of pointer arithmetic. This means that a
loop like
DO 10 I = 1, N
X(I) = A*Y(I) + B
10 CONTINUE
_couldn't_ be strength-reduced to pointer increments; the body of
the loop had to be
VALC I; VALC Y; VALC A; MULT; VALC B; ADD; VALC I; NAMC X; INDX; STOD
^ implicit index operation here explicit here^
Loop unrolling would have bought very little, because there is nothing
in the loop _body_ that can be shared with any other iteration.
(e) Integers and floating point numbers used the *same* representation
and the *same* ALU. In fact, integer arithmetic tended to be more
expensive than floating point, because
REAL X, Y, Z
X = Y + Z
=> VALC Y, VALC Z, ADD, NAMC X, STOD
but
INTEGER I, J, K
I = J + K
=> VALC J, VALC K, ADD, NTGR, NAMC I, STOD
i.e. there is an explicit "round to integer" instruction there.

It would not be at all surprising if these factors completely overwhelmed
any stack/register effects.

An interesting data point is that when the univerity in question got a
Prime 400, which had 16-bit (not 40-bit) integers and 32-bit (not 47-bit)
floats and which never checked an array index, and had lots of semiconductor
memory instead of creaking old ferrite cores,
- the B6700 *compiled* Fortran programs a *lot* faster than the P400
(yet the B6700 compiler was a large Algol program, and the P400
compiler was a small assembly code program which did _not_ do a lot
of optimising)
- the P400 *ran* Fortran programs quite a bit faster than the B6700.

Ever since, I have been suspicious of single benchmarks...

--
Fifty years of programming language research, and we end up with C++ ???
Richard A. O'Keefe; http://www.cs.rmit.edu.au/~ok; RMIT Comp.Sci.

Roger Peppe

unread,
Jun 13, 1996, 3:00:00 AM6/13/96
to

On Sun, 09 Jun 1996 14:22:18 GMT, Nik Shaylor - Firenze - Italia <nsha...@tcp.co.uk> wrote:
> I have not got the details, but as I recall the Transputer has this sort of
> architecture. BTW anyone know how it is these days?

it was going to turn into the T9000 with all whizbang
options, but Inmos were over-ambitious, so now it's dead.

one of the saddest stories of the last decade IMO.

i've been using the texas instruments TMS320C40, supposedly another
CSP machine, which is so crap by comparison it's not true.

the transputer architecture was based around a three level stack;
when the stack was empty, a context switch could happen.
hence almost zero time context switch (no registers to save)

are there any other simple and powerful parallel CPUs around
that might provide the sort same sort of functionality ?

rog. (all my own opinions)


David Shepherd

unread,
Jun 13, 1996, 3:00:00 AM6/13/96
to

Roger Peppe (r...@ohm.york.ac.uk) wrote:

: On Sun, 09 Jun 1996 14:22:18 GMT, Nik Shaylor - Firenze - Italia <nsha...@tcp.co.uk> wrote:
: > I have not got the details, but as I recall the Transputer has this sort of
: > architecture. BTW anyone know how it is these days?

: it was going to turn into the T9000 with all whizbang
: options, but Inmos were over-ambitious, so now it's dead.

The transputer is far from dead - large numbers of the T4xx and T8xx
still sell and the transputer architecture is now implemented in
the ST20 microcore family.

--
--------------------------------------------------------------------------
david shepherd
(speaking purely in a private capacity)
tel/fax: +44 1454 611638/617910 email: d...@bristol.st.com
"whatever you don't want, you don't want negative advertising"


Alexander Anderson

unread,
Jun 13, 1996, 3:00:00 AM6/13/96
to

In article <DsxvE...@uns.bris.ac.uk>, David Shepherd
<d...@pact.srf.ac.uk> writes

>The transputer is far from dead - large numbers of the T4xx and T8xx
>still sell and the transputer architecture is now implemented in
>the ST20 microcore family.


True, but SGS-Thompson, who took over from Inmos, didn't do it any
favours, to put it mildly.


I know that the Transputer/Occam mailing list based at Oxford is
full of talk about interfacing Java and the transputer.


The transputer is a great processor, and conception.


Sandy
--
// Alexander Anderson Computer Systems Student //
// sa...@almide.demon.co.uk Middlesex University //
// Home Fone: +44 (0) 171-794-4543 Bounds Green //
// http://www.mdx.ac.uk/~alexander9 London U.K. //

Magnus Redin

unread,
Jun 13, 1996, 3:00:00 AM6/13/96
to

d...@pact.srf.ac.uk (David Shepherd) writes:

>Roger Peppe (r...@ohm.york.ac.uk) wrote:

>: it was going to turn into the T9000 with all whizbang
>: options, but Inmos were over-ambitious, so now it's dead.

> The transputer is far from dead - large numbers of the T4xx and T8xx


> still sell and the transputer architecture is now implemented in the
> ST20 microcore family.

Z80 is also shipping in large numbers and is implemented in microcore
libraries...

Where are the new Transputers? Its dead but not cold and stiff yet.
Anyone for CPR?

Regards,
--
--
Magnus Redin Lysator Academic Computer Society re...@lysator.liu.se
Mail: Magnus Redin, Björnkärrsgatan 11 B 20, 584 36 LINKöPING, SWEDEN
Phone: Sweden (0)13 260046 (answering machine) and (0)13 214600

Bill Mangione-Smith

unread,
Jun 13, 1996, 3:00:00 AM6/13/96
to

In article <4po1ti$n...@news.csie.nctu.edu.tw> m821...@athletes.EE.NCTU.edu.tw (YCYANG) writes:

From Zalman Stern (zal...@macromedia.com) article:
: In article <4plb2g$5...@news.csie.nctu.edu.tw>,
: m821...@athletes.EE.NCTU.edu.tw (YCYANG) wrote:
: > Did I miss something here? From my understanding of the
: > JAVA virtual machine, a recompile of the Java class code
: > is possible and could take advantage of the parallelism
: > within the code sequence.

: The translated code has to end up somewhere. If one is targeting really
: low-end embedded systems (which picoJava is and microJava may), memory
: will be the dominant cost factor. If doing a Java CPU saves a meg or two
: of RAM, its well worthwhile in this market.

Well, if the targeted market is at "really low-end embedded systems",
there has been so many low-cost micro-controllers.

I'm not familiar with a micro-controller that runs java, though I would
be amused to hear of one. Were you thinking of the 8051 or the 6805?
Remember, to sun java *is* essential, and they think there is an embedded
market in the future.

Bill

David E. Fox

unread,
Jun 14, 1996, 3:00:00 AM6/14/96
to

In article <4pjs0g$i...@kannews.ca.newbridge.com>, Paul DeMone wrote:
>Alastair Mayer wrote:

>Algol programs much faster than the B6700. The problem with stack

>machines is that
>1) intermediate computation results that might be reused are lost and

On this one point:

The only stack-based machines I've used are HP programmable
calculators. However, in programming them I've noticed that one
can write very short subroutines to compute common subexpressions.

For example, if one notes that 5*sqrt(2) is used in several places,
write a subroutine that computes that, and then call it when
needed. The tradeoff is of course the number of instructions in
the subroutine vs. the subroutine call overhead (which, on HP
calculators isn't much, at least not in instructions).

Forth, of course, might engender a similar methodology. Other
HLLs don't.

So, if you reuse a particular intermediate result, write a
subroutine to generate it.

>2) the top of stack becomes a bottleneck, especially in superscalar
>implementations.

No argument there - I'm not that familiar with this stuff to make
a comment. :)

>Designing processors "optimized" for specific languages has tantalized
>computer architects for many years. Unfortunately the economics and

I wouldn't mind a processor optimized for Linux and gcc :) though.

> Paul W. DeMone The 801 experiment SPARCed an ARMs race to put
> Kanata, Ontario more PRECISION and POWER into architectures with
> pde...@tundra.com MIPSed results but ALPHA's well that ends well.
>
>Satan's CPU? The P666 of course - check out Appendix H(ell) under NDA
>


--
------------------------------------------------------------------------
David E. Fox Tax Thanks for lettimg me
df...@belvdere.vip.best.com the change magnetic patterns
ro...@belvedere.sbay.org churches on your hard disk.
-----------------------------------------------------------------------


Bert THOMPSON

unread,
Jun 14, 1996, 3:00:00 AM6/14/96
to

ze...@fasttech.com (Bohdan Tashchuk) writes:

|In <4pjs0g$i...@kannews.ca.newbridge.com> Paul DeMone <pde...@tundra.com> writes:
|>I suspect with the rapidly improving technology of just-in-time (JIT)
|>Java compilers that the market window for Java chips is sliding shut if
|>it ever really existed in the first place.

|There never was a "market window" for Java chips.

|The only thing that exists is a "market HYPE window" that Sun is using
|in a transparent attempt to manipulate its stock price. Virtually ANY
|association with "the Internet" is worth lots of $$$s in todays market.

Whoa! And I thought I took the cake for cynicism. 8^)
I think you're probably right about Sun trying to manipulate
their stock prices --- so is every other sane company.

However, I think there is definitely a market for Java chips.

Java is a well-designed language which, with small modifications, would
serve well in an embedded domain. (This has already been argued at
length in this forum. Recall also that Java was originally designed
for embedded use.)

Let's pretend that Java -is- good for embedded development.
Why then are Java chips useful?
- You get native speed without the overhead of a JIT.
- Java engine can be put in on-chip ROM with need
for little RAM. (Off-chip RAM == Much more cost for both
the RAM chip and the extra packaging.)
- You get free support for mobile code. Very little
bundling/unbundling of mobile code required.
(There are rumblings of an explosion in embedded
networking --- not just mobile phones and factory-
floor LANs.)

-If- Java catches on as a language for embedded work, -then-
Java chips will be useful. In fact, if chip makers start releasing
Java chips, then it will induce people to use Java for embedded stuff.

Personally, I could tolerate programming a 10mW 20Mhz Java chip
with a dozen I/O lines and a few K of onboard RAM and PROM.
It ain't far off.

I have my own conspiracy theory on why Sun seem to be promoting only
Web-related Java. They probably already have a convincing lead
in developing other areas of application of Java.

Bert
---------------------------------------------------------------------------
Bert Thompson a...@cs.mu.oz.au
"This sentence is true."
---------------------------------------------------------------------------

Guy Harris

unread,
Jun 14, 1996, 3:00:00 AM6/14/96
to

David E. Fox <df...@belvdere.vip.best.com> wrote:
>For example, if one notes that 5*sqrt(2) is used in several places,
>write a subroutine that computes that, and then call it when
>needed. The tradeoff is of course the number of instructions in
>the subroutine vs. the subroutine call overhead (which, on HP
>calculators isn't much, at least not in instructions).

Umm, I don't think the original poster's concern was with the *code
space* overhead of not being able to reuse results, I think it was with
the *CPU time* overhead of not being able to reuse results, in which
case the tradeoff is the number of instructions to re-evaluate the
expression vs. the number of instructions needed to reload the cached
result.

One could, presumably, on a pure stack machine, pop the intermediate
result off the stack into a temporary variable, and then push it back on
the stack and, to reuse it, push the temporary variable.

(Of course, the compiler might well note that "5*sqrt(2)" takes only one
instruction to evaluate, namely the instruction to load/push/whatever
7.0710678115; generating code to call "sqrt()" with an argument of 2 in
that particular expression seems somewhat pointless, unless your
"sqrt()" implementation has a side-effect, or your compiler has no way
of determining that it doesn't.)

Mark Rosenbaum

unread,
Jun 14, 1996, 3:00:00 AM6/14/96
to

In article <4pjs0g$i...@kannews.ca.newbridge.com>,

Paul DeMone <pde...@tundra.com> wrote:
>Designing processors "optimized" for specific languages has tantalized
>computer architects for many years. Unfortunately the economics and
>volumes of general purpose machines pushes them ahead so fast as to
>negate the modest advantages of a language specific machine. As a grad
>student I had a Symbolics 3600 Lisp machine to play with. This was a
>great machine but even then 68020-based general purpose workstations
>were becoming available that could run Lisp programs faster for much
>less money.

I think the critical point here is volume. If the market for java chips
is large enough then it would justify the engineering effort to keep them
up to date with other processors.


Mark Rosenbaum Otey-Rosenbaum & Frazier
m...@netcom.com Consultants in High Performance and
(303) 727-7956 Scalable Computing and Applications
Boulder CO

Phil Koopman

unread,
Jun 14, 1996, 3:00:00 AM6/14/96
to

Paul DeMone <pde...@tundra.com> wrote:
> The problem with stack
>machines is that
>1) intermediate computation results that might be reused are lost and
Not true with a decent compiler. See:
http://www.cs.cmu.edu/~koopman/stckcomp/

>2) the top of stack becomes a bottleneck, especially in superscalar
>implementations.

There is no reason that this should be the case -- one could use
register renaming on the stack elements just as with registers. Of
course contemporary stack chips don't do this because they aren't
targetted for that kind of transistor budget.

-- Phil

Phil Koopman -- koo...@cs.cmu.edu -- http://www.cs.cmu.edu/~koopman

Alain Raynaud

unread,
Jun 17, 1996, 3:00:00 AM6/17/96
to

Bert THOMPSON wrote:
>
> Java is a well-designed language which, with small modifications, would
> serve well in an embedded domain. (This has already been argued at
> length in this forum. Recall also that Java was originally designed
> for embedded use.)

I fail to understand what the point of a chip running Java is.

After all, java is a language (just like C or C++). Why does it make sense
to have a chip run Java and not run C++ ?

Why did everybody dismiss the idea of running a language through a compiler
before running it on any architecture ?


Ok, from what I got, the main reason is that you want to be able to
run java on the fly. From a network (does the word Internet ring
any bell ?).

I see Java as a language, plus a virtual machine. Why not translate
a virtual machine to APIs, and make this a standard ? Why go all
the way to bother to run this "native", when it doesn't make sense ?

I'm really interested in understanding all this.

PS: don't get me wrong, java chips are good for me. It will let my
company sell more of its product...

Alain.
--
-----------------------------------------------------------------------
Alain RAYNAUD META SYSTEMS
Batiment Hermes
4, rue Rene Razel
Tel: (33) 1 69 35 10 00 91400 Saclay - FRANCE
E-Mail: al...@meta-systems.fr Fax: (33) 1 69 35 10 10
-----------------------------------------------------------------------

Bryan O'Sullivan

unread,
Jun 17, 1996, 3:00:00 AM6/17/96
to

In article <31C5D2...@meta-systems.fr> Alain Raynaud
<al...@meta-systems.fr> writes:

a> I fail to understand what the point of a chip running Java is.

It saves you the cost of either interpreting or compiling to some
other native instruction set; this is significant in low-cost embedded
markets.

a> Why did everybody dismiss the idea of running a language through a


a> compiler before running it on any architecture ?

Nobody dismissed it. Interpreters, compilers and native silicon all
have their places on the price-performance curve, and some are clearly
better suited for adoption than others on particular segments of this
curve.

a> Ok, from what I got, the main reason is that you want to be able to
a> run java on the fly.

The main reason is that you want something that is both cheap and
pretty fast. If what you use is to be fast, you can't realistically
use interpretation. If it is to be cheap, both interpretation and
compilation are bad (the latter being worse) because of the cost
incurred in embedding an interpreter or compiler in ROM, NVRAM, or
whatever.

This has already been done to death. If the above does not make
things clear, please go back and read the rest of this thread.

John Bayko

unread,
Jun 17, 1996, 3:00:00 AM6/17/96
to

In article <31C5D2...@meta-systems.fr>,

Alain Raynaud <al...@meta-systems.fr> wrote:
>
>I fail to understand what the point of a chip running Java is.
> [...]

>Ok, from what I got, the main reason is that you want to be able to
>run java on the fly. From a network (does the word Internet ring
>any bell ?).

An example - NorTel announced that it wants to incorporate Java
chips in future telephones. They've been investigating for years the
idea of 'customer upgradable software' for low end equipment - for
example, if a customer wants to upgrade the phone to one which
supports call display, just call an upgrade number and download the
new software into the old phone, instead of having to bring the phone
in for trade-in.
Java chips allow this sort of upgrade to plug-in phones, cellular
phones - and more important, allows standard applets to be made
available to non-phone company providers. Want to order a pizza? Call
the pizza company's server, it'll download a cool menu (literally) to
your phone and allow you to order exactly what you want, and make the
customer aware of any special promotions.
And it does this with a $25 chip and 8-line LCD available to any
phone manufacturer, instead of a PC with Windows and Netscape which
takes longer just to boot up than the entire ordering process.

YCYANG

unread,
Jun 18, 1996, 3:00:00 AM6/18/96
to

From John Bayko (ba...@BOREALIS.CS.UREGINA.CA) article:

: >Ok, from what I got, the main reason is that you want to be able to


: >run java on the fly. From a network (does the word Internet ring
: >any bell ?).

: Java chips allow this sort of upgrade to plug-in phones, cellular


: phones - and more important, allows standard applets to be made
: available to non-phone company providers. Want to order a pizza? Call
: the pizza company's server, it'll download a cool menu (literally) to
: your phone and allow you to order exactly what you want, and make the
: customer aware of any special promotions.

Well, in my opinion, any microprocessor-based arch. could
do the same thing. You're describing an application of a
flexible, extensible environment. However no one MUST
implement the environment using Java-Native machine.

Roedy Green

unread,
Jun 18, 1996, 3:00:00 AM6/18/96
to

Alain Raynaud wrote:
>I fail to understand what the point of a chip running Java is.

Here are some advantages, primarily in small machines.

1. ALL the code is compact. With any sort of JIT or compiler the code will
be 10 or more times larger. You don't need to waste ROM or RAM on a JIT
and space for the fluffed up code.

2. You don't have the overhead of a JIT, either in time or space. The code
is ready to go.

3. hardware may be optimised to the specific job rather than kludged around
some arthritic architecture. This means it can run light with low power
and silicon real estate.

4. there are opportunities in high speed Java chips for increased
parallelism and look-ahead.

My own guess is the future will belong to machines that are hybrids, they
will have special Java hardware, but will rely on JITs to take advantage of
stats dynamically gathered to generate highly optimised code, where
pipelining is planned mostly in advance rather than dynamically. JITs may
even refine code DURING execution.


Ro...@bix.com <Roedy Green of Canadian Mind Products> contract programming
-30-

Michael Koster

unread,
Jun 18, 1996, 3:00:00 AM6/18/96
to

ba...@BOREALIS.CS.UREGINA.CA (John Bayko) writes:

>In article <31C5D2...@meta-systems.fr>,


> Alain Raynaud <al...@meta-systems.fr> wrote:
>>
>>I fail to understand what the point of a chip running Java is.

>> [...]


>>Ok, from what I got, the main reason is that you want to be able to
>>run java on the fly. From a network (does the word Internet ring
>>any bell ?).

> An example - NorTel announced that it wants to incorporate Java


>chips in future telephones. They've been investigating for years the
>idea of 'customer upgradable software' for low end equipment - for

(...)


> And it does this with a $25 chip and 8-line LCD available to any
>phone manufacturer, instead of a PC with Windows and Netscape which
>takes longer just to boot up than the entire ordering process.

Kind of a nice alternative to the recently proposed $1400 WiNtel
PC, for those that don't have a burning desire to compute...

Michael.

Jeff Fox

unread,
Jun 18, 1996, 3:00:00 AM6/18/96
to

In article <31c1878c...@usenet.interramp.com> koo...@cs.cmu.edu

Very true. In the F21 chip we are trying to keep the transistor budget
very small (<15k) so the stacks are on chip register arrays with
circular stack pointers. As you know stack locations on these chips
are not in addressable memory so they are not really designed for
supporting things like the stack frames in Java.
But the number of locations in the stack or the ability to address
stack locations as memory locations is a matter of adding a few more
transistors. On these designs it could not add more than a few cents
at most to the chip cost.
As you also know one advantage of the on chip stack registers is that
stack access is not the bottleneck. On these chips stack access is
several times faster than memory access and we take advantage of this
by packing up to 4 instructions into a memory word on this cpu. But
the 2.5ns stack access is certainly not the bottleneck on this .8u
chip.
The Minimal Instruction Set Computer approach is not the stack
machine design today that maps most directly to a Java engine. But
this thread has also speculated about future very low power
very low cost chips designed for Java and I think the MISC
approach may find a niche there.

Jeff Fox
jf...@netcom.com Ultra Technology Inc.
jf...@dnai.com
je...@itvcorp.com
http://www.dnai.com/~jfox

Roedy Green

unread,
Jun 19, 1996, 3:00:00 AM6/19/96
to

John Bayko wrote:
>Java chips allow this sort of upgrade to plug-in phones, cellular
>phones - and more important, allows standard applets to be made
>available to non-phone company providers. Want to order a pizza? Call
>the pizza company's server, it'll download a cool menu (literally) to
>your phone and allow you to order exactly what you want.

The biggest timesaver will be the Bustel button. You hit a white button
with a red heart on your phone to send the other party your electronic
business card. You hit it twice and key in a pin number to send credit
card info. It will slash ordering time and ensure accurate deliveries and
allow fully automated entry of the information into the callee's database,
AND give you more time to flirt.

With Java-phones, your rapid-dial phone list, phone number, and phone
features can follow you around.

Kenny Ranerup

unread,
Jun 20, 1996, 3:00:00 AM6/20/96
to

>>>>> Roedy Green writes:
> Here are some advantages, primarily in small machines.
> 1. ALL the code is compact. With any sort of JIT or compiler the
> code will be 10 or more times larger. You don't need to waste ROM
> or RAM on a JIT and space for the fluffed up code.

+ some other points about JIT.

All the points you make are under the assumption that the binary code
must be portable between different machines and OSes. Then both JIT
and Java chips make sense. But if you don't need binary compatibility
why compile into JVM code at all? You'll only loose performance and
possibly memory space.

Kenny Ranerup

Stefan Monnier

unread,
Jun 20, 1996, 3:00:00 AM6/20/96
to

In article <874toau...@organon.serpentine.com>,
Bryan O'Sullivan <b...@serpentine.com> wrote:
] a> Why did everybody dismiss the idea of running a language through a

] a> compiler before running it on any architecture ?
] Nobody dismissed it. Interpreters, compilers and native silicon all
] have their places on the price-performance curve, and some are clearly
] better suited for adoption than others on particular segments of this
] curve.

Of course nobody dismissed it, but it looks like people tend to forget that the
processor is supposed to run the JVM instructions, not the Java language. So
even though the compiler from Java to byte-codes is simple, there *is* a
compilation step already. The only case where you can say that the compilation
is avoided is when you receive the byte-code from somewhere else and you
couldn't receive something else instead. In other words, it's only interesting
for WWW.

Now why should an embedded processor run an instruction set that was never
designed to be run by a processor, but by a virtual machine ?

This Java thing is really the opposite of Computer Architecture. Rather than
design an ISA to get better performance (whatever your performance metric is),
just take a random inadapted ISA and just build a processor for it.


Stefan

Alex Colvin

unread,
Jun 20, 1996, 3:00:00 AM6/20/96
to

>With Java-phones, your rapid-dial phone list, phone number, and phone
>features can follow you around.

Hah! not only that, but businesses will insert their numbers into your list,
market research companies will grab your rapid-dial list, and MakeMoneyFast
will send them chain letters...

--
Alex Colvin
alex....@dartmouth.edu


Bryan O'Sullivan

unread,
Jun 20, 1996, 3:00:00 AM6/20/96
to

In article <4qbcdo$h...@info.epfl.ch> "Stefan Monnier"
<stefan....@lia.di.epfl.ch> writes:

s> So even though the compiler from Java to byte-codes is simple,
s> there *is* a compilation step already.

I am, perhaps, failing to see your point. Yes, there is already a
compilation step. However, it only happens once; it happens somewhere
other than on the end-user's machine, toaster, or frob; and for
non-trivial programs, .class files are just about always more compact
than the source code that describes them.

s> In other words, it's only interesting for WWW.

Joe User will *always* get the bytecodes from somewhere else, whether
it be over the Web, down the POTS from his phone provider, or
whatever.

s> Now why should an embedded processor run an instruction set that
s> was never designed to be run by a processor, but by a virtual
s> machine ?

I don't think anyone is claiming that the JVM maps particularly well
to hardware, because such a claim would be silly. However, it maps
well enough that hardware that runs bytecodes directly will perform
just fine for the kind of markets at which picoJava is targeted.
Note, by the way, that Sun has made no claims that higher-performance
members of the Java processor family will execute bytecodes directly.

s> This Java thing is really the opposite of Computer Architecture.

This isn't true; the constraints and tradeoffs to be made are
considerably different to those considered during the design of "speed
at any price" processors, but that's all.

John Bayko

unread,
Jun 20, 1996, 3:00:00 AM6/20/96
to

In article <4q53q5$s...@news.csie.nctu.edu.tw>,

YCYANG <m821...@athletes.EE.NCTU.edu.tw> wrote:
>
>: Java chips allow this sort of upgrade to plug-in phones, cellular
>: phones - and more important, allows standard applets to be made
>: available to non-phone company providers. Want to order a pizza?
>
>Well, in my opinion, any microprocessor-based arch. could
>do the same thing. You're describing an application of a
>flexible, extensible environment. However no one MUST
>implement the environment using Java-Native machine.

No, but Java/JVM provides a standard (important), and a Java
processor makes it cheaper than emulating a JVM or recompiling Java
bytecode - both of those require extra ROM or RAM at least. If your
goal is low cost and high volume, companies even work on reducing the
number of screws in such appliances.

David Hopwood

unread,
Jun 20, 1996, 3:00:00 AM6/20/96
to

[comp.sys.unisys removed from Newsgroups]

In article <87lohix...@organon.serpentine.com>,


Bryan O'Sullivan <b...@serpentine.com> wrote:

>In article <4qbcdo$h...@info.epfl.ch> "Stefan Monnier"
><stefan....@lia.di.epfl.ch> writes:
>
>s> So even though the compiler from Java to byte-codes is simple,
>s> there *is* a compilation step already.
>
>I am, perhaps, failing to see your point. Yes, there is already a
>compilation step. However, it only happens once; it happens somewhere
>other than on the end-user's machine, toaster, or frob; and for
>non-trivial programs, .class files are just about always more compact
>than the source code that describes them.

Yes, but only by a factor of about 2 (according to 'du', the JDK classfiles
are 1600K, and the corresponding source 3023K).

>s> In other words, it's only interesting for WWW.
>
>Joe User will *always* get the bytecodes from somewhere else, whether
>it be over the Web, down the POTS from his phone provider, or
>whatever.
>
>s> Now why should an embedded processor run an instruction set that
>s> was never designed to be run by a processor, but by a virtual
>s> machine ?
>
>I don't think anyone is claiming that the JVM maps particularly well
>to hardware, because such a claim would be silly.

Precisely. More to the point, for non-network embedded use there is no
disadvantage to compiling for a specific architecture.

David Hopwood
david....@lmh.ox.ac.uk

calvin

unread,
Jun 20, 1996, 3:00:00 AM6/20/96
to

The same thing goes with why we should have MPEG decoder chips. I got
a software mpeg player.

Or Modem hardware compression for that matter.

Why is Intel putting in all the 3D and Multimedia in the Pentium Pro
chip?

Why graphic accelarator boards, can't we do that with software?

There ya go.

b...@serpentine.com (Bryan O'Sullivan) wrote:

>In article <31C5D2...@meta-systems.fr> Alain Raynaud
><al...@meta-systems.fr> writes:

>a> I fail to understand what the point of a chip running Java is.

>It saves you the cost of either interpreting or compiling to some
>other native instruction set; this is significant in low-cost embedded
>markets.

>a> Why did everybody dismiss the idea of running a language through a


>a> compiler before running it on any architecture ?

>Nobody dismissed it. Interpreters, compilers and native silicon all
>have their places on the price-performance curve, and some are clearly
>better suited for adoption than others on particular segments of this
>curve.

>a> Ok, from what I got, the main reason is that you want to be able to


>a> run java on the fly.

>The main reason is that you want something that is both cheap and
>pretty fast. If what you use is to be fast, you can't realistically
>use interpretation. If it is to be cheap, both interpretation and
>compilation are bad (the latter being worse) because of the cost
>incurred in embedding an interpreter or compiler in ROM, NVRAM, or
>whatever.

>This has already been done to death. If the above does not make
>things clear, please go back and read the rest of this thread.

> <b

Chris Pirih, proverbs at wolfenet dot com

unread,
Jun 20, 1996, 3:00:00 AM6/20/96
to

In article <4qcfkm$e...@sue.cc.uregina.ca>,

ba...@BOREALIS.CS.UREGINA.CA (John Bayko) wrote:
| No, but Java/JVM provides a standard (important), and a Java
| processor makes it cheaper than emulating a JVM or recompiling Java
| bytecode - both of those require extra ROM or RAM at least. If your
| goal is low cost and high volume, companies even work on reducing the
| number of screws in such appliances.

So Sun makes a Java microprocessor, and we end up with yet another
architecture for embedded processors. What makes Java more attractive
in that domain than, say, ARM or MIPS?

---
chris

Kym Horsell

unread,
Jun 21, 1996, 3:00:00 AM6/21/96
to

In article <4padgl$a...@cronkite.cisco.com>,
John Ahlstrom <jahl...@cisco.com> wrote:
>It would be nice if some Burroughs A-Series compiler jocks
>kicked in on this thread.

Someone have the number of the home?

--
R. Kym Horsell
KHor...@EE.Latrobe.EDU.AU k...@CS.Binghamton.EDU
http://WWW.EE.LaTrobe.EDU.AU/~khorsell http://CS.Binghamton.EDU/~kym

Torben AEgidius Mogensen

unread,
Jun 21, 1996, 3:00:00 AM6/21/96
to

Roedy Green <ro...@BIX.com> writes:

>Alain Raynaud wrote:
>>I fail to understand what the point of a chip running Java is.

>Here are some advantages, primarily in small machines.

>1. ALL the code is compact. With any sort of JIT or compiler the code will
>be 10 or more times larger. You don't need to waste ROM or RAM on a JIT
>and space for the fluffed up code.

Two points here: 1) 10x blow-up is being very pessimistic. A good JIT
compiler could make it between 2-5 times on average. 2) You can make a
JIT compiler work with a relatively small code area, which works like
a cache of compiled code. This way, the memory usage can be (almost)
as little as you like, larger space just gives you better speed (up to
a point).

>2. You don't have the overhead of a JIT, either in time or space. The code
>is ready to go.

If you download the Java byte code, you have plenty of time to do the
compilation during the waits. If you use the cache idea described
above, you will translate the code in small pieces as theya re
needed. Hence, there will only be a marginal slow-down when the
program is started.

>3. hardware may be optimised to the specific job rather than kludged around
>some arthritic architecture. This means it can run light with low power
>and silicon real estate.

Not all other architectures are 'arthritic'. Also, I doubt that the
power and silicon real estate is smaller than for simple RISCs.

>4. there are opportunities in high speed Java chips for increased
>parallelism and look-ahead.

As there are in any other processor doing JIT-compilation or
interpretation of Java byte code.

>My own guess is the future will belong to machines that are hybrids, they
>will have special Java hardware, but will rely on JITs to take advantage of
>stats dynamically gathered to generate highly optimised code, where
>pipelining is planned mostly in advance rather than dynamically.

IT sounds costly to provide both Java and 'normal' instruction sets.

>JITs may even refine code DURING execution.

This is just an extension of the cache idea above.

Torben Mogensen (tor...@diku.dk)

( )

unread,
Jun 21, 1996, 3:00:00 AM6/21/96
to

Presumably, memory footprint (assuming that cost of the microprocessors
is on par). Directly executing Java bytecodes compared to Java compiled code
will always take less memory (by how much...is up for debate).

As you'd agree, the Java microprocessor is not slated to fight the
specint/specfp war, or else one would'nt employ a stack machine.
We have the SPARC family for that battlefront.

What the Java microprocessor is slated for, is to execute Java lanaguage
in as compact a system as possible. System is defined as
combination of javaos + embedded microprocessor + memory for
any given application domain.

What might make it more attractive than ARM/MIPS/ShBoom/other processors
is (in theory at least) the compactness of the system.

Also in theory at least, by virtue of its compactness, it is also expected
to be lowest cost.

What may make the competitor's mircroprocessor more attractive on the
other hand, is if:
a) they find no compelling need to develop their application sw in Java
b) their overall system cost (javaos + their microprocessor + memory)
actually comes out lower (such as microprocssor costs only $0.50
(yes 50 cents) and this compensates for the cost of extra (say) $10
in memory needed by JIT or fully compiled binary);
c) the cost model of the entire system is $0.80 cents (no OS needed,
tightly execute very small binary microcode such as 100 bytes to
1Kbytes to 64Kbytes...all burned into an onchip ROM as deployed
in many of the current embedded systems using 4bit, 8bit, and 16bit
microcontrollers). And the choice of language is not important,
and ability to upgrade sw from the network is irrelevant.
d) they want to write their sw in Java language but some other
microprocessor gives them alternate tangible benefits that
override the cost advantage of the compactness of java system,
such as already design win into a product...leading to other products,
some functionality attribute that might be missing from java
microprocessor and time to market wont allow the customer to
make use of the pico-java core and roll his own controller,
or finally some performance metric that he needs that is not
met by the java microprocessor and is met by the competing processor.

As for objection a), see the July issue of embedded systems for
an interesting article on using Java in that space.

For objections b and c, I have no idea what cost the java system will
be offered at. All the above numbers are pure speculation on my part.
But one can also make intelligent guesses by looking at competition
pricing. I am sure Sun will make it attractive for a broad
segment of the market by offering/enabling several configurations,
footprints, and cost points (they want to suceed dont they?).
By "enabling" I mean Sun has announced licenscing its pico-java core
to anyone as an embedded core-ware (for a licensing fee of course...I
have no idea how much).

And further for c) no one has claimed that Java microprocessor is
for every single application domain. Instead, the claim as I understand
it, is that it will enable 'new' application domains, those that
have hitherto not existed. For instance, see the announcements from Nortel.
And the main purpose of doing java silicon is so that Sun can offer
a complete and integrated system solution. This certainly is
not intended to preclude other solutions.

For objections d) I am sure that SME (Sun Microelectronics) will
have to remain vigilant in anticipating customer needs and
fighting for design wins. Thats what competition is all about.

Not speaking for Sun,
Zahir.

Stuart Lynne

unread,
Jun 21, 1996, 3:00:00 AM6/21/96
to

In article <4qce5i$5...@news.ox.ac.uk>,
David Hopwood <lady...@sable.ox.ac.uk> wrote:

>>I don't think anyone is claiming that the JVM maps particularly well
>>to hardware, because such a claim would be silly.
>
>Precisely. More to the point, for non-network embedded use there is no
>disadvantage to compiling for a specific architecture.

Can you think of a reason why anything should be built that wouldn't
benefit from *some* ability to do field upgrades? And at that point
it would be nice to have cross-widget compatability. My java Toast
applet should run on different manufacturers toasters. Download might
be network or serial or AC carrier (X10, CEBus, LonWorks).

--
Stuart Lynne <s...@wimsey.com> 604-933-1000 <http://www.wimsey.com>
PGP Fingerprint: 28 E2 A0 15 99 62 9A 00 88 EC A3 EE 2D 1C 15 68

Paul Houle

unread,
Jun 21, 1996, 3:00:00 AM6/21/96
to

Torben AEgidius Mogensen wrote:

> >My own guess is the future will belong to machines that are hybrids, they
> >will have special Java hardware, but will rely on JITs to take advantage of
> >stats dynamically gathered to generate highly optimised code, where
> >pipelining is planned mostly in advance rather than dynamically.
>
> IT sounds costly to provide both Java and 'normal' instruction sets.
>
> >JITs may even refine code DURING execution.
>
> This is just an extension of the cache idea above.
>

It probably wouldn't be that expensive to produce a dual-mode chip
that can run two instruction sets. After all, we are getting the capability
of laying out more and more transistors on a chip... That ability is growing
faster than our ability to decide where to put transistors on a chip. I
don't see any reason that you couldn't have a chip that has, say, a
pentium or PPC601 chip on one side and a specialized java chip on the other
side. This way you could build a machine that runs Wintel or MacOS along
with Java/OS.

Of course, you could always write a PPC or pentium virtual machine
in Java, creating a cross-platform solution for running Wintel and MacOS
apps... ;-)

John Bayko

unread,
Jun 21, 1996, 3:00:00 AM6/21/96
to

In article <4qdq09$7...@ratty.wolfe.net>,

Chris Pirih, proverbs at wolfenet dot com <_@_._> wrote:
>In article <4qcfkm$e...@sue.cc.uregina.ca>,
> ba...@BOREALIS.CS.UREGINA.CA (John Bayko) wrote:
>| No, but Java/JVM provides a standard (important), and a Java
>| processor makes it cheaper than emulating a JVM or recompiling Java
>| bytecode - both of those require extra ROM or RAM at least. If your
>| goal is low cost and high volume, companies even work on reducing the
>| number of screws in such appliances.
>
>So Sun makes a Java microprocessor, and we end up with yet another
>architecture for embedded processors. What makes Java more attractive
>in that domain than, say, ARM or MIPS?

You must have missed the example about 3 messages upthread of this
one... the example was a consumer telephone that allows a customer to
dial up a pizza ordering system which downloads Java bytecode that
provides an interactive menu for the customer. The advantage is that
Java is a standard - no matter who makes the phones, the applet runs
on all of them.

Foulk MGMN

unread,
Jun 21, 1996, 3:00:00 AM6/21/96
to

>2) the top of stack becomes a bottleneck, especially in superscalar
>implementations.

The A-Series caches the top of stack so it is not a bottle neck. It also
has an associative memory cache for lower stack entries that are
referenced frequently. The Pentium processors also do some of this with
their memory cache.

>Umm, I don't think the original poster's concern was with the *code
>space* overhead of not being able to reuse results, I think it was with
>the *CPU time* overhead of not being able to reuse results, in which
>case the tradeoff is the number of instructions to re-evaluate the
>expression vs. the number of instructions needed to reload the cached
>result.

The intermediate result can be left on the stack to retrieve at a later
time. Also, most stack machine compilers optimize exprestion evaluation to
reduce the instances of having to hold an intermediate result.

>Designing processors "optimized" for specific languages has tantalized
>computer architects for many years. Unfortunately the economics and

In the seventies Burroughs released the B1000 series that did this. The
machine was bit addressable. Every language compiled to byte code that was
interpreted by a microcoded interpreter that was optimized to the the
language's unique requirements. There were rumors that Unisys was going to
use this archetecture for all platforms for the merged companies.

The B1000 solution would work well for Java. It would give the speed while
giving the flexibilty to add new features to the compiled code. The
processor itself would be simpler than the current microprocessors.

The V-Series was designed around the COBOL language. The assembler even
looks like COBOL. I never did understand why a COBOL shop would by an
A-Series when the V-Series ran COBOL much better.

The A-Series was designed around ALGOL. The newest A-Series processors
reside on a single chip and scream.

So there is not problem with designing a processor around Java. The
question is how powerful do you want to make it.

C. Mitford Stanley
foulk...@acm.org

Bernd Paysan

unread,
Jun 21, 1996, 3:00:00 AM6/21/96
to

Kenny Ranerup wrote:

>
> >>>>> Roedy Green writes:
> > Here are some advantages, primarily in small machines.
> > 1. ALL the code is compact. With any sort of JIT or compiler the
> > code will be 10 or more times larger. You don't need to waste ROM
> > or RAM on a JIT and space for the fluffed up code.
>
> + some other points about JIT.
>
> All the points you make are under the assumption that the binary code
> must be portable between different machines and OSes. Then both JIT
> and Java chips make sense. But if you don't need binary compatibility
> why compile into JVM code at all? You'll only loose performance and
> possibly memory space.

I still fail to see why Java code should be much shorter than other
embedded control ISAs (Hitachi HS or 68k to name typical represents of
the more powerful architectures). I know why Forth code is shorter (only
if byte code is used): the difference is use of many small subroutines,
which is possible, because Forth is a hand-tweaked stack language (with
all advantages like low code size, and disadvantage as strange syntax).

Java is a register language (like C, the "portable assembler" for the
PDP 11) compiled for a stack VM, so programmers will write large (as
compared to Forth) subroutines, won't factor out common parts, and even
if they do, the overhead (build stack frame, access input variables,
store results in stack frame) is not neglictable. A Forth subroutine
("word") doesn't have this overhead, because there is no stack frame and
the return address is on it's own stack.

Could someone give me some references why or than (measured) Java code
is much shorter than other code for EC architectures? I believe this is
a myth. Another myth concerning stack architectures: Forths for desktop
systems don't produce more compact code than C compilers, because Forths
on desktop systems don't use byte code for performance reasons, and for
R&D man-power reasons, they don't optimize native code (if they even
generate it) as good as C compilers. So if there is a difference, it
comes from different factoring, and Java programmers certainly will
factor like C programmers, not like Forth programmers.

Java is certainly inspired by Forth. The opcode names give a hint, and
Sun uses Forth bytecode for OpenBoot. Java should be C-like to be widely
accepted, and a stack oriented bytecode will be portable and have a
reasonable performance when interpreted. And the current Java
application, Web applets, is what's the best to do with this concept.
For the original purpose, embedded control, Java fails to be interactive
in a small environment (which is the success key of Forth or BASIC
microcontrollers), it's not interactive at all. The native method system
isn't seamlessly included in the developement kit (or I fail to see that
it is), which is true for Forth (included assembler) and even BASIC
(when did you last sys xxxx?).

Java checks array ranges and other things, but this is not needed for
EC. If you must guarantee function, neither an access out of bounds, nor
a "out of bound" trap is allowed.

--
Bernd Paysan
"Late answers are wrong answers!"
http://www.informatik.tu-muenchen.de/~paysan/

Bernd Paysan

unread,
Jun 21, 1996, 3:00:00 AM6/21/96
to

Stefan Monnier wrote:
> This Java thing is really the opposite of Computer Architecture. Rather than
> design an ISA to get better performance (whatever your performance metric is),
> just take a random inadapted ISA and just build a processor for it.

So you say the guys that implemented recent Intel ISA machines are _not_
computer architects (they are more a sort of computer microarchitecs ;-).

Chris Goellner

unread,
Jun 22, 1996, 3:00:00 AM6/22/96
to

Chris Pirih, proverbs at wolfenet dot com (_@_._) wrote:
: In article <4qcfkm$e...@sue.cc.uregina.ca>,
: ba...@BOREALIS.CS.UREGINA.CA (John Bayko) wrote:
: | No, but Java/JVM provides a standard (important), and a Java
: | processor makes it cheaper than emulating a JVM or recompiling Java
: | bytecode - both of those require extra ROM or RAM at least. If your
: | goal is low cost and high volume, companies even work on reducing the
: | number of screws in such appliances.

: So Sun makes a Java microprocessor, and we end up with yet another
: architecture for embedded processors. What makes Java more attractive
: in that domain than, say, ARM or MIPS?

The appeal is that Java and JavaOS seem to provide a complete package
the could be relatively processor or vendor independant. I have a
friend who does embedded systems programming. He has different source
trees and compilers for each chip he works. I've seen him spend weeks
porting something to a new chip. If he could work with Java and JavaOS
he could produce more stable code for a variety of chips easier. He
would also be able to work in Java rather than assembly or straight C.

Trevor Pering

unread,
Jun 22, 1996, 3:00:00 AM6/22/96
to

In article <4qhjab$7...@mailgate.ustc.ciba.com>,

Chris Goellner <goel...@usbw.ciba.com> wrote:
>
>The appeal is that Java and JavaOS seem to provide a complete package
>the could be relatively processor or vendor independant. I have a
>friend who does embedded systems programming. He has different source
>trees and compilers for each chip he works. I've seen him spend weeks
>porting something to a new chip. If he could work with Java and JavaOS
>he could produce more stable code for a variety of chips easier. He
>would also be able to work in Java rather than assembly or straight C.

Now he will have one more source tree to deal with for a while, a Java
one. His other installed bases won't magically go away overnight, and
he will still have to support code for those systems.

If everybody, however, woke up one morning and decided to write
everything for the ARM processor, using a standard ARM API to a
completely specified ARM OS, then the results would be pretty much the
same. In the meanwhile, they could write ARM interpreters and
intermediate compilers for the other systems. (I'm not saying
anything about the ARM is better/worse, just that there is nothing
speical about Java)

To me, it seems that it's a mater of standardization. If it is the
right time for such a standard (no more breakthroughs are to be made)
and the standard is right (Java is not too bloated for the nich it is
filling) then it will catch on.

Trevor


Chris Goellner

unread,
Jun 23, 1996, 3:00:00 AM6/23/96
to

Trevor Pering (per...@waskosim.EECS.Berkeley.EDU) wrote:
: In article <4qhjab$7...@mailgate.ustc.ciba.com>,
: Chris Goellner <goel...@usbw.ciba.com> wrote:
: >
: > [...]

: [...]
: If everybody, however, woke up one morning and decided to write

It sort of seems like they have, with all of the Java talk
going around.

: everything for the ARM processor, using a standard ARM API to a


: completely specified ARM OS, then the results would be pretty much the
: same. In the meanwhile, they could write ARM interpreters and
: intermediate compilers for the other systems. (I'm not saying
: anything about the ARM is better/worse, just that there is nothing
: speical about Java)

: To me, it seems that it's a mater of standardization. If it is the
: right time for such a standard (no more breakthroughs are to be made)
: and the standard is right (Java is not too bloated for the nich it is
: filling) then it will catch on.
: Trevor

No there isn't really anything special other than we are seeing it
happen now. ARM and some of the other guys don't seem to want to do
what SUN is doing. I agree with you though, if SUN or Microsoft
screws up Java it'll be just another Forth/Smalltalk type system.
I also think the Java chips from SUN put a new spin on this.

BTW. Does anyone know when we're going to see this JavaOS.

Gil Gameiro

unread,
Jun 23, 1996, 3:00:00 AM6/23/96
to

Stefan Monnier wrote:
> This Java thing is really the opposite of Computer Architecture. Rather than
> design an ISA to get better performance (whatever your performance metric is),
> just take a random inadapted ISA and just build a processor for it.

You have no idea what you are talking about, ISA has nothing to do with the
processor code, it is just a Hardware BUS!

The "Java thing" as you say, is just to create a processor that runs bytecode as a
native instruction set so you don't have to interpret it first.
And you may not know it, but Sun spent a reasonable amount of time running
statistics on which kind of instructions were executed very often and which were
not before they came out with the byte code.

Remember that these guys had RISC architectures far more powerfull that a 386 when
we were still playing with 286.

How many RISC/CISC processors do you know that have a built in instruction for the
switch/case statement? Well Java byte code does! Makes me laught when I see the ASM
generated for a 386 code that has a bunch of CMP/JNE/CMP/JNE/CMP/JNE for every
case!

Common, give the a little more credit or are you that much smarter?

Bye now,

Vivek Sadananda Pai

unread,
Jun 23, 1996, 3:00:00 AM6/23/96
to

In article <31CDC1...@mbay.net>, Gil Gameiro <bi...@mbay.net> writes:
|> Stefan Monnier wrote:
|> > This Java thing is really the opposite of Computer Architecture. Rather than
|> > design an ISA to get better performance (whatever your performance metric is),
|> > just take a random inadapted ISA and just build a processor for it.
|>
|> You have no idea what you are talking about, ISA has nothing to do with the
|> processor code, it is just a Hardware BUS!

When you have comp.arch in the "Newsgroups:" line, you might
want to do a little checking. ISA is most definitely the name
of the bus on a lot of PC machines. It's also short for
"Instruction Set Architecture", or what you call "the processor code"

-Vivek

Guy Harris

unread,
Jun 23, 1996, 3:00:00 AM6/23/96
to

Gil Gameiro <bi...@mbat.net> wrote:
>How many RISC/CISC processors do you know that have a built in instruction
>for the switch/case statement?

VAX. However, it does a dense-case switch, i.e. a fetch from a table
using the value on which the switch is being done as the table index.

>Well Java byte code does! Makes me laught when I see the ASM
>generated for a 386 code that has a bunch of CMP/JNE/CMP/JNE/CMP/JNE for
>every case!

Just because there's a built-in instruction to do a case statement, that
doesn't mean the hardware that implements that instruction won't do the
moral equivalent of such a compare chain. I.e., do *not* assume that
"it's a single instruction" means "it's fast" (most readers of
"comp.arch" presumably already know that, but others might not).

UNIX C compilers I've seen generate different code for switch statements
based on the density of the cases.

For sparse sets of a few cases, they generate compare chains.

For dense sets, they generate indexed jumps, along the lines of the VAX
instruction.

For sparse sets with more cases, they generate linear table searches
through a jump table; I think I *might* have seen compilers that
generate hashed searches in a jump table.

Bernd Paysan

unread,
Jun 23, 1996, 3:00:00 AM6/23/96
to

John Bayko wrote:
> You must have missed the example about 3 messages upthread of this
> one... the example was a consumer telephone that allows a customer to
> dial up a pizza ordering system which downloads Java bytecode that
> provides an interactive menu for the customer. The advantage is that
> Java is a standard - no matter who makes the phones, the applet runs
> on all of them.

Somebody above said "memory footprint". Now you say "standard", and your
example may require a lot of library code (e.g. widget classes). So you
have to carry all the bloat of the default Java library with your handy
around...

Please, if any of you refer to "memory footprint", state measurements,
don't just postulate something. Yes, stack processors might utilize more
compact code than GPR processors. But the Java compiler isn't very
sophisticated, and it's local variable oriented, so I fail to see the
justification for this generalization (stack = short code, JavaVM =
stack machine -> JavaVM = short code // wrong!!). Even the conclusion
stack emulation=fast emulation -> JavaVM=fast emulation has been proved
experimentally wrong. JITs for Java AFAIK produce code more then 10
times faster and still are not optimized to death, while a quite good
JIT for gforth (a tuned virtual machine for Forth) is just a factor of 3
ahead interpreted VM, and compared with optimized C, gforth's VM is only
about 6 times slower (see Anton Ertl's dissertation paper, found in
http://www.complang.tuwien.ac.at/papers/).

Stephen Molloy

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to

In <31CDC1...@mbay.net> Gil Gameiro <bi...@mbay.net> writes:
>
>Stefan Monnier wrote:
>> This Java thing is really the opposite of Computer Architecture.
Rather than
>> design an ISA to get better performance (whatever your performance
metric is),
>> just take a random inadapted ISA and just build a processor for it.
>
>You have no idea what you are talking about, ISA has nothing to do
with the
>processor code, it is just a Hardware BUS!

"ISA" has at least two meanings that I can think of: #1 is
instruction set architecture (which the original post was
referring to) and #2 the 8 MHz PC ISA bus (which you were
referring to). Get a clue.


Ian Dall

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to

In article <31CDC1...@mbay.net> Gil Gameiro <bi...@mbay.net> writes:


> How many RISC/CISC processors do you know that have a built in instruction for the

> switch/case statement? Well Java byte code does! Makes me laught when I see the ASM

> generated for a 386 code that has a bunch of CMP/JNE/CMP/JNE/CMP/JNE for every
> case!

a) The ns32x32 have a case statement

b) The case statement is replaceable with few instructions on any reasonably
useful processor (RISC or CISC). Essentially it is a table lookup
of branch addresses followed by a branch to that address. Actually,
this has problems if the table is sparse. Does the JVM have a smart
way of handling sparse jump tables? That might be interesting.

c) Implimenting a switch as a sequence of test and branches may actually
be faster than you think. I imagine a jump table screws any speculative
execution. Of course, for a sufficiently large switch statement where
all cases are actually used, there will be a point where the jump
table is faster.

I wont comment on the JVM since I don't know much about it. I just wanted
to point out that on the face of it a hardware implimentation of a
switch statement doesn't seem to be either new or much of a win.

--
Ian Dall I'm not into isms, but hedonism is the most
harmless I can think of. -- Phillip Adams

internet: Ian....@dsto.defence.gov.au

Chris Pirih, proverbs at wolfenet dot com

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to

In article <4qkq9r$6...@bayonne.netapp.com>, g...@netapp.com (Guy Harris) wrote:

| Gil Gameiro <bi...@mbat.net> wrote:
| >How many RISC/CISC processors do you know that have a built in instruction
| >for the switch/case statement?
| >Well Java byte code does! Makes me laught when I see the ASM
| >generated for a 386 code that has a bunch of CMP/JNE/CMP/JNE/CMP/JNE for
| >every case!
|
| UNIX C compilers I've seen generate different code for switch statements
| based on the density of the cases.

The Microsoft C compiler, widely regarded as not-very-state-of-the-art,
generates jump tables for switch statements as appropriate. It's also
reasonable clever with its CMPs (and SUBs) too. I wonder which compiler
Gil Gameiro is using, and if he has optimizations turned on.

---
chris

Amos Omondi

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to

Gil Gameiro <bi...@mbay.net> wrote:

>You have no idea what you are talking about, ISA has nothing to do with the
>processor code, it is just a Hardware BUS!
>

>Common, give the a little more credit or are you that much smarter?
>
>Bye now,


Ross Alexander

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to

Gil Gameiro <bi...@mbay.net> writes:

>Stefan Monnier wrote:

>> This Java thing is really the opposite of Computer Architecture.
>> Rather than design an ISA to get better performance (whatever your
>> performance metric is), just take a random inadapted ISA and just
>> build a processor for it.

>You have no idea what you are talking about, ISA has nothing to do


>with the processor code, it is just a Hardware BUS!

ISA = instruction set architecture. Or industry standard architecture,
depending on context. The first case has everything to do with the
processor code.

[...]

>How many RISC/CISC processors do you know that have a built in
>instruction for the switch/case statement? Well Java byte code does!
>Makes me laught when I see the ASM generated for a 386 code that has
>a bunch of CMP/JNE/CMP/JNE/CMP/JNE for every case!

Various dinosaur architectures from the 60's had builtin instructions
for switch/case; they're dead anyway.

regards,
Ross
--
Ross Alexander, ve6pdq -- (403) 675 6311 -- r...@cs.athabascau.ca

Brian Case

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to

Bernd Paysan wrote:
[snip]

> I still fail to see why Java code should be much shorter than other
> embedded control ISAs (Hitachi HS or 68k to name typical represents of
> the more powerful architectures).
[snip]
> Could someone give me some references why or than (measured) Java code
> is much shorter than other code for EC architectures? I believe this is
> a myth.

I also believe it is a myth. I write for an industry newsletter that covers
microprocessors. A friend and I wrote companion articles on Java a few
months ago. My friend measured the sizes of Java and x86 binaries for the
same code. The x86 code is shorter. Remember: Java has the constant pool.
Yes, the byte code is dense, but not *that* dense, and the constant pool
is less efficient than the ways other langauges keep initialized data and
symbols around (because the constant pool must keep more information).

Note: this conclusion is based on a single data point (I think; my friend
might have measured more programs, but I'm not sure), but even so, I think
the conclusion is valid in this case.

> Java checks array ranges and other things, but this is not needed for
> EC. If you must guarantee function, neither an access out of bounds, nor
> a "out of bound" trap is allowed.

RIGHT! What good is it if your car's dashboard says: "Arrays bounds error in
engine controller. Halted."

Also, note that Java is not the only way to do downloadable content. Java
isn't the only way to build a cell phone that can download a pizza shop
menu or whatever. This ability hasn't suddenly been "invented." Yes, Java
provides a standard way. But what is the advantage of having the ability
to run my electric razor applet or my toaster applet on my PC? None that
I can see. So, what's the advantage of using the standard environment for
these very specialized applications? I think a smart designer of an embedded
system that needs to download executable content would find a way to do so
with the smallest cost; to me the Java VM plus the class libraries isn't
the way with the smallest cost. Now, note that I don't say the designer won't
use the Java language; that's a decision I think would be well justified.
But for specialized applications, it's best to compile all the way to a
statically-linked binary expressed in machine code for a common, cheap,
available-in-many-implementations embedded microprocessor.

Oops, wrote more than I intended again....

This isn't a flame; it's my opinion. If I'm wrong, please set me straight
in a polite way. Thanks.

bcase

Matt Kennel

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to

Chris Pirih, proverbs at wolfenet dot com (_@_._) wrote:
: In article <4qcfkm$e...@sue.cc.uregina.ca>,
: ba...@BOREALIS.CS.UREGINA.CA (John Bayko) wrote:
: | No, but Java/JVM provides a standard (important), and a Java
: | processor makes it cheaper than emulating a JVM or recompiling Java
: | bytecode - both of those require extra ROM or RAM at least. If your
: | goal is low cost and high volume, companies even work on reducing the
: | number of screws in such appliances.

: So Sun makes a Java microprocessor, and we end up with yet another
: architecture for embedded processors. What makes Java more attractive
: in that domain than, say, ARM or MIPS?

Nothing except the possibility for raw technical superiority against
fair competition.

I.e. this is good.

: ---
: chris

--
Matthew B. Kennel/m...@caffeine.engr.utk.edu/I do not speak for ORNL, DOE or UT
Oak Ridge National Laboratory/University of Tennessee, Knoxville, TN USA/
*NO MASS EMAIL SPAM* It's an abuse of Federal Government computer resources
and an affront to common civility. On account of egregiously vile spamation,
my software terminates all email from "interramp.com" and "cris.com" without
human intervention.

Zalman Stern

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to

All the major development systems I use on the Mac and Windows have
announced or shipped support for compiling Java to JVM byte codes. (well,
no one is doing an MPW based Java environment for the Mac so far as I
know. I suppose Metrowerks would if there were demand for such.) Most have
also announced or shipped significant development environments for Java.
(Debuggers, just-in-time compilation based runtimes, editors, etc.) The
situation in the UNIX world is pretty good as well with most workstation
vendors supporting Java. So if nothing else, a plethora of development
environment options is an advantage for Java in the embedded market.
(Contrast this to writing your code using a traditional
compile/edit/download-to-SBC/boot/etc. environment and I expect you'll see
some productivity arguments to be made for Java.)

Yes, this is a case of hype begating the win, but who cares? JavaSoft is
pushing Java at a bunch of different markets and if they can penetrate a
few of them, that will be enough momentum to make the language and its
infrastrucutre a viable technology. Building native JVM CPU's is a
necessary step to cracking the embedded market. Whether it will be
sufficient or not is another story for which the ending has yet to be
written.

Zalman Stern, Caveman Programmer, Macromedia Video Products, (415) 378 4539
3 Waters Dr. #100, San Mateo CA, 94403, zal...@macromedia.com
If you squeeze my lizard, I'll put my snake on you -- Lemmy

Chris Pirih

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to

In article <31CECA...@best.com>, Brian Case <bc...@best.com> wrote:

| Bernd Paysan wrote:
| > Java checks array ranges and other things, but this is not needed for
| > EC. If you must guarantee function, neither an access out of bounds, nor
| > a "out of bound" trap is allowed.
|
| RIGHT! What good is it if your car's dashboard says: "Arrays bounds error in
| engine controller. Halted."

That's a pretty weak argument against runtime bounds checking. An
intelligent designer might use an array-bounds exception to flash a
warning light which indicates a possible computer malfunction, then
proceed with the program as if the exception didn't happen. The
effect is the same as no bounds checking at all, but with a clue
that something is wrong. On the other hand, most vendors probably
don't want the customer to know that their widget is malfunctioning....

Was this thread being crossposted to comp.sys.unisys for any
particular reason?

---
chris

Bernd Paysan

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to
Brian Case wrote:
> Also, note that Java is not the only way to do downloadable content. Java
> isn't the only way to build a cell phone that can download a pizza shop
> menu or whatever. This ability hasn't suddenly been "invented." Yes, Java
> provides a standard way. But what is the advantage of having the ability
> to run my electric razor applet or my toaster applet on my PC? None that
> I can see. So, what's the advantage of using the standard environment for
> these very specialized applications? I think a smart designer of an embedded
> system that needs to download executable content would find a way to do so
> with the smallest cost; to me the Java VM plus the class libraries isn't
> the way with the smallest cost.

These things have been done. At the Garching accelerator laboratory a network of
highly configurable microcontrollers (Z80 and Z280, some 386EX modules, too) is
used with "Open Network Forth". The way to download software is just to download
Forth source, which is resource-saving (the text interpreter is in the 1k
range), fast enough, and the standard problem didn't apply (and if, there is ANS
Forth). Certainly it is not at all hacker-save, this wasn't an issue. Don't load
code you don't trust, anyway.

> Now, note that I don't say the designer won't
> use the Java language; that's a decision I think would be well justified.

Today there are too few Java compilers out there for EC machines.

> But for specialized applications, it's best to compile all the way to a
> statically-linked binary expressed in machine code for a common, cheap,
> available-in-many-implementations embedded microprocessor.

Yeah.

Bernd Paysan

unread,
Jun 24, 1996, 3:00:00 AM6/24/96
to
Ross Alexander wrote:
> Various dinosaur architectures from the 60's had builtin instructions
> for switch/case; they're dead anyway.

x86 and PA-RISC have (different style) branch by table instructions, and
both are far from being dead. Certainly nobody reasonable would build a
key->vector search function today.

Amos Shapir

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

g...@netapp.com (Guy Harris) writes:

>Gil Gameiro <bi...@mbat.net> wrote:

>>Well Java byte code does! Makes me laught when I see the ASM
>>generated for a 386 code that has a bunch of CMP/JNE/CMP/JNE/CMP/JNE for
>>every case!

>Just because there's a built-in instruction to do a case statement, that


>doesn't mean the hardware that implements that instruction won't do the
>moral equivalent of such a compare chain. I.e., do *not* assume that
>"it's a single instruction" means "it's fast" (most readers of
>"comp.arch" presumably already know that, but others might not).

Exactly. The RISC revolution was helped greatly by the fact that on
some VAX models, the add-compare-and-branch instruction (which is the
VAX's built-in instruction to do a FOR or DO statement) ended up being
slower than the equivalent chain of add, cmp, and br instructions.

The main advantage of Java byte code however, is that it's designed to
be small, not necessarily efficient, since it's meant to be
transported verbatim over networks.

--
Amos Shapir Net: am...@nsof.co.il
Paper: nSOF Parallel Software, Ltd.
Givat-Hashlosha 48800, Israel
Tel: +972 3 9388551 Fax: +972 3 9388552 GEO: 34 55 15 E / 32 05 52 N

Richard Tobin

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

In article <31CDC1...@mbay.net> bi...@mbat.net writes:

> Remember that these guys had RISC architectures far more powerfull
> that a 386 when we were still playing with 286.

Speak for yourself...

> How many RISC/CISC processors do you know that have a built in

> instruction for the switch/case statement? Well Java byte code does!


> Makes me laught when I see the ASM generated for a 386 code that has
> a bunch of CMP/JNE/CMP/JNE/CMP/JNE for every case!

How a switch is coded depends on the nature of the cases. If the
cases are (mostly) consecutive integers (which might well include,
say, type tests) then a reasonable compiler will produce a jump table.
If the cases are more variable, the processor is going to have to do
compare-and-jump, even if it's hidden inside an instruction. Even on
a conventional processor it might be implemented as a table of values
and destinations, with a short loop doing the compare-and-jump.

There are several choices for implementing a bytecoded language like
Java. You can interpret the bytecodes (like most existing Java
implementations) which has good space efficiency but may not be very
fast. You can compile the bytecodes to native code, possibly on the
fly, which gives good speed but needs more memory. Or you can build
hardware, which should give good speed and memory use. Whether there
is some market where the advantages of the last option make it worth
while I'm not sure; evidently Sun thinks so. A major *disadvantage*
of hardware is its inflexibility: I might want to run a version of
Java bytecode extended to better support other languages.

-- Richard

--
Don't worry, it's only Usenet.

C Goellner

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

m...@caffeine.engr.utk.edu (Matt Kennel) writes:

>
> Chris Pirih, proverbs at wolfenet dot com (_@_._) wrote:
> : In article <4qcfkm$e...@sue.cc.uregina.ca>,
> : ba...@BOREALIS.CS.UREGINA.CA (John Bayko) wrote:

> : | [...]


>
> : So Sun makes a Java microprocessor, and we end up with yet another
> : architecture for embedded processors. What makes Java more attractive
> : in that domain than, say, ARM or MIPS?
>
> Nothing except the possibility for raw technical superiority against
> fair competition.

You forgot to include market penetration through media propaganda.

--
Chris Goellner
goel...@usbw.ciba.com
Ciba Geigy Polymers NA
UNIX Systems Admin

Karl Zimmerman

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

In article <31CECA...@best.com> Brian Case <bc...@best.com> writes:
>Bernd Paysan wrote:
>[snip]
>> I still fail to see why Java code should be much shorter than other
>> embedded control ISAs (Hitachi HS or 68k to name typical represents of
>> the more powerful architectures).
[snip]
>I also believe it is a myth. I write for an industry newsletter that covers
>microprocessors.

I don't recall _Sun_ ever saying that compactness was a big advantage over
other assembly code. They _have_ said that it's an advantage over downloading
_source_.

>> Java checks array ranges and other things, but this is not needed for
>> EC. If you must guarantee function, neither an access out of bounds, nor
>> a "out of bound" trap is allowed.
>
>RIGHT! What good is it if your car's dashboard says: "Arrays bounds error in
>engine controller. Halted."

Hmm. But these checks exist to provide security in the JVM when one
_can't_ guarantee that the programmer will guarantee function. I'd rather
have a car dashboard that said "ERROR! The loaded code is unsafe! Don't
drive this car!" _before_ I start the car than have one that says
"HA HA HA! The Mad Hacker what Hacks at Midnight has taken control of
your car!" when I'm on the road!

>Also, note that Java is not the only way to do downloadable content. Java
>isn't the only way to build a cell phone that can download a pizza shop
>menu or whatever. This ability hasn't suddenly been "invented."

[snip]


> Now, note that I don't say the designer won't
>use the Java language; that's a decision I think would be well justified.

>But for specialized applications, it's best to compile all the way to a
>statically-linked binary expressed in machine code for a common, cheap,
>available-in-many-implementations embedded microprocessor.

No argument there. But when reading about Java's history, one gets the
distinct impression that it _hadn't_ found its niche in consumer electronics
yet; Sun had been kicking it around for years (as Oak & precursors) before
it realized that with a bit of fine-tuning (addition of a byte-code analyzer,
possibly elimination of pointers) it could work as pretty safe Web-loadable
content.
Now, I could be dead wrong, but I _don't_ think that Web-loadable
electric razor software (or car dashboard software) was on Sun's mind when
they were investigating Oak's potential in consumer electronics; but
reliability and protection against brain-dead programming errors may
well have been.

--
Karl Zimmerman zim...@bostech.com
Contracting Software Engineer
My opinions are not necessarily those of my employers.

Timothy Jehl~

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

In article <4ql50e$t...@fang.dsto.defence.gov.au>, Ian....@dsto.defence.gov.au (Ian Dall) writes:
>
> In article <31CDC1...@mbay.net> Gil Gameiro <bi...@mbay.net> writes:
>
>

> > How many RISC/CISC processors do you know that have a built in instruction for the
> > switch/case statement? Well Java byte code does! Makes me laught when I see the ASM
> > generated for a 386 code that has a bunch of CMP/JNE/CMP/JNE/CMP/JNE for every
> > case!
>

> a) The ns32x32 have a case statement
>
> b) The case statement is replaceable with few instructions on any reasonably
> useful processor (RISC or CISC). Essentially it is a table lookup
> of branch addresses followed by a branch to that address. Actually,
> this has problems if the table is sparse. Does the JVM have a smart
> way of handling sparse jump tables? That might be interesting.
>
> c) Implimenting a switch as a sequence of test and branches may actually
> be faster than you think. I imagine a jump table screws any speculative
> execution. Of course, for a sufficiently large switch statement where
> all cases are actually used, there will be a point where the jump
> table is faster.
>
> I wont comment on the JVM since I don't know much about it. I just wanted
> to point out that on the face of it a hardware implimentation of a
> switch statement doesn't seem to be either new or much of a win.
>
> --
> Ian Dall I'm not into isms, but hedonism is the most
> harmless I can think of. -- Phillip Adams
>
> internet: Ian....@dsto.defence.gov.au

Just for what it's worth, the 80960HA has a group of conditional
arithmetic instructions which are very useful for simple switch
statements. I'm not a c programmer, but if you are looking for
something like

IF a = 7 then b = c+d else b = a+f

is very easy to implement on an 80960HA microprocessor without taking
a branch, and paying the accompanying pipeline flush penalties caused
by the misprediction.
However, these structures aren't nearly as useful if the clauses
start to get complex.

TJ

Paul DeMone

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

Interesting point but media propaganda can go only so far for so long.
Look at the PowerPC consortium - a mighty collection of computer industry
heavyweights with a lot of uncritical support by the trade rags. Remember
the Apple RISC vs CISC performance vs time graphs in their ads showing
CISC flattening to horizontal while PowerPC goes into an exponential climb?
These ads must have made excellent motivational pinups at Intel.

In the early days PPC promised twice the performance as x86 at the same
price. Later it became a bit more performance at about half the price.
And now it is almost the performance at a bit lower price. Business Week
had an interesting article a few months ago exposing the PPC's feet of
clay - about the first mainstream critical examination that I have seen.
IBM and Motorola will have to work hard to stay at the "me too" position
with Intel. To some extent PPC is giving non-embedded RISC a bad name
and hindering the real challengers like DEC, HP, and SGI/MIPS.

IMHO the java chip craze will pass but the language will live on for many
applications where platform independence is important. Some java chips will
probably get made but then they will be tossed into Darwinian competition
against the entrenched survivors in the embedded processor market like MIPS,
ARM, 960, 68K, and x86; and all the hype in the world won't help them then.
A vapor chip is easy to promote but eventually it will have to grow pins
and acquire numbers like cost, performance (?), and power dissipation.

Paul

all opinions strictly my own.
--
Paul W. DeMone The 801 experiment SPARCed an ARMs race to put
Kanata, Ontario more PRECISION and POWER into architectures with
pde...@tundra.com MIPSed results but ALPHA's well that ends well.

Satan's CPU? The P666 of course - check out Appendix H(ell) under NDA


Tony Finch

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

In article <4qphm8$18...@chnews.ch.intel.com>,

Timothy Jehl~ <tj...@sedona.intel.com> wrote:
>
> Just for what it's worth, the 80960HA has a group of conditional
> arithmetic instructions which are very useful for simple switch
> statements. I'm not a c programmer, but if you are looking for
> something like
>
> IF a = 7 then b = c+d else b = a+f
>
> is very easy to implement on an 80960HA microprocessor without taking
> a branch, and paying the accompanying pipeline flush penalties caused
> by the misprediction.
> However, these structures aren't nearly as useful if the clauses
> start to get complex.

On the ARM, all instructions are conditional, so even fairly
complicated conditional expressions can be compiled without branches
(i.e. including AND and OR boolean operators), and if the sattements
controlled by the IF are simple enough that a branch is slower, then
the whole thing can be done without a branch.

Tony.
--
"What it all amounts to is that english
is chiefly a matter of marksmanship."

Kenny Ranerup

unread,
Jun 25, 1996, 3:00:00 AM6/25/96
to

>>>>> Bernd Paysan writes:

> I still fail to see why Java code should be much shorter than other
> embedded control ISAs (Hitachi HS or 68k to name typical represents

> of the more powerful architectures). I know why Forth code is
> shorter (only if byte code is used): the difference is use of many
> small subroutines, which is possible, because Forth is a
> hand-tweaked stack language (with all advantages like low code size,
> and disadvantage as strange syntax).

> Could someone give me some references why or than (measured) Java
> code is much shorter than other code for EC architectures? I believe
> this is a myth.

I haven't seen any evidens either. Maybe this is just a
missunderstanding of what people are comparing. If you compare a
typical 32-bit high end processor like Sparc, Mips, PowerPC etc, with
JVM code, the JVM code is definitely smaller. But if you compare JVM
with embedded ISAs that are designed for compact code you probably
wont see any big advantage for JVM.

The code and execution efficiency of a stack architecture also depends
on the ratio of subroutine calls compared to straight line code. Stack
architectures aren't very good at data shuffling between registers
but can be excellent in subroutine calling efficiency.

Kenny Ranerup

It is loading more messages.
0 new messages