On 4/20/2013 12:37 AM, Paul Rubin wrote:
> rickman<
gnu...@gmail.com> writes:
>> I'm still considering the implications of that, but I'm sure it will
>> end up with multiple clocks per instruction, just not a variable
>> number. Likely three or four clocks defining "phases" of the
>> instruction. It will be a bit more complex in some ways than the pure
>> stack design, but should be faster and may be smaller.
>
> Have you read Koopman's stuff about stack hardware, and looked at
> Bernd's b16 design?
I read Koopman's book a long time ago. This was the basis for the
fundamental instruction organization I have been working with. Because
of the high frequency of use of the call/jump/literal I optimized these
instructions. I think the uCore has done the same thing with a 1 bit op
code for the literal instruction.
I have looked at Bernd's b16. It uses 5 bit instructions which is
something I wanted to get away from since my design is optimized for
FPGAs and have constraints on the memory width.
> I thought one of the defining characteristics of MISC was that by giving
> up any access to the interior of the stack, you can do most operations
> in a single cycle with no pipelining. You sometimes use extra
> instructions juggling the stack, but you make up for that in higher ipc,
> higher clock frequencies, and less chip area (so more parallelism
> through multiple cores, if your problem can use that).
I think that is Chuck's idea in his chip designs. I would not say MISC
requires any particular architectural feature. I think I understand the
theoretical trade offs. But Chuck and Bernd are designing ASICs while I
am designing in FPGAs. So there are different optimizations that work
best.
> Looking at the F18A die photo though, it appears dominated by memory
> arrays: stacks, ram, and rom. The ALU is a tiny sliver in the middle.
> Therefore I get the idea that by going to 6 bit instruction and possibly
> doubling the ALU's size, the cpu could have been much more powerful at
> relatively little cost in silicon.
I think you are presuming that a 32 or 36 bit processor is "more
powerful" in a meaningful way than the 18 bit processor Chuck designed.
18 bits is enough for many, many apps including high quality audio.
Given the 5 bit word size you might think he would have used a 20 bit
word size, but he seems to be a real mizer on transistors and wanted to
keep it as small as absolutely possible. Remember, a bigger word size
means a bigger RAM too...
I think it would be very easy to get a *lot* more power by going to
newer, finer pitch processes.
> I wonder if experienced GA144
> programmers ever get over the pain of seeing 4-7 instruction slots
> burned every time they use the constant "1". But, supposedly Chuck and
> company did a fair amount of simulation of various options before going
> forward with what they have, so maybe they know something I don't.
You need to truly understand the F18 if you want to judge the tradeoffs
in the design. To do that you need to read the programming tricks
manual... which hasn't been written yet as far as I know. The closest
thing to it is Chuck's web blog which has a lot of info if you want to
spend the time to distill it out. It takes some serious reading to
absorb all that is in there. I would recommend starting with his essay,
"The Map is Not the Territory". Then keep that in mind as you learn...
http://www.colorforth.com/map.htm
>> The part I am ill-equipped to handle is writing a Forth compiler to
>> generate good code for this design. Manually coding using the
>> primitive instructions would not be so hard, but designing a Forth
>> optimizing compiler might be a stretch, certainly for me.
>
> I'd expect the user program to map just about directly to machine
> instructions, similar to Chuck's chips. Arrayforth seems like
> more of an assembler than a compiler.
I don't follow what you are saying. Are you saying the coding should be
done in assembler, which is what Chuck does, or are you saying that the
machine should map closely to Forth?
I am saying that Forth can be compiled to this CPU architecture but it
is not so straightforward. I think it will require an optimizing
compiler to get good usage of the CPU. Of course, the user can code in
the assembly language and build word definitions just like any other
Forth. But if they want to write purely in Forth, it will require some
work on a compiler.
--
Rick