Low-level instruction analysis

Igor Stasenko

unread,

Nov 25, 2008, 7:43:10 PM11/25/08

to Moebius project discussion

I'm thinking, what design should be of a native code generator , with
combined optimizer to convert some basic 'virtual' CPU instructions
into complex real CPU ones..

I'm speaking about following:

suppose we need to read a word from some memory location, but first it
needs to be computed. An abstract code could look like:

a := memory at: (b + (offset*4)).

Its a very common operation, which may occur in our code - read a slot
of object 'b' , designated by some offset.

We know, that on x86 CPUs it can be encoded into single instruction:

mov a, [b + offset*4]

But we can't support such complex instruction in low-level 'vurtual'
cpu instructions set - because different architectures may offer
different ways how to encode such operation.

A virtual cpu having most basic instructions , where we can add two
words, multiply two words or read memory from location designated by
address:

temp1 := offset*4.
temp2 := b+temp1.
a := memory at: temp2.

So, we should perform some kind of analyzis, to see , if it possible
on target architecture to encode these 3 separate instructions into
single or two native instructions.

That's where i begin questioning:
suppose i create an abstract class which represents a target
architecture. A subclasses of it defining a specific target
architecture. What interface we need in such class?
How to formalize it? Any ideas?

--
Best regards,
Igor Stasenko AKA sig.

Klaus D. Witzel

unread,

Nov 26, 2008, 6:30:56 AM11/26/08

to Moebius project discussion

I think that only a small subset of this is needed

- http://www.google.com/search?q=fundamentally+hardware+architecture+for+Object-oriented+programming+wikipedia

then look up the entire sentence on that wikipedia page; also a small
subset of this

- http://en.wikipedia.org/wiki/Burroughs_large_systems_instruction_set

> How to formalize it? Any ideas?

We would need to have

- operands (with specialized number subclasses)
- references (with specialized pointer classes, e.g. arrays[typed]
v.s. structs)

Address arithmetic should be deferred until it is time to choose an
instruction from the target set (for example by catching messages sent
to references).

Igor Stasenko

unread,

Nov 26, 2008, 6:51:20 AM11/26/08

to moebius-proje...@googlegroups.com

2008/11/26 Klaus D. Witzel <klaus....@cobss.com>:

I starting to like it, especially that opcodes behavior is polymorphic
, depending on operand type.
I took a look a tinyCC (C compiler), where its also encoding types of
operands , and only then producing instructions.
In this way, it can optimize boolean expressions quite simple, as well
as track/inline functions which return boolean as result.
For instance, what i don't like in my current design, that i cannot
optimally inline methods like:

isNil
^ self ~~ 0

and then write code, like any smalltalker does:

self isNil ifTrue: [ blabla]..

instead , in native code, i forced to write:

a equal: b ifTrue: [ blabla]

The second thing is that it using a value stack model instead of
unlimited set of registers.
COLA JIT using value stack as well (in contrast to Exupery).
Me also thinks that value stack is more natural to smalltalk than temp
registers. This is because of nature & elegance of message passing
syntax. :)

Reply all

Reply to author

Forward