issue: RPN, TAC+SSA, and x86-style ASM

cr88192

unread,

Feb 23, 2008, 5:08:55 AM2/23/08

to

well, recently I came up with an idea, but it has turned into a bit of a
mess on me...

I have an existing C compiler I wrote before, and it is organized about like
this:
Preprocessor;
Parser;
Upper Compiler, converts syntax trees to RPNIL;
Lower Compiler, converts RPNIL to assembler;
Assembler;
Linker.

so, the upper-compiler ends up being fairly sparse, mostly being a fairly
dumb converter.

the output, I call RPNIL, is a vaguely Forth or PostScript like language
representing the intermediate version of the language.

some of the basic types-related processing, constant folding, some branch
elimination, ... has been done before any of this RPNIL code is generated
(likewise, the various high level constructions have been decomposed into
labels and conditional jumps).

meanwhile, the lower-compiler does much of the "heavy lifting", implementing
many things I had not really originally intended (nearly the entire types
tower, operator overloading, inlining, ...), and goes all the way down to
machine code (when this compiler project was started, it was originally
assumed that this lower end would be simple and stupid, but it ended up
being one of the largest and complicated pieces of the project).

so, an idea was considered:
I can split this compiler into 2 major stages:
an upper-lower compiler, which would do most of the high-level processing
(the it would then compile from RPNIL to some other representation);
a lower-lower compiler, which would take this representation and convert it
into machine code.

starting out, I had made the (incorrect) assumption that this would be an
easy process or reorganizing the code, but looking at it, it is unlikely to
be anywhere near this simple (the code is both inconsistently structured and
heavily interconnected).

if I were to follow things how I began to split them, the lower-lower
compiler (I had started calling it VRM), I would end up with an abstracted
x86-ASM like language.

basically, it would differ from x86 ASM largely by glossing over many of the
differences between x86 and x86-64, implementing more types, and essentially
merging/orthogonalizing the SSE registers and GPRs, for example, in allowing
64 and 128 bit integers as built-in types, ... but, it would be a lot
lower-level than RPNIL in that it would not automatically handle the calling
conventions, abstract types (such as typed pointers, structures, arrays,
...).

as this had been imagined, it would follow the x86-style "operator dest,
source" instruction format. the exact register handling would be handled by
a register allocator, and register names would consist of an id and some
letters giving the type.

for example (this stage of the idea):
push x0f4 ;push float vector to stack
div x5o, x3o ;divide 128 bit integers
mul x1f4, x7f4 ;SIMD multiply ('mulps')
sub x2g, x4g ;SSE subtract ('subsd')
pop x3g ;pop double into x3

push r1q ;push qword in r0
mov r2q, x6q ;mov qword in x6 to r2
...

a subsequent idea was then that I could further generalize things, and
switch over to a TAC+SSA based representation:
mov.i r16 3
mov.i r17 4
mov.i r18 2
mul r20 r16 r17
add r23 r18 r20

phi r24 r23 r26
L0:
mov.i r25 1
sub r26 r24 r25
cmp_lez r27 r26
jmp_t r27 L0
...

which would allow some more "interesting" features, and can be fairly easily
converted into the two-register form internally by using dependency analysis
(we can take note that, if a value is never used again, we will not see it
again, thus a "true" three-address form is rare, and will involve
duplicating the register).

in the existing RPN-compiler, much of this dependency tracking is an
implicit part of the stack machinery, since when we pop two items from the
stack, we can safely assume we will not see them again (thus, dup or similar
is needed if we want to keep the value around). internally, a two-oprand
register-based approach took hold, primarily because this was much easier to
work with (the stack serving, in large part, to push and pop abstract
literals and registers).

but, it seems a little silly to have an RPN-based language that compiles
into SSA based language.

may as well directly compile from syntax trees into SSA, as this would seem
to make more sense. however, doing this would end up with me rewriting damn
near the entire compiler (as opposed to just a decent chunk of it...).

technically, it would be a lot easier to stick with a two-operand form
(similar to x86), and sticking with RPNIL as the main IL (even simpler, is
to implement this lower portion more as a mass of library code that is
directly driven by the RPNIL compiler, which is a lot closer to the original
plan, and what I had originally started doing before all this came up).

however, TAC and SSA may make sense if they may later allow a lot more
features and optimizations (for example, constructions being, for whatever
reason, either not possible or extremely awkward with RPN).

or, I can do the basic library (serving as a codegen), and possibly later
write an SSA style frontend (and rework the upper compiler to target it
directly).

anyone have any thoughts?...

Rod Pemberton

unread,

Feb 23, 2008, 6:53:30 AM2/23/08

to

"cr88192" <cr8...@hotmail.com> wrote in message
news:c05b0$47bfeebb$ca83b482$97...@saipan.com...

> well, recently I came up with an idea, but it has turned into a bit of a
> mess on me...
>
> I have an existing C compiler I wrote before, and it is organized about
like
> this:
> Preprocessor;
> Parser;
> Upper Compiler, converts syntax trees to RPNIL;
> Lower Compiler, converts RPNIL to assembler;
> Assembler;
> Linker.
>

[wants/needs restructuring...., but very far along in project]
> anyone have any thoughts?...
>

Obviously, only you really know your code and the complexities of changing
it... But, at this point, I'd think you'd want to preserve RPNIL and
perhaps even concentrate it. Then, add SSA to allow you to select between
the two or gradually migrate away from RPNIL.

First, I think you should rework the compiler so that the upper compiler
_only_ "converts syntax trees to RPNIL."
Second, migrate the upper compiler's "easy-lifting" i.e., "basic
types-related processing, constant folding, some branch elimination", into
the "heavy-lifting" area of the lower compiler as RPNIL.
Third, now that all processing is done in RPNIL, insert an SSA processing
layer between the "heavy-lifting" area in RPNIL and the RPNIL to assembly.

I.e.,

(purified) Upper Compiler, *only* converts syntax trees to RPNIL
(modified, split) Lower Compiler part A, "heavy-lifting and merged upper
compiler easy-lifting" in RPNIL
(new) convert RPNIL to SSA
(new) perform SSA only transforms
(new) convert SSA to RPNIL
(kept, split) Lower Compiler part B, converts RPNIL to assembler

Now that you have both layers, if you want to gradually eliminate RPNIL, you
can. Or, you have your choice of which is better.

Rod Pemberton

cr88192

unread,

Feb 23, 2008, 7:55:32 AM2/23/08

to

"Rod Pemberton" <do_no...@nohavenot.cmm> wrote in message
news:fpp1at$dd7$1...@aioe.org...

> "cr88192" <cr8...@hotmail.com> wrote in message
> news:c05b0$47bfeebb$ca83b482$97...@saipan.com...
>> well, recently I came up with an idea, but it has turned into a bit of a
>> mess on me...
>>
>> I have an existing C compiler I wrote before, and it is organized about
> like
>> this:
>> Preprocessor;
>> Parser;
>> Upper Compiler, converts syntax trees to RPNIL;
>> Lower Compiler, converts RPNIL to assembler;
>> Assembler;
>> Linker.
>>
> [wants/needs restructuring...., but very far along in project]
>> anyone have any thoughts?...
>>
>
> Obviously, only you really know your code and the complexities of changing
> it... But, at this point, I'd think you'd want to preserve RPNIL and
> perhaps even concentrate it. Then, add SSA to allow you to select between
> the two or gradually migrate away from RPNIL.
>

ok.

> First, I think you should rework the compiler so that the upper compiler
> _only_ "converts syntax trees to RPNIL."

this is what it mostly does (none of the logic from the lower compiler
exists in the upper compiler, and the only real communication between them
is spewing everything into a buffer and passing it along).

originally, I had intended it to do more, making the compiler more
retargetable by easily swapping out the lower end of the compiler, but the
lower end ended up being large and complicated enough that it is no longer
reasonable...

> Second, migrate the upper compiler's "easy-lifting" i.e., "basic
> types-related processing, constant folding, some branch elimination", into
> the "heavy-lifting" area of the lower compiler as RPNIL.

not terribly reasonable, as a certain amount has to be done in the upper
compiler just to make the language hold together (C and similar are annoying
in this way).

some of the constructions (such as destructuring 'if' statements and loops,
can't really be sent much lower, since these will not map nicely to RPN
anyways...).

if/when I implement C++ or similar, likely a bit more will need to happen in
said upper compiler...

> Third, now that all processing is done in RPNIL, insert an SSA processing
> layer between the "heavy-lifting" area in RPNIL and the RPNIL to assembly.
>
> I.e.,
>
> (purified) Upper Compiler, *only* converts syntax trees to RPNIL
> (modified, split) Lower Compiler part A, "heavy-lifting and merged upper
> compiler easy-lifting" in RPNIL
> (new) convert RPNIL to SSA
> (new) perform SSA only transforms
> (new) convert SSA to RPNIL
> (kept, split) Lower Compiler part B, converts RPNIL to assembler
>

I am not sure what point there would be in going twice through RPNIL, since
it is high level enough that going back to it would probably eliminate most
of what could be done by going through SSA (RPNIL is at a similar level of
abstraction to the likes of MSIL/CIL, only that the compiler recieves
textual input rather than bytecode).

RPNIL was originally intended as a "general purpose" output for compilers
anyways, me reasoning that RPN was a good solid model, and easy to target (I
figured SSA would be much harder to target since one needs to keep track of
names and similar).

2 3 + 4 *

vs:
mov.i r0 2
mov.i r1 3
add r2 r0 r1
mov.i r3 4
mul r4 r2 r3

so, yeah, RPN has a clear advantage in this respect...

> Now that you have both layers, if you want to gradually eliminate RPNIL,
> you
> can. Or, you have your choice of which is better.
>

probably, it will go like this:
Syntax Trees (currently implemented as XML DOM trees) to RPNIL;
RPNIL to SSA;
further SSA based processing;
SSA to ASM.

as such, I could probably loosen up the SSA handling, making SSA be the
intermediate representation between these stages. then the question would be
one of using a textual or binary representation between stages (I am more
inclined towards textual...)

probable syntax:
likely line-oriented, and based on the "<opr> <args*>" layout (very easy to
parse);
types will be sticky, so registers will remember what they are (vs endlessly
having to restate them);
any conversions will need to be explicit, and operator types will need to
agree;
...

most forms with a target will look like:
<opr> <dest> <args*>

for example:
mul r2 r0 r1 //r2=r0*r1

>
> Rod Pemberton
>

Reinder Verlinde

unread,

Feb 23, 2008, 11:35:17 AM2/23/08

to

In article <c05b0$47bfeebb$ca83b482$97...@saipan.com>,
"cr88192" <cr8...@hotmail.com> wrote:

> well, recently I came up with an idea, but it has turned into a bit of a
> mess on me...
>
> I have an existing C compiler I wrote before, and it is organized about like
> this:
> Preprocessor;
> Parser;
> Upper Compiler, converts syntax trees to RPNIL;
> Lower Compiler, converts RPNIL to assembler;
> Assembler;
> Linker.
>
> so, the upper-compiler ends up being fairly sparse, mostly being a fairly
> dumb converter.
>
> the output, I call RPNIL, is a vaguely Forth or PostScript like language
> representing the intermediate version of the language.
>

> [lots of info removed]

>
> or, I can do the basic library (serving as a codegen), and possibly later
> write an SSA style frontend (and rework the upper compiler to target it
> directly).
>
>
> anyone have any thoughts?...

Many people have thoughts, but I am not sure what you are looking for.
What I miss in your description is what exactly you want to achieve.
Possible answers could be:

- getting to learn how to write a 'real' compiler chain
- getting a better C compiler
- getting a faster C compiler
- fixing known code-generation problems in your existing code

I guess that, in all cases, you will want to look at
<http://www.llvm.org/> and <http://clang.llvm.org/>. It might lead to
one or more of:

- ideas about organizing your code
- ideas about using LLVM as your intermediate SSA language
- ideas about ditching your code, and start contributing to LLVM

Reinder

cr88192

unread,

Feb 23, 2008, 4:15:19 PM2/23/08

to

"Reinder Verlinde" <rei...@verlinde.invalid> wrote in message
news:reinder-68E55E...@fl-69-69-134-194.sta.embarqhsd.net...

actually, the goal is for dynamic compilation and blurring the line between
compile time and runtime (basically, using C like we would many script
languages such as Python or Scheme, but still retaining the power and
performance of C).

for example, if I had lexical closures and eval in C (both eventual planned
features), then that would be worthwhile.

another recent goal/feature was the addition of machinery to aid with
Persistent Storage, DSM (Distributed Shared Memory), and code possible
mobility (AKA: watch as lexically-scoped code objects migrate over the
network).

...

> I guess that, in all cases, you will want to look at
> <http://www.llvm.org/> and <http://clang.llvm.org/>. It might lead to
> one or more of:
>
> - ideas about organizing your code
> - ideas about using LLVM as your intermediate SSA language
> - ideas about ditching your code, and start contributing to LLVM
>

I already know about LLVM, and though an interesting project, I don't
entirely agree with it in some ways, nor do I really believe everyone needs
to settle on the same implementation.

I also might have acted differently had I known about clang to begin with,
but once I learned about it, I figured, "well, I already have my own
implementation"...

also, it is written in C++, and follows an uber-centralized class-heirarchy
based design (aka: 'OOP'), which I personally find disagreeable as well (I
prefer a modular aproach, and not an inheritence-based approach).

after all, it is good if I can rip out and replace parts of a project as
needed, rather than having to view it as some integrated whole...

having a single integrated lower compiler hinders this, in making it to
where I can't really "swap out" things for other things...

to what are my ends, I also feel I better achieve my goals than LLVM does
for me (after all, I already have a decently working framework in < 1 year).

actually, I had also vaguely considered using the LLVM IR language as the
SSA syntax, but for the time being figured it would be too much of a hassle
to parse (allowing the low-level code generator to be swapped out could be a
useful feature though, esp since LLVM supports more targets than just x86
and x86-64, ...).

of course, right now I have some doubts as such that adding SSA will make
the code any simpler though (one of the original considerations for
considering splitting the upper and lower compilers). rather, it has merits
in its own right, but these merits are not necessarily simplicity...

my lower compiler is as such a 30 kloc mass of highly-interconnected code,
and a 10/20 or 15/15 split would be nice, but more likely, such a split (and
addition of SSA) would turn out as 20/25 or similar...

it may also make sense to have multiple possible backends around as well
(for example, I had once considered the possibility of a CPS-based RPNIL
backend, ...).

or such...

> Reinder

Phil Carmody

unread,

Feb 24, 2008, 7:09:15 AM2/24/08

to

"cr88192" <cr8...@hotmail.com> writes:
> actually, the goal is for dynamic compilation and blurring the line
> between compile time and runtime (basically, using C like we would
> many script languages such as Python or Scheme, but still retaining
> the power and performance of C).

Didn't Francois Bellard write tcc for that purpose?

Phil
--
Dear aunt, let's set so double the killer delete select all.
-- Microsoft voice recognition live demonstration

cr88192

unread,

Feb 25, 2008, 5:27:57 AM2/25/08

to

"Phil Carmody" <thefatphi...@yahoo.co.uk> wrote in message
news:87ir0e5...@nonospaz.fatphil.org...

> "cr88192" <cr8...@hotmail.com> writes:
>> actually, the goal is for dynamic compilation and blurring the line
>> between compile time and runtime (basically, using C like we would
>> many script languages such as Python or Scheme, but still retaining
>> the power and performance of C).
>
> Didn't Francois Bellard write tcc for that purpose?
>

I looked at tcc as well...

well, ok, I use a good deal of NIH practices (apparently almost my official
practice...).

one can learn things, have a good time, doing it oneself...

after all, one may not have much else of value in their lives...

robert...@yahoo.com

unread,

Feb 25, 2008, 6:04:32 PM2/25/08

to

The front-end/back-end split is usually such that there are no
lexical, syntactic or semantic issues left for the back end to deal
with. Almost universally it's considered a good idea to begin code
generation with no ambiguities left in the source (obviously excluding
anything that's defined to be ambiguous until runtime). If you've got
a significant amount of that sort of stuff left in your back end, you
may want to consider getting that moved up first.

cr88192

unread,

Feb 25, 2008, 7:46:45 PM2/25/08

to

<robert...@yahoo.com> wrote in message
news:558037f6-299d-462b...@j28g2000hsj.googlegroups.com...

<snip>

>
> The front-end/back-end split is usually such that there are no
> lexical, syntactic or semantic issues left for the back end to deal
> with. Almost universally it's considered a good idea to begin code
> generation with no ambiguities left in the source (obviously excluding
> anything that's defined to be ambiguous until runtime). If you've got
> a significant amount of that sort of stuff left in your back end, you
> may want to consider getting that moved up first.
>

there are no lexical or syntactic issues.
but, however, it does handle most of the semantics.

the reason is that I had started out with an abstract stack machine
(basically, vaguely PostScript style, where we have a stack and operations
that work on pretty much anything they are given), and during development,
ever more things were done as the codegen became more developed.

now, the thing is very complicated...

the idea is then to split this level up, so that the lower level deals
almost purely with architectural issues (register allocation and operations,
...), and the RPNIL compiler becomes more the "middle end" rather than the
"lower end".

so, the upper compiler:
does preprocessing and parsing;
does a some amount of AST level operations (folding constant expressions,
eliminating dead branches, ...);
converts ASTs into a more or less stack-based representation.

the 'RPNIL' compiler:
deals with said abstract stack-based computational model;
decomposes complex types and operations into basic architectural types (raw
pointers, memory, and register operations);
also handles things like type conversions, function calls, the type tower,
operator overloading, and inlining;
...

the 'VRM' layer:
does things like register allocation, figures out good ways to perform
register/memory and memory/memory operations;
emits assembler.

assembler and so on: ...

partly all this is because, originally, my compiler mutated out of an
interpreter for a dynamically typed scripting language of mine.

as such, everything RPNIL or lower, was originally handled in what was
originally the interpreter or JIT compiler (the major change was that of
going over to using a textual representation rather than raw bytecode,
originally due to issues that would have been involved trying to hack the
two layers together at the bytecode level due to representational
differences, it was much easier just to dump the output into a textual
buffer and re-parse in terms of the RPNIL compiler's bytecode...).

the upper compiler was originally a severely hacked-over version of the
script language's upper compiler (and, thus, operates at a similar level of
abstraction, which may not be perfectly well suited to a language like C,
but oh well...).

later on, this upper compiler was rewritten to use DOM trees instead of
S-Expressions (and be, in general, a little less hackish), but otherwise
does not do much differently than before.

or such...