Question on writing a Scheme compiler

Joel Smith

unread,

May 24, 2003, 7:37:51 PM5/24/03

to

Hello,

I have just started writing my own scheme compiler and vm as a learning
exercise and I have hit a bit of a problem. I can't figure out how to
optimize an expression, such as (+ 1 2), to use my vm's builtin %add opcode.

What I want is for the compiler to inline the operation by using the %add
opcode rather than have the overhead of calling the procedure +. Obviously
I could hard-code the compiler to inline expressions of the form (+ ...)
but that means that if the user redefines + the compiler will still
compile to %add.

How can I inline fixnum addition while still allowing all variables within
the environment to be settable. I have looked at other compilers which
seem to solve this problem but I can't figure out how they do it.

Any help would be gratefully received!

Happy Hacking,

Joel.

Jeffrey Mark Siskind

unread,

May 25, 2003, 12:32:15 AM5/25/03

to

There are two general approachs. One is to have declarations. The
other is to
have whole-program analysis. There are various forms of both.
Declarations
can take the form of:

- actual declarations like (declare (inline +))
- specially named variants of the procedures that are inlined like ($+
x y)
- compile switches like -inline+
- module systems
...

Whole-program analysis can make inferences like:

- there is no syntactic occurrence of (set! + ...) anywhere in the
program
- there are no reachable syntactic occurences of (set! + ...) anywhere
in the
program
- at any occurence of (+ ...) in the program the value of + must be
identical
to the value defined in the RnRS environment
- at any reachable occurrence of (+ ...) in the program the value of +
must be
identical to the value defined in the RnRS environment
- at any reachable occurrence of (+ ...) in the program where the
result of
that expression flows somewhere that affects the observable behavior
of the
program the value of + must be identical to the value defined in the
RnRS
environment
- at any reachable occurrence of (+ ...) in the program where the
result of
that expression flows somewhere that affects the observable
behaviour of the
program the value of + must produce results that do not affect the
observable
behaviour on all possible arguments that can occur at that program
point
...

In principle, there is no limit to the level of sophistication that
one can
bring to bear on deciding when to inline builtins. There are practical
limits
however. Sound decisions are possible when you perform whole-program
analysis.
But complete decisions, for all but the simplest of models are
undecidable.
So compilers do what is called conservative approximation, that is
being sound, but incomplete. This issue is just how conservative, how
approximate, and how incomplete.

There are many Scheme compilers that use declarations to inline
primitives.
The only Scheme compiler that I know of that uses whole-program
analysis to
inline primitives is Stalin. You can get Stalin from my home page. And
you can
get the paper Flow-Directed Lightweight Closure Conversion from my
home page
that discusses somewhat about how inlining is done.

Joe Marshall

unread,

May 27, 2003, 11:22:31 AM5/27/03

to

qo...@purdue.edu (Jeffrey Mark Siskind) writes:

> There are two general approachs. One is to have declarations. The
> other is to have whole-program analysis. There are various forms of
> both.

There is another approach that I might characterize as `optimistic
inlining'. Basically, you go ahead and inline the + procedure and
make a notation of what needs to be re-compiled should someone bash on
+. The major drawbacks and advantages to this approach should be
fairly obvious.