For the past few months, I've been working on a from-scratch
implementation of Racket's macros, modules, and top-level. If you're
interested, see the "linklet" branch of
https://github.com/mflatt/racket
The word "linklet" refers to the simplified (relative to modules)
notion compilation, linking, and evaluation that's built into the
revised runtime system. A `linklet` is almost a `lambda`, but it
imports and exports variables instead of values. To put it another way,
a `linklet` is almost a `unit`, except that it doesn't support mutual
dependencies, and its compilation protocol supports cross-linklet
optimization (for cross-module optimization).
The macro expander, all module handling, and support for top-level
evaluation are implemented in Racket on top of linklets. That part is
in the "pkgs/expander" directory:
https://github.com/mflatt/racket/tree/linklet/pkgs/expander
The expander can build itself, and a tool in the "expander" directory
can extract the expander's implementation into a single linklet, which
is then embedded into the Racket runtime.
In the new implementation, 20k lines of well-organized Racket code
replace 35k lines of less organized C code. The expander's embedded
bytecode is roughly the same size as the compiled C code that it
replaces.
Although this new implementation works well enough to build the Racket
distribution, run DrRacket, and pass the core Racket tests, its
performance is not yet good enough to replace the current
implementation. Roughly, the new implementation uses x1.5 memory and
takes x1.5 to x3 as long to expand/compile/build programs. In absolute
terms, that seems pretty good for a 3-month-old, from-scratch
reimplementation of about 15% of `racket`, but it's not good enough to
impose on Racket users.
I'll continue trying to get this variant of Racket into shape.
Meanwhile, I enthusiastically welcome anyone who is interested in
helping to improve this expander and its performance. Bug reports are
welcome.
The new implementation of the expander is hopefully much easer to read
and modify than the old one, not just because it's written in Racket,
but because it's better organized. As an orientation, start with the
original repo,
https://github.com/mflatt/expander
which has "pico", "micro", "mini", and "demi" branches that build up to
the full expander. (Unfortunately, the jump from "demi" to the full
expander has become especially large.)
In case anyone gets as far as investigating performance, here are some
things to note relative to the old implementation:
* The old expander and compiler front-end are fused. When you use
`eval` or `compile`, then no fully expanded form is actually
generated; instead, bits of expanded code are converted to an
intermediate compiler representation as expansion proceeds, and that
fusion of expansion and compilation might be a significant shortcut.
* The new `compile` or `eval` not only fully expands a form before
compiling, it has an extra layer of compilation to convert from
modules (or top-level evaluation) to linklets.
* If you use `expand` on a module with the old expander, then it
always compiles the module body as well as expanding it. Compilation
is needed when the module contains a submodule that requires the
enclosing submodile, and so the old `expand` always compiles, just
in case. As a result, `expand` on a module can sometimes be faster
with the new implementation, because the new implementation is
lazier about compiling a module body.
These differences in structure account for some of the initial
performance differences. It's not clear how much the overall
performance difference depends on these factors, or how much it depends
(on or can be compensated by) other factors like different data
structures, algorithms, and the base performance of C versus Racket. I
can't help hoping that I've done something dumb in terms of performance
--- maybe the same dumb thing in the old and new expanders --- where
others will see and fix the problem just as soon as the code is
readable enough.