Chibi will always be a tiny Scheme implementation,
easy to use from C, with features added by building
on a small core.
However, I want a _fast_ Scheme implementation.
There's no reason we shouldn't be able to have an
implementation as fast as, say, ocaml, which is to
say faster than C in some cases. This doesn't have
to be Chibi, or even my own implementation, but
the other Scheme compilers out there are difficult
to work with. I don't think they have to be. Unfortunately
(or fortunately?) I don't have time to sit down and
write a full optimizing compiler from scratch (or rather
flesh out my other, unpublished compiler). I have a
busy job, and spend maybe a half day on the weekends
and an hour or two in the odd evening working on my
hobbies, which lately is mostly devoted to R7RS.
Which means to make progress I need a plan.
The plan is to apply the ship of Theseus to Chibi,
to turn it into a smart, optimizing native compiler. I will
replace individual components of Chibi one at a
time, such that each change is useful and complete
in itself, is a step towards the final goal, and Chibi
remains small and suitable as a C extension language
in the meantime. The major steps are:
1. Simple native (x86) backend. With a compiler
setting enable compiling directly to machine code
instead of the VM (i.e. directly, not a JIT). The
infrastructure for this is complete, and some simple
examples work, so I hope to get this published soon.
This is the basis for all future speed - it's just not
worth spending that much time tuning the VM if we
can go native. This will be a very naive compiler,
with greedy register allocation and no peephole
optimizations.
2. Rewrite the x86 backend in Scheme. This will be
a separate module, bootstrapped by the version above.
This will make it easier to include low-level optimizations
such as peephole optimizations and smart register
allocation.
3. Rewrite the frontend compiler in Scheme. This will
also be a separate module, bootstrapped by the C
compiler. This will make mid-level optimizations and
macro extensions easier.
4. (optional) Rewrite most/all of the runtime in Scheme.
Use a restricted language or special optimizations to rewrite
the GC.
5. (optional) Write an ELF module and hooks needed to
generate host OS libraries and executables. At this point
we're self-hosting - we just need the C Chibi for bootstrapping.
The C generated library and Scheme generated library
are both usable directly from C programs. The VM is
still usable (crucial for non-x86 platforms until we add
new native backends).
6. (dream) Rewrite all the libc calls in Scheme, and
use to build an OS.
In parallel with these steps I'll continue to work on high-level
optimizations, which work directly with the AST and are
thus applicable to both the VM and the native compiler.
There are very broadly speaking 5 tiers of Scheme compiler
speeds:
* fast native compilers: ikarus, larceny, gambit, bigloo,
mit-scheme, stalin, racket
* slower native compilers: chicken
* fast VMs: gauche
* VMs: just about everything else including Chibi
* tree interpreters: TinyScheme, Scheme9
Yes, compared to other native compilers Chicken lags
noticeably (though it probably has the fastest call/cc).
Yes, compared to other VMs Gauche stands out from
the pack.
At 0.4 Chibi was lagging at the slow end of the VMs,
but I hope for it to shift to the high end soon. After
step 1 I hope for it to reach Chicken speeds, and after
step 2 it should more match the fast native compilers,
and with step 3 it should be better at "smart" optimizations.
--
Alex
The important part:
"With a compiler setting enable compiling directly to machine code"
That is, if you compile
make CPPFLAGS=-DSEXP_USE_NATIVE_X86
then you get the native compiler, otherwise you get the
VM. This is already how it works - vm.c is only compiled
in conditionally, and with the above setting you get x86.c
(which works for simple examples but isn't checked in yet).
In both cases the compiler (eval.c) is shared.
So the VM will always be supported.
--
Alex
> 1. Simple native (x86) backend. With a compiler setting enable
> compiling directly to machine code instead of the VM (i.e.
> directly, not a JIT). The infrastructure for this is complete,
> and some simple examples work, so I hope to get this published
> soon. This is the basis for all future speed - it's just
> not worth spending that much time tuning the VM if we can go
> native. This will be a very naive compiler, with greedy register
> allocation and no peephole optimizations.
How does this work? Are definitions compiled as they are entered, or do
you compile everything available (that is not already compiled) when an
expression is evaluated?
> 2. Rewrite the x86 backend in Scheme. This will be a separate
> module, bootstrapped by the version above. This will make it
> easier to include low-level optimizations such as peephole
> optimizations and smart register allocation.
I don't understand your claim (snipped here) that this step will make
things much faster. The cost of compilation has to be added to the cost
of execution. Why would this step-2 compiler, whether compiled by the
step-1 compiler or by itself, be expected to run faster than the step-1
compiler, which is written in hand-optimized C? Certainly it'll be
easier to maintain, but I would expect that the slower compilation would
tend to outweigh the faster execution. This is why many JITs have two
compilers, a faster one that generates mediocre code, used normally, and a
slower one that generates good code, used only when it's apparent that the
mediocre code is executing too often.
--
We pledge allegiance to the penguin John Cowan
and to the intellectual property regime co...@ccil.org
for which he stands, one world under http://www.ccil.org/~cowan
Linux, with free music and open source
software for all. --Julian Dibbell on Brazil, edited
It works the same way the VM works - definitions are compiled
as they are entered.
I'll worry about whole-module optimizations later.
>> 2. Rewrite the x86 backend in Scheme. This will be a separate
>> module, bootstrapped by the version above. This will make it
>> easier to include low-level optimizations such as peephole
>> optimizations and smart register allocation.
>
> I don't understand your claim (snipped here) that this step will make
> things much faster. The cost of compilation has to be added to the cost
> of execution.
Indeed, it's likely to make compilation slower, but it
will make it easier to generate faster code. In particular,
the register allocation and peephole optimizations will
be written in Scheme.
--
Alex