Ahead-of-time compilation (as well as tree-shaking, etc.) for Forth

Alex Dowad

unread,

Jul 7, 2016, 3:49:00 PM7/7/16

to

Greetings all,

I'm sure there is a wealth of information on this topic out there on the web, but my Google power may not be strong enough. If anyone could be so kind as to give me a few links to quality material, that would be appreciated.

I've built a Forth implementation and am interested in getting it to compile Forth programs to standalone binaries. The binaries should not go through the process of building up word definitions and adding them to the dictionary at run-time; all that should happen at compile-time (including executing immediates like IF), and just the completed word definitions should go into the output binary.

Better yet is if only the word definitions which are actually needed go into the output. The dictionary itself should also be omitted, meaning that words like FIND cannot be used at run-time (only at compile-time, either at the top level or in an immediate word).

Better still is if the output is fully relocatable and thus can be linked freely with libraries (which might be written in a different language). It turns out this is non-trivial to do, unless you make some undesirable compromises:

Fixing up the addresses in the code field of a word's definition (this is DTC) and making them relocatable is trivial. Fixing up BRANCHes (which could possibly cross over into a different word's code field) is also more or less trivial.

Dealing with ALLOTed addresses may be trickier, depending on what compromises you want to make. VARIABLE, VALUE, ARRAY, and so on all rely on ALLOT, compiling an ALLOTed address as the argument of LIT. It would be best if they work seamlessly at both compile-time and run-time.

It is easy enough to record how much memory is ALLOTed and arrange for an equivalent amount of memory to be reserved at run-time. But since we want the code to be relocatable, either you must emit a relocation for every place where an ALLOTed address appears in a word definition, or significant extra indirection (and cost) will be incurred each time those addresses are accessed at runtime.

Again, it's easy to add a special primitive which tells the compiler to emit a relocation, and use that primitive in VARIABLE, VALUE, DEFER, etc. It's unfortunate that ALL words, including custom ones, which include addresses will have to use that special primitive, but that's not so bad.

What is much trickier is that ALLOTed addresses may also be stored in a VARIABLE, an ARRAY, a table of some kind, etc. If you allow VARIABLEs to be initialized at compile-time, and then carry the values through to run-time by storing the values in the binary, those values could be pointers. Those pointers will not be valid at run-time, unless they are relocated with the rest of the code.

But when looking at raw data, it's very hard to know what is and is not a pointer. Unless the programmer writes some special incantation every time (s)he stores a pointer somewhere, in the general case, the compiler will not be able to identify all the pointers in pre-initialized data and emit relocations for all of them.

A similar problem exists with tick; it's not hard to deal with written inside a colon definition, but very hard to deal with when exec tokens are stored in arbitrary places and used at run-time.

A few possible solutions:

- No pre-initialized data. All data (VARIABLEs, ARRAYs, etc etc) must all be initialized at run-time, using code inside a colon definition. Perhaps they can also be used at compile-time, but the values do not carry through.
- Abandon the idea that the output should be relocatable. Emit code which only works when loaded at a specific address.
- Add an offset to addresses when accessing them at run-time from within @, etc. The offset will have to be kept in a global variable. Likewise in EXECUTE.
- Abandon the idea that the same VARIABLE should work at both compile-time and run-time. Use a completely different word, with different semantics.
- ???

I'm sure I'm not the first person to run into these difficulties. What are other Forth AOT compilers "out there" doing? (My guess: probably emitting non-relocatable code.)

Thank you,
Alex Dowad

Gerry Jackson

unread,

Jul 7, 2016, 4:25:55 PM7/7/16

to

I believe one solution to identifying addresses, although I haven't
tried it myself, is to compile the program twice to different addresses
and compare the two results. As they are compiled to different addresses
anything different between the two must be an address (unless you're
compiling time varying data which is, presumably, unlikely).

--
Gerry

Alex Dowad

unread,

Jul 7, 2016, 4:48:47 PM7/7/16

to

On Thursday, 7 July 2016 22:25:55 UTC+2, Gerry wrote:
> I believe one solution to identifying addresses, although I haven't
> tried it myself, is to compile the program twice to different addresses
> and compare the two results. As they are compiled to different addresses
> anything different between the two must be an address (unless you're
> compiling time varying data which is, presumably, unlikely).
>
> --
> Gerry

Wow, that is creative.

Rod Pemberton

unread,

Jul 7, 2016, 7:47:57 PM7/7/16

to

On Thu, 7 Jul 2016 12:48:58 -0700 (PDT)
Alex Dowad <alexin...@gmail.com> wrote:

> I'm sure there is a wealth of information on this topic out there on
> the web, but my Google power may not be strong enough. If anyone
> could be so kind as to give me a few links to quality material, that
> would be appreciated.

I'm probably not going to be able to help you that much. However, I
have questions the answers to which may help others to help you.

> I've built a Forth implementation and am interested in getting it to
> compile Forth programs to standalone binaries.

Are these binaries to have a built-in ITC or DTC interpreter, or
are they without an interpreter as STC? (answered below)

> Better yet is if only the word definitions which are actually needed
> go into the output.

So, you're wanting to specify a Forth word to use as a stand-alone
program, have that word's code traced all the way down to primitives,
and have the correct sequence of primitives emitted as binary code?

> Better still is if the output is fully relocatable and thus can be
> linked freely with libraries (which might be written in a different
> language). It turns out this is non-trivial to do, unless you make
> some undesirable compromises:

So, you're not wanting a built-in Forth address interpreter for ITC/DTC
code, but you're actually wanting compiled code like a C compiler
produces? (This would be STC in Forth parlance.)

You might consider going the Forth-to-C translation method and then
compile the C code. There are a couple of standalone programs which do
this.

Isn't there a Forth word like SEE to trace words? Can't you
redefine : colon and ; semicol to emit info, binary code, or C code?
And, can you add a bit, like for immediacy, to indicate a word trace
instead of interpretation or compilation?

Rod Pemberton

Elizabeth D. Rather

unread,

Jul 7, 2016, 8:57:48 PM7/7/16

to

On 7/7/16 9:48 AM, Alex Dowad wrote:
> Greetings all,
>
> I'm sure there is a wealth of information on this topic out there on the web, but my Google power may not be strong enough. If anyone could be so kind as to give me a few links to quality material, that would be appreciated.
>
> I've built a Forth implementation and am interested in getting it to compile Forth programs to standalone binaries. The binaries should not go through the process of building up word definitions and adding them to the dictionary at run-time; all that should happen at compile-time (including executing immediates like IF), and just the completed word definitions should go into the output binary.
>
> Better yet is if only the word definitions which are actually needed go into the output. The dictionary itself should also be omitted, meaning that words like FIND cannot be used at run-time (only at compile-time, either at the top level or in an immediate word).
>
> Better still is if the output is fully relocatable and thus can be linked freely with libraries (which might be written in a different language). It turns out this is non-trivial to do, unless you make some undesirable compromises:

Pretty much all these capabilities have been implemented at various
times. Most OS-hosted Forths (such as SwiftForth) can call external
libraries and generate bootable binaries.

On most OS-hosted platforms (e.g., PCs) there is so much address space
available that they don't bother stripping unused words from the
binaries. However, interactive cross-compilers such as SwiftX include a
facility for doing this when generating a ROMable or flashable target.

I recommend that you download trial versions of SwiftForth and SwiftX
and read the documentation. You may also benefit from looking at the
proposed Cross-compiler standard
http://newsgroups.derkeiler.com/Archive/Comp/comp.lang.forth/2007-06/msg00283.html
to see the how the words managing read-only and writable memory and
other issues can be handled.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Cecil Bayona

unread,

Jul 7, 2016, 9:09:33 PM7/7/16

to

Dead links, you get a 404 error.

--
Cecil - k5nwa

Elizabeth D. Rather

unread,

Jul 7, 2016, 10:35:18 PM7/7/16

to

So sorry! Try again now.

Alex Dowad

unread,

Jul 8, 2016, 12:02:11 AM7/8/16

to

On Friday, 8 July 2016 01:47:57 UTC+2, Rod Pemberton wrote:
> > I've built a Forth implementation and am interested in getting it to
> > compile Forth programs to standalone binaries.
>
> Are these binaries to have a built-in ITC or DTC interpreter, or
> are they without an interpreter as STC? (answered below)

My preferred strategy is to emit DTC, with a little startup stub which sets up the registers used as data stack and retstack pointers, as well as that used for a Forth instruction pointer, and then executes NEXT to kick things off. The entry point of the binary will point to that startup stub. It will also add a little bit of exit code, and the startup stub will push the location of that exit code onto the retstack when starting, so the program exits gracefully.

> > Better yet is if only the word definitions which are actually needed
> > go into the output.
>
> So, you're wanting to specify a Forth word to use as a stand-alone
> program, have that word's code traced all the way down to primitives,
> and have the correct sequence of primitives emitted as binary code?

Almost; specify a Forth word to use as a stand-alone program, and emit code for it and all words which it (recursively) uses, except for specific words which are marked as having an external definition (which must be supplied by linking against other object files).

> > Better still is if the output is fully relocatable and thus can be
> > linked freely with libraries (which might be written in a different
> > language). It turns out this is non-trivial to do, unless you make
> > some undesirable compromises:
>
> So, you're not wanting a built-in Forth address interpreter for ITC/DTC
> code, but you're actually wanting compiled code like a C compiler
> produces? (This would be STC in Forth parlance.)

No, I'd like to emit DTC code, but also emit all the needed ELF relocations so that a standard linker (not one specifically designed for Forth) can use that code.

Thanks, AD

Lars Brinkhoff

unread,

Jul 8, 2016, 2:03:53 AM7/8/16

to

Alex Dowad writes:
> I'm sure there is a wealth of information on this topic out there on
> the web, but my Google power may not be strong enough. If anyone could
> be so kind as to give me a few links to quality material, that would
> be appreciated.

Some of what you ask about is routinely done by typical Forth
metacompilers/cross compilers/target comppilers.

Here are some articles on the subject in Forth Dimensions:

Volume 4, issue 6, page 19: A Techniques Tutorial; Meta Compiling I.
Volume 5, issue 2, page 23: Techniques Tutorial: Meta Compiling II.
Volume 5, issue 3, page 31: Techniques Tutorial: Meta Compiling III.
Volume 12, issue 2, page 31: Metacompile by Defining Twice.
Volume 12, issue 6, page 31: Metacompilation Made Easy.
Volume 14, issue 3, page 13: Principles of Metacompilation, Part One.
Volume 14, issue 4, page 14: Principles of Metacompilation, Part Two.
Volume 14, issue 5, page 26: Principles of Metacompilation (III).
Volume 19, issue 5, page 9: Easy Target Compilation.
Volume 19, issue 6, page 9: Building a Remote Target Compiler.

Get them here:
http://www.forth.org/fd/contents.html

Paul Rubin

unread,

Jul 8, 2016, 3:05:47 AM7/8/16

to

Alex Dowad <alexin...@gmail.com> writes:
> Almost; specify a Forth word to use as a stand-alone program, and emit
> code for it and all words which it (recursively) uses

If the program is written in a style that uses xt's extensively then
it's hard to tell what calls what.

Alex Dowad

unread,

Jul 8, 2016, 3:42:38 AM7/8/16

to

That's a very good point, Paul... after having written that post, I realized the same thing (in the shower, as is apt to happen).

Lars Brinkhoff

unread,

Jul 8, 2016, 3:57:27 AM7/8/16

to

You may want to include all ticked words as additional entrypoints.

Chris Curl

unread,

Jul 8, 2016, 8:49:13 AM7/8/16

to

If you can build your system to use relative address offsets, instead of
absolute offsets, then it seems to me that it should "just work".

Or you could take an approach similar to Java where it builds pseudo code
and then creates the real code when it loads. Of course, there is some
startup overhead.

I have been experimenting with the a different approach. I have built a
virtual Forth machine. My interactive development work is done 100%
in that virtual environment. I have a built a disassembler for my Forth
VM binary that can create out assembler code for my VM. To me, that approach
lends itself to being able to be able take my representation of the
machine and do whatever I want with it. The idea is that I can then
build different disassemblers for my VM that spit out assembler source
code (or binary code) for whatever target system I want to build for.

For what it's worth, my project is here. It is a work in progress.

https://github.com/CCurl/Forth_C

Chris Curl

unread,

Jul 8, 2016, 8:58:38 AM7/8/16

to

Of course, the output of the disassembler could also be some other
language that could be used to generate binaries for the target system.

If nothing else, it is a "thinking outside the box" approach that might
spark some ideas ...

I would be interested to hear how you solve your issue.

Alex Dowad

unread,

Jul 8, 2016, 9:53:25 AM7/8/16

to

Dear Chris, you are very right about using relative offsets.

Compiling to bytecode and from there to machine code is fine, but you still need to extract certain information to generate efficient machine code, such as which values will be used as memory addresses. Whether the conversion happens in one step or two, that doesn't change.

hughag...@gmail.com

unread,

Jul 8, 2016, 10:11:16 AM7/8/16

to

On Friday, July 8, 2016 at 5:49:13 AM UTC-7, Chris Curl wrote:
> If you can build your system to use relative address offsets, instead of
> absolute offsets, then it seems to me that it should "just work".

That is what I'm doing in Straight Forth. It has a 64-bit cell (numbers have an assumed 2^32 unity). Addresses are 32-bit however --- they aren't really addresses though --- they are offsets from a base-pointer.

Richard Owlett

unread,

Jul 8, 2016, 10:31:56 AM7/8/16

to

On 7/7/2016 9:35 PM, Elizabeth D. Rather wrote:
> [snip]

>>>
>>> I recommend that you download trial versions of SwiftForth and
>>> SwiftX
>>> and read the documentation. You may also benefit from looking
>>> at the
>>> proposed Cross-compiler standard
>>> http://newsgroups.derkeiler.com/Archive/Comp/comp.lang.forth/2007-06/msg00283.html
>>>
>>>
>>> to see the how the words managing read-only and writable
>>> memory and
>>> other issues can be handled.
>>>
>>> Cheers,
>>> Elizabeth
>>>
>>>
>> Dead links, you get a 404 error.
>
> So sorry! Try again now.
>
> Cheers,
> Elizabeth
>

http://www.forth.com/downloads/ANS/XCapp5.doc still gives 404 error.
But http://www.forth.com/downloads/ANS/XCapp5.pdf DOES work.

I hadn't been aware of
http://newsgroups.derkeiler.com/Archive/Comp/comp.lang.forth
Thanks for that link.

Alex Dowad

unread,

Jul 8, 2016, 10:50:12 AM7/8/16

to

Thanks, this is interesting!

I am thinking that I may _just emit code which only works when loaded at a specific address_.

I'm writing a book which shows how to build a Forth interpreter, then a compiler, then use the compiler to build a simple (POSIX-ish) OS kernel and minimal userspace. The bootloader is just asm, not Forth, but will be included too.

I do have a working Forth compiler written in Ruby, and by linking in different implementations of the primitives, can produce binaries which either run on Linux, or (when stripped of ELF headers) bare-metal. But it's ugly, and has undesirable compromises, and definitely isn't anything one would want to read about in a book.

That "compiler" is actually a linker as well, and you have to tell it the external libraries to link against. I was thinking it would simplify things to use GNU ld instead.

The overriding goal for my 2nd compiler is simplicity and ease of understanding. Emitting code for a fixed address will help achieve that. I can still use GNU ld with a linker script to make it locate Forth object files at the addresses they were compiled for.

Thanks all, and if you have any further ideas, please post. I am looking forward to digging into the back issues of Forth Dimensions which Lars Brinkoff kindly listed.

AD

Cecil Bayona

unread,

Jul 8, 2016, 11:17:46 AM7/8/16

to

Charles Childers built retro Forth in a similar manner, generating code
for a virtual Forth CPU, once you build a virtual machine for a
particular CPU then the generated Forth Virtual Code will work on any of
them. He has virtual machines for a variety of CPUs available. A virtual
machine is simple compared to the development system so moving to a new
CPU is much easier.

< http://forthworks.com/retro/ >

--
Cecil - k5nwa

Chris Curl

unread,

Jul 8, 2016, 11:34:03 AM7/8/16

to

Interesting. I have no delusions about being the first one to do something
like this. And the performance of my little VM is adequate for how I am using
it ... at least for now. But it is certainly not nearly as fast as a system
running in the machine's native CPU instruction set.

The performance consideration is why I am trying to set mine up in such a way
that I will be able to write a disassembler for the VM that translates my VM
byte code into code that can be compiled into a native program for whatever
platform I want to target.