I've got some questions about things relating to the topics of compiler
backends and target languages dealing with the Forth programming language.
While crawling the web, I was unable to find out a compiler that uses
Forth as its target language. But in my opinion, compiling a
higher-level language into Forth code is a great deal between using a
standardized lower-level and widely spread programming language and its
platform-independency. There are even CPUs which are capable of
executing Forth.
So, does anyone know about a compiler-project or similar software that
uses Forth as its destination? And if not - would it be wrong to compile
code into sequences of Forth definitions and words? Why?
Maybe I'm just looking too "foolish Forthy" into this topic. At least,
it's a simple, stack-based virtual machine which is needed to execute a
program in a particular (maybe self-defined) lower level language a
compiler compiles to.
Is my question only a different look on a well-known and widely used
philosophy of program execution, or is it quite legitimate?
Thanks for all replies in advance!
Regards,
Jan
> While crawling the web, I was unable to find out a compiler that uses
> Forth as its target language.
Forth is quite different from other languages. The typical
implementation were a translator for the source language, written in
Forth itself. IIRC there exist tiny-C compilers/interpreters written in
Forth.
DoDi
FORTH proper is generally not used.
however, loosely FORTH-like backends are, actually, fairly common...
for example, I use a backend which was, to some extent, inspired by
PostScript...
likewise, both the Java VM, and .NET VM, use a RPN-based bytecode formats.
> Maybe I'm just looking too "foolish Forthy" into this topic. At least,
> it's a simple, stack-based virtual machine which is needed to execute a
> program in a particular (maybe self-defined) lower level language a
> compiler compiles to.
>
> Is my question only a different look on a well-known and widely used
> philosophy of program execution, or is it quite legitimate?
there are varying levels of "FORTH-ness" in various RPN-based backends...
for example, JBC (Java ByteCode) and MSIL/CIL (the bytecode used in .NET),
are both based around RPN.
granted though, these variants typically make "ammends" to allow efficient
compilation, for example, it is common to use direct labels and jumps, and
to restrict the way the stack is used.
for example:
in JBC, both the source and destination of a jump are required to have the
same stack layout.
AFAIK, MSIL does not allow a jump with items still on the stack (everything
needs to be forced out to variables). (I may be wrong on this point).
my IL, OTOH, while still not allowing items to remain on the stack, makes
use of a "union" feature to allow merging stack-items (from several control
paths) together into a single target stack item (basically, it is analogous
to the phi operation in SSA).
however, I may eventually begin either to phase out this approach (using a
temporary variable instead), or I "could" implement the JBC approach, but
this is less likely. I have worked out "better" ways to approach JBC, as
discovered in a recent effort to compile JBC to C, where most likely it
would be first compiled into TAC (AKA: 3-Address form) or SSA form... (from
the design, SSA would likely be "easy", but I actually use plain TAC, as
this maps better to C).
note that full unrestricted RPN (such as is found in plain Forth or
PostScript) would be difficult to work with or oprimize, and would place
notable restrictions on the way the code may be compiled or run... (in
particular, it would likely require the use of multiple stacks, as well as a
custom calling convention).
by restricting the ways the stack is used, however, allows efficient
compilation (but, at the cost of disallowing certain kinds of
constructions).
other comments:
RPN is a very convinient representation for upper-end compiler output, and
is also a relatively powerful, simple, and general-purpose representation.
however, it is not the ideal representation for (directly) producing
compiled code (at least, with current processors and calling conventions),
which demands a different model.
as a simple example, go and look at the SysV / AMD64 calling convention, and
an RPN-based codegen, and see if you can spot the problem...
a more abstract form of this model is SSA, however, explicit use of SSA is
something I have not been able to do effectively thus far. however, SSA'isms
have steadily been working their way into the rest of my compiler, such
that, although the input itself remains as RPN, RPN plays a diminishing role
in most of the actual code-generation process (basically, the RPN aspect
becomes increasingly virtualized).
so, I have yet to be able to make a "leap", but the transitions happen a
step at a time...
as I see it, however, RPN will likely remain a convinient representation for
producing upper-end compiler output, which may keep it as being a reasonable
representation, even despite SSA-form being used for the actual code
generation.
IME, RPN can typically be produced by using a straightforwards process to
unwind the ASTs, whereas directly producing SSA from ASTs would seem to
require "a little more work"... (and, also "some algo" which I am not
presently able to imagine...).
hell, maybe RPN has bent my mind somehow that I can't really see "the SSA
way of doing things", I don't know...
similarly, RPN is likely to be a much better general representation for
things like portable bytecode, ... than would be something like SSA (where
somehow, I suspect the exact SSA-form representation of a program is likely
to vary somewhat with the specific design of the codegen, whereas the RPN
form can likely remain far more generic...).
or, in other words, for a portable IL, a JBC or MSIL-like representation may
well be better than, say, an LLVM-like representation...
as well, it seems to be not "that" difficult to unwind RPN into SSA-form...
but, then again, maybe other people know some things I don't...
There have been research projects that translate Forth into C (see
http://www.complang.tuwien.ac.at/projects/forth.html), but none that I
know of that have done the reverse. It's possible to target a stack
based machine with any language, but using Forth as an IL would be a
retrograde step imho when there are ILs that match the source language
better. VMs like the .net IL is stack based and strongly typed and
better suited for languages like C.
--
Regards
Alex McDonald
Correction on my previous reply.
http://groups.google.ca/group/comp.lang.forth/browse_thread/thread/9416057d52
75aa17/98fc97704cda1b07
is a "tiny C to Forth" compiler. And this from 1993;
http://portal.acm.org/citation.cfm?id=199200.316994.
The Tamarin JIT, which is used for ActionScript and JavaScript,
targets Forth. You may have to dig into the code to find out many
details.
http://www.mozilla.org/projects/tamarin/
http://www.bluishcoder.co.nz/2008/05/extending-tamarin-tracing-with-forth.htm
l
Jason
Some years ago MPE did a C to Forth compiler as part of an EU funded
project. They did release it into the wild, and it can be downloaded from:
http://www.mpeforth.com/arena/c2forth110.zip
> Maybe I'm just looking too "foolish Forthy" into this topic. At least,
> it's a simple, stack-based virtual machine which is needed to execute a
> program in a particular (maybe self-defined) lower level language a
> compiler compiles to.
You should look for papers from Jaanus Pvial of the programming group at
the University of Tartu. He was using Forth as a common back end.
--
Peter Knaggs
http://www-personal.umich.edu/~williams/archive/forth/hatforth/hatforth.html
might be of interest too
Andreas
Now that takes me back a couple decades. :)
I worked on a "low cost computer" for schools in what was then called the
3rd world that wsa based on translating several languages, incl C and FORTRAN,
to PS.
The "object code" (making heavy use of some of those niec associative array
and block-nesting features of PS) was then run directly on low-cost printers.
The C compiler was self-hosting, but there turned out not to be enough
interest to get the FORTRAN compiler ('77 of course) running on a printer. ;)
I have built a C to forth compiler. It was a commercial
undertaking. Ask privately for commercial conditions.
jacob navia
email:
jacob at jacob point remcomp point fr