ARMD: Emacs-based embedded language with compile-time register stack

Daniel Engeler

unread,

Apr 3, 2007, 4:46:29 PM4/3/07

to

I wrote ARMD, a language for embedded development. ARMD is inspired by
the Forth stack, but uses it differently. Forth uses the stack for
data at run-time, ARMD for registers at compile-time.

Consider the expression "10 20 + print" in Forth:

Source Code Stack
-----------------------------------
10 push 1 10
20 push 2 10 20
+ branch + 30
print branch print empty

In ARMD:

Source Code Stack
-----------------------------------
10 mov r0, #10 r0
20 mov r1, #20 r0 r1
+ add r0, r0, r1 r0
print branch print empty

In ARMD, the stack is used for register allocation at compile-time.
Similar to Forth, tokens are compiled in-order:

- A number (such as "30") generates code which loads the number into a
register. The register is determined at compile-time by the stack.
For example, if the stack is (r0 r1), the code "mov r2, #30" is
compiled, after which the stack is (r0 r1 r2).

- An operator (such as "+") generates code which performs the
operation directly on the registers. For example, if the stack is
(r6 r2 r4), the code "add r0, r2, r4" is compiled, after which the
stack is (r6 r0).

- A function (such as "print") with N arguments generates code which
moves the top N registers into r0 to rN-1, then branches to the
function. For example, if the stack is (r3 r1), "print" has N=1
arguments, the generated code is "mov r1, r0", "branch print", after
which the stack is (r3).

Another example, consider the expression "10 foo 20 foo +" with the
function "foo" (with 1 argument and 1 result):

Source Code Stack Comment
------------------------------------------------
10 mov r0, #10 r0 r0 = 10
foo branch foo r0 r0 = result of foo
20 mov r1, #20 r0 r1 r1 = 20
mov r4, r0 r4 r1 first reordering
mov r0, r1 r4 r0 second reordering
foo branch foo r4 r0
+ add r0, r4, r0 r0

After "20", the stack is (r0 r1). The function "foo" requires its
argument in r0, so we must move r1 into r0. But r0 is already used, so
it must be saved into r4 (first reordering). Only then can r1 be moved
into r0 (second reordering). In the last line, note how "+" operates
directly on the register stack, which requires no reordering.

ARMD uses no other optimization technique. I wild-guess (without
having any measurements, but from observing GCC assembly) that it is
about half as efficient as optimized C, since no stack-juggling is
necessary, but register reordering eats cycles and cache lines.

ARMD is implemented in 4000 lines of Emacs Lisp. It includes a simple
ARM assembler and a monitor for target communication over a serial
port to interactively download and execute code.

I wonder if someone else has implemented such a compile-time register
stack before, or an embedded language inside Emacs.

-- Daniel Engeler

Andreas Kochenburger

unread,

Apr 3, 2007, 5:31:35 PM4/3/07

to

Daniel Engeler wrote:
> I wrote ARMD, a language for embedded development. ARMD is inspired by
> the Forth stack, but uses it differently. Forth uses the stack for
> data at run-time, ARMD for registers at compile-time.

> I wonder if someone else has implemented such a compile-time register

> stack before, or an embedded language inside Emacs.

I have no reference at hand. But it reminds me of a data flow analysis
technique for compiler optimization. What you have implemented here
looks like an operands stack.

--

Andreas
-------
MinForth: http://minforth.net.ms/

Anton Ertl

unread,

Apr 3, 2007, 5:34:53 PM4/3/07

to

"Daniel Engeler" <eng...@gmail.com> writes:
>I wonder if someone else has implemented such a compile-time register
>stack before,

You did, in D. The oldest reference in the Forth world I know is Tom
Almy's ForthCMP [almy86]. The oldest reference in computer science that
I know is from 1965 [kanner+65].

> or an embedded language inside Emacs.

The first implementation of Vmgen (which is the language used for
specifying Gforth primitives) was written in Elisp. I later rewrote
it in Forth, which was very beneficial.

So, now that you have written D in Forth, and ARMD in Elisp, what did
you like better, and why?

Welcome back. BTW, if you wanted to give an URL for ARMD, you forgot
to include it.

- anton

@Article{almy86,
author = "Thomas Almy",
title = "Compiling {Forth} for Performance",
journal = jfar,
year = "1986",
volume = "4",
number = "3",
pages = "379--388",
annote = "A batch Forth compiler for the 8086 and the Z80. It
produces machine code in executable files. It uses
peephole optimization and keeps up to two values from
the top of the stack in registers."
}

@Article{kanner+65,
author = "H. Kanner and P. Kosinski and C. L. Robinson",
title = "The structure of yet another {ALGOL} compiler",
journal = cacm,
volume = "8",
number = "7",
pages = "427--438",
month = jul,
year = "1965",
coden = "CACMA2",
ISSN = "0001-0782",
bibdate = "Sun Sep 18 23:35:40 1994",
bibsource = "ftp://ftp.ira.uka.de/pub/bibliography/Compiler/bevan.bib
and
ftp://ftp.ira.uka.de/pub/bibliography/Compiler/Compiler.Lins.bib",
abstract = "A high-speed ``top down'' method of syntax analysis
which completely eliminates ``back-up'' of the source
string has been implemented in a convenient
macro-language. A technique of simulation at compile
time of the use of a conventional run-time stack
enables the generation of code for expressions which
minimizes stores, fetches and stack-pointer motion at
run time, while properly treating recursion and side
effects of procedures. Block structure and recursion
are handled without need for interpretive methods at
run time. The ``context problem'' in the transmission
to recursive procedures of parameters ``called by
name'' is solved in a manner which permits the handling
of common cases of simple expressions and array
identifiers with particular efficiency.",
checked = "19940407",
sjb = "Contains two good pieces of advice: (1) Do not bother
to mechanism those operations which are easily
performed by humans. (2) Do not perform at run time any
bookkeeping operations that can reasonably be performed
at compile time. The former led to the decision to
writing the lexer/parser as set of recursive routines
and the latter to the removal of any form of ``go to''
interpreter \cite{Irons:Feurzig:cacm:1961}. Notes that
the ALGOL report uses syntax to distinguish between
arithmetic and boolean expressions but that this causes
problems for their syntax analyser. The solution to the
problems was to unify the syntax and make
differentiating between the two types of expression a
typing problem. Rest of the paper details solutions to
the following areas: labels and multiple assignments;
run time lists for {\bf own} variables; dealing with
block structure using the symbol table; code generation
for expressions; dealing with switches and
procedures.",
annote = "Discusses several technical issues in Algol~60
implementation; some of which are specific to the
language, but others are still interesting today
(e.g., how to deal with common prefixes in syntax
analysis). They apparently generate code for an
accumulator machine with several index registers
without stack addressing modes. The code generation
logically works with a stack in memory; however, the
compiler emulates most stack pointer updates at
compile-time and translates stack accesses into
indexed accesses."
}

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2007: http://www.complang.tuwien.ac.at/anton/euroforth2007/

pablo reda

unread,

Apr 3, 2007, 11:20:42 PM4/3/07

to

Hi Daniel

Please note one thing

if you stack twp numbers and then + , this is equal to load the
result.

and add the concept of variable..

you publish your code ?

Daniel Engeler

unread,

Apr 4, 2007, 4:19:25 PM4/4/07

to

>> I wonder if someone else has implemented such a compile-time
>> register stack before,
> You did, in D.

:-) That's right. The difference is that D uses a fixed register
stack, being either empty, (r0), (r0 r1), (r0 r1 r2), or (r0 r1 r2
r3). For every operators such as "+", there were up to 4 variants, for
every possible stack. For example, "+" has 4 variants:

stack before -> stack after: instructions
---------------------------------------------------
empty -> (r0) : pop r0; pop r1; add r0, r0, r1
(r0) -> (r0) : pop r1; add r0, r0, r1
(r0 r1) -> (r0) : add r0, r0, r1
(r0 r1 r2) -> (r0 r1) : add r1, r1, r2
(r0 r1 r2 r3) -> (r0 r1 r2) : add r2, r2, r3

ARMD in contrast can have any registers on the stack, and operators
are "immediate" in that they decide at compile-time what registers to
manipulate. For example:

(r4 r2) -> (r0) : add r0, r4, r2
(r0 r2 r5 r4) -> (r0 r2 r1) : add r1, r5, r4

> The first implementation of Vmgen (which is the language used for
> specifying Gforth primitives) was written in Elisp. I later rewrote
> it in Forth, which was very beneficial.

Please elaborate.

> So, now that you have written D in Forth, and ARMD in Elisp, what
> did you like better, and why?

1. Forth unifies the application ("led", "blink") and the compiler
(":", "create"). But Forth neglects the development ("edit the source
of this function", "test this function"). While developing D, my PC
was no more than a text editor and serial terminal.

ARMD integrates the compiler and the development in Emacs. For
example, I can define a function and test it easily by simply
executing these Elisp expressions, which write code to the target via
the serial port and execute it immediately:

(armd-defun _foo (x) (_+ x 1))
(assert (= (armd-eval (_foo 4)) 5))

(The underscore "_" in target function names is for namespace
separation.)

ARMD separates the application and the compiler, but I don't see this
as a disadvantage.

Also, building a compiler in Elisp is simpler than in D/Forth, because
Elisp list manipulation and debugging is more powerful and comfortable
than a simple Forth that's still building itself.

2. D was stack-based like Forth, while ARMD is register/variable based
like C. I never managed Forth stack juggling. ARMD variables are cheap
and handy, which is more comfortable to a C programmer.

> Welcome back. BTW, if you wanted to give an URL for ARMD, you forgot
> to include it.

I'll document the code and then post it in this newsgroup.
Unfortunately it's not easy to use because you first have to generate
the bootloader and serial monitor for the target in ARMD and flash
this on the target (probably through JTAG).

Daniel Engeler

Anton Ertl

unread,

Apr 7, 2007, 9:10:04 AM4/7/07

to

"Daniel Engeler" <eng...@gmail.com> writes:
>>> I wonder if someone else has implemented such a compile-time
>>> register stack before,
>> You did, in D.
>
>:-) That's right. The difference is that D uses a fixed register
>stack, being either empty, (r0), (r0 r1), (r0 r1 r2), or (r0 r1 r2
>r3). For every operators such as "+", there were up to 4 variants, for
>every possible stack. For example, "+" has 4 variants:
>
>stack before -> stack after: instructions
>---------------------------------------------------
>empty -> (r0) : pop r0; pop r1; add r0, r0, r1
>(r0) -> (r0) : pop r1; add r0, r0, r1
>(r0 r1) -> (r0) : add r0, r0, r1
>(r0 r1 r2) -> (r0 r1) : add r1, r1, r2
>(r0 r1 r2 r3) -> (r0 r1 r2) : add r2, r2, r3

I count 5.

>ARMD in contrast can have any registers on the stack, and operators
>are "immediate" in that they decide at compile-time what registers to
>manipulate. For example:
>
>(r4 r2) -> (r0) : add r0, r4, r2
>(r0 r2 r5 r4) -> (r0 r2 r1) : add r1, r5, r4

That's basically the difference between a implementation that uses
static stack caching (with the straight-forward n+1 stack
representations), and an implementation that internally represents the
registers explicitly. I think that Tom Almy's compiler was
stack-caching, but had more stack representations.

An example of an implementation that represents the registers
explicitly is VFX Forth. I am pretty sure that Mops is another such
implementation, and maybe jForth, too. RAFTS goes beyond that by
having register allocation in a separate, later phase from
stack-to-(pseudo-)register mapping, with instruction selection and
instruction scheduling in between.

>> The first implementation of Vmgen (which is the language used for
>> specifying Gforth primitives) was written in Elisp. I later rewrote
>> it in Forth, which was very beneficial.
>
>Please elaborate.

The problem that triggered the rewrite was that Elisp behaved
differently wrt case sensitivity in every Emacs version. The rewrite
solved that. Additional benefits were:

- The Forth version was much faster (the Elisp version took ~30s).

- The Forth version was easier to maintain. The functional style in
which the Elisp version was written turned out to require lots of
changes when various extensions were done (additional parameters had
to be added to a lot of functions). It might well be that this was
just due to my not having mastered this programming style yet.

>1. Forth unifies the application ("led", "blink") and the compiler
>(":", "create"). But Forth neglects the development ("edit the source
>of this function", "test this function"). While developing D, my PC
>was no more than a text editor and serial terminal.
>
>ARMD integrates the compiler and the development in Emacs. For
>example, I can define a function and test it easily by simply
>executing these Elisp expressions, which write code to the target via
>the serial port and execute it immediately:

I am quite happy to work in an environment where the editor is
separate from the execution environment. I like to restart the
execution environment for every test run, but I want to have the
editor session essentially forever.

However, if you want a more integrated environment, there are various
approaches for that, from forth.el's run-forth command (inherited by
gforth.el) to various kinds of Forth editors; they are not Emacs,
however; Forthmacs is probably close, but unfortunately long
unmaintained AFAIK.

>ARMD separates the application and the compiler, but I don't see this
>as a disadvantage.

I see it as quite important not to have such a separation. It is the
basis for macros and run-time code generation.

Elizabeth D Rather

unread,

Apr 7, 2007, 1:43:21 PM4/7/07

to

Anton Ertl wrote:
> "Daniel Engeler" <eng...@gmail.com> writes:
...

>> 1. Forth unifies the application ("led", "blink") and the compiler
>> (":", "create"). But Forth neglects the development ("edit the source
>> of this function", "test this function"). While developing D, my PC
>> was no more than a text editor and serial terminal.
>>
>> ARMD integrates the compiler and the development in Emacs. For
>> example, I can define a function and test it easily by simply
>> executing these Elisp expressions, which write code to the target via
>> the serial port and execute it immediately:
>
> I am quite happy to work in an environment where the editor is
> separate from the execution environment. I like to restart the
> execution environment for every test run, but I want to have the
> editor session essentially forever.

For many years our products included an integrated editor. Its
capabilities evolved, and it became quite powerful. However, we found
most of our customers had favorite programmers' editors that they
preferred to use, and for the last 10+ years we've used that model. It
works very well. We have established links between SwiftForth or SwiftX
(our cross-compiler) and about 20 popular Windows editors, so that they
can interact, but the model that Anton is describing is the one most
users prefer.

SwiftX is also connected to the target by some means (serial, parallel,
USB, etc., depending on the target), and has the capability of compiling
and downloading code automatically much as Daniel describes. The host
environment maintains a fully interactive link with the target: you can
not only download and execute code, but also examine its memory,
exercise its I/O ports, etc.

...

>
>> ARMD separates the application and the compiler, but I don't see this
>> as a disadvantage.
>
> I see it as quite important not to have such a separation. It is the
> basis for macros and run-time code generation.

I completely agree with Anton as far as development on a platform such
as Windows, but for embedded systems we prefer to have a powerful
compiler on the host, and either none or a simple compiler in the
target. A really powerful compiler has too big a footprint for small
embedded systems. The compromise we reached in SwiftX was to have all
aspects of the compiler available to the programmer on the host, for the
purpose of building macros and custom compiler directives that may be
useful for the application.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310-491-3356
5155 W. Rosecrans Ave. #1018 Fax: +1 310-978-9454
Hawthorne, CA 90250
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Stephen Pelc

unread,

Apr 10, 2007, 7:50:56 AM4/10/07

to

On 3 Apr 2007 13:46:29 -0700, "Daniel Engeler" <eng...@gmail.com>
wrote:

>ARMD is implemented in 4000 lines of Emacs Lisp. It includes a simple
>ARM assembler and a monitor for target communication over a serial
>port to interactively download and execute code.
>
>I wonder if someone else has implemented such a compile-time register
>stack before, or an embedded language inside Emacs.

MPE's VFX Forth code generators use a compile-time stack, and
also perform constant folding and so on. These compilers are
available for a range of architectures, including i32 (see
VFX Forth for Windows) and ARM.

A typical VFX Forth code generator is 5000 lines of Forth,
of which about 50% is common VFX code and 50% is architecture
specific.

Stephen

--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads