Recent discussions have inspired me to think about how a
(non-standard) Forth system would look that is designed to be as
copy-pasteable as possible, i.e., where you could copy and paste code
from interpreted to compiled code and vice versa; and what would not
be possible. And ideally we would manage to do that without
multiple-behaviour words.
I look at the Forth-2012 core words as a departure point. I expect
that all problems are already present in the core words.
Non-parsing Words with default interpretation and compilation
semantics are already copy-pasteable, with the exception of ]. ! #
#> #S * */ */MOD + +! , - . / /MOD 0< 0= 1+ 1- 2! 2* 2/ 2@ 2DROP 2DUP
2OVER 2SWAP < <# = > >BODY >NUMBER ?DUP @ ABORT ABS ACCEPT ALIGN
ALIGNED ALLOT AND BASE BL C! C, C@ CELL+ CELLS CHAR+ CHARS COUNT CR
DECIMAL DEPTH DROP DUP EMIT ENVIRONMENT? EVALUATE EXECUTE FILL FIND
FM/MOD HERE HOLD I IMMEDIATE INVERT J KEY LSHIFT M* MAX MIN MOD MOVE
NEGATE OR OVER QUIT ROT RSHIFT S>D SIGN SM/REM SOURCE SPACE SPACES
STATE SWAP TYPE U. U< UM* UM/MOD XOR
Parsing words with default compilation semantics: ' : CHAR CONSTANT
CREATE VARIABLE WORD. These are not copy-pasteable. By converting them
from parsing words to non-parsing words that take a string, most of
them can be made copy-pasteable; e.g.:
5 "foo" constant
To make this practical, we need literal strings, and explains why
Forth has traditionally used parsing words instead. For the literal
strings, since we don't want the multiple-behaviour S", and want
copy-pasteability, literal strings have to be recognized by the text
interpreter.
The conversion above makes no sense for WORD. Many uses of WORD are
for defining parsing words and these words should be changed to by
take the string(s) from the stack. For the uses in user-defined
text-interpreters, a way to make it copy-pasteable to the regular text
interpreter would be to have another input stream from which WORD
takes its input (no, I have not worked this idea out).
Code using >IN also may behave differently when copy-and-pasted, and
the additional input stream may solve that, too.
For ":" the conversion is not sufficient to make it copy-pasteable,
but I'll discuss that later.
( parses, but is copy-pastable.
Words where the compilation semantics parse: ." ABORT" S" ['] [CHAR].
Most of these can be converted into normal words that take strings on
the stack. E.g., ." foo" becomes "foo" TYPE. S" becomes
unnecessary, because the text interpreter recognizes literal strings.
And we can eliminate some words in the process; not only do we not
need .", but "x" CHAR and "x" [CHAR] do the same, so [CHAR] is
unnecessary. Of course, given that Forth-2012 has 'x', CHAR is
unnecessary, too.
There is, however, a difference between what standard ['] FOO does and
what "FOO" ' would do: standard ['] looks the string up during
compilation, ' during run-time, when the search order and the contents
of the wordlists may be different. The simplest solution is to let
the text interpreter recognize literal xts, e.g., `FOO.
Return-address words: >R R> R@. They can be given default
interpretation semantics (in addition to the default compilation
semantics); in order to work, the return stack must not be used by the
text interpreter across the interpretation of words.
Control structure words: +LOOP BEGIN DO ELSE IF LEAVE LOOP REPEAT THEN
UNTIL WHILE.
One existing solution is to turn on compilation when encountering a
control structure start, and revert back to interpretation upon
control structure end. This works much better in the present context,
because the copy-pasteability reduces the differences that you would
normally encounter with this approach. The compilation needs to be to
a separate dictionary section, so that we can perform dictionary
allocation inside the control structure.
Another solution is to behave somewhat like the control structures do
at run-time, during interpretation time.
Both of these solutions are not satisfactory wrt the requirement to
avoid dual-behaviour words.
Another solution is the Postscript way: replace the control structure words
with words that take xts, e.g.:
5 [: dup 0> ;] [: dup . 1- ;] while
where WHILE performs the first xt, and if it returns true, performs
the second xt, and then repeats the process. LEAVE needs to be
adapted to deal with the next enclosing LOOP (somewhat like THROW for
CATCH). Locals are interesting in connection with this solution.
All these solutions don't allow to copy-and-paste incomplete control
structures, but I don't think that that's a solvable problem.
DOES> ... ; can be replaced by [: ... ;] SET-DOES>, which is already
copy-pasteable in Gforth.
For words that assume a surrounding colon definition, such as EXIT and
RECURSE (and UNLOOP, which only makes sense if followed by an EXIT), I
don't see a sensible way to make them copy-pasteable.
[ and ] could be changed to reduce/increase the compilation level. In
particular, when you have ] FOO [ inside a colon definition, FOO is
POSTPONEd; when you copy it into interpreted code, FOO is compiled.
Sub-0 compilation levels are just interpreter instances with a
separate context (I have not worked this out in detail). This
behaviour of [ and ] leaves the framework of compilation and
interpretation semantics introduced by Forth-94. They always perform
this behaviour, no matter what the state.
LITERAL moves a number from one compilation level to the next higher
one. In particular, for sub-0 levels, it copies the number from the
data stack at that level to the data stack at the next level (not
worked out, either).
POSTPONE is unnecessary, thanks to ] ... [.
[: and ;] (not core, but needed above) nest an anonymous colon
definition inside another, or produce an anonymous colon definition
when used interpretively. This does not satisfy the requirement of no
dual-behaviour words.
To make : and ; work copy-pasteably, they would have to behave like [:
and ;]. Or we just use replace
: foo ... ;
with
"foo" [: ... ;] alias
and forget about : and ;. However, then EXIT and RECURSE are not very
useful in connection with xt-based control structures.
Overall, there are a number of problems to be solved, but a wholly
copy-pasteable Forth would also simplify some issues that we have in
standard Forth.
- anton
--
M. Anton Ertl
http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs:
http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard:
http://www.forth200x.org/forth200x.html
EuroForth 2018: September 12/14-17 near Edinburgh, Scotland