Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Exorcizing STATE with Literals

66 views
Skip to first unread message

JennyB

unread,
Jun 26, 2012, 2:41:49 AM6/26/12
to
Inspired by the sub-thread 'Alternatives to S" and TO' here https://groups.google.com/forum/#!topic/comp.lang.forth/tJG_Wpb_4l4

(Google Groups won't let me quote any more exactly, and I think this deserves its own thread for all readers.

Anton Ertl wrote:


Another take on your statement is that we should get rid of words like
S" and TO (and not design more of the kind) which some implementations
implement with STATE-smartness, for which other implementations
provide implementations with better properties, but with repercussions
that are far more trouble than these words are worth.

Instead, we would only have normal words (like +) and IMMEDIATE words
like ( and IF (the latter maybe compile-only), but no access to STATE
and no words that have independent interpretation and compilation
semantics.

We would have to replace S" and TO with something else. The classic
approach would be the ' ['] solution, but that's not very popular.

What is popular is the solution taken for numbers: The text
interpreter deals with them as appropriate and everyone is happy. We
don't use NUM 123 and [NUM] 123 for numbers, we let the text
interpreter deal with them.

Probably most of the parsing words (the main driver behind STATE-smart
words) can be eliminated by having a few additions to the text
interpreter. E.g., instead of writing ." bla" or .( bla) one could
write "bla" TYPE, and have code that works both interpretively and
compiled as intended.

There are a few parsing words where the replacement requires its own
treatment in the text interpreter, in particular TO. Instead of

TO v

we could write

->v

and the text interpreter would do the appropriate thing with V.

Albert van der Horst used to call this feature "denotations", but
lately I have seen him use "prefixes".

Many will cry out against this complication of the text interpreter.
Some may also find the idea bad because up to now the text interpreter
is a black box so any additions there are only done by the compiler
vendor. However:

1) I think that this feature adds relatively little complexity to the
text interpreter, and it gets rid of a huge amount of baggage
elsewhere (STATE-smart words and the restrictions that their defenders
want to put upon us, or the complexities of the alternatives and their
ramifications).

2) One could make it possible for "mere" users to add further
prefixes/denotations (or, as Gforth calls them, recognizers) to the
text interpreter, just like it is possible to add further words now.
Of course some people will fear giving this much power to the users,
and it's certainly a feature that should be used sparingly, but that's
a management issue.

/endquote

That makes sense. The tricky definitions are those that, like literals, parse something from the input stream and then have to do something different with it according to the current state of STATE. It's tricky to POSTPONE literals, and it makes no sense to ' them.

Berndt then showed how this is done in Gforth using recognizers:

We have only one recognizer chain, and instead of passing a flag, we
return a vtable, which contains three methods: how to interpret the
thing, how to compile the thing, and how to serialize the thing
(serializing is for POSTPONE,). "Thing" is the rest the recognizer
returns (token, number, float, whatever).

/endquote

A recognizer is like a wordlist for literals; if the parsed text fits a certain pattern it tells you what to do with it, otherwise it passes it on to the next recognizer in the chain. So, if we have a recognizer chain called FIND-LITERAL that ends with an always-successful throw for unmatched patterns and the vtable methods are INTERPRETED, COMPILED and POSTPONED, then the text interpreter might look like:

... FIND DUP IF \ ... no surprises here
ELSE
find-literal STATE @ IF compiled ELSE interpreted THEN
THEN

With all the STATE-smartness handled in the FIND-LITERAL chain, the specifications for ' and FIND become simplified.

' always returns the definition's execution token, even if its semantics are
compile-only
FIND always returns the execution token and a flag to say whether it is
normally executed while compiling

A programmer can still write classical STATE-smart words and take the consequences if they are ticked or POSTPONEd, or they can write words like S" and TO as prefixes, which cannot be ticked but can be supplied with a sensible POSTPONED method.

Is it possible to exorcise STATE entirely?
Quite easily. Just have a FIND-replacement that returns a vtable.

' EXECUTE
' COMPILE
:NONAME POSTPONE LITERAL POSTPONE COMPILE, ;
CREATE DEFAULT-XT

' EXECUTE
' EXECUTE
' COMPILE,
CREATE IMMEDIATE-XT

: FIND-XT \ c-addr u -- xt vtable | 0
... FIND CASE
1 OF IMMEDIATE-XT ENDOF
-1 OF DEFAULT-XT ENDOF
DUP
ENDCASE

And the text interpreter becomes:

... FIND-XT DUP 0= IF FIND-LITERAL THEN
STATE @ IF COMPILED ELSE INTERPRETED THEN ;

Even that use of STATE can be eliminated:

DEFER DO-IT
: [ ['] INTERPRETED IS DO-IT ;
: ] ['] COMPILED IS DO-IT ;

... FIND-XT DUP 0= IF FIND-LITERAL THEN DO-IT









Rod Pemberton

unread,
Jun 26, 2012, 4:00:20 AM6/26/12
to
"JennyB" <jenny...@googlemail.com> wrote in message
news:dd38576b-2fbc-48d2...@googlegroups.com...
>
> Inspired by the sub-thread 'Alternatives to S" and TO' here
> [new Google Groups link...]

The first message of the subthread "Alternatives to S" and TO" in "?EXEC".
http://groups.google.com/group/comp.lang.forth/msg/2ba0eacb9ea02224

Usenet msd-id 2012Jun2...@mips.complang.tuwien.ac.at

> (Google Groups won't let me quote any more exactly,
> and I think this deserves its own thread for all readers.

Ok.

<OT>
I think they buried it because they really weren't wanting comments from
certain people here. They just wanted to discuss it among those who were
still posting in the ?EXEC thread. They had a good conversation going.
There is nothing wrong with that. There is no real way to get a private
conversation on Usenet except to go to private email. I (and most likely
others) still read it. We just didn't comment. The issues on STATE
awareness and those Forth words have been brought up quite a few times
recently, i.e., last six months, in various c.l.f. threads, including a few
times by me. Sometimes an issue needs to be brought up at the correct time
for anyone to take notice.
</OT>

> [SNIP]

No comments.


Rod Pemberton



Alex McDonald

unread,
Jun 26, 2012, 6:18:50 AM6/26/12
to
On Jun 26, 7:41 am, JennyB <jennybr...@googlemail.com> wrote:
> Inspired by the sub-thread 'Alternatives to S" and TO' herehttps://groups.google.com/forum/#!topic/comp.lang.forth/tJG_Wpb_4l4
You may be missing an IMMEDIATE on [ . Here's my STATEless interpreter
(lightly edited to remove extraneous stuff for doubles etc. BL WORD
COUNT is due to other internal considerations; it could & should be a
string from PARSE-NAME.);

POSTPONE, compiles the compilation semantics of the XT.

: compiler ( addr 0 | xt -1 | xt 1 -- ??? )
if
postpone, \ smart compile time
else
count number \ parse number
postpone literal \ compile it
then ;

: interpreter ( addr 0 | xt -1 | xt 1 -- ??? )
if
execute \ interpret
else
count number \ parse number
then ;

defer parser ' interpreter is parser

: parse-source ( -- )
begin bl word dup c@
while
find
parser
repeat drop ;

: ] ( -- )
compiler is parser ;

: [ ( -- )
interpreter is parser ; immediate

Albert van der Horst

unread,
Jun 26, 2012, 11:17:46 AM6/26/12
to
In article <dd38576b-2fbc-48d2...@googlegroups.com>,
JennyB <jenny...@googlemail.com> wrote:
<SNIP>
>

>Albert van der Horst used to call this feature "denotations", but=20
>lately I have seen him use "prefixes".=20

Things like $808D "aaap" #123,456,890 0x4556 are denotations, a
generalization of numbers. They must generate e.g. a (double)
constant and compilation behaviour should be to to compile that
constant. This is the inevitable limited smartness we are
accustomed to for classic Forth numbers.
We want
- interpretation and compilation behaviour to be evident
- naturally denotations cannot be ticked, postponed or
compiled

A possible implementation of this are prefix words like " # 0x
that are actually in the dictionary. In ciforth there are prefixes
for numbers: 0 1 2 3 4 5 6 7 8 9 with the same behaviour
`` (NUMBER) '' filled in.
Having prefixes recognized as a separated word is a powerful
technique but it is just one technique to implement denotations.

An interesting application where a user invents their own prefixes
is colorforth, using : prefix for definitions, [ for interpreted
words, ] for compiled words etc. So there is merit to not
restricting this technique to implementors.

>Many will cry out against this complication of the text interpreter.=20
>Some may also find the idea bad because up to now the text interpreter=20
>is a black box so any additions there are only done by the compiler=20
>vendor. However:=20
>
>1) I think that this feature adds relatively little complexity to the=20
>text interpreter, and it gets rid of a huge amount of baggage=20
>elsewhere (STATE-smart words and the restrictions that their defenders=20
>want to put upon us, or the complexities of the alternatives and their=20
>ramifications).=20
>
>2) One could make it possible for "mere" users to add further=20
>prefixes/denotations (or, as Gforth calls them, recognizers) to the=20
>text interpreter, just like it is possible to add further words now.=20
>Of course some people will fear giving this much power to the users,=20
>and it's certainly a feature that should be used sparingly, but that's=20
>a management issue.=20
>
>/endquote
>
>That makes sense. The tricky definitions are those that, like literals, par=
>se something from the input stream and then have to do something different =
>with it according to the current state of STATE. It's tricky to POSTPONE li=
>terals, and it makes no sense to ' them.
>

You make a good case. I started using prefixes in ciforth, because
in total I considered this having the least total complexity.

<SNIP>

Groetjes Albert


--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Rod Pemberton

unread,
Jun 26, 2012, 8:16:41 PM6/26/12
to
"Alex McDonald" <bl...@rivadpm.com> wrote in message
news:f9370dfa-c718-41fb...@h10g2000yqn.googlegroups.com...

> [...]
> Here's my STATEless interpreter [...]

Just to clarify a bit, use of "STATEless" can be confusing. It can be
confused with "stateless", or have the different meanings conjoined. Your
interpreter is "STATEless" as in "without using STATE". However, it's not
stateless. It still has states: 'parser as compiler' or 'parser as
interpreter'. Each works differently and exclusively of the other, i.e.,
each is a state.


Rod Pemberton




BruceMcF

unread,
Jun 26, 2012, 9:40:14 PM6/26/12
to
On Jun 26, 8:16 pm, "Rod Pemberton" <do_not_h...@notemailnot.cmm>
wrote:
> "Alex McDonald" <b...@rivadpm.com> wrote in message
Or, more generally there, the state of a deferred word rather than the
state of a variable. It could even be Forth94 compliant, if there is a
dummy variable that is loaded with the result of comparing the
deferred word xt to the compiling xt given that Forth94 portable
source cannot rely on being able to write to the STATE variable.

A stateless approach would be to have the compiler loop performed by ]
and when exited return to the interpreter loop ~ that is not
compatible with Forth94.

JennyB

unread,
Jun 27, 2012, 2:14:19 AM6/27/12
to
On Tuesday, 26 June 2012 11:18:50 UTC+1, Alex McDonald wrote:

> You may be missing an IMMEDIATE on [ .

Fast fingers, slow brain. Also, the stack of FIND-XT should be:

FIND-XT \ c-addr u -- xt vtable | c-addr u 0

FIND-XT ?DUP 0= IF FIND-LITERAL THEN \ c-addr u -- ? vtable

> Here's my STATEless interpreter
> (lightly edited to remove extraneous stuff for doubles etc.

Yes, it's the extraneous stuff that is the problem. This scheme replaces all that with FIND-LITERAL INTERPRETED or FIND-LITERAL POSTPONED , and allows the user to add their own literal types.

Alex McDonald

unread,
Jun 27, 2012, 4:51:19 AM6/27/12
to
On Jun 27, 1:16 am, "Rod Pemberton" <do_not_h...@notemailnot.cmm>
wrote:
> "Alex McDonald" <b...@rivadpm.com> wrote in message
The reference was capitalised as STATEless to alert you to the fact
that it referred to the variable.

Alex McDonald

unread,
Jun 27, 2012, 4:57:37 AM6/27/12
to
That's the same scheme as I have with two number conversion "chains";
one when interpreting & one when compiling.

BruceMcF

unread,
Jun 27, 2012, 10:22:50 AM6/27/12
to
On Jun 27, 2:14 am, JennyB <jennybr...@googlemail.com> wrote:
> On Tuesday, 26 June 2012 11:18:50 UTC+1, Alex McDonald  wrote:
> > You may be missing an IMMEDIATE on [ .
>
> Fast fingers, slow brain. Also, the stack  of FIND-XT should be:
>
>  FIND-XT   \ c-addr u -- xt vtable | c-addr u 0
>
>  FIND-XT ?DUP 0= IF FIND-LITERAL THEN  \ c-addr u -- ? vtable

Does this make the null pointer assumption? Anyway, a constant height
stack gives:

FIND-XT \ c-addr u -- xt vtable TRUE | c-addr u 0

... FIND-XT 0= IF FIND-LITERAL THEN \ c-addr u -- ? vtable

JennyB

unread,
Jun 28, 2012, 4:02:04 PM6/28/12
to
Much better. Thanks, Bruce
0 new messages