Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How fig-Forth escapes an infinite loop ...

32 views
Skip to first unread message

Rod Pemberton

unread,
Oct 4, 2011, 9:10:13 PM10/4/11
to

This might be useful for those coding their own Forth interpreter.

It could be called:
"How fig-Forth secretly escapes an infinite loop."

Or, alternately:
"One more almost undocumented, hidden, 'feature' of a threaded Forth
interpreter ..."

First, let me provide the background. A few of you may recall that my Forth
interpreter in C was parsing a word at a time, instead of reading a line and
parsing it into words. I was attempting to fix this. So, my QUIT resembles
this (with corrections and without various kludges):

: QUIT RESET [ BEGIN QUERY INTERPRET STATE @ 0= IF ." OK" ENDIF AGAIN ;

That's similar to fig-Forth, but without some stuff, e.g., BLK.

My QUERY just processed one word at a time. So, I was using QUERY
everywhere QUERY and WORD would normally be used. To implement
line-processing, I needed an additional loop somewhere to repeat the process
of parsing words from a line of text. Where should I locate it? What type
of loop is it? Again, I looked to fig-Forth (paraphrased):

: QUIT
( whatever )
BEGIN
( whatever )
QUERY INTERPRET
( whatever )
." OK"
( whatever )
AGAIN
;

: INTERPRET
BEGIN
-FIND
( whatever )
AGAIN
;

Ah, the additonal loop is in INTERPRET, but wait a minute ... The astute
among you will notice what I noticed. In QUIT, INTERPRET returns after
being called to print "OK". However, INTERPRET also has an infinite loop -
that BEGIN AGAIN sequence. How does INTERPRET to return to QUIT with an
infinite loop in it? Well, I searched for the answer, alot. ABORT ? No,
that would call QUIT. QUIT ? No, that wouldn't return to the location to
print "OK." R> DROP ? There doesn't seem to be any of those in the main
code ... Just an R> somewhere? I didn't see that either. Most R> usage
seemed to be in matched sequences with >R . Sigh ... TWO nested infinite
loops! And, one of them exits? Ok, what is going on here? If R> DROP was
somewhere but not in INTERPRET, then it could be in one of the words
INTERPRET uses. No! It's not there either ... Let me check x86 and CP/M
fig-Forth. OMG, the assembly does the same thing!

After searching a bit, I noticed that fig-Forth word called X that seems to
do what is needed.

fig-Forth's definition of X:

"
: X BLK @ ( END-OF-TEXT IS NULL *)
IF ( DISC ) 1 BLK +! 0 IN ! BLK @ 7 AND 0=
IF ( SCR END ) ?EXEC R> DROP ENDIF ( disc dependent )
ELSE ( TERMINAL ) R> DROP
ENDIF ; ! IMMEDIATE -->
"

See, X contains the required R> DROP needed by INTERPRET and it mentions
"TERMINAL". But, X is not used *anywhere* in the fig-Forth code. Since
it's not used, it seemed I was back to square one ... Although, the
fig-Forth description seemed to confirm it is used for that ... somehow ...

fig-Forth's description of X:

"
X
This is a psuedonym for the "null" or dictionary entry for
a name of one character of ascii null. It is the
execution procedure to terminate interpretation of a line
of text from the terminal or within a disc buffer, as both
buffers always have a null at the end.
"

What?! ...

It turns out that X is how fig-Forth escapes *ONE* of the infinite loops,
the one in INTERPRET. It's just that the explanation of what is being done
is not widespread. Fortunately, I located the following. This is how X
works:

"The other subtlety relates to how the loops are terminated. Note that the
INTERPRET loop shown above never terminates! We all know that it really
does terminate, and the mechanism is pretty kludgey. What happens is that
there is a null character at the end of every line of text in the input
stream, and at the end of every BLOCK of text from mass storage. The text
interpreter picks up this null character just like a normal word. The
dictionary contains an entry which matches this "null word". The associated
code is executed, and it plays around with the return stack in such a way
that the INTERPRET loop is exited without its ever knowing about it."

- from 1984 document by Mitch Bradley
- posted to comp.lang.forth in 1993 by Stephan Bevan
http://groups.google.com/group/comp.lang.forth/msg/c15f99be974e6687

Another copy appears to be here:
ftp://ftp.taygeta.com/pub/forth/Archive/various/interpreter.txt

HTH,


Rod Pemberton



Elizabeth D. Rather

unread,
Oct 4, 2011, 9:53:18 PM10/4/11
to
On 10/4/11 3:10 PM, Rod Pemberton wrote:
> This might be useful for those coding their own Forth interpreter.
>
> It could be called:
> "How fig-Forth secretly escapes an infinite loop."
>
> Or, alternately:
> "One more almost undocumented, hidden, 'feature' of a threaded Forth
> interpreter ..."
...
*Sigh* I remember that trick. It was in very early Forths, probably as
long ago as NRAO. Awful. Excessively "cute" and obscure. Once you're
done enjoying your "eureka!" moment, forget you ever saw that!

A much cleaner solution is to have BEGIN ... WHILE ... REPEAT loops that
compile or interpret depending on STATE, with the loops terminating when
the current input source is exhausted, whereupon the system (or TERMINAL
task) simply waits for more input in the BEGIN ... AGAIN loop in QUIT.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Mark Wills

unread,
Oct 5, 2011, 2:48:42 AM10/5/11
to
On Oct 5, 2:10 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> - posted to comp.lang.forth in 1993 by Stephan Bevanhttp://groups.google.com/group/comp.lang.forth/msg/c15f99be974e6687
>
> Another copy appears to be here:ftp://ftp.taygeta.com/pub/forth/Archive/various/interpreter.txt
>
> HTH,
>
> Rod Pemberton

Rod,

It's more like this:

: QUIT
BEGIN
RESET-RETURN-STACK
INTERPRET
." OK"
AGAIN
;

: INTERPRET ( -- )
BLK @ DUP
IF
\ get input from the block buffer
BLOCK DUP
IF
TIB ! 1024 SPAN !
ELSE
TRUE ABORT" Block load error"
THEN
ELSE
\ get input from the keyboard
DROP
TIB @ 80 DUP C/L ! EXPECT
THEN
BEGIN
32 WORD DUP \ read a word from the terminal input buffer
WHILE
2DUP FIND DUP
IF \ word was found
1+
IF \ immediate word. clean stack & execute
NIP NIP EXECUTE
ELSE \ not immediate
STATE @
IF \ compiling
, 2DROP
ELSE \ interpreting
NIP NIP EXECUTE
THEN
THEN
ELSE \ not a valid word, maybe it's a number?
2DROP 2DUP NUMBER 0=
IF \ valid number
STATE @
IF
COMPILE LIT , 2DROP ELSE NIP NIP
THEN
ELSE \ not a number, so error
DROP CR ." Error:" TYPE ." not found "
STATE @ \ compiling?
IF \ yes, display name of word containing error
." in " LATEST @ 2+ DUP @ $F AND SWAP 2+ SWAP TYPE CR
BLK @ IF
\ display block # if processing a block
." In block " BLK @ .
THEN
THEN
ABORT
THEN
THEN
REPEAT
2DROP
;

Obviously your code won't be identical, but it should give you some
idea of the structure. I should probably factor my INTERPRET into
smaller chunks, but it works. I chopped some code that wasn't
relevant, but you hopefully get the idea.

As Elizabeth said, you get a line of input, either from a block/file
or from the keyboard.

My Forth system has one foot in FIG and one foot in Forth-83 (e.g I
don't think Forth-83 has NUMBER, it has CONVERT or something like
that) but anyway, you'll probably get the general gist.

The key to it is the loop in INTERPRET and the way EXPECT and WORD
behave. There's no special magic. You don't need to manipulate the
return stack etc and jump through hoops.

EXPECT ( addr n -- ) will get n bytes of input from the input device
and store them beginning at address addr. EXPECT will return early if
return (or some sort of EOL) is detected. The number of characters
received is stored in SPAN.

So normally, we'd fill the terminal input buffer (TIB) with EXPECT:

TIB @ 80 EXPECT ( get address of TIB and ask EXPECT to get 80
characters)

When you have some data in the TIB, you call repeatedly. WORD looks at
the address of the input buffer, and adds the offset >IN to it, and
searches from that point for a string delimited by the delimeter
character supplied.

E.g.

32 WORD will look return a word seperated by a space.

WORD ( char_delimeter -- address length)

When word has exausted all its input (e.g. all the characters received
by EXPECT) it returns 0 0 and you drop out of interpret at that point,
back into QUIT. My QUIT resets the return stack, but shouldn't really
be necessary, I guess.

HTH.

I'm not sure if the description I have given is 'classical' but I
figure it's somewhere close. It works, anyway ;-)

Mark

Mark Wills

unread,
Oct 5, 2011, 2:52:22 AM10/5/11
to
> Mark- Hide quoted text -
>
> - Show quoted text -

Woops:

When you have some data in the TIB, you call repeatedly.

Should be:

When you have some data in the TIB, you call WORD repeatedly.

Rod Pemberton

unread,
Oct 5, 2011, 4:18:44 AM10/5/11
to
"Elizabeth D. Rather" <era...@forth.com> wrote in message
news:DMCdnQpPY8sSJRbT...@supernews.com...
> On 10/4/11 3:10 PM, Rod Pemberton wrote:
>
> > [old trick]
>
> *Sigh* I remember that trick. It was in very early Forths, probably as
> long ago as NRAO. Awful. Excessively "cute" and obscure.

lol! :-)

Well, I'm glad there is at least one person around who still knows what was
going on. If I hadn't found that post, I'd have been asking what was up ...

> Once you're done enjoying your "eureka!" moment, forget you
> ever saw that!
>

It's "ugly," but I can't. My interpreter is predominantly fig-Forth like,
so far ... That includes many definitions. So, I need to keep the quirks
in so everything works. Admittedly, I haven't done that anywhere near 100%.
That's because I'm removing (hopefully) uneeded stuff, like BLK or words
that didn't survive to the present, and I'm also keeping things compatible
with the C code. So, words, like COUNT, ENCLOSE, LATEST/LAST, CFA, PFA, etc
have been modified to match my compiler's C methods, e.g., ASCIZ and
structs. I've got ANS notes in there, but I'm nowhere near that yet. I've
also used a hanful of things from other Forth spec.'s that make sense to me,
e.g., -1 for TRUE, CELL, CELLS, CELL+, BYTE, CHAR, CHARS, etc.

> A much cleaner solution is to have BEGIN ... WHILE ... REPEAT loops that
> compile or interpret depending on STATE, with the loops terminating when
> the current input source is exhausted, whereupon the system (or TERMINAL
> task) simply waits for more input in the BEGIN ... AGAIN loop in QUIT.
>

Yes, a version of that is in the Mitch Bradley document. The problem is if
I do that now, I could introduce an unforeseen complication that'll cause me
more work later. I'd much rather do that after it's complete.

Thanks.


Rod Pemberton



Rod Pemberton

unread,
Oct 5, 2011, 4:20:36 AM10/5/11
to
"Mark Wills" <markrob...@yahoo.co.uk> wrote in message
news:dfb02b06-6253-4a45...@m37g2000yqc.googlegroups.com...
> On Oct 5, 2:10 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
>
> > [QUIT and INTERPRET]
Well, my Forth is much less developed, for sure. I have none of these:
BLK BLOCK SPAN ABORT ABORT" C/L 2DUP 2DROP ." 2+ $F \ (

I'm not sure what C/L or $F are ... ." and ( and maybe \ are in the
future. 2x words will be near the very end. I'm hoping I can get away
without BLK or BLOCK. I'd rather use a modern method for loading code
instead of old style disk sectors, e.g., ANS method.

Rightly or wrongly, I followed the advice of Andrew Haley in 2009 on the
order to check STATE and FIND and how to implement INTERPRET. So, my
INTERPRET and sub-words is setup more like his first post here:
http://groups.google.com/group/comp.lang.forth/msg/68915cdba3b07323
http://groups.google.com/group/comp.lang.forth/msg/dd2d8e7342aa17c6

I'm still considering switching the STATE and FIND check order. I like the
fig-Forth ordering better, but prefer the more complete selection of actions
in "Haley's" method.

> Obviously your code won't be identical, but it should give you some
> idea of the structure. I should probably factor my INTERPRET into
> smaller chunks, but it works. I chopped some code that wasn't
> relevant, but you hopefully get the idea.

Except for me disliking the order that the STATE and FIND checks occur,
"Haley's" method is fairly clean. Six states? Also, the Mitch Bradley
document I linked to goes into an discussion on this too.

> [explanation of stuff]
>

I do have most of that stuff working, except WORD, which I'm reworking into
things. I posted the issue I found before completing my fixes which should
be soon, hopefully ...

Thanks.


Rod Pemberton


Elizabeth D. Rather

unread,
Oct 5, 2011, 5:04:25 AM10/5/11
to
On 10/4/11 10:18 PM, Rod Pemberton wrote:
> "Elizabeth D. Rather"<era...@forth.com> wrote in message
> news:DMCdnQpPY8sSJRbT...@supernews.com...
>> On 10/4/11 3:10 PM, Rod Pemberton wrote:
>>
>>> [old trick]
>>
>> *Sigh* I remember that trick. It was in very early Forths, probably as
>> long ago as NRAO. Awful. Excessively "cute" and obscure.
>
> lol! :-)
>
> Well, I'm glad there is at least one person around who still knows what was
> going on. If I hadn't found that post, I'd have been asking what was up ...
>
>> Once you're done enjoying your "eureka!" moment, forget you
>> ever saw that!
>>
>
> It's "ugly," but I can't. My interpreter is predominantly fig-Forth like,
> so far ... That includes many definitions. So, I need to keep the quirks
> in so everything works.

That's not a "quirk" it's a booby trap.

> Admittedly, I haven't done that anywhere near 100%.
> That's because I'm removing (hopefully) uneeded stuff, like BLK or words
> that didn't survive to the present, and I'm also keeping things compatible
> with the C code. So, words, like COUNT, ENCLOSE, LATEST/LAST, CFA, PFA, etc
> have been modified to match my compiler's C methods, e.g., ASCIZ and
> structs. I've got ANS notes in there, but I'm nowhere near that yet. I've
> also used a hanful of things from other Forth spec.'s that make sense to me,
> e.g., -1 for TRUE, CELL, CELLS, CELL+, BYTE, CHAR, CHARS, etc.

Ok, so you're already not pure fig. That was an atrocious model, in any
case. It really isn't necessary to replicate all fig's mistakes.

>> A much cleaner solution is to have BEGIN ... WHILE ... REPEAT loops that
>> compile or interpret depending on STATE, with the loops terminating when
>> the current input source is exhausted, whereupon the system (or TERMINAL
>> task) simply waits for more input in the BEGIN ... AGAIN loop in QUIT.
>
> Yes, a version of that is in the Mitch Bradley document. The problem is if
> I do that now, I could introduce an unforeseen complication that'll cause me
> more work later. I'd much rather do that after it's complete.

You need to understand the code to make it work. If you understand it,
you will have fewer "unforeseen complications" and a far easier test
cycle. Write clean, simple code. That's always the best strategy.

Elizabeth D. Rather

unread,
Oct 5, 2011, 5:14:03 AM10/5/11
to
...
> Well, my Forth is much less developed, for sure. I have none of these:
> BLK BLOCK SPAN ABORT ABORT" C/L 2DUP 2DROP ." 2+ $F \ (

You can get away with omitting blocks. They were still important for
compatibility purposes in 1994, much less now. C/L is "characters per
line", for the old block editor. Omit. But ABORT and ABORT" are useful
(you can define them in terms of CATCH/THROW if you want to). And 2DUP
and 2DROP are incredibly useful for dealing with pairs of things, not
necessarily double numbers. ." is used all over the place. 2+ should be
replaced by CELL+ almost everywhere.

...
> Rightly or wrongly, I followed the advice of Andrew Haley in 2009 on the
> order to check STATE and FIND and how to implement INTERPRET. So, my
> INTERPRET and sub-words is setup more like his first post here:
> http://groups.google.com/group/comp.lang.forth/msg/68915cdba3b07323
> http://groups.google.com/group/comp.lang.forth/msg/dd2d8e7342aa17c6
>
> I'm still considering switching the STATE and FIND check order. I like the
> fig-Forth ordering better, but prefer the more complete selection of actions
> in "Haley's" method.

Andrew's code is generally clean and well-thought-out, which I cannot
say for figForth.

Mark Wills

unread,
Oct 5, 2011, 7:05:39 AM10/5/11
to
On Oct 5, 9:20 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "Mark Wills" <markrobertwi...@yahoo.co.uk> wrote in message
> INTERPRET and sub-words is setup more like his first post here:http://groups.google.com/group/comp.lang.forth/msg/68915cdba3b07323http://groups.google.com/group/comp.lang.forth/msg/dd2d8e7342aa17c6
>
> I'm still considering switching the STATE and FIND check order.  I like the
> fig-Forth ordering better, but prefer the more complete selection of actions
> in "Haley's" method.
>
> > Obviously your code won't be identical, but it should give you some
> > idea of the structure. I should probably factor my INTERPRET into
> > smaller chunks, but it works. I chopped some code that wasn't
> > relevant, but you hopefully get the idea.
>
> Except for me disliking the order that the STATE and FIND checks occur,
> "Haley's" method is fairly clean.  Six states?  Also, the Mitch Bradley
> document I linked to goes into an discussion on this too.
>
> > [explanation of stuff]
>
> I do have most of that stuff working, except WORD, which I'm reworking into
> things.  I posted the issue I found before completing my fixes which should
> be soon, hopefully ...
>
> Thanks.
>
> Rod Pemberton

Rod,

$F is just the value 0xf (15 decimal).

I have code in NUMBER that can accept hex and binary numbers without
me having to change BASE. If NUMBER sees a $ it temporarily changes
BASE to 16 then restores it afterwards.

E.g.

DECIMAL ok
$FF . 255 ok
%1001 . 9 ok

I think it's quite a common feature, as one will often want to express
some values in HEX and HEX and DECIMAL etc just get in the way of the
source code.

I also have $. to print a value as an unsigned hex value. Useful when
doing bit-mangling. So I can do

%11110 $. 1E ok

Whilst still leaving BASE unchanged.

:-)

Coos Haak

unread,
Oct 5, 2011, 7:23:25 PM10/5/11
to
Op Tue, 4 Oct 2011 21:10:13 -0400 schreef Rod Pemberton:

<snip>
>: INTERPRET
> BEGIN
> -FIND
> ( whatever )
> AGAIN
> ;

This would not loop forever and does not need the antique X word:

: INTERPRET
BEGIN BL WORD DUP C@
WHILE <processing>
REPEAT
DROP
;

This way, it ends at the end of the line.

--
Coos

CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html
0 new messages