Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

ANN: Conditions and restarts in ANS Forth

228 views
Skip to first unread message

Alexander Shpilkin

unread,
Feb 23, 2019, 1:21:30 PM2/23/19
to
Sometimes, when I'm not making a fool of myself on c.l.f or ForthHub, I
do actually write code. My Forth system project, <https://github.com/
alexshpilkin/after>, seems to have stalled, but I thought I could release
some developments from it in an ANS-compatible form to let people discuss
and experiment. So, here goes...

This may be considered my response to Mitch Bradley's implied
challenge[1] to come up with a Forth idea that wasn't tried before (or at
least that's how I took it... =) ). I have tried to implement in ANS
Forth a condition and restart system in the vein of Common Lisp[2],
Dylan[3], and Racket[4]. If that seems too theoretical, the C2 wiki also
points out[5] that the DOS Abort/Retry/Ignore prompt is _unimplementable_
on top of exception systems in every modern language except those three.
The best high-level overview of the idea known to me is in Practical
Common Lisp[6], and its originator Kent Pitman also wrote an in-depth
discussion[7] of the issues and design choices. This text will instead
proceed from the lowest level up, describing parts of the code in
corresponding sections. The complete development is also available in a
GitHub repository, <https://github.com/alexshpilkin/studies>.

There's nothing particularly non-obvious about this code; given an active
imagination, one could even claim it does nothing but mix the well-known
ideas of error callbacks and non-local exits in a particular way, any
obscure Lisp influences notwithstanding. I still think it's remarkable
how this particular combination of these ideas achieves a decoupling of
error policy and mechanisms in a way that conventional exception systems
can't.

\ From folklore
: CELL- [ 1 CELLS ] LITERAL - ;
\ From folklore
: +CONSTANT CREATE , DOES> @ + ;
\ From folklore
: ," [CHAR] " PARSE HERE OVER ALLOT SWAP CMOVE ;
\ From Wil Baden's TOOLBELT 2002
: ANDIF POSTPONE DUP POSTPONE IF POSTPONE DROP ; IMMEDIATE
\ From Gforth, for correct backtraces
: NOTHROW ['] FALSE CATCH 2DROP ;

## Frame stack

These first two parts are essentially workarounds for inflexibilities in
ANS Forth. These should be completely straightforward to implement with
carnal knowledge of the system, but are rather awkward and inefficient in
the portable version I give.

In ANS Forth, it's impossible to access earlier frames on the return
stack except by popping until the desired element. This part implements
what's essentially a parallel return stack, with >F and F> in place of >R
and R>, but also FP@ ( -- fp ) to save the current frame pointer and @F
( fp n -- ) to fetch the nth last cell pushed before fp was saved. There
is no supposition here that the frame pointer fp is in fact an address
usable with the usual memory access words---it could just as well be an
offset from FP0, for example. The client code is also not supposed to
know that the frame stack is distinct from the return stack, so this can
be turned into actually usable code as a part of a Forth system.

CREATE FSTACK 32 CELLS ALLOT FSTACK CELL- CONSTANT FP0
VARIABLE FP FP0 FP !
: FP@ ( -- fp ) FP @ ;
: >F ( x -- ) FP @ CELL+ DUP FP ! ! ;
: F> ( -- x ) FP @ DUP CELL- FP ! @ ;
: @F ( fp n -- x ) CELLS - @ ;

## Stack-preserving THROW and CATCH

It puzzles me that THROW and CATCH[8], the only non-local control
transfers provided by ANS Forth, as well as both of the early proposals
(ALERT/EXCEPT/RESUME[9] of Guy and Rayburn, which uses a more convenient
syntax, and EXCEPTION/TRAP[10] of Roye, which is essentially THROW/CATCH
by another name) insist on restoring the data stack pointer. This is
perhaps suitable if they are to be used as an error-handling construct
exactly as envisioned by the authors, but completely inappropriate if an
arbitrary amount of data needs to be passed during the transfer, as here.

To add insult to injury, ANS-style stack-restoring THROW and CATCH are
straightforwardly, if a tad inefficiently, implementable in terms of
stack-preserving ones using DEPTH, while the opposite requires no small
amount of contortions. It _is_ possible, though, using auxiliary global
storage, so that is what this part does. It also takes the opportunity
to save and restore the frame pointer, to make the "parallel return
stack" actually track the real one.

VARIABLE CATCHDEPTH VARIABLE STASHDEPTH
CREATE STASH 32 CELLS ALLOT

: THROW ( -* )
DEPTH CATCHDEPTH @ - DUP STASHDEPTH ! >R STASH BEGIN
R@ 0 > WHILE TUCK ! CELL+ R> 1- >R REPEAT DROP R> DROP
1 THROW ;

: CATCH ( ... xt -- ... f )
CATCHDEPTH @ >R DEPTH 1- CATCHDEPTH ! FP @ >R CATCH
R> FP ! R> CATCHDEPTH ! ( ... error ) DUP 1 = IF DROP
STASHDEPTH @ DUP >R CELLS STASH + BEGIN R@ 0 > WHILE
CELL- DUP @ SWAP R> 1- >R REPEAT DROP BEGIN
R@ 0 < WHILE DROP R> 1+ >R REPEAT R> DROP TRUE
ELSE DUP IF THROW THEN THEN ;

The data stack depth can be saved on the return stack and restored back
manually by MARK and TRIM, but only if it has become greater in the
meantime.

: MARK POSTPONE DEPTH POSTPONE >R ; IMMEDIATE
: (TRIM) >R BEGIN DEPTH R@ U> WHILE DROP REPEAT R> DROP ;
: TRIM POSTPONE R> POSTPONE (TRIM) ; IMMEDIATE

A final word of warning: my implementations of THROW ( -* ) and CATCH
( ... xt -- ... f ) use ANS throw code 1 and pass through any others.
However, this is to be understood more as a debugging aid than a serious
attempt at interoperability: an ANS exception passing through a condition
handling construct will still break things. Actual interoperability is
possible, but would disrupt the logical layering of code that this
presentation uses.

## SIGNAL, RESPOND, and PASS

This part and the next one implement the logic for doing something when
an unusual situation arises and for unwinding the stack if necessary.
The general approach parallels the original 32-bit Windows SEH[11],
though I hope my implementation is not that convoluted.

When a program needs to indicate that something unusual happened, it puts
one _or several_ items describing the situation on the data stack and
calls SIGNAL. This does not in itself relinquish control; instead, a
callback routine is called normally. That routine can either perform a
non-local exit or return to the signaller (in case this was only a
warning and it got ignored, for example).

The callbacks are called _responses_ and are installed using RESPOND
(... xt response-xt -- ... ). A stack of responses is maintained, and a
response can give up and call the one up the stack using PASS. (In a
real implementation PASS would do a tail call.) The stack of callbacks is
maintained as linked frames on the frame stack: the response xt is on
top, and the pointer to the previous response frame is below it. The
pointer to the top frame is stored in (would-be user) variable RESPONSE,
saved and restored by every RESPOND block. When a response is invoked,
the pointer rf to its own response frame is put on the data stack on top
of the information provided by the signaller. The response can use it to
retrieve additional data from the frame or pass it to PASS ( rf -* ).

(The slightly backwards frame structure, with the xt above the link, is
to make SIGNAL completely oblivious of the links. I don't know if it's
worth it.)

Unlike other systems, this one does not implement the Common Lisp
"condition firewall" or give any other special treatment to signalling
inside a response. A response just regular code executing in the dynamic
scope of the signaller.

VARIABLE RESPONSE

: SIGNAL RESPONSE @ DUP 0 @F EXECUTE ;

: RESPOND ( ... xt response-xt -- ... )
RESPONSE @ >F >F FP@ RESPONSE ! CATCH ( ... f )
F> DROP F> RESPONSE ! IF THROW THEN NOTHROW ;

: (PASS) 1 @F DUP 0 @F EXECUTE ;
: PASS ( rf -* ) POSTPONE (PASS) POSTPONE EXIT ; IMMEDIATE

## ESCAPE and RESUME

A response does not have to perform a non-local exit, but if an error was
signalled, it will probably want to, whether to retry the operation from
some predetermined point, to return a substitute value from it, or to
abort it. The non-local exit provided by THROW and would be suitable
here, were it not so indiscriminate: THROW passes control to the closest
CATCH, and that's it.

Thus, ESCAPE and RESUME implement on top of Forth THROW and CATCH a
capability that is closer to their Common Lisp[12] and MACLISP[13]
counterparts: an exit point established with RESUME ( ... xt -- ... f )
is uniquely identified by its _tag_, which is passed to xt on top of the
data stack, and ESCAPE ( tag -* ) requires the tag of the exit it should
take.

The tag is implemented as the frame pointer at the moment RESUME is
entered. The actual implementation of ESCAPE is then simply THROW, while
RESUME, after performing a CATCH, compares the tag on top of the data
stack with the current frame pointer and either exits to user code or
reTHROWs. Unfortunately, this protocol has to more or less monopolize
THROW and CATCH. A notable exception is cleanup code that must be
executed every time control goes out of a protected block of code: this
cannot be implemented with ESCAPE and RESUME and must instead be done
using THROW and CATCH themselves:

protected-xt CATCH ( cleanup code ) IF THROW THEN

The cleanup code cannot rely on the state of the data stack, of course,
but things can be passed to it on the return stack instead.

: ESCAPE ( tag -* ) POSTPONE THROW ; IMMEDIATE

: RESUME ( ... xt -- ... f )
FP@ SWAP 0 >F CATCH ( ... 0 | tag -1 ) F> DROP
DUP ANDIF OVER FP@ <> THEN IF DROP THROW THEN NOTHROW
DUP IF NIP THEN ;
\ the 0 ensures different RESUMEs have different tags

_Obscure prior work note:_ The Common Lisp implementation of this is more
limited, as described in X3J13 issue EXIT-EXTENT[14]. Let a RESUME
called A be established first, then inside it a RESUME called B, then
inside that a cleanup CATCH as above called C, then let the code inside
that escape to A, so the cleanup code of C is now executing. The issue is
whether that cleanup code can escape to B, which is closer in the call
stack than the original A it is supposed to continue to. Common Lisp
allows systems to prohibit this (proposal MINIMAL in the issue writeup),
but this implementation allows it (proposal MEDIUM). I think the latter
is cleaner, because exit scope is dynamic scope, and dynamic scope is
dynamic scope is dynamic scope, but apparently the Common Lisp
implementers disagreed.

## Class system

What is implemented up to this point is not a complete condition and
restart system, but only a foundation for one: while there is a mechanism
to receive data about an unusual situation and accept or decline to
handle it, there is no agreed protocol for doing so. Similarly, while
there is a way to escape to a chosen resumption point, possibly passing
some data on the stack, there is no protocol for performing the choice.
This is what remains to be done: in SysV ABI terms, the "personality"[15].

Following common terminology, I call unusual situations that are handled
by the system _conditions_, the responses to them _handlers_, and the
ways of recovery from those situations _restarts_. Condition data is put
on the data stack before calling SIGNAL, and each handler uses the
contents of its frame to decide whether to handle or PASS. A restart is
simply a condition whose handler, by convention, immediately escapes to
the restart code, leaving the restart data intact on the data stack. In
both cases, more specific data should be placed below less specific, so
that the handler or restart code can still work even if it is only
prepared to handle a less specific kind of condition or restart.

What we need, then, is a way of organizing the various kinds of
conditions and restarts in a hierarchy by specificity. After thinking
about this for some time, I couldn't come up with anything better than
just doing a simple single-inheritance, prototype-based[16] object
system. The way it is used is a bit unusual, though: the objects are
statically allocated and represent not the conditions or restarts
themselves, but the kinds of condition or restart, so I call it a class
system instead of an object system.

To signal a particular condition, put the data corresponding to that
condition on the data stack, topped off by the condition class, then
execute SIGNAL. The innermost handler for which the signalled condition
class EXTENDS the handled one will handle it. The protocol for restarts
is similar and will be described later.

We see that very little is required of classes: we must be able to tell
whether one class EXTENDS another one, and strictly speaking that's it.
I've also included rudimentary support for _slots_, that is, values
associated with a class that can be overriden by an extending class. A
slot that stores an xt can be used like a method in a traditional object
system. When used in a condition or restart class, methods will want to
inspect the data left below the class itself, but, contrary to the usual
Forth convention, they should _not_ consume it: a method cannot know if
there are more items on the stack than it expects, and the user cannot
know how many items a method expects, unless they know in advance what
the exact class is, which would defeat the point of classes.

(Of course, there could be a standard DUPLICATE method or a SIZE slot. I
don't think that would improve things compared to just not consuming the
data.)

HERE CELL+ DUP , 1 CELLS , CONSTANT TOP

: CLONE ( c "name" -- ) DUP >R CREATE HERE >R DUP @
TUCK - SWAP [ 2 CELLS ] LITERAL + DUP , R> + , BEGIN
DUP R@ < WHILE DUP @ , CELL+ REPEAT @ CELL+ , R> DROP
DOES> DUP @ + ;

: EXTENDS ( c1 c2 -- ) OVER @ OVER @ MIN ROT SWAP - @ = ;

: METHOD ( xt "name" -- ) ( ... c -- ... c )
CREATE , DOES> >R DUP R> @ EXECUTE @ EXECUTE ;

## HANDLE

The protocol for signalling conditions has already been described above:
put the condition data and condition class on the stack, then call
SIGNAL. To handle subclasses of a class c using handler-xt during
execution of xt, call HANDLE ( ... xt handler-xt c -- ... ). The handler
receives the pointer to its own frame on top of the class; the last cell
put on the frame stack before HANDLE is accessible using 4 @F (and so on
for earlier cells). Returning from the handler causes execution to
resume after SIGNAL.

Slot 1 (or >UNHANDLED) of every condition specifies the default handler
for that condition, that will be executed if no handler on the stack
accepts responsibility for it. The default handler, unlike the usual
ones, is invoked with the condition class on top of the stack. Slot 2
(>DISPLAY) should contain the xt that prints the condition data on the
console in a user-readable form (leaving the stack intact). Additionally,
the description of every condition should specify whether a handler is
allowed to return or whether it has to invoke a restart.

The base class for bug and error conditions is ?. Any classes extending
it should have names ending with ? and describing the object or action
that encountered a problem, not the problem itself, because there is
usually less arbitrary choice in naming the former: for example, I/O?,
not I/O-ERROR?; NAME?, not NAME-NOT-FOUND?.

: (HANDLE) ( ... c fp -- ... ) 2DUP 2 @F EXTENDS IF
DUP 3 @F EXECUTE ELSE PASS THEN ;
: HANDLE ( ... xt c handler-xt -- ... ) >F >F
['] (HANDLE) RESPOND F> DROP F> DROP ;

1 CELLS +CONSTANT >UNHANDLED ' >UNHANDLED METHOD UNHANDLED
2 CELLS +CONSTANT >DISPLAY ' >DISPLAY METHOD DISPLAY

: DEFAULT-RESPONSE ( с rf ) DROP UNHANDLED ;
' DEFAULT-RESPONSE >F FP@ RESPONSE !

: UNHANDLED-? ." Unhandled " DISPLAY ABORT ;
: DISPLAY-? ." bug" ;
TOP CLONE ? ' UNHANDLED-? , ' DISPLAY-? ,

## RESTART

For now, a handler that wants to escape to code up the call stack can be
established by first establishing a RESUME, saving the tag to the handler
frame, and using it to ESCAPE when needed. For handlers that _always_
want to escape before doing anything else, this operation is packaged
into a single word, RESTART ( ... xt c -- ... f ). Restarts are simply
conditions that are by convention always handled using RESTART. The base
class for restarts is RESTART?. It extends ? so as to also serve as a
bug condition indicating that an unavailable restart was signalled.

_Prior work note:_ The idea to make a restart a kind of condition is
originally from Dylan; in Common Lisp, restarts are disjoint from
conditions. However, the functionality provided by RESTART is desirable
in any system, but awkward to implement using restarts as the sole stack
unwinding mechanism, while the reverse is straigtforward. This
implementation uses the simpler Dylan approach.

: ((RESTART)) ( ... c hf -* ) 4 @F ESCAPE ;
: (RESTART) ( ... xt c tag -- ... )
>F ['] ((RESTART)) HANDLE F> DROP ;
: RESTART ( ... xt c -- ... f ) ['] (RESTART) RESUME ;

3 CELLS +CONSTANT >NAME
5 CELLS +CONSTANT >DESCRIBE ' >DESCRIBE METHOD DESCRIBE

: DISPLAY-RESTART? ." bug: no restart " DUP >NAME 2@ TYPE ;
HERE ," RESTART?" DUP HERE SWAP -
: DESCRIBE-RESTART? ." Unspecified restart" ;
? CLONE RESTART? ( ... c -- restart? ) ? >UNHANDLED @ ,
' DISPLAY-RESTART? , ( c-addr len ) , , ' DESCRIBE-RESTART? ,

## Example

The [example](crexam.fth) demonstrates a DOS-style Abort/Retry/Ignore
prompt, with even more modularity than in DOS: the simulated SYSTEM
provides the prompt, but has no knowledge of the available recovery
options; the simulated SHELL and I/O subsystem provide the options, and
the simulated APPLICATION has no knowledge of either. A serious attempt
at an interactive debugger would probably look a lot like SYSTEM, except
it could use system-specific knowledge to implement a full debugging
REPL. Alternatively, the list of restarts could be incorporated into the
a return stack trace.

Unlike the rest of the code, the example doesn't limit itself to the ANS
CORE wordset; it uses quotations from Forth 200x.

\ An error is not a bug, but an unhandled one is
: DISPLAY-ERROR? ." error" ;
? CLONE ERROR? ? >UNHANDLED @ , ' DISPLAY-ERROR? ,

: I/O?-DEVICE ( i/o? -- i/o? dev# ) OVER ;
: DISPLAY-I/O?
." input/output error on device " I/O?-DEVICE . ;
ERROR? CLONE I/O? ( dev# -- i/o? ) ERROR? >UNHANDLED @ ,
' DISPLAY-I/O? ,

VARIABLE BUFFER

: (READ-BYTE) ( simulate an error and garbage data )
42 BUFFER ! 123 I/O? SIGNAL ;

HERE ," IGNORE?" DUP HERE SWAP -
: DESCRIBE-IGNORE? ." Ignore the error and proceed" ;
RESTART? CLONE IGNORE? RESTART? >UNHANDLED @ ,
RESTART? >DISPLAY @ , ( c-addr len ) , , ' DESCRIBE-IGNORE? ,

HERE ," RETRY?" DUP HERE SWAP -
: DESCRIBE-RETRY? ." Retry the operation" ;
RESTART? CLONE RETRY? RESTART? >UNHANDLED @ ,
RESTART? >DISPLAY @ , ( c-addr len ) , , ' DESCRIBE-RETRY? ,

: READ-BYTE
BEGIN
MARK
[: MARK ['] (READ-BYTE) IGNORE? RESTART TRIM ;]
RETRY? RESTART
WHILE
TRIM
REPEAT R> DROP ;

: APPLICATION READ-BYTE ." Read byte: " BUFFER @ . CR ;

HERE ," ABORT?" DUP HERE SWAP -
: DESCRIBE-ABORT? ." Stop and return to shell" ;
RESTART? CLONE ABORT? RESTART? >UNHANDLED @ ,
RESTART? >DISPLAY @ , ( c-addr len ) , , ' DESCRIBE-ABORT? ,

: SHELL
MARK
['] APPLICATION ABORT? RESTART IF ." Aborted " CR THEN
TRIM ;

: MORE-RESPONSES 0 @F ['] DEFAULT-RESPONSE <> ;
: NEXT-RESPONSE 1 @F ;
: RESTART-RESPONSE DUP 0 @F ['] (HANDLE) = ANDIF DUP 2 @F
RESTART? EXTENDS THEN ;
: RESTART. DUP >NAME 2@ TYPE ." -- " DESCRIBE DROP ;
: LIST-RESTARTS ." Restarts:" CR RESPONSE @ BEGIN
RESTART-RESPONSE IF DUP 2 @F 2 SPACES RESTART. CR THEN
DUP MORE-RESPONSES WHILE NEXT-RESPONSE REPEAT DROP ;

: SYSTEM
['] SHELL ? [:
( hf ) DROP ." Signalled " DISPLAY CR LIST-RESTARTS
PAD DUP 84 ACCEPT CR EVALUATE SIGNAL
;] HANDLE ;

CR SYSTEM

## Design issues

This is a proof of concept. The organization and naming of the words
should generally be sound, but I can only design a limited part of a
system without actually using it first. The following is a list of
design issues I can see but have decided not to sidestep in this sketch:

* The condition hierarchy needs to be worked out. There need to be
base classes for warnings, errors, bugs, implementation limitations.
The distinction between bugs and (runtime) errors is particularly
important, and many languages with exception systems (C++, Java,
Python...) get it wrong. It is not clear to me if errors should be a
particular case of bugs, as in the example, or if the default handler
for an error should signal a bug; similarly for restarts. It seems
that a handler for bugs would usually be installed by an interactive
toplevel of some sort, so it would make sense for it to also be an
absolute barrier for any conditions or restarts. On the other hand,
there can also be fatal conditions that are not in any sense bugs, like
SystemExit in Python or ThreadInterruptedException in Java.

* Restarts being conditions still feels more like an accident than a
coherent choice; I can't think of a case where a hierarchy of restarts
could possibly be useful.

* The set of methods on a condition class is a sketch. Unfortunately, no
methods can be added to a class once it has been defined, so things like
DISPLAY will have to be baked in.

* An actual implementation of frames using the return stack will need to
carefully specify which words that create frames (like HANDLE) make them
contiguous with whatever user code put there before.

* Even if frame pointers are not actual addresses, a specification might
want to declare that they are disjoint from any addresses pointing to
user-created areas. (ANS Forth doesn't do this even for xts, though.)

* I don't know how to frame pointers should be passed around. Wrapping
code like (HANDLE) might want to consume the relevant data on the frame
and advance the frame pointer it passes to underlying code to a point
lower on the stack. This probably composes better, but means that, for
example, a handler will not be able to call PASS.

* Restarts currently don't have user-accessible frames that could be used
without passing control to the restart, like LIST-RESTARTS does in the
example. This may or may not be a good idea. Restart code or methods
will also want to access the _condition_ it's handling; the calling
convention requires some thought.

* It's not clear to me that the approach the example uses to enumerate
available restarts is the best one. A separate stack containing only the
restarts is also an option, but then a restart barrier, like a handler
for ?, will need to manipulate it explicitly to ensure introspection
works correctly.

* Words are needed for inspecting and walking the response stack (at
least equivalent to those used in the example). I'm not aware of any
general approaches to looping over arbitrary structures in Forth (unlike
its more Lispish cousins like Joy or Factor), so this will have to be
postponed for now.

* The layout of slots and frames is too fragile and requires far too much
boilerplate in user code. It might make sense to use some sort of
structure lexicon to simplify this, but there is a certain appeal to
defining classes in one pass instead of compiling default data first and
overwriting it later.

* The syntax in general is downright miserable to use without quotations
and still awkward with them, similar to ANS Forth exceptions. A syntax
more in line with usual Forth control structures, like that of the ALERT/
EXCEPT/RESUME proposal, might be better, even if it's unimplementable in
terms of ANS words. On the other hand, the requirement to pack things
into a separate xt serves nicely to enforce return stack separation
that's in any case required by the words. (In this sense, it's the
syntax of DO/LOOP that is the mistake.)

[1]: https://github.com/ForthHub/discussion/issues/
79#issuecomment-454218065
[2]: http://www.lispworks.com/documentation/lw71/CLHS/Body/09_.htm
[3]: https://opendylan.org/books/drm/Conditions
[4]: https://docs.racket-lang.org/reference/exns.html
[5]: http://wiki.c2.com/?AbortRetryIgnore
[6]: http://www.gigamonkeys.com/book/beyond-exception-handling-
conditions-and-restarts.html
[7]: http://www.nhplace.com/kent/Papers/Condition-Handling-2001.html
[8]: https://www.complang.tuwien.ac.at/forth/dpans-html/dpans9.htm
[9]: http://soton.mpeforth.com/flag/jfar/vol3/no4/article3.pdf
[10]: http://soton.mpeforth.com/flag/jfar/vol5/no2/article4.pdf
[11]: http://bytepointer.com/resources/
pietrek_crash_course_depths_of_win32_seh.htm
[12]: http://clhs.lisp.se/Body/s_catch.htm
[13]: http://www.maclisp.info/pitmanual/contro.html#5.13.1
[14]: http://clhs.lisp.se/Issues/iss152_w.htm
[15]: https://stackoverflow.com/q/16597350
[16]: http://wiki.c2.com/?PrototypeBasedProgramming

minf...@arcor.de

unread,
Feb 23, 2019, 5:01:10 PM2/23/19
to
I admit not having read everything...
Just what came to my mind (and what is implemented in MinForth):

MF exception frames are chained data structures living in a separate
stack, not in Forth's return stack. MF is implemented in C and that separate
stack is actually the C return stack.

Exception frames contain also setjmp records. Forth exceptions trigger C
longjmp()s thereby unwinding both stacks, which is the "trick" to "restart"
the whole system.

Individual exception frames of the exception frame chain can be accessed
and exceptions executed by transversing the chain via pointers.

Alexander Shpilkin

unread,
Feb 23, 2019, 8:03:06 PM2/23/19
to
On Sat, 23 Feb 2019 14:01:09 -0800, minforth wrote:
> I admit not having read everything...

I regret not writing a short summary, but I was a bit fed up writing the
long version already, to be honest. If people are interested, I will; in
the meantime, you can ponder the example (last code section) and/or run
it in any Forth with quotations[*] (it's crexam.fth in the repo). It
does an interesting thing that's natural with the condition-and-restart
system, but not with ANS THROW/CATCH.

> Just what came to my mind (and what is implemented in MinForth):
>
> MF exception frames are chained data structures living in a separate
> stack, not in Forth's return stack. MF is implemented in C and that
> separate stack is actually the C return stack.
>
> Exception frames contain also setjmp records. Forth exceptions trigger C
> longjmp()s thereby unwinding both stacks, which is the "trick" to
> "restart" the whole system.

Sure. Choosing separate or same return stack is a decision that's
invisible outside the implementation as far as I can see, but you'll
indeed need to have a separate one if you want to propagate exceptions
across Forth->C->Forth calls. (Or you could do what lua_pcall does and
say the invervening C code will have to unwind itself by hand.)

Though my impression from looking at the MinForth code for five minutes
is that no tricky C->Forth->C->Forth->THROW business is happening and the
stack is always unwound down to main().

To be clear: the CL-style "restarts" in my system are unrelated to this.

> Individual exception frames of the exception frame chain can be accessed
> and exceptions executed by traversing the chain via pointers.

Again, sure. The code that implements THROW will have to walk some sort
of exception stack to find the closest CATCH.

My issue is having to access frames _from Forth_, because the handlers of
my system are more like error callbacks than exception handlers: when you
encounter an error, you _don't_ unwind the stack before executing the
handler (and the handler can return to the signalling code---think
warning handlers). Which means that if the handler wants to remember
something from the moment it was established, it needs to be a "downward
funarg"[1]/"Pascal closure" in that it needs to access some values that
aren't on top of _any_ stack. (Not the data one, not the return one, not
the exception one...) Hence the Forth-visible frames. Stack-allocated
Ertl-style closures[2] would be a less verbose way to do the same.

[*]: I could implement the example without quotations, but the syntax
would really, really suck, so I didn't.
[1]: https://stackoverflow.com/q/581182
[2]: http://www.complang.tuwien.ac.at/anton/euroforth/ef18/papers/ertl.pdf

--
Alex

minf...@arcor.de

unread,
Feb 24, 2019, 3:01:50 AM2/24/19
to
Am Sonntag, 24. Februar 2019 02:03:06 UTC+1 schrieb Alexander Shpilkin:
> On Sat, 23 Feb 2019 14:01:09 -0800, minforth wrote:
> > I admit not having read everything...
>
> I regret not writing a short summary, but I was a bit fed up writing the
> long version already, to be honest. If people are interested, I will; in
> the meantime, you can ponder the example (last code section) and/or run
> it in any Forth with quotations[*] (it's crexam.fth in the repo). It
> does an interesting thing that's natural with the condition-and-restart
> system, but not with ANS THROW/CATCH.
>
> > Just what came to my mind (and what is implemented in MinForth):
> >
> > MF exception frames are chained data structures living in a separate
> > stack, not in Forth's return stack. MF is implemented in C and that
> > separate stack is actually the C return stack.
> >
> > Exception frames contain also setjmp records. Forth exceptions trigger C
> > longjmp()s thereby unwinding both stacks, which is the "trick" to
> > "restart" the whole system.
>
> Sure. Choosing separate or same return stack is a decision that's
> invisible outside the implementation as far as I can see, but you'll
> indeed need to have a separate one if you want to propagate exceptions
> across Forth->C->Forth calls. (Or you could do what lua_pcall does and
> say the invervening C code will have to unwind itself by hand.)
>
> Though my impression from looking at the MinForth code for five minutes
> is that no tricky C->Forth->C->Forth->THROW business is happening and the
> stack is always unwound down to main().

MinForth's code base is not public, you can't inspect it.

You might have looked at an ~20 y old VM-based ancestor though, which had been
an abandoned test program even then. There are no similarities to today.

a...@littlepinkcloud.invalid

unread,
Feb 24, 2019, 5:50:43 AM2/24/19
to
Alexander Shpilkin <ashp...@gmail.com> wrote:
>
> ## Stack-preserving THROW and CATCH
>
> It puzzles me that THROW and CATCH[8], the only non-local control
> transfers provided by ANS Forth, as well as both of the early
> proposals (ALERT/EXCEPT/RESUME[9] of Guy and Rayburn, which uses a
> more convenient syntax, and EXCEPTION/TRAP[10] of Roye, which is
> essentially THROW/CATCH by another name) insist on restoring the
> data stack pointer.

It shouldn't puzzle you. CATCH and THROW are a structured exception-
handling mechanism, with all that implies. The problem with not
restoring the stack pointer is that the depth of the stack becomes
unknown at the point of the CATCH, presenting the programmer, in
particular a maintenance programmer, with a program that it is
impossible to reason about. Also, it is pointless to invent a CATCH
mechanism that leaves stuff on the stack because instead of this:

[: blah foo leaving? if nn throw then ;] catch if leave-actions stuff then

you can do this:

[: blah foo leaving? if leave-actions nn throw then ;] catch if stuff then

In other words, move the code that needs the stack at the point of the
THROW to the point of the THROW.

> This is perhaps suitable if they are to be used as an error-handling
> construct exactly as envisioned by the authors, but completely
> inappropriate if an arbitrary amount of data needs to be passed
> during the transfer, as here.

Kernighan: "Everyone knows that debugging is twice as hard as writing
a program in the first place. So if you're as clever as you can be
when you write it, how will you ever debug it?"

Andrew.

Alexander Shpilkin

unread,
Feb 25, 2019, 9:34:19 AM2/25/19
to
Ah. I was wrong, then. Point for making a fool of myself on c.l.f :) I
know next to nothing about commercial systems, unfortunately--- there's
usually no reason for me to seek out an implementation I can't inspect.
(The Chez Scheme papers might be an exception, as its developers have
been extraordinarily open in documenting its internals.)

I was actually trying to guess an answer to a question, but maybe I
should have just asked: does MinForth use setjmp/longjmp in order to
support tricky situations like unwinding across (opaque) C code, or is it
just a shortcut to avoid doing it explicitly inside the VM?

(The whole point of a condition system like I posted vs a conventional
exception system is that signalling an error doesn't, in itself, have to
perform a non-local exit like a THROW or a longjmp()... But implementing
non-local exits is still an interesting topic.)

--
Alex

minf...@arcor.de

unread,
Feb 25, 2019, 10:08:50 AM2/25/19
to
MinForth is not commercial. Its sources are not public by contractual
reasons, and "the world" has no need for yet another Forth.

> I was actually trying to guess an answer to a question, but maybe I
> should have just asked: does MinForth use setjmp/longjmp in order to
> support tricky situations like unwinding across (opaque) C code, or is it
> just a shortcut to avoid doing it explicitly inside the VM?
>

The classic high-level THROW/CATCH mechanism unwinds only two Forth stacks
by stepping down one exception frame after the other.
When you have a separate exception frame stack, you can pick whatever frame
you like to reset the system to its associated earlier state.
MinForth does not use a separate exception frame stack, but threads exception
frames into C local variable stack frames.
It sounds more complicated than it is, and prevents C stack overflows naturally.

> (The whole point of a condition system like I posted vs a conventional
> exception system is that signalling an error doesn't, in itself, have to
> perform a non-local exit like a THROW or a longjmp()... But implementing
> non-local exits is still an interesting topic.)
>

As I said, when you can address individual exception frames, you can use
them directly as exit door without stepping down the whole ladder one by one.

Alexander Shpilkin

unread,
Feb 25, 2019, 1:32:08 PM2/25/19
to
On Mon, 25 Feb 2019 07:08:48 -0800, minforth wrote:
> Am Montag, 25. Februar 2019 15:34:19 UTC+1 schrieb Alexander Shpilkin:
>> On Sun, 24 Feb 2019 00:01:49 -0800, minforth wrote:
>>> MinForth's code base is not public, you can't inspect it.
>>>
>>> You might have looked at an ~20 y old VM-based ancestor though, which
>>> had been an abandoned test program even then. There are no
>>> similarities to today.
>>
>> Ah. I was wrong, then. Point for making a fool of myself on c.l.f :)
>> I know next to nothing about commercial systems, unfortunately---
>> there's usually no reason for me to seek out an implementation I can't
>> inspect. (The Chez Scheme papers might be an exception, as its
>> developers have been extraordinarily open in documenting its
>> internals.)
>>
> MinForth is not commercial. Its sources are not public by contractual
> reasons, and "the world" has no need for yet another Forth.

Should have said "closed-source" or something similar... In any case,
I'd say the issue is whether the code is publicly available knowledge or
not, not whether people make money off it.

>> I was actually trying to guess an answer to a question, but maybe I
>> should have just asked: does MinForth use setjmp/longjmp in order to
>> support tricky situations like unwinding across (opaque) C code, or is
>> it just a shortcut to avoid doing it explicitly inside the VM?
>>
>>
> The classic high-level THROW/CATCH mechanism unwinds only two Forth
> stacks by stepping down one exception frame after the other.
> When you have a separate exception frame stack, you can pick whatever
> frame you like to reset the system to its associated earlier state.

Ah, right, ANS Forth does do that. "Exit all at once" is easily
implementable as a convention on top of "exit step-by-step", of course,---
that's what ESCAPE/RESUME do in my code, for example. And you're going
to want to interrupt an all-at-once exit midway in any case to support
cleanup (finally / UNWIND-PROTECT / however you want to call it). So I
don't think it's a big loss to implement all-at-once exits in terms of
step-by-step exits, although the former does make its own sense API-wise.

Just to be clear as to what I mean, here's a quick implementation of all-
at-once exits (called ESCAPE/RESUME) in terms of ANS THROW/CATCH and a
constant amount of storage:

VARIABLE TAG 1 TAG !

: ESCAPE ( ... tag -* ) THROW ;

: RESUME ( ... xt -- ... f ) ( xt: ... tag -- ... )
TAG @ DUP 1+ TAG ! SWAP CATCH TAG @ 1- DUP TAG !
OVER 0= IF DROP EXIT THEN OVER <> IF THROW THEN DROP
DROP TRUE ;

\ there's one more data stack item when CATCH is executed than
\ when RESUME is executed, so two DROPs are needed

Because ANS THROW can only pass one cell to the corresponding CATCH, this
ESCAPE can't pass anything to the corresponding RESUME, but otherwise it
works fine. The version in my original article first patches THROW/CATCH
to preserve the data stack (and introduces additional words MARK/TRIM to
save and restore it explicitly), so it doesn't have that limitation.

> MinForth does not use a separate exception frame stack, but threads
> exception frames into C local variable stack frames.
> It sounds more complicated than it is, and prevents C stack overflows
> naturally.

Er.... If what I've seen is not the MinForth code, then I have to admit
I have no idea what you mean here. That's why I originally went looking
for the code. Can you draw a picture? :)

>> (The whole point of a condition system like I posted vs a conventional
>> exception system is that signalling an error doesn't, in itself, have
>> to perform a non-local exit like a THROW or a longjmp()... But
>> implementing non-local exits is still an interesting topic.)
>>
> As I said, when you can address individual exception frames, you can use
> them directly as exit door without stepping down the whole ladder one by
> one.

There are two different things your usage of "address" could mean. Do
you only mean that there's a means to specify which non-local exit you
want to take? (What my code snippet above does.) Or do you mean that,
in addition to that, I can go and look up some values from when a
particular exit was established, _without_ taking it? (What my c-and-r
system requires---"downward funargs".)

--
Alex

minf...@arcor.de

unread,
Feb 25, 2019, 3:30:06 PM2/25/19
to
To cut it short, and because it is non-portable anyhow, imagine MF exception
frames as C jmp_buf structures, but pepped up by additional data representing
actual Forth system states.

Such jmp_buf-plus structures can either be used as exception frames in normal
CATCH/THROW ways, or - when stored in a global variable - be used as extra
global exit doors, e.g. to cold re-start the system.

m...@iae.nl

unread,
Feb 27, 2019, 2:57:04 AM2/27/19
to
On Saturday, February 23, 2019 at 11:01:10 PM UTC+1, minf...@arcor.de wrote:
[..]
> MF exception frames are chained data structures living in a separate
> stack, not in Forth's return stack. MF is implemented in C and that separate
> stack is actually the C return stack.
>
> Exception frames contain also setjmp records. Forth exceptions trigger C
> longjmp()s thereby unwinding both stacks, which is the "trick" to "restart"
> the whole system.

This suggests that the whole Forth system is perfectly
rewound to a previous point in time.

How does that work?

For the Forths I know, it would involve restoring the content of
their integer, floating-point, return, locals, and system stack,
recovering a copy of the user area, restoring the content of
transient areas (like HERE and PAD), restoring all the variables
and values of the running application, resetting the state of
the text, graphics and editor consoles, resetting all open files
(i.e. the keyboard and numeric pad), reallocating all allotted
and allocated memory, restoring the state of the graphics
subsystem (which is only partly written in Forth), resetting
the OS random number generator and clock, properly rewinding
and resetting running tasks, and unmentionable things I may
have forgotten here.

-marcel

minf...@arcor.de

unread,
Feb 27, 2019, 4:02:45 AM2/27/19
to
Be more reasonable, please. Does your Forth system include a time machine?

none albert

unread,
Feb 27, 2019, 8:05:03 AM2/27/19
to
In article <e9c435be-b931-4c5b...@googlegroups.com>,
I think the idea that on an error you must have a perfect working
system to handle the error is preposterous.
That is why original Unix systems had the idea that on a kernel
panic you must "dump core". Unpractical nowadays.
That is why ciforth on an error tries to print the unique (!) error
number that identifies the situation and tries to print the text
"ciforth" , on the error channel of course. "Tries" because the program
may be almost dead.
Once that succeeded, maybe other information may get out, but maybe
it doesn't.

The more layers around error handling, the less the chance to
solve the real problems, when not if they occurr.

The there is the qualitative difference between types of errors,
like a sector not reading and filling a zero in a database
where it should not be accepted. No time to go into that now.

>-marcel
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

m...@iae.nl

unread,
Feb 27, 2019, 1:39:49 PM2/27/19
to
I agree, however, Forth being an interpreter, some subset
of the above list is required.

-marcel

minf...@arcor.de

unread,
Feb 27, 2019, 3:04:12 PM2/27/19
to
Obviously. A protection system f.ex. needs exact definitions about what has
to be preserved after a trigger event, has to be reset to zero conditions,
what has to alarmed or logged, and of course maximum allowed reaction times.
Et cetera.

A multi-user multi-tasking data flow management application requires totally
different exit doors, and might even need rerouting strategies for channels
and dump interrupted data packages, which can be rather complex clean-up
tasks.

Just two real-world examples where Forth's simple CATCH/THROW mechanism
is not sufficient.

Alexander Shpilkin

unread,
Feb 28, 2019, 9:55:05 AM2/28/19
to
On Sun, 24 Feb 2019 04:50:37 -0600, aph wrote:
> Alexander Shpilkin <ashp...@gmail.com> wrote:
>>
>> ## Stack-preserving THROW and CATCH
>>
>> It puzzles me that THROW and CATCH[8], the only non-local control
>> transfers provided by ANS Forth, as well as both of the early proposals
>> (ALERT/EXCEPT/RESUME[9] of Guy and Rayburn, which uses a more
>> convenient syntax, and EXCEPTION/TRAP[10] of Roye, which is essentially
>> THROW/CATCH by another name) insist on restoring the data stack
>> pointer.
>
> It shouldn't puzzle you. CATCH and THROW are a structured exception-
> handling mechanism, with all that implies. The problem with not
> restoring the stack pointer is that the depth of the stack becomes
> unknown at the point of the CATCH, presenting the programmer, in
> particular a maintenance programmer, with a program that it is
> impossible to reason about.

Well, yes, I understand the need to restore the stack to a known depth at
_some_ point. Indeed, even the program I posted defines and uses words
MARK and TRIM that do just that; and of course both of the cited
proposals make this argument as well.

In principle, I'd even agree that this manipulation can be bundled into
THROW and CATCH (although I think that it's not the simplest way). The
problem (that I haven't yet formulated when I wrote the quoted text) is
that they make it impossible to pass more than one stack item from before
the THROW to after the CATCH, because it makes it needlessly difficult to
build other control structures upon them. For a simpler example than the
condition system, see my Monday article[1] (or the same code on
GitHub[2]); it's uncommented, but should be easy to follow.

(I was somewhat frustrated by working around the data stack manipulation
when I wrote the original description, thus failed to make my point
clear. I still think that the workaround is the most complicated part of
the whole development, especially compared to a non-portable definition
using RP@ and friends (locals and floats omitted):

VARIABLE CATCHER
: CATCH CATCHER @ >R RP@ CATCHER ! EXECUTE R> CATCHER
FALSE ;
: THROW CATCHER @ RP! R> CATCHER ! TRUE ;
: MARK POSTPONE SP@ POSTPONE >R ; IMMEDIATE
: TRIM POSTPONE R> POSTPONE SP! ; IMMEDIATE

Then again, judging by the relative silence in response to my original
announcement, maybe this wasn't the only point I failed to make clear...)

Perhaps what we disagree about are the expectations about THROW and
CATCH: you consider them to be an exception handling mechanism, while I
want a non-local exit construct. Implementing the former in terms of the
latter is easy, but the converse might be hard. As to why wanting what I
want is reasonable... Well, I'd say it's unfair for a language
(especially Forth) to provide an exception facility, but not non-local
exits used to build it.

Finally and tangentially... Now that I think about it, no, I can't list
what, exactly, a "structured" exception handling mechanism must do to
qualify as such, only a (very) fuzzy feeling. (Ironically, while there
_is_ a technical definition of what it means for _local_ control flow to
be structured---reducibility of the flow graph[3]---Forth control flow
doesn't qualify... Though I wouldn't call it _un_structured, either.)
Again, that is not to say that I don't understand the need to restore the
data stack at _some_ point.

> Also, it is pointless to invent a CATCH mechanism that leaves stuff on
> the stack because instead of this:
>
> [: blah foo leaving? if nn throw then ;] catch if leave-actions stuff
> then
>
> you can do this:
>
> [: blah foo leaving? if leave-actions nn throw then ;] catch if stuff
> then
>
> In other words, move the code that needs the stack at the point of the
> THROW to the point of the THROW.

(I'm guessing the first version needs something akin to TRIM between
LEAVE-ACTIONS and STUFF, and a corresponding MARK at the beginning.)

Huh. Yes, I guess that works, except that this transformation is non-
local. This is why it's problematic for a condition system: its very
point compared to an exception system is how cleanly it localizes the
different parts of error handling exactly where they need to be.

A more down-to-earth argument would be to suppose LEAVE-ACTIONS needs to
either access the return stack or communicate something to STUFF. Not
insurmountable with a generous supply of VARIABLEs, but given that I've
already shown how to implement non-restoring THROW and CATCH in terms of
restoring (ANS) ones, it's not exactly a question of expressive power,
merely of naturality. And I think there are cases where the non-
restoring version is more natural.

>> This is perhaps suitable if they are to be used as an error-handling
>> construct exactly as envisioned by the authors, but completely
>> inappropriate if an arbitrary amount of data needs to be passed during
>> the transfer, as here.
>
> Kernighan: "Everyone knows that debugging is twice as hard as writing a
> program in the first place. So if you're as clever as you can be when
> you write it, how will you ever debug it?"

I... suppose you're saying that what I propose is too complicated or
difficult to reason about? It might be, but I'm not convinced so far.
Again, the issue is not exactly whether the stack is saved and restored
or not, but whether I can pass several things from the THROW to the CATCH
(and thus build other things upon them). I would be happy, for example,
with ANS-like THROW and CATCH specified by

CATCH ( i*x xt -- j*y 0 | i*z k*w k err )
THROW ( j*x k*y k err -- j*x k*y k | i*x k*y k err )

I just don't think that would be simpler that not restoring the data
stack in the first place and leaving that responsibility to the
programmer, who might or might not want to then build ANS-like exceptions
on top.

[1]: <q51cb4$hfc$3...@gioia.aioe.org>
[2]: https://github.com/alexshpilkin/studies/blob/master/escres.fth
[3]: https://cs.stackexchange.com/q/88109

--
Alex

minf...@arcor.de

unread,
Feb 28, 2019, 10:10:18 AM2/28/19
to
Am Donnerstag, 28. Februar 2019 15:55:05 UTC+1 schrieb Alexander Shpilkin:
> I would be happy, for example,
> with ANS-like THROW and CATCH specified by
>
> CATCH ( i*x xt -- j*y 0 | i*z k*w k err )
> THROW ( j*x k*y k err -- j*x k*y k | i*x k*y k err )
>
> I just don't think that would be simpler that not restoring the data
> stack in the first place and leaving that responsibility to the
> programmer, who might or might not want to then build ANS-like exceptions
> on top.


Who is bound to restrict himself slavishly to an _OPTIONAL_ ANS wordset???

a...@littlepinkcloud.invalid

unread,
Feb 28, 2019, 11:33:27 AM2/28/19
to
Taking a few items from the data stack and putting them in a block of
memory is so trivial that it's hard for me to take this seriously as a
problem. On a multi-tasking system such a parameter block (or a
pointer to it) would probably have to be in the user area. Problem
solved, ten minutes' work, nothing to see.

> Perhaps what we disagree about are the expectations about THROW and
> CATCH: you consider them to be an exception handling mechanism, while I
> want a non-local exit construct.

Not exactly, no. I would say instead that if your normal (i.e. non-
execeptional) control flow needs a nonlocal exit then you're doing
something very unwise. I have been there in Forth, done that, and
lived to regret it.

Having said all of that, I'll grant you that there are sometimes
places where it's useful to do something like this, such as, say, a
recursive-descent parser with backtracking. But that doesn't require
much more than to save the input stream state at the point of the TRY.

>>> This is perhaps suitable if they are to be used as an error-handling
>>> construct exactly as envisioned by the authors, but completely
>>> inappropriate if an arbitrary amount of data needs to be passed during
>>> the transfer, as here.
>>
>> Kernighan: "Everyone knows that debugging is twice as hard as writing a
>> program in the first place. So if you're as clever as you can be when
>> you write it, how will you ever debug it?"
>
> I... suppose you're saying that what I propose is too complicated or
> difficult to reason about?

I am. What you're proposing might be lovely when it works, but if
there are any bugs in this area then -- to speak in the vernacular --
you're up shit creek without a paddle. It's the same reason that
structured programming in general insists on properly nested control
structures: sure, it might occasionally be neat to break the rules,
but if you do it can be very hard to reason about what happens.

In other words: TRY and CATCH are restrictive, by design. Their
limitation is a feature, not a bug.

Andrew.
0 new messages