Memoizing recursive words

Arnold Doray

unread,

Dec 3, 2011, 2:07:40 PM12/3/11

to

Here is my attempt at a set of Forth words to memoize recursive words. It
works for functions with 1 argument and 1 value.

The key words are :: used to define a recursive word and recurse* which
is used instead of recurse.

For example, you'd define the Fibonacci series thus:

3 :: fib
dup 1 > if dup 1- recurse* swap 2 - recurse* + then ;

This sets up a memoization table with 3 slots. :: leaves the address of
the memoization table on the stack, which the programmer can use to free
memory. The memoization table is a palimpsest/circular buffer, so old
entries are wiped out.

I've included 2 other examples ( mutually recursive functions and doubly
recursive functions ) in the listing below.

The code has been tested on GForth and VFX. I've only used ANS words.

It's my first real Forth program, so I'd greatly appreciate your
criticisms.

Cheers
Arnold

\
\ Memoizes 1 parameter recursive functions with 1 return value.
\ Copyright 2011 by Arnold Doray
\ v 0.0
\ Requires the Memory Allocation and Exception wordsets
\

\ unsigned minimum. Most Forths have this. But not in spec.
: umin ( u1 u2 -- u3 )
2dup u> if swap then drop ;

\ temporarily holds addr of memoization table
\ Used at compile time only
variable mt

\ checks if memory allocation is OK
: cannot-allocate? ( ior -- )
if -1 throw then ;

\ gets max number of entries in memoization table
: mt-size ( addr -- n )
@ ;

\ address of read/write top of memoization table
: rwtop ( addr -- addr' )
1 cells + ;

\ start address of memoization table body
: >mt-body ( addr -- addr' )
2 cells + ;

\ total length of memoization table in cells.
\ there are <size> key/value entries.
: mt-length ( size -- length )
1+ 2* cells ;

\ start address of next key on the memoization table
: next-key ( addr -- addr' )
2 cells + ;

\ start address of a value given the entry's address
: the-value ( addr -- addr' )
1 cells + ;

\ start address of a key given the entry's address
: the-key ( addr -- addr' ) ;

\ puts the mt's size and rwtop on the stack
: rwtop-n-size ( addr -- rwtop size )
dup rwtop @ swap mt-size ;

\ finds the limit to read to
: rtop ( addr -- rtop )
rwtop-n-size umin ;

\ write pointer to circular buffer
: wtop ( addr -- wtop )
rwtop-n-size mod ;

\ increments the top
: +rwtop ( addr -- )
rwtop 1 swap +! ;

\ start address of entry n in the memoization table
: entry ( addr n -- addr' )
2* cells swap >mt-body + ;

\ creates and initialies the memoization table
\ the memoization table format is <size><read/write top> ( <key><value> )*
: create-mt ( size -- addr )
dup mt-length allocate ( size addr ior )
cannot-allocate? \ throws an exception on failure
dup >R ! \ initialize size
0 R@ rwtop ! \ init read/write top to 0
R> ;

\ finds the value given a key. O(size) operation
: assoc ( key addr size -- value false | key true )
dup 0= if 2drop true exit then \ termination
>R \ save size
2dup @ = \ key found?
if
swap R> 2drop
the-value @ false exit
then
next-key R> 1- recurse ;

\ gets a value from a memoization table
: get-value ( key addr -- value false | key true )
dup >mt-body swap rtop assoc ;

\ sets a value to the memoization table
: set-value ( key value addr -- )
>R
R@ dup wtop entry >R
R@ the-value !
R> the-key !
R> +rwtop ;

\ Quotes words in a macro
: `
postpone postpone ; immediate

\ the addr of the memoization table found at compile time
: mt,
mt @ postpone literal ; immediate

\ macro for memoized recursion. Use this instead of RECURSE.
: recurse*
postpone mt, ` get-value
` if
` dup postpone recurse
` tuck
postpone mt, ` set-value
` then ; immediate

\ To define memoized recursive words.
\ The address of the memoization table is
\ placed on the stack, when the recursive
\ word is defined. Use this address to free the table
\ Usage: <mt size> :: <name> <definition> ;
: :: ( size -- addr )
create-mt dup mt ! : ;

\ ------------------------------------------------------- EXAMPLE USAGE

\ Example 1 : Basic recursion
\ The Fibonacci Series
\ Needs only a small fixed sized memoization table

\ Use a memoization table with 3 slots
3 :: fib
dup 1 > if dup 1- recurse* swap 2 - recurse* + then ;

\ calculate the 45th fibonacci number
45 fib cr .

\ calculate the 46th fibonacci number
46 fib cr .

\ free the memoization table
free drop

\ Example 2 : Mutually recursive functions
\ Hofstandter's Male and Female sequences
\ Probably needs an O(log(n)) sized memoization table

defer F

10 :: M
dup 0= if 1+ exit then
dup 1- recurse* F - ;

10 :: (F)
dup 0= if exit then
dup 1- recurse* M - ;

' (F) is F

200 F cr .
200 M cr .

free drop
free drop

\ Example 3 : Doubly-recursive sequences
\ Hofstadter-Conway $10,000 sequence
\ Probably needs an O(n) sized table.

30 :: a
dup 2 > if
dup 1- recurse* recurse* swap
dup 1- recurse* - recurse*
+
else
drop 1
then ;

100 a cr .

free drop

A. K.

unread,

Dec 4, 2011, 7:21:30 AM12/4/11

to

On 03.12.2011 20:07, Arnold Doray wrote:
> Here is my attempt at a set of Forth words to memoize recursive words. It
> works for functions with 1 argument and 1 value.
>
> The key words are :: used to define a recursive word and recurse* which
> is used instead of recurse.
>
> For example, you'd define the Fibonacci series thus:
>
> 3 :: fib
> dup 1> if dup 1- recurse* swap 2 - recurse* + then ;
>
> This sets up a memoization table with 3 slots. :: leaves the address of
> the memoization table on the stack, which the programmer can use to free
> memory. The memoization table is a palimpsest/circular buffer, so old
> entries are wiped out.
>
> I've included 2 other examples ( mutually recursive functions and doubly
> recursive functions ) in the listing below.
>
> The code has been tested on GForth and VFX. I've only used ANS words.
>
> It's my first real Forth program, so I'd greatly appreciate your
> criticisms.
>
> Cheers
> Arnold

<snip>

Thank you for this interesting work! It made me also look up how
memoizing lookup tables are used in other languages.

I won't criticize. My remarks are tinted by my own personal coding taste
(others may object) but here they are:
- it is well documented, good!
- it seems a bit overfactorized
- the syntax does not seem 'natural' to me
- I wouldn't use allocate (in some machines it is slow) but allot the table
- the ` back-accent is awkward on some keyboards and can be confused
with ' tick

Regards
Andreas

Arnold Doray

unread,

Dec 4, 2011, 9:20:51 AM12/4/11

to

On Sun, 04 Dec 2011 13:21:30 +0100, A. K. wrote:

> Thank you for this interesting work! It made me also look up how
> memoizing lookup tables are used in other languages.
>
>

Thank you for your comments Andreas.

Until I tried out these examples, I wasn't aware that different recursive
functions require very different table sizes (ie, in order to calculate
the recursive function in O(n) time, making it effectively an iteration).
Eg, for the Fibonacci series, you only need to remember 3 values. But for
other series, you need more, either O(log(n)) or O(n) or more. Might be a
good measure of the complexity of the recursive function.

I wonder if there is any literature on this?

Incidentally, I tried to calculate the Ackermann function A(m,n) with m=4
but failed at A(4,1). Not because of memoization, but because of a return
stack overflow.

> - the syntax does not seem 'natural' to me - I wouldn't use allocate (in
> some machines it is slow) but allot the table
>

About using ALLOCATE -- I wanted to create an anonymous array, sort of an
equivalent of :NONAME but for arrays. I tried ALLOT at first, but (a) this
needs a name to attach to, as far as I can see. and (b) the space is
created on the dictionary (I believe!) so I'm not sure how to free it up
when the function is no longer used. Some memoization tables can be quite
large, so I don't want to fill up the dictionary with junk.

ALLOCATE seems to solve both these problems. I also don't have to worry
about other technicalities like alignment. Latency is an issue, but it's
only one-off during compile time. It isn't an issue when the functions
are used. Or are you saying that arrays created using ALLOCATE are slower
to use compared to those created with ALLOT?

Or is there a better way to create fast anonymous arrays in Forth?

( Yes, I could have used a named array, but this has other problems. )

> - the ` back-accent is awkward on some keyboards and can be confused
> with ' tick

Too true!

The symbol has to be unobtrusive since I'm using it to delay execution of
words in the macro. The traditional Lisp quote ' is taken up in standard
Forth. The double quote " is free, but could be confusing. ` was the best
I could come up with!

What I am after is a simple way to create a C-like macro, which is what
RECURSE* really is. Is there a better way to do this in Forth?

Thanks,
Arnold

Josh Grams

unread,

Dec 4, 2011, 11:03:17 AM12/4/11

to

Arnold Doray wrote:
> On Sun, 04 Dec 2011 13:21:30 +0100, A. K. wrote:

> About using ALLOCATE -- I wanted to create an anonymous array, sort of an
> equivalent of :NONAME but for arrays. I tried ALLOT at first, but (a) this
> needs a name to attach to, as far as I can see. and (b) the space is
> created on the dictionary (I believe!) so I'm not sure how to free it up
> when the function is no longer used. Some memoization tables can be quite
> large, so I don't want to fill up the dictionary with junk.
>
> ALLOCATE seems to solve both these problems. I also don't have to worry
> about other technicalities like alignment. Latency is an issue, but it's
> only one-off during compile time. It isn't an issue when the functions
> are used. Or are you saying that arrays created using ALLOCATE are slower
> to use compared to those created with ALLOT?

No, it's all the same RAM. It doesn't matter whether you get the
address from ALLOT or ALLOCATE. And if you have a function which is
slow enough that it needs to be memoized, I doubt you'll notice the time
taken to ALLOCATE memory...

The problem I see with ALLOCATE is that you're hard-coding the address
into the word, but if you free the table, the word is still there, so
you have to be careful never to call it again.

>> - the ` back-accent is awkward on some keyboards and can be confused
>> with ' tick
>
> Too true!
>
> The symbol has to be unobtrusive since I'm using it to delay execution of
> words in the macro. The traditional Lisp quote ' is taken up in standard
> Forth. The double quote " is free, but could be confusing. ` was the best
> I could come up with!

Backtick is the usual nickname for POSTPONE. I don't know much about
keyboard layouts from other countries, but I'd be surprised to find that
it was more awkward than typing eight characters. And if you have
trouble tellling ` from ' then you need to get a font which doesn't
suck.

> What I am after is a simple way to create a C-like macro, which is what
> RECURSE* really is. Is there a better way to do this in Forth?

Nope, that's pretty much the way to do it.

The other thing which jumps out at me is that I'd use the Forth200x
+FIELD (or FIELD:) or gforth's FIELD to define your structure fields.
That would make the structure stand out better and make all that
boilerplate more compact and look less excessive.

Also I would move the definition of `mt` down to the bottom, since
that's the only place where it is actually used.

And you left out a `the-key` in `assoc`.

Looks good to me. I'm sure there are people here who wouldn't bother
naming the structure fields at all and just hard-code offsets, since
it's such a small piece of code. But that's personal preference.

--Josh

A. K.

unread,

Dec 4, 2011, 11:25:47 AM12/4/11

to

On 04.12.2011 17:03, Josh Grams wrote:

> Arnold Doray wrote:

> No, it's all the same RAM. It doesn't matter whether you get the
> address from ALLOT or ALLOCATE.

There ARE system where it matters, believe me. ;-)

And there are even systems that do not provide the memory allocation
wordset at all.

Elizabeth D. Rather

unread,

Dec 4, 2011, 1:24:54 PM12/4/11

to

On 12/4/11 4:20 AM, Arnold Doray wrote:
...

>
> About using ALLOCATE -- I wanted to create an anonymous array, sort of an
> equivalent of :NONAME but for arrays. I tried ALLOT at first, but (a) this
> needs a name to attach to, as far as I can see. and (b) the space is
> created on the dictionary (I believe!) so I'm not sure how to free it up
> when the function is no longer used. Some memoization tables can be quite
> large, so I don't want to fill up the dictionary with junk.
>
> ALLOCATE seems to solve both these problems. I also don't have to worry
> about other technicalities like alignment. Latency is an issue, but it's
> only one-off during compile time. It isn't an issue when the functions
> are used. Or are you saying that arrays created using ALLOCATE are slower
> to use compared to those created with ALLOT?
>
> Or is there a better way to create fast anonymous arrays in Forth?
>
> ( Yes, I could have used a named array, but this has other problems. )

You can certainly get an anonymous array by using HERE <n> ALLOT (which
leaves the address of the start of the space on the stack just as
ALLOCATE does). It is true that this space cannot be released as
ALLOCATEd space can, but in general, this isn't a problem. In
particular, since you're hard-coding the address, you appear to be
treating it as permanent.

Alignment is not significantly different between the two approaches. If
you have reason to believe your data space is unaligned (e.g. you have
just explicitly ALLOTed an odd number of AU) you can just say ALIGN and
everything will be fine.

I'm also unclear as to what problems you see with named arrays.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

BruceMcF

unread,

Dec 4, 2011, 1:51:32 PM12/4/11

to

On Dec 4, 11:03 am, Josh Grams <j...@qualdan.com> wrote:
> No, it's all the same RAM. It doesn't matter whether you get the
> address from ALLOT or ALLOCATE.

In some cases it is NOT the same RAM.

And if you've saved a dictionary as a system it can make a substantial
difference, as ALLOT will be in the saved system and ALLOCATE will be
called either as needed or in the start up word when the system is
loaded.

BUFFER is a word that addresses that issue without the overhead of
ALLOCATE ~ it would be implemented with ALLOT in an implementation
that does not have one built in, but for an implementation with a
native BUFFER, it would be space that would not be embedded within a
saved system but reserved for use.

Arnold Doray

unread,

Dec 4, 2011, 2:14:59 PM12/4/11

to

On Sun, 04 Dec 2011 08:24:54 -1000, Elizabeth D. Rather wrote:

>
> You can certainly get an anonymous array by using HERE <n> ALLOT (which
> leaves the address of the start of the space on the stack just as

Thanks! That was very useful.

> ALLOCATE does). It is true that this space cannot be released as
> ALLOCATEd space can, but in general, this isn't a problem. In
> particular, since you're hard-coding the address, you appear to be
> treating it as permanent.
>

With ALLOCATE, you give the programmer the *option* of freeing up the
space, while with ALLOT the programmer has no choice but to live with
junk in the dictionary. The memoization tables could be large, especially
for the more interesting recursive functions.

>
> I'm also unclear as to what problems you see with named arrays.
>

Besides the problem of junk in the dictionary that can't be freed up, the
problem is creating the name itself. The options are

(a) asking the programmer to do it, eg:

100 memoize fib-mt
:: fib .... ;

Which creates an unnecessary name, fib-mt, besides being more verbose
compared to:

100 :: fib ... ;

The latter is a more elegant solution, IMHO because of these reasons. You
don't need to name the structure. All you need is the address.

(b) Putting in some plumbing to create a named array automatically when
the function is defined. This is quite beyond me at this stage. I would
essentially need to re-write my own version of :

Cheers,
Arnold

BruceMcF

unread,

Dec 4, 2011, 3:09:36 PM12/4/11

to

On Dec 4, 2:14 pm, Arnold Doray <thinksqua...@gmail.com> wrote:
> (b) Putting in some plumbing to create a named array automatically when
> the function is defined. This is quite beyond me at this stage. I would
> essentially need to re-write my own version of :

If it was useful to name the array ~ say, for debugging ~ you could
save that spot in the parsing, create the named array in a side
wordlist, then come back and define the word with the original ":".

WORDLIST CONSTANT Memoize-Arrays

: :: ( ... create array ... -- addr size )
>IN @ GET-CURRENT ( addr size #in current-wid )
Memoize-Arrays SET-CURRENT 2SWAP 2DUP 2CONSTANT
2>R SET-CURRENT >IN ! : 2>R ( r: addr size ) ;

But that could also be added later, after first going with an
anonymous array embedded in the definition of the word to be defined.

: :: ( ... create array .. -- addr size ) 2>R : 2R> ;

Elizabeth D. Rather

unread,

Dec 4, 2011, 5:50:28 PM12/4/11

to

On 12/4/11 8:51 AM, BruceMcF wrote:
> On Dec 4, 11:03 am, Josh Grams<j...@qualdan.com> wrote:
>> No, it's all the same RAM. It doesn't matter whether you get the
>> address from ALLOT or ALLOCATE.
>
> In some cases it is NOT the same RAM.

It certainly cannot be "the same" in the sense of ALLOCATEd space being
intermingled with ALLOTted space, since the whole point of ALLOCATE is
to be able to FREE space when it's no longer needed, and there is no
provision for freeing or rearranging ALLOTted space. Whether there are
two separately managed regions of data space in the same bank of RAM is
entirely an implementation issue, though.

> And if you've saved a dictionary as a system it can make a substantial
> difference, as ALLOT will be in the saved system and ALLOCATE will be
> called either as needed or in the start up word when the system is
> loaded.
>
> BUFFER is a word that addresses that issue without the overhead of
> ALLOCATE ~ it would be implemented with ALLOT in an implementation
> that does not have one built in, but for an implementation with a
> native BUFFER, it would be space that would not be embedded within a
> saved system but reserved for use.

Do you mean BUFFER: ?

BruceMcF

unread,

Dec 4, 2011, 6:40:24 PM12/4/11

to

On Dec 4, 5:50 pm, "Elizabeth D. Rather" <erat...@forth.com> wrote:
> > BUFFER is a word that ...

> Do you mean BUFFER: ?

Yes, that's the one, http://www.forth200x.org/buffer.html

Arnold Doray

unread,

Dec 4, 2011, 11:00:48 PM12/4/11

to

On Sun, 04 Dec 2011 12:09:36 -0800, BruceMcF wrote:

> On Dec 4, 2:14 pm, Arnold Doray <thinksqua...@gmail.com> wrote:
>> (b) Putting in some plumbing to create a named array automatically when
>> the function is defined. This is quite beyond me at this stage. I would
>> essentially need to re-write my own version of :
>
> If it was useful to name the array ~ say, for debugging ~ you could save
> that spot in the parsing, create the named array in a side wordlist,
> then come back and define the word with the original ":".
>
> WORDLIST CONSTANT Memoize-Arrays
>
> : :: ( ... create array ... -- addr size )
> >IN @ GET-CURRENT ( addr size #in current-wid )
> Memoize-Arrays SET-CURRENT 2SWAP 2DUP 2CONSTANT 2>R SET-CURRENT >IN !
> : 2>R ( r: addr size ) ;

Thanks Bruce. This is really useful to know.

Cheers,
Arnold

Arnold Doray

unread,

Dec 5, 2011, 6:17:12 AM12/5/11

to

On Sun, 04 Dec 2011 16:03:17 +0000, Josh Grams wrote:

> The problem I see with ALLOCATE is that you're hard-coding the address
> into the word, but if you free the table, the word is still there, so
> you have to be careful never to call it again.

Yes, this bothers me too. In fact there is a related problem: My solution
only memoizes when recurse* is called. So, a call to fib(45) doesn't
memoize that value, only arguments less than 45. I realized early on that
both issues can be solved this way:

: (F) ... ; \ does the actual function calculation

: F
<mt-addr> mt-exists?
dup <mt-addr> get-value
if
(F)
then
save-value-to-mt ;

This would solve the mt check efficiently and memoize the last function
call. As a bonus, you'd then memoize *any* function, not just recursive
ones.

Unfortunately, I can't figure out a way to do it. My attempt was:

: ::
create-mt dup mt ! \ run once when fn is defined ,
<somehow create the name> \ read fn name and create dict entry
postpone mt, ` mt-exists? \ mt-exists? called at fn's runtime
` dup postpone mt, ` get-value
` if
:noname ...
Here's where I run into trouble.
How to terminate :noname's parsing action at ::'s
runtime so that the rest of :: is executed?

compile, \ compile the xt of the :noname into fn
` then
` save-value-to-mt ;

Any pointers?

Thanks,
Arnold

Andrew Haley

unread,

Dec 5, 2011, 12:04:56 PM12/5/11

to

Arnold Doray <thinks...@gmail.com> wrote:
> Here is my attempt at a set of Forth words to memoize recursive words. It
> works for functions with 1 argument and 1 value.
>
> The key words are :: used to define a recursive word and recurse* which
> is used instead of recurse.
>
> For example, you'd define the Fibonacci series thus:
>
> 3 :: fib
> dup 1 > if dup 1- recurse* swap 2 - recurse* + then ;
>
> This sets up a memoization table with 3 slots. :: leaves the address of
> the memoization table on the stack, which the programmer can use to free
> memory. The memoization table is a palimpsest/circular buffer, so old
> entries are wiped out.
>
> I've included 2 other examples ( mutually recursive functions and doubly
> recursive functions ) in the listing below.
>
> The code has been tested on GForth and VFX. I've only used ANS words.
>
> It's my first real Forth program, so I'd greatly appreciate your
> criticisms.

OK. Firstly, the good things. It looks pretty well-engineered, it's
heavily commented, and as far as I can see it looks like it's probably
correct. On the downside, it's big; it needs some heavy filleting.
This is a fairly common characteristic of beginners' forth programs.

There are some stylistic nits I'll point out inline.

> \ checks if memory allocation is OK
> : cannot-allocate? ( ior -- )
> if -1 throw then ;

You don't really need this: the standard idiom is ALLOCATE THROW .

>
> \ gets max number of entries in memoization table
> : mt-size ( addr -- n )
> @ ;
>
> \ address of read/write top of memoization table
> : rwtop ( addr -- addr' )
> 1 cells + ;
>
> \ start address of memoization table body
> : >mt-body ( addr -- addr' )
> 2 cells + ;

It would make sense to use the Forth 200x structures extension for
these fields.

> \ increments the top
> : +rwtop ( addr -- )
> rwtop 1 swap +! ;

This isn't really a good factor. If you want something to bump a
count, it's better to do something like:

: ++ ( addr -- )
1 swap +! ;

I'm not recommending ++ as a name. Perhaps BUMP or TALLY would be
better. But it's even easier to read as

1 over rwtop +!

> \ creates and initialies the memoization table
> \ the memoization table format is <size><read/write top> ( <key><value> )*
> : create-mt ( size -- addr )
> dup mt-length allocate ( size addr ior )
> cannot-allocate? \ throws an exception on failure
> dup >R ! \ initialize size
> 0 R@ rwtop ! \ init read/write top to 0
> R> ;

The use of ALLOCATE versus ALLOT has been adddressed at some length
already.

>
> \ finds the value given a key. O(size) operation
> : assoc ( key addr size -- value false | key true )
> dup 0= if 2drop true exit then \ termination
> >R \ save size
> 2dup @ = \ key found?
> if
> swap R> 2drop
> the-value @ false exit
> then
> next-key R> 1- recurse ;

RECURSE is very non-idiomatic here: this is plain iteration.

> \ macro for memoized recursion. Use this instead of RECURSE.
> : recurse*
> postpone mt, ` get-value
> ` if
> ` dup postpone recurse
> ` tuck
> postpone mt, ` set-value
> ` then ; immediate
>
> \ To define memoized recursive words.
> \ The address of the memoization table is
> \ placed on the stack, when the recursive
> \ word is defined. Use this address to free the table
> \ Usage: <mt size> :: <name> <definition> ;
> : :: ( size -- addr )
> create-mt dup mt ! : ;

The use of IMMEDIATE and defining a special version of RECURSE is a
rather complex way of solving the problem. Also, as you've already
noticed, the definition itself is not memoized, only its internal
recursion is.

Andrew.

Arnold Doray

unread,

Dec 6, 2011, 2:24:00 AM12/6/11

to

On Mon, 05 Dec 2011 11:04:56 -0600, Andrew Haley wrote:

Many thanks for the comments. I found them very helpful.

I have a couple of questions:

>> \ finds the value given a key. O(size) operation : assoc ( key addr
>> size -- value false | key true )
>> dup 0= if 2drop true exit then \ termination
>> >R \ save size
>> 2dup @ = \ key found? if
>> swap R> 2drop
>> the-value @ false exit
>> then
>> next-key R> 1- recurse ;
>
> RECURSE is very non-idiomatic here: this is plain iteration.
>

Hmm. Iteration was my first choice, but it is more verbose:

: assoc ( key addr size -- value false | key true )
dup 0= if 2drop true exit then \ termination

0 do
2dup

loop
2drop

Arnold Doray

unread,

Dec 6, 2011, 3:03:03 AM12/6/11

to

On Mon, 05 Dec 2011 11:04:56 -0600, Andrew Haley wrote:

Whoops. Here is my completed post:

>> \ finds the value given a key. O(size) operation : assoc ( key addr
>> size -- value false | key true )
>> dup 0= if 2drop true exit then \ termination
>> >R \ save size
>> 2dup @ = \ key found? if
>> swap R> 2drop
>> the-value @ false exit
>> then
>> next-key R> 1- recurse ;
>
> RECURSE is very non-idiomatic here: this is plain iteration.
>

The iterative version is:

: next-key ( addr index -- addr' )
2* cells + ;

: assoc ( key addr size -- value false | key true )
dup 0= if 2drop true exit then

0 do
2dup i next-key = if
swap drop
the-value @ false unloop exit
then
loop
2drop true ;

I preferred the recursive definition because it is slightly less verbose
and has only one exit point.

It is also slightly more efficient since NEXT-KEY in the iterative
version requires 3 operations ( 2* cells + ) while the recursive version
only requires one operation ( 2 cells + ) -- or two, depending on the
compiler.

I notice that SwiftForth nicely does a JMP for this tail-recursive ASSOC,
while VFX, ( which seems otherwise to be an excellent compiler, IMHO )
uses a CALL.

>> \ macro for memoized recursion. Use this instead of RECURSE. : recurse*
>> postpone mt, ` get-value
>> ` if
>> ` dup postpone recurse
>> ` tuck
>> postpone mt, ` set-value
>> ` then ; immediate
>>
>> \ To define memoized recursive words. \ The address of the memoization
>> table is \ placed on the stack, when the recursive \ word is defined.
>> Use this address to free the table \ Usage: <mt size> :: <name>
>> <definition> ; : :: ( size -- addr )
>> create-mt dup mt ! : ;
>
> The use of IMMEDIATE and defining a special version of RECURSE is a
> rather complex way of solving the problem. Also, as you've already
> noticed, the definition itself is not memoized, only its internal
> recursion is.
>

Good point. I thought long and hard about this, and every solution I
could come up with that's simple for the user involves retrofitting
RECURSE.

I would have preferred to do is something like :

: ::
<initialize mt>
<define function>
mt-exists? \ throws exception if not
get-value if
<delegate to actual function>
<save latest value>
then ;

Where the "delegated" function does the actual work. Probably defined
with :NONAME. But it would have to itself call the outer function, so
that involves a retrofitting or re-writing of RECURSE too.

Or do you have a simpler solution in mind?

In any case, as I explained in another post, I don't have the Forth chops
to execute this idea.

The basic problem is getting the last defined XT when :: executes. Gforth
has this really neat word -- LATESTXT but it is not in the '94 spec.

And unfortunately, the various implementations I've tested (except for VFX
and Swift) don't put the :NONAME's XT until the function actually exits:

variable latestxt
: def :noname dup latestxt ! ;

def ." hello world" ;
latestxt @ execute

when run with VFX or Swift, this works as expected. But not with Gforth
or BigForth. These former two put "turds" onto the stack at DEF's
runtime.

Is there a portable way to know the last defined XT? I can imagine all
sorts of uses for something like LATESTXT -- adding automatic finalizers
to words, retrofitting recursion and slightly better mutual recursion,
just off the top of my head.

Cheers,
Arnold

Andrew Haley

unread,

Dec 6, 2011, 5:02:38 AM12/6/11

to

Arnold Doray <thinks...@gmail.com> wrote:
> On Mon, 05 Dec 2011 11:04:56 -0600, Andrew Haley wrote:
>
> Many thanks for the comments. I found them very helpful.
>
> I have a couple of questions:
>
>>> \ finds the value given a key. O(size) operation : assoc ( key addr
>>> size -- value false | key true )
>>> dup 0= if 2drop true exit then \ termination
>>> >R \ save size
>>> 2dup @ = \ key found? if
>>> swap R> 2drop
>>> the-value @ false exit
>>> then
>>> next-key R> 1- recurse ;
>>
>> RECURSE is very non-idiomatic here: this is plain iteration.
>
> Hmm. Iteration was my first choice, but it is more verbose:

: foo ... recurse ;

is equivalent to

: foo begin ... again ;

Andrew.

Stephen Pelc

unread,

Dec 6, 2011, 5:46:29 AM12/6/11

to

On Tue, 6 Dec 2011 08:03:03 +0000 (UTC), Arnold Doray
<thinks...@gmail.com> wrote:

>I notice that SwiftForth nicely does a JMP for this tail-recursive ASSOC,
>while VFX, ( which seems otherwise to be an excellent compiler, IMHO )
>uses a CALL.

The performance gain for the tail recursive CALL/JMP change is often
marginal, while the loss of reliability (it's a hidden return stack
transformation) can be significant. The discussions at MPE on this
issue were long and heated. Overall, we went for reliability. Show
me a measurement where the impact is significant, and I'll rethink
my attitude.

I haven't measured it on current desktop CPUs, but on some
architectures, JMP and CALL take the same space and execution time.

Stephen

--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads

Andrew Haley

unread,

Dec 6, 2011, 5:54:34 AM12/6/11

to

Stephen Pelc <steph...@mpeforth.com> wrote:
> On Tue, 6 Dec 2011 08:03:03 +0000 (UTC), Arnold Doray
> <thinks...@gmail.com> wrote:
>
>>I notice that SwiftForth nicely does a JMP for this tail-recursive ASSOC,
>>while VFX, ( which seems otherwise to be an excellent compiler, IMHO )
>>uses a CALL.
>
> The performance gain for the tail recursive CALL/JMP change is often
> marginal, while the loss of reliability (it's a hidden return stack
> transformation) can be significant. The discussions at MPE on this
> issue were long and heated. Overall, we went for reliability.

That makes some sense.

> I haven't measured it on current desktop CPUs, but on some
> architectures, JMP and CALL take the same space and execution time.

Mmm, but the difference is between CALL ; RET and just JMP .
It's not the difference between a simple CALL and a JMP .

Andrew.

Arnold Doray

unread,

Dec 6, 2011, 7:36:50 AM12/6/11

to

On Tue, 06 Dec 2011 10:46:29 +0000, Stephen Pelc wrote:

> On Tue, 6 Dec 2011 08:03:03 +0000 (UTC), Arnold Doray
> <thinks...@gmail.com> wrote:
>
>>I notice that SwiftForth nicely does a JMP for this tail-recursive
>>ASSOC, while VFX, ( which seems otherwise to be an excellent compiler,
>>IMHO ) uses a CALL.
>
> The performance gain for the tail recursive CALL/JMP change is often
> marginal, while the loss of reliability (it's a hidden return stack
> transformation) can be significant. The discussions at MPE on this issue
> were long and heated. Overall, we went for reliability. Show me a
> measurement where the impact is significant, and I'll rethink my
> attitude.
>
> I haven't measured it on current desktop CPUs, but on some
> architectures, JMP and CALL take the same space and execution time.
>
> Stephen

From the high quality of code I've seen VFX produce I was sure MPE had a
very good reason. But I don't understand what you mean by a "hidden
return stack transformation". Could you elaborate?

As I understand it, the use of CALL vs JMP in this context has less to do
with performance than converting tail-recursive functions into an
iteration. Since CALL pushes the current IP onto the stack before making
the jump, it could potentially cause a stack overflow for large
iterations. This is why most "C" programmers shy away from recursion,
even tail-recursive ones. But reliance on tail-call optimization is very
common in languages like Scheme / Lisp, and some problems are nicely
expressed this way. Like ASSOC.

But I'm probably being pedantic. Because Forth has so little by way of
syntax, I could do it here too if I used BEGIN ... AGAIN instead of
RECURSE as Andrew suggests.

To clarify, if I replace the terminal RECURSE by AGAIN ... BEGIN, then VFX
uses a JMP instead of a CALL:

: assoc
begin
< some code >
again ;

--> VFX produces a JMP at the end of the word.

While:

: assoc
< some code>
recurse ;

--> VFX produces a CALL at the end of the word.

Can't the phrase "RECURSE ;" be replaced by just a JMP as you have done
in the BEGIN -- AGAIN construct?

Cheers,
Arnold

Andrew Haley

unread,

Dec 6, 2011, 11:28:13 AM12/6/11

to

Arnold Doray <thinks...@gmail.com> wrote:
>
> Can't the phrase "RECURSE ;" be replaced by just a JMP as you have done
> in the BEGIN -- AGAIN construct?

Not always, because some silly bleepers manipulate the return stack to
do control flow manipulation. Doing this isn't standard, but that
isn't enough to stop everyone.

Andrew.

Arnold Doray

unread,

Dec 6, 2011, 11:32:40 AM12/6/11

to

Couldn't the same error occur within in a BEGIN ... AGAIN loop? What
makes RECURSE special in this respect?

Arnold

Andrew Haley

unread,

Dec 6, 2011, 11:40:29 AM12/6/11

to

Arnold Doray <thinks...@gmail.com> wrote:
> On Tue, 06 Dec 2011 10:28:13 -0600, Andrew Haley wrote:
>
>> Arnold Doray <thinks...@gmail.com> wrote:
>>>
>>> Can't the phrase "RECURSE ;" be replaced by just a JMP as you have done
>>> in the BEGIN -- AGAIN construct?
>>
>> Not always, because some silly bleepers manipulate the return stack to
>> do control flow manipulation. Doing this isn't standard, but that isn't
>> enough to stop everyone.
>

> Couldn't the same error occur within in a BEGIN ... AGAIN loop? What
> makes RECURSE special in this respect?

It pushes an address onto the return stack and jumps; AGAIN just
jumps.

Andrew.

Arnold Doray

unread,

Dec 6, 2011, 11:51:05 AM12/6/11

to

On Tue, 06 Dec 2011 10:40:29 -0600, Andrew Haley wrote:

> Arnold Doray <thinks...@gmail.com> wrote:
>> On Tue, 06 Dec 2011 10:28:13 -0600, Andrew Haley wrote:
>>
>>> Arnold Doray <thinks...@gmail.com> wrote:
>>>>
>>>> Can't the phrase "RECURSE ;" be replaced by just a JMP as you have
>>>> done in the BEGIN -- AGAIN construct?
>>>
>>> Not always, because some silly bleepers manipulate the return stack to
>>> do control flow manipulation. Doing this isn't standard, but that
>>> isn't enough to stop everyone.
>>
>> Couldn't the same error occur within in a BEGIN ... AGAIN loop? What
>> makes RECURSE special in this respect?
>
> It pushes an address onto the return stack and jumps; AGAIN just jumps.
>
> Andrew.

I'm not saying that RECURSE in general should be a jump.

But RECURSE ; ( ie, a RECURSE that is the *last* word in a definition )
can be compiled to a JMP. It is the same as AGAIN. AGAIN JMPs to BEGIN,
while RECURSE ; (note, recurse in terminal position) should JMP to the
start of the function.

Arnold

Stephen Pelc

unread,

Dec 6, 2011, 12:18:32 PM12/6/11

to

On Tue, 6 Dec 2011 12:36:50 +0000 (UTC), Arnold Doray
<thinks...@gmail.com> wrote:

>From the high quality of code I've seen VFX produce I was sure MPE had a
>very good reason. But I don't understand what you mean by a "hidden
>return stack transformation". Could you elaborate?

There are two cases

: aa ... recurse ;
CALL AA becomes JMP AA
: aa ... bbb ;
CALL BB becomes JMP BB

On the first entry to AA the return stack contains the address just
after the AA. On the second entry to AA the return stack contains the
address just after the first call to AA. If AA itself performs some
(non-standard but classic) return stack manipulations, these will fail
in the second case.

When you see a word with an active return stack, it can be difficult
to predict the number of levels over which the return stack is active.
Every time that I have attempted to codify rules for this, I have
been defeated by the ingenuity of Forth programmers.

>Can't the phrase "RECURSE ;" be replaced by just a JMP as you have done
>in the BEGIN -- AGAIN construct?

Yes, it can. But for the reasons above, it requires a lot of analysis
at compile time to do this safely. I could do this safely for most
usage, but Bernd or Michael could probably ruin my construct
embarrassingly quickly.

Until someone provides a numerical example of why tail call
elimination is a really good optimisation, VFX will remain
conservative for return stack optimisations. Being an
unquantified "neat hack" isn't convincing.

Stephen

P.S. The primary duty of an exception handler is to get the error out
of the lap of the programmer and into the surprised face of the user.
Provided you keep this cardinal rule in mind, you can't go far wrong.
(Verity Stob)

Andrew Haley

unread,

Dec 6, 2011, 1:03:56 PM12/6/11

to

Arnold Doray <thinks...@gmail.com> wrote:
> On Tue, 06 Dec 2011 10:40:29 -0600, Andrew Haley wrote:
>
>> Arnold Doray <thinks...@gmail.com> wrote:
>>> On Tue, 06 Dec 2011 10:28:13 -0600, Andrew Haley wrote:
>>>
>>>> Arnold Doray <thinks...@gmail.com> wrote:
>>>>>
>>>>> Can't the phrase "RECURSE ;" be replaced by just a JMP as you have
>>>>> done in the BEGIN -- AGAIN construct?
>>>>
>>>> Not always, because some silly bleepers manipulate the return stack to
>>>> do control flow manipulation. Doing this isn't standard, but that
>>>> isn't enough to stop everyone.
>>>
>>> Couldn't the same error occur within in a BEGIN ... AGAIN loop? What
>>> makes RECURSE special in this respect?
>>
>> It pushes an address onto the return stack and jumps; AGAIN just jumps.
>

> I'm not saying that RECURSE in general should be a jump.
>
> But RECURSE ; ( ie, a RECURSE that is the *last* word in a definition )
> can be compiled to a JMP.

No, it can't. Not in the presence of R-stack manipulation.

> It is the same as AGAIN.

No, it isn't. Think about what happens if the word contains R> .

Andrew.

Bernd Paysan

unread,

Dec 6, 2011, 7:06:57 PM12/6/11

to

Andrew Haley wrote:
> Not always, because some silly bleepers manipulate the return stack to
> do control flow manipulation. Doing this isn't standard, but that
> isn't enough to stop everyone.

No, because we are not in the C camp, where compilers are deliberately
breaking programs that are "not standard", despite they work almost
everywhere (which means "common practice") ;-).

The return stack in Forth is visible. It's the return stack. It's not
some arbitrary stack designed to keep temporary values. The main
reason to make return stack control flow manipulations non-standard is
that there are a few weird systems like F-PC, which use two return
stack elements instead of one to store the previous IP. This is wasted
time. It's just as wasted as trying to write the standard so that
Chuck Moore's Forth-of-the-year could be standard. Chuck doesn't want
to be standard, and next year his Forth will be different, anyways.

I've implemented tail call optimization in my b16 assembler. The
approach there is a pragmatic one: If you want to have the return stack
manipulation feature, you add a nop between call and ;, which prevents
the optimization to trigger. Since ; is a last-slot instruction on the
b16, this nop would have been inserted anyways. The reason I added
this optimization is not that it is that much faster, but that it saves
return stack space, which is quite precious on the b16. I wouldn't do
it for that reason in bigForth.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/

Andrew Haley

unread,

Dec 7, 2011, 4:37:49 AM12/7/11

to

Bernd Paysan <bernd....@gmx.de> wrote:
> Andrew Haley wrote:
>> Not always, because some silly bleepers manipulate the return stack to
>> do control flow manipulation. Doing this isn't standard, but that
>> isn't enough to stop everyone.
>
> No, because we are not in the C camp, where compilers are deliberately
> breaking programs that are "not standard", despite they work almost
> everywhere (which means "common practice") ;-).

Speak for yourself! :-)

> The return stack in Forth is visible. It's the return stack. It's
> not some arbitrary stack designed to keep temporary values.

Maybe, maybe not. On some systems that's exactly what it is.

> The main reason to make return stack control flow manipulations
> non-standard is that there are a few weird systems like F-PC, which
> use two return stack elements instead of one to store the previous
> IP.

There are other possible variations, such as systems with a call stack
that can't be used for data. But as I've said before, allowing
R-stack manipulation effectively defeats useful optimizations. For
example, inlining is hard. These optimizations are far more useful
than R-stack manipulation, which serves only to produce obscure
write-only code. There are no upsides to R-stack manipulation, only
downsides.

Andrew.

Stephen Pelc

unread,

Dec 7, 2011, 5:36:25 AM12/7/11

to

On Wed, 07 Dec 2011 03:37:49 -0600, Andrew Haley
<andr...@littlepinkcloud.invalid> wrote:

>But as I've said before, allowing
>R-stack manipulation effectively defeats useful optimizations.

I'm fairly sure that all return stack flow control words can be
detected at compile time. I cannot convince myself that the number
of return stack levels that they affect is entirely predictable.

>For example, inlining is hard. These optimizations are far more useful
>than R-stack manipulation,

So far, true.

>which serves only to produce obscure write-only code. There are no
>upsides to R-stack manipulation, only downsides.

That sounds like dogma and hand-waving.

There are a very few occasions on which return stack flow control
words are useful notational conveniences. For example, the MPE
prefix assemblers depend on one such. It's done that way so that
the user can choose between prefix and postfix notation.

Andrew Haley

unread,

Dec 7, 2011, 7:23:52 AM12/7/11

to

Stephen Pelc <steph...@mpeforth.com> wrote:
> On Wed, 07 Dec 2011 03:37:49 -0600, Andrew Haley
> <andr...@littlepinkcloud.invalid> wrote:
>
>>But as I've said before, allowing
>>R-stack manipulation effectively defeats useful optimizations.
>
> I'm fairly sure that all return stack flow control words can be
> detected at compile time.

I'm sure they can't unless you're conservative and mark all "suspect"
uses of the R-stack as preventing inlining.

>>For example, inlining is hard. These optimizations are far more useful
>>than R-stack manipulation,
>
> So far, true.
>
>>which serves only to produce obscure write-only code. There are no
>>upsides to R-stack manipulation, only downsides.
>
> That sounds like dogma and hand-waving.

Even if it does, it's still true AFAIK. Perhaps I should have said
that in all the code I've looked at that uses R-stack manipulation I
have never seen a single use that wouldn't be improved by taking it
out.

But you're right in that one can never prove that X does not exist in
this universe. Maybe there is a use of R-stack manipulation that
makes code clearer and less obscure. (I exclude here real Forth
hardware that necessarily uses R-stack manipulation to do control
flow.)

> There are a very few occasions on which return stack flow control
> words are useful notational conveniences. For example, the MPE
> prefix assemblers depend on one such. It's done that way so that
> the user can choose between prefix and postfix notation.

Hmmm. I've a pretty good idea how that trick works, and I don't think
it needs R-stack manipulation. As usual, I'm prepared to be
convinced.

Andrew.

Stephen Pelc

unread,

Dec 7, 2011, 8:50:30 AM12/7/11

to

Prove it.

So here's a simple and not entirely portable implementation
that works on most desktop class CPUs and classical Forths.
The original code was presented by Bob Smith at a FORML conference.
Your version should preserve the prefix/postfix switch.

2variable aprior \ holds previous instruction and data
0 0 aprior 2!

variable <prefix> \ prefix/postfix mode
<prefix> on

: prefix? \ -- t/f ; true if in prefix mode
<prefix> @
;

: ?prefix \ n -- n' ; executes previous opcode
prefix? if
r>
aprior 2@ 2swap aprior 2!
>r
endif
;

The assembler uses this, often as below, to permit prefix or
postfix operation.

: 2op-r/m \ op1 op2 reg -- ; reg/mem16 --
create
, , ,
does>
?prefix \ do the last one
... \ lay the code
reset \ ready for next opcode
;

$0f $00 $02 2op-r/m LLDT
...

There is a derivative with fewer side effects, but it's harder
to understand. What is nice about ?PREFIX is that you just stick
it into a Forth assembler in the right places and the assembler
becomes switchable between prefix and postfix.

Anton Ertl

unread,

Dec 7, 2011, 8:29:22 AM12/7/11

to

steph...@mpeforth.com (Stephen Pelc) writes:
>On Wed, 07 Dec 2011 03:37:49 -0600, Andrew Haley
><andr...@littlepinkcloud.invalid> wrote:
>
>>But as I've said before, allowing
>>R-stack manipulation effectively defeats useful optimizations.
>
>I'm fairly sure that all return stack flow control words can be
>detected at compile time.

Yes.

However, the more Forth-style way IMO would be to notify the compiler
of the return-address manipulation rather than have the compiler
perform data-flow analysis.

>I cannot convince myself that the number
>of return stack levels that they affect is entirely predictable.

In the usual (among return-address manipulating programs) cases it is,
but of course there could be things like:

( n ) begin
dup while
r> drop 1- repeat

But these cases are so rare that it would not be a problem to disable
inlining and tail-call optimization for all callers (even through
multiple levels of direct calls) of such words.

The main problem is EXECUTE etc. (e.g. DEFERred words); in most cases
you cannot exclude the possibility of an EXECUTE executing an
arbitrary word, including a word that performs some return address
manipulation. So the compiler would have to disable inlining and
tail-call optimization for all words that contain EXECUTE or call
DEFERred words (and words that call such words etc.). That would
disable these optimizations for a pretty large part of the code.

>>For example, inlining is hard. These optimizations are far more useful
>>than R-stack manipulation,
>
>So far, true.

That's in the eye of the beholder.

What I find more problematic about return-address manipulation is that
it is not compatible with other (standard) features, in particular,
locals. That's why it cannot be properly added as a feature to the
standard.

There is also the issue of processors that put the return addresses in
a place that is not accessible as data, but I doubt that Forth
implementations for these CPUs are fully compliant anyway, so if we
standardized return-address manipulation, that would be just one
standard feature among several that such a system does not implement.

Another implementation issue is that compilation-through-C becomes
harder and produces much slower code if it has to support
return-address manipulation. However, AFAIK no such system is
currently maintained (not sure what became of Rob Chapman's Timbre
system), so maybe we should not worry about that too much.

The uses of return-address manipulation I have seen from Bernd can be
replaced without too much effort with quotations (nested variants of
:NONAME words) and words that take xts. Some of the things Michael
Gassanenko has done may be harder to replace (not sure).

Overall, I think that we should add quotations and use them to replace
return-address manipulating code. Then the compiler can happily
perform inlining and tail-call optimization.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2011: http://www.euroforth.org/ef11/

Arnold Doray

unread,

Dec 7, 2011, 9:30:00 AM12/7/11

to

On Tue, 06 Dec 2011 17:18:32 +0000, Stephen Pelc wrote:

> There are two cases
>
> : aa ... recurse ;
> CALL AA becomes JMP AA
> : aa ... bbb ;
> CALL BB becomes JMP BB
>
> On the first entry to AA the return stack contains the address just
> after the AA. On the second entry to AA the return stack contains the
> address just after the first call to AA. If AA itself performs some
> (non-standard but classic) return stack manipulations, these will fail
> in the second case.
>

The penny just dropped -- "hidden return stack transformation" is your
polite term for "messing up the return stack". :)

I tried this example:

: hijack ." hijacked" ;
: next-action ." next-action" ;

: test
if
R> drop ['] hijack >R
then
dup 0= if drop exit then
dup cr .
1- false recurse ;

10 true test next-action

It works for VFX because it uses CALL for the terminal RECURSE while
Swift hits an error after TEST terminates. Yes, both approaches are on
spec, but I agree with your assessment -- VFX's behaviour is much more
desirable because you can use this technique to implement "finalizers" --
mandatory cleanup actions after a given word like TEST. Eg, freeing
memory, closing database or socket connections, etc.

By manipulating the return stack you can create new : that always ends
with a predefined set of actions during the defined word's runtime. Which
is very useful. It's a pity this technique isn't portable.

>>Can't the phrase "RECURSE ;" be replaced by just a JMP as you have done
>>in the BEGIN -- AGAIN construct?
>
> Yes, it can. But for the reasons above, it requires a lot of analysis at
> compile time to do this safely. I could do this safely for most usage,
> but Bernd or Michael could probably ruin my construct embarrassingly
> quickly.
>
> Until someone provides a numerical example of why tail call elimination
> is a really good optimisation, VFX will remain conservative for return
> stack optimisations. Being an unquantified "neat hack" isn't convincing.

I agree.

The way to guarantee tail-call optimization is to use BEGIN ... AGAIN
instead of : ... RECURSE ;

Cheers,
Arnold

Anton Ertl

unread,

Dec 7, 2011, 10:33:14 AM12/7/11

to

Arnold Doray <thinks...@gmail.com> writes:
>I tried this example:
>
>: hijack ." hijacked" ;
>: next-action ." next-action" ;
>
>: test
> if
> R> drop ['] hijack >R
> then
> dup 0= if drop exit then
> dup cr .
> 1- false recurse ;
>
>10 true test next-action
>
>It works for VFX because it uses CALL for the terminal RECURSE while
>Swift hits an error after TEST terminates.

It also does not work on Forth systems where the xt is not usable as a
return address, among them most systems based on threaded code, e.g.,
Gforth, even if the systems don't perform inlining or tail-call
optimization.

> you can use this technique to implement "finalizers" --
>mandatory cleanup actions after a given word like TEST. Eg, freeing
>memory, closing database or socket connections, etc.

No, that's an unreliable technique even on systems where it works,
because it does not catch exceptions.

The way to perform cleanup actions is to wrap the protected code with
CATCH, then perform the cleanup, then THROW any exception onward. E.g.:

: hex.-helper ( u -- )
hex u. ;

: hex. ( u -- )
base @ >r
['] hex.-helper catch
r> base ! throw ;

>By manipulating the return stack you can create new : that always ends
>with a predefined set of actions during the defined word's runtime. Which
>is very useful. It's a pity this technique isn't portable.

A portable technique is to redefine ";":

: ; POSTPONE predefined-set-of-actions POSTPONE ; ; immediate

Arnold Doray

unread,

Dec 7, 2011, 12:09:57 PM12/7/11

to

On Wed, 07 Dec 2011 15:33:14 +0000, Anton Ertl wrote:

>>By manipulating the return stack you can create new : that always ends
>>with a predefined set of actions during the defined word's runtime.
>>Which is very useful. It's a pity this technique isn't portable.
>
> A portable technique is to redefine ";":
>
> : ; POSTPONE predefined-set-of-actions POSTPONE ; ; immediate
>
> - anton

Unfortunately this only works in the simplest of cases. When used
with :NONAME or non-tail recursive functions, it fails. Eg:

: bingo ." was his name" ;
: ;; postpone bingo postpone ; ; immediate

: def :noname ;

def ." rover " ;; <-- calls bingo at compile time!

: fib
dup 1 > if dup 1- recurse swap 2 - recurse + then ;;

10 fib <-- calls bingo many times!

Is there a better, more robust way to create finalizers in these
situations?

Cheers,
Arnold

Anton Ertl

unread,

Dec 7, 2011, 12:26:31 PM12/7/11

to

Arnold Doray <thinks...@gmail.com> writes:
>On Wed, 07 Dec 2011 15:33:14 +0000, Anton Ertl wrote:
>
>>>By manipulating the return stack you can create new : that always ends
>>>with a predefined set of actions during the defined word's runtime.
>>>Which is very useful. It's a pity this technique isn't portable.
>>
>> A portable technique is to redefine ";":
>>
>> : ; POSTPONE predefined-set-of-actions POSTPONE ; ; immediate
>>
>> - anton
>
>Unfortunately this only works in the simplest of cases. When used
>with :NONAME or non-tail recursive functions, it fails. Eg:
>
>: bingo ." was his name" ;
>: ;; postpone bingo postpone ; ; immediate
>
>: def :noname ;
>
>def ." rover " ;; <-- calls bingo at compile time!

Not on standard systems; e.g.:

Gforth 0.6.2, Copyright (C) 1995-2003 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
: bingo ." was his name" ; ok
: ;; postpone bingo postpone ; ; immediate ok
ok
: def :noname ; ok
ok
def ." rover " ;; ok
execute rover was his name ok

SwiftForth i386-Linux 3.2.1 28-Dec-2009
: bingo ." was his name" ; ok
: ;; postpone bingo postpone ; ; immediate ok
ok
: def :noname ; ok
ok
def ." rover " ;; ok
execute rover was his name ok

>: fib
> dup 1 > if dup 1- recurse swap 2 - recurse + then ;;
>
>10 fib <-- calls bingo many times!

Sure, that's what you asked for:

>>>always ends
>>>with a predefined set of actions during the defined word's runtime.

>Is there a better, more robust way to create finalizers in these
>situations?

It's in the part of the posting that you did not cite.

Andrew Haley

unread,

Dec 7, 2011, 12:49:16 PM12/7/11

to

Stephen Pelc <steph...@mpeforth.com> wrote:
> On Wed, 07 Dec 2011 06:23:52 -0600, Andrew Haley
> <andr...@littlepinkcloud.invalid> wrote:
>
>>Hmmm. I've a pretty good idea how that trick works, and I don't think
>>it needs R-stack manipulation. As usual, I'm prepared to be
>>convinced.
>
> Prove it.

I'm not sure what "it" is in this context.

Jeez, that's evil. :-)

> There is a derivative with fewer side effects, but it's harder
> to understand. What is nice about ?PREFIX is that you just stick
> it into a Forth assembler in the right places and the assembler
> becomes switchable between prefix and postfix.

Hmmm, that's impressively devious. Perhaps I've missed something, but
isn't ?PREFIX just

: (prefix) ( a)
body> aprior @ swap aprior !
<prefix> off execute <prefix> on ;

: ?prefix
postpone prefix? postpone if
postpone (prefix) postpone exit
postpone then ; immediate

Unfortuately that's not portable either because of the BODY> , although
I think most Forths can do it. I wonder if there's a nice portable
way.

It'd be more difficult in the case of something like

: blah x y z ?prefix a b c ;

but I don't think an assembler ever needs to do that.

Andrew.

Andrew Haley

unread,

Dec 7, 2011, 12:55:37 PM12/7/11

to

Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:

> There is also the issue of processors that put the return addresses
> in a place that is not accessible as data, but I doubt that Forth
> implementations for these CPUs are fully compliant anyway, so if we
> standardized return-address manipulation, that would be just one
> standard feature among several that such a system does not
> implement.

That isn't necessarily true.

> The uses of return-address manipulation I have seen from Bernd can be
> replaced without too much effort with quotations (nested variants of
> :NONAME words) and words that take xts. Some of the things Michael
> Gassanenko has done may be harder to replace (not sure).
>
> Overall, I think that we should add quotations and use them to replace
> return-address manipulating code. Then the compiler can happily
> perform inlining and tail-call optimization.

I've wanted this for, oh, ages. It would make the language
considerably more powerful, but there's not much common practice in
this area. And I'm not sure how much actual use it'd get in Forth's
user base: I'm against adding any more dusty corners to the language.

Andrew.

Andrew Haley

unread,

Dec 7, 2011, 1:08:19 PM12/7/11

to

\ Arnold Doray <thinks...@gmail.com> wrote:

\ > Or do you have a simpler solution in mind?

\ Yes. Starting from the top:

\ The thing to keep in mind is that syntax isn't very important, but
\ semantics is. It's better not to start out with a fixed idea of what
\ the syntax is going to look like, but to solve the problem in the most
\ natural way and let the syntax be a result of the process.
\
\ We first need some sort of lookup table. The best design depends on
\ the funtion being memoized, but I'd like the interface to the table
\ to be:
\
\ SLOT ( key table - slot-address)
\
\ where the address returned points to a slot that's a tuple of two
\ integers, the key and the value.
\
\ As a simple example, I'll use a simple modulo hash for indexing into
\ the table. Like this:

: slot ( n table - slot)
dup cell+ >r @ mod abs 2 cells * r> + ;

\ A couple of field words for the key and the value:

: hash.key ( a - a') ;
: hash.value ( a - a') cell+ ;

\ And a word that creates a table:

hex 80000000 constant min-int decimal
: ,hashtable ( n)
dup ,
0 do min-int , 0 , loop ;

\ Now, I'll need a word that looks up a key in the table, and either
\ returns the value it found or executes a provided XT, updating the
\ table. Like EXECUTE, but with a memoization table:

: memoized-execute ( n table xt - n')
>r
over swap slot ( n slot)

2dup hash.key @ = if ( We found it)
hash.value @ swap r> 2drop exit then

( n slot) over r> execute
( n slot n') dup >r
-rot ( n' n slot) 2! r> ;

\ The stack manipulation in that word is rather fiddly and sorely
\ needs tidying up, but I hope it's fairly clear what's going on.
\
\ So, we now have all the pieces we need, and can define FIB:

defer (fib)
create fib-table 29 ,hashtable

: fib ( n - n') fib-table ['] (fib) memoized-execute ;

:noname ( n - n')
dup 1 > if dup 1- fib swap 2 - fib + then ;
is (fib)

\ That's OK, but it's not great. We've had to create a dummy (FIB)
\ definition and an auxiliary FIB-TABLE.
\
\ We really need a word that creates a memoizer that wraps a definition.
\ DOES> to the rescue:

: memoize ( n)
create 0 , ,hashtable
does> ( n) dup cell+ swap @ memoized-execute ;

\ Here's how it's used:

29 memoize fib
:noname ( n - n')
dup 1 > if dup 1- fib swap 2 - fib + then ;
' fib >body !

\ and

101 memoize M
101 memoize F

:noname
dup 0= if drop 1 exit then
dup 1- F M - ;
' F >body !

:noname
dup 0= if exit then
dup 1- M F - ;
' M >body !

\ And that will do, I think. You could add a bit of syntactic sugar
\ so that it looks a bit more like a colon definition, but I wouldn't.
\
\ "Syntactic sugar causes cancer of the semicolon." -- Alan Perlis
\
\ Andrew.

BruceMcF

unread,

Dec 7, 2011, 3:41:35 PM12/7/11

to

On Dec 4, 11:00 pm, Arnold Doray <thinksqua...@gmail.com> wrote:
> On Sun, 04 Dec 2011 12:09:36 -0800, BruceMcF wrote:
>> : :: ( ... create array ... -- addr size )
>> >IN @ GET-CURRENT ( addr size #in current-wid )
>> Memoize-Arrays SET-CURRENT 2SWAP 2DUP 2CONSTANT 2>R SET-CURRENT >IN !
>> : 2>R ( r: addr size ) ;

> Thanks Bruce. This is really useful to know.

Typo (well, typo plus lagged edit):

: :: ( ... create array ... -- addr size )
>IN @ GET-CURRENT ( addr size #in current-wid )
Memoize-Arrays SET-CURRENT 2SWAP 2DUP 2CONSTANT
2>R SET-CURRENT >IN ! : 2R> ( addr size ) ;

Hans Bezemer

unread,

Dec 7, 2011, 5:55:37 PM12/7/11

to

Andrew Haley wrote:
>> But RECURSE ; ( ie, a RECURSE that is the last word in a definition )

>> can be compiled to a JMP.

True.

> No, it can't. Not in the presence of R-stack manipulation.

Also true. In 4tH, there is agressive tail-call optimization, but is needs
special attention in the following cases:

: someword if something else something-else then ;
Will translate to:

JUMP Z
CALL SOMETHING
JUMP to RETURN
JUMP SOMETHING-ELSE
RETURN

While:

: someword something ;

Will translate to:

JUMP SOMETHING

However, if a RS changing word is called the tail-call optimizer has to be
disabled. The occasions this is needed are rare, so detecting such a word
isn't done by the compiler. The programmer has to tell the tail-call
optimizer not to kick in:

: someword something rs-changer [FORCE] ;

I'd be delighted to know if there are other occasions this scheme doesn't
work. It does so far for 20,000+ lines of code.

P.S.

BTW, the first example can be further optimized by writing:

: someword if something exit then something-else ;

Which will render to:

JUMP Z
JUMP SOMETHING
JUMP SOMETHING-ELSE

But it's ugly source ;-)

Hans Bezemer

Bernd Paysan

unread,

Dec 7, 2011, 6:03:07 PM12/7/11

to

Andrew Haley wrote:

>> Overall, I think that we should add quotations and use them to
>> replace
>> return-address manipulating code. Then the compiler can happily
>> perform inlining and tail-call optimization.
>
> I've wanted this for, oh, ages. It would make the language
> considerably more powerful, but there's not much common practice in
> this area. And I'm not sure how much actual use it'd get in Forth's
> user base: I'm against adding any more dusty corners to the language.

MINOS heavily relies on quotations. I feel a bit guilty of using return
stack tricks in MINOS, where the quotations would do that job, too, and
the MINOS harness actually provides these quotations (and all the fancy
OOP stuff to do real iterators and so on), but those return stack
manipulation parts are historical, as the quotations became obvious and
essential not before I did Theseus, the GUI editor.

People who edit a GUI want to enter the actions a button does, but do
not want to give these actions a name. An unnamed definition however is
only identified by its position, so it's right where you use it, i.e. it
has to be a quotation. For more direct usages of quotations in Forth you
can use named words, and tick those.

The definition of my quotations are fairly portable, this is how it
looks like in bigForth (factored a bit more for clairty):

: comp-state@ ( -- state ) loffset @ last @ ;
: comp-state! ( state -- ) last ! loffset ! ;

: :[ ( compile-time: -- orig colon-sys )
state @ IF comp-state@ POSTPONE AHEAD true
ELSE false THEN
:noname ; immediate

: ]: ( compile-time: orig colon-sys -- ; run-time: -- xt )
POSTPONE ; >r IF ] comp-state! POSTPONE THEN r> POSTPONE ALiteral
ELSE r> THEN ( xt ) ; immediate

What differs between different Forth systems is what you need to save
and restore in comp-state@ and comp-state!.

Since all single braces are used up, I decided to have :[ ]: as
delimiters (the colon shows that this stuff inside a quotation is a
definition, i.e. compiled).

Retro uses single [ ] for quotations, but Retro choses to ignore most of
standard Forth ;-). Another obvious syntax would be '[ ]'.

Hans Bezemer

unread,

Dec 7, 2011, 6:08:04 PM12/7/11

to

Andrew Haley wrote:
>> The performance gain for the tail recursive CALL/JMP change is often
>> marginal, while the loss of reliability (it's a hidden return stack
>> transformation) can be significant. The discussions at MPE on this
>> issue were long and heated. Overall, we went for reliability.
>
> That makes some sense.

Depends on the implementation. When assembly, yes maybe. But when coding
such a thing in high level C, it becomes clear:

CALL-RETURN
- Get current address
- Push on return stack
- Jump
..
- Get address from return stack
- Jump

Vs.

- Jump

Like I said before: the only thing is when the called word CHANGES the
return stack. If there is something left on the return stack by you BEFORE
you jump, you should not use tail recursion, because it isn't tail
recursion - cleaning up is something that is left to do.

>> I haven't measured it on current desktop CPUs, but on some
>> architectures, JMP and CALL take the same space and execution time.
>
> Mmm, but the difference is between CALL ; RET and just JMP .
> It's not the difference between a simple CALL and a JMP .

Correct.

Hans Bezemer

BruceMcF

unread,

Dec 7, 2011, 6:11:56 PM12/7/11

to

On Dec 7, 6:03 pm, Bernd Paysan <bernd.pay...@gmx.de> wrote:
> Since all single braces are used up, I decided to have :[ ]: as
> delimiters (the colon shows that this stuff inside a quotation is a
> definition, i.e. compiled).

Do quotations nest? If so, [: ... ;] would be reasonably pictographic
(as well as being happier emoticons than the grumpy ones you picked).

In any event, I like :[ ... ]: best of the three mentioned.

Arnold Doray

unread,

Dec 7, 2011, 7:05:53 PM12/7/11

to

On Wed, 07 Dec 2011 12:08:19 -0600, Andrew Haley wrote:

Many thanks for taking the trouble to write this. I have learnt a lot
from your code.

>
> \ And a word that creates a table:
>
> hex 80000000 constant min-int decimal :
> ,hashtable ( n)
> dup ,
> 0 do min-int , 0 , loop ;
>

Wouldn't you have to insert an ALIGN?

> \ We really need a word that creates a memoizer that wraps a definition.
> \ DOES> to the rescue:
>
> : memoize ( n)
> create 0 , ,hashtable
> does> ( n) dup cell+ swap @ memoized-execute ;
>
> \ Here's how it's used:
>
> 29 memoize fib
> :noname ( n - n')
> dup 1 > if dup 1- fib swap 2 - fib + then ;
> ' fib >body !
>
> \ and
>
> 101 memoize M
> 101 memoize F
>
> :noname
> dup 0= if drop 1 exit then
> dup 1- F M - ;
> ' F >body !
>
> :noname
> dup 0= if exit then
> dup 1- M F - ;
> ' M >body !
>

Brilliant. I would never have thought to use ' XYZ >body ! . You've
basically written the XT of the :NONAME into the first data cell of the
hashtable, which the DOES> accesses at runtime to retrieve the XT.

Yes, your solution is much better than mine. You're also right -- I was
too fixated on syntax and that constrained my thinking.

Thanks,
Arnold

Bernd Paysan

unread,

Dec 7, 2011, 7:52:29 PM12/7/11

to

BruceMcF wrote:

> On Dec 7, 6:03 pm, Bernd Paysan <bernd.pay...@gmx.de> wrote:
>> Since all single braces are used up, I decided to have :[ ]: as
>> delimiters (the colon shows that this stuff inside a quotation is a
>> definition, i.e. compiled).
>
> Do quotations nest?

Of course.

> If so, [: ... ;] would be reasonably pictographic
> (as well as being happier emoticons than the grumpy ones you picked).

Knode translates :[ into a vampire bat ;-). I actually like [: and ;],
and wonder why I haven't picked those. Idiomatic, they make a lot more
sense than my :[ ]:, because enclosing something in brackets means "at
compile time", and enclosing an entire definition with brackets thus
means "nested definion". Just that there is no name.

> In any event, I like :[ ... ]: best of the three mentioned.

[ ] is already taken, it's not available.

Andrew Haley

unread,

Dec 8, 2011, 4:37:56 AM12/8/11

to

Arnold Doray <thinks...@gmail.com> wrote:
> On Wed, 07 Dec 2011 12:08:19 -0600, Andrew Haley wrote:
>
>> \ And a word that creates a table:
>>
>> hex 80000000 constant min-int decimal :
>> ,hashtable ( n)
>> dup ,
>> 0 do min-int , 0 , loop ;
>>
>
> Wouldn't you have to insert an ALIGN?

Ah, yeah. Intel x86 porcessors have warped my mind.

> Brilliant. I would never have thought to use ' XYZ >body ! .

Thanks. You're very kind, but that is really just experience.

Andrew.

Gerry Jackson

unread,

Dec 8, 2011, 6:12:34 AM12/8/11

to

On 08/12/2011 00:52, Bernd Paysan wrote:
> BruceMcF wrote:
>
>> On Dec 7, 6:03 pm, Bernd Paysan<bernd.pay...@gmx.de> wrote:
>>> Since all single braces are used up, I decided to have :[ ]: as
>>> delimiters (the colon shows that this stuff inside a quotation is a
>>> definition, i.e. compiled).
>>
>> Do quotations nest?
>
> Of course.
>
>> If so, [: ... ;] would be reasonably pictographic
>> (as well as being happier emoticons than the grumpy ones you picked).
>
> Knode translates :[ into a vampire bat ;-). I actually like [: and ;],
> and wonder why I haven't picked those. Idiomatic, they make a lot more
> sense than my :[ ]:, because enclosing something in brackets means "at
> compile time", and enclosing an entire definition with brackets thus
> means "nested definion". Just that there is no name.

Yes I use [: ... ;] for that very reason.

>
>> In any event, I like :[ ... ]: best of the three mentioned.
>
> [ ] is already taken, it's not available.
>

--

Gerry

Anton Ertl

unread,

Dec 8, 2011, 10:36:11 AM12/8/11

to

steph...@mpeforth.com (Stephen Pelc) writes:
>On Wed, 07 Dec 2011 06:23:52 -0600, Andrew Haley
><andr...@littlepinkcloud.invalid> wrote:
>
>>Hmmm. I've a pretty good idea how that trick works, and I don't think
>>it needs R-stack manipulation. As usual, I'm prepared to be
>>convinced.
>
>Prove it.

...

>Your version should preserve the prefix/postfix switch.
>
>2variable aprior \ holds previous instruction and data
> 0 0 aprior 2!
>
>variable <prefix> \ prefix/postfix mode
> <prefix> on
>
>: prefix? \ -- t/f ; true if in prefix mode
> <prefix> @
>;
>
>: ?prefix \ n -- n' ; executes previous opcode
> prefix? if
> r>
> aprior 2@ 2swap aprior 2!
> >r
> endif
>;
>
>The assembler uses this, often as below, to permit prefix or
>postfix operation.
>
>: 2op-r/m \ op1 op2 reg -- ; reg/mem16 --
> create
> , , ,
> does>
> ?prefix \ do the last one
> ... \ lay the code
> reset \ ready for next opcode
>;

Andrew presented a version that uses BODY>. If you don't insist on
using the same syntax, this can also be implemented in standard Forth
(untested):

: create-?prefix ( "name" -- )
>in @ create >in ! ' , ;

: (?prefix) ( addr1 -- addr2 )
prefix? if
dup cell+ swap @ ( addr2 xt )

aprior 2@ 2swap aprior 2!

execute exit
then
cell+ ;

: does>-?prefix ( -- )
postpone does> postpone (?prefix) ;

: 2op-r/m \ op1 op2 reg -- ; reg/mem16 --

create-?prefix
, , ,
does>-?prefix \ do the last one

... \ lay the code
reset \ ready for next opcode
;

\ the rest is not affected by the changed syntax

$0f $00 $02 2op-r/m LLDT

Anton Ertl

unread,

Dec 8, 2011, 10:51:53 AM12/8/11

to

Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>
>> There is also the issue of processors that put the return addresses
>> in a place that is not accessible as data, but I doubt that Forth
>> implementations for these CPUs are fully compliant anyway, so if we
>> standardized return-address manipulation, that would be just one
>> standard feature among several that such a system does not
>> implement.
>
>That isn't necessarily true.

Please elaborate! Do you know of any standard systems on CPUs where
the return addresses cannot be accessed as data? Or at least of CPUs
large enough for standard systems that have such a restriction?

>> Overall, I think that we should add quotations and use them to replace
>> return-address manipulating code. Then the compiler can happily
>> perform inlining and tail-call optimization.
>
>I've wanted this for, oh, ages. It would make the language
>considerably more powerful, but there's not much common practice in
>this area. And I'm not sure how much actual use it'd get in Forth's
>user base: I'm against adding any more dusty corners to the language.

Then I guess that way to go is to add it to some systems and establish
some common practice.

Given that return address manipulation is common enough that VFX even
foregoes some optimizations to support it, quotations might gain some
usage once the systems have provided them for a few years.

Anton Ertl

unread,

Dec 8, 2011, 11:15:37 AM12/8/11

to

an...@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>Andrew presented a version that uses BODY>. If you don't insist on
>using the same syntax, this can also be implemented in standard Forth
>(untested):
>
>: create-?prefix ( "name" -- )
> >in @ create >in ! ' , ;
>
>: (?prefix) ( addr1 -- addr2 )
> prefix? if
> dup cell+ swap @ ( addr2 xt )
> aprior 2@ 2swap aprior 2!
> execute exit
> then
> cell+ ;
>
>: does>-?prefix ( -- )
> postpone does> postpone (?prefix) ;
>
>: 2op-r/m \ op1 op2 reg -- ; reg/mem16 --
> create-?prefix
> , , ,
> does>-?prefix \ do the last one
> ... \ lay the code
> reset \ ready for next opcode
>;
>
>\ the rest is not affected by the changed syntax
>
>$0f $00 $02 2op-r/m LLDT

Hmm, I think that leads to an endless recursion if PREFIX? is true,
because the EXECUTE always starts such a word again from the start
(alternating between the latest and the previous one). And there are
other bugs. All of them are correct in Andrew Haley's version. So
let's see how I can correct that:

: create-?prefix ( "name" -- )
>in @ create >in ! ' , ;

: (?prefix) ( addr1 -- addr2 )

dup cell+ swap @ ( addr2 xt )
aprior 2@ 2swap aprior 2!

<prefix> off execute <prefix> on exit
then ;

: does>-?prefix ( -- )
postpone does>

postpone prefix? postpone if
postpone (?prefix) postpone exit
postpone then
postpone cell+ ; immediate

: 2op-r/m \ op1 op2 reg -- ; reg/mem16 --
create-?prefix
, , ,
does>-?prefix \ do the last one
... \ lay the code
reset \ ready for next opcode
;

Bernd Paysan

unread,

Dec 8, 2011, 12:04:07 PM12/8/11

to

Gerry Jackson wrote:

>> Knode translates :[ into a vampire bat ;-). I actually like [: and
>> ;], and wonder why I haven't picked those. Idiomatic, they make a
>> lot more sense than my :[ ]:, because enclosing something in brackets
>> means "at compile time", and enclosing an entire definition with
>> brackets thus means "nested definion". Just that there is no name.
>
> Yes I use [: ... ;] for that very reason.

I'd say we have a consensus on which smiley to use ;-).

Andrew Haley

unread,

Dec 8, 2011, 1:17:56 PM12/8/11

to

Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>>
>>> There is also the issue of processors that put the return addresses
>>> in a place that is not accessible as data, but I doubt that Forth
>>> implementations for these CPUs are fully compliant anyway, so if we
>>> standardized return-address manipulation, that would be just one
>>> standard feature among several that such a system does not
>>> implement.
>>
>>That isn't necessarily true.
>
> Please elaborate! Do you know of any standard systems on CPUs where
> the return addresses cannot be accessed as data? Or at least of
> CPUs large enough for standard systems that have such a restriction?

An 8051 has enough address space and has an internal subroutine stack
that's used for return addresses only. I don't know if the Forth, Inc
version supports all of CORE, but it could, which is the point.

>>> Overall, I think that we should add quotations and use them to replace
>>> return-address manipulating code. Then the compiler can happily
>>> perform inlining and tail-call optimization.
>>
>>I've wanted this for, oh, ages. It would make the language
>>considerably more powerful, but there's not much common practice in
>>this area. And I'm not sure how much actual use it'd get in Forth's
>>user base: I'm against adding any more dusty corners to the language.
>
> Then I guess that way to go is to add it to some systems and establish
> some common practice.
>
> Given that return address manipulation is common enough that VFX even
> foregoes some optimizations to support it, quotations might gain some
> usage once the systems have provided them for a few years.

Maybe. I'll have a look at what it would take to do it in SwiftForth.

Andrew.

Stephen Pelc

unread,

Dec 9, 2011, 9:48:14 AM12/9/11

to

On Thu, 08 Dec 2011 15:51:53 GMT, an...@mips.complang.tuwien.ac.at
(Anton Ertl) wrote:

>Please elaborate! Do you know of any standard systems on CPUs where
>the return addresses cannot be accessed as data? Or at least of CPUs
>large enough for standard systems that have such a restriction?

In all ARM systems using the Thumb instruction set, bit 0 is set in
the value on the return stack, even though this actually refers to
an even address.

There are several CPUs, e.g. PIC24/30 in which the return address
is not necessarily the same size as a data cell.

There are 8051 systems in which the return address is not saved on
a Forth stack.

In standards and portability terms, you cannot assume the size,
location or meaning of the return address item. You have to use
new words such as Michael Gassanenko's Open Interpreter word
set.

We have all wanted return address issues to go away for many years,
but silicon designers have just not heard our pleas. We just have to
accept this.

Anton Ertl

unread,

Dec 9, 2011, 11:35:18 AM12/9/11

to

steph...@mpeforth.com (Stephen Pelc) writes:
>On Thu, 08 Dec 2011 15:51:53 GMT, an...@mips.complang.tuwien.ac.at
>(Anton Ertl) wrote:
>
>>Please elaborate! Do you know of any standard systems on CPUs where
>>the return addresses cannot be accessed as data? Or at least of CPUs
>>large enough for standard systems that have such a restriction?
>
>In all ARM systems using the Thumb instruction set, bit 0 is set in
>the value on the return stack, even though this actually refers to
>an even address.

As long as the only thing we do with a return address is to remove it
from the return stack and possibly push it back on later and jump to
it by performing an actual return, the actual format of a return
"address" is irrelevant. That usage scheme covers most of the uses
that Bernd Paysan and Michael Gassanenko are doing, and anything
beyond (like accessing and skipping literal data) is even less
portable anyway (even outside ARM-based systems).

In any case, on the ARM the return addresses may have a funny format,
but they are accessible as data.

>There are several CPUs, e.g. PIC24/30 in which the return address
>is not necessarily the same size as a data cell.

Are there fully compliant standard systems for these processors? And
are the return addresses larger than a data cell? If not, that's not
a problem.

>There are 8051 systems in which the return address is not saved on
>a Forth stack.

Yes, Andrew mentioned that. Still, my feeling is that a system
implementor will rather leave more memory to the programmer than to
provide all of the core words (hmm, ok, they could provide many of the
words as a listing to type in, then they would not consume memory
unless typed in). Does MPE have a fully compliant standard system for
the 8051? I'd rather expect an umbilical system.

BruceMcF

unread,

Dec 9, 2011, 1:18:23 PM12/9/11

to

On Dec 9, 11:35 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> As long as the only thing we do with a return address is to
> remove it from the return stack and possibly push it back on
> later and jump to it by performing an actual return, the actual
> format of a return "address" is irrelevant.

But if the value is actually modified, it can't be done with words
named >R and R> which will caused massive legacy code breakage if they
do not preserve cell values.

> That usage scheme covers most of the uses
> that Bernd Paysan and Michael Gassanenko are doing, and anything
> beyond (like accessing and skipping literal data) is even less
> portable anyway (even outside ARM-based systems).

>RET and >RET as single-cell operations would address cases where the return stack is accessible and no larger than a cell, but where for a variety of reasons the return stack is not actually used as the rack. In typical implementations, they'd just be synonyms for >R and R> ...

... that only leaves processors with return stack entries larger than
a single cell outside the scope of those words. To include those and
have a useful system, it does seem like you'd indeed need to go to the
Open Interpreter proposal or something of similar scope:

http://forth.sourceforge.net/wordset/open-interpreter/OI-norma.htm

Andrew Haley

unread,

Dec 9, 2011, 1:36:03 PM12/9/11

to

Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:

> steph...@mpeforth.com (Stephen Pelc) writes:

>>There are 8051 systems in which the return address is not saved on
>>a Forth stack.
>
> Yes, Andrew mentioned that. Still, my feeling is that a system
> implementor will rather leave more memory to the programmer than to
> provide all of the core words (hmm, ok, they could provide many of the
> words as a listing to type in, then they would not consume memory
> unless typed in).

Err, type in? Is this a joke? It'd be distributed electronically,
surely.

Andrew.

Elizabeth D. Rather

unread,

Dec 9, 2011, 1:39:19 PM12/9/11

to

On 12/9/11 6:35 AM, Anton Ertl wrote:
> steph...@mpeforth.com (Stephen Pelc) writes:
>> On Thu, 08 Dec 2011 15:51:53 GMT, an...@mips.complang.tuwien.ac.at
>> (Anton Ertl) wrote:
>>
>>> Please elaborate! Do you know of any standard systems on CPUs where
>>> the return addresses cannot be accessed as data? Or at least of CPUs
>>> large enough for standard systems that have such a restriction?
>>
>> In all ARM systems using the Thumb instruction set, bit 0 is set in
>> the value on the return stack, even though this actually refers to
>> an even address.
>
> As long as the only thing we do with a return address is to remove it
> from the return stack and possibly push it back on later and jump to
> it by performing an actual return, the actual format of a return
> "address" is irrelevant. That usage scheme covers most of the uses
> that Bernd Paysan and Michael Gassanenko are doing, and anything
> beyond (like accessing and skipping literal data) is even less
> portable anyway (even outside ARM-based systems).

If actual return addresses are not kept on the Forth return stack (e.g.
on those 8051s), R> etc. won't access them.

> In any case, on the ARM the return addresses may have a funny format,
> but they are accessible as data.
>
>> There are several CPUs, e.g. PIC24/30 in which the return address
>> is not necessarily the same size as a data cell.
>
> Are there fully compliant standard systems for these processors? And
> are the return addresses larger than a data cell? If not, that's not
> a problem.
>
>> There are 8051 systems in which the return address is not saved on
>> a Forth stack.
>
> Yes, Andrew mentioned that. Still, my feeling is that a system
> implementor will rather leave more memory to the programmer than to
> provide all of the core words (hmm, ok, they could provide many of the
> words as a listing to type in, then they would not consume memory
> unless typed in). Does MPE have a fully compliant standard system for
> the 8051? I'd rather expect an umbilical system.

FORTH, Inc.'s systems for the 8051 (and other MCUs) are compliant with
the proposed cross-compiler standard.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

BruceMcF

unread,

Dec 9, 2011, 3:07:09 PM12/9/11

to

On Dec 9, 1:36 pm, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

> Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>> Yes, Andrew mentioned that. Still, my feeling is that a system
>> implementor will rather leave more memory to the programmer than to
>> provide all of the core words (hmm, ok, they could provide many of the
>> words as a listing to type in, then they would not consume memory
>> unless typed in).

> Err, type in? Is this a joke? It'd be distributed electronically,
> surely.

I think rather the point being acknowledged is that if a full system
is on the microcontroller, it is a CORE system even if it does not
ship with the full CORE in dictionary, so long as if the rest of the
CORE is provided somehow.

"Typing it in" is just a colorfully distracting way to express it. If
its a serial port console, you'd obviously prefer to copy the
definition in electronic form and feed it in through your console
program.

For a cross-compiler set-up, all of that is neither here nor there,
since what is on the microcontroller is the runtime code, so if any of
the return stack manipulations would occur at runtime, they have to
work on the microcontroller, whether or not the microcontroller
contains a complete CORE.

Anton Ertl

unread,

Dec 10, 2011, 7:34:35 AM12/10/11

to

"Elizabeth D. Rather" <era...@forth.com> writes:
>On 12/9/11 6:35 AM, Anton Ertl wrote:
>> Does MPE have a fully compliant standard system for
>> the 8051? I'd rather expect an umbilical system.
>
>FORTH, Inc.'s systems for the 8051 (and other MCUs) are compliant with
>the proposed cross-compiler standard.

Thank you for supporting my point.

Anton Ertl

unread,

Dec 10, 2011, 7:37:56 AM12/10/11

to

Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:

>> Still, my feeling is that a system
>> implementor will rather leave more memory to the programmer than to
>> provide all of the core words (hmm, ok, they could provide many of the
>> words as a listing to type in, then they would not consume memory
>> unless typed in).
>
>Err, type in? Is this a joke? It'd be distributed electronically,
>surely.

For the point of this discussion (as well as for my opinion of quality
of the system), that makes little difference. The "provides as
listing" option used to be mentioned regularly as a way to standard
compliance in earlier years; it's probably mentioned somewhere in the
standard.

BruceMcF

unread,

Dec 10, 2011, 9:39:05 AM12/10/11

to

On Dec 10, 7:34 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> "Elizabeth D. Rather" <erat...@forth.com> writes:
>
> >On 12/9/11 6:35 AM, Anton Ertl wrote:
> >> Does MPE have a fully compliant standard system for
> >> the 8051? I'd rather expect an umbilical system.
>
> >FORTH, Inc.'s systems for the 8051 (and other MCUs) are compliant with
> >the proposed cross-compiler standard.
>
> Thank you for supporting my point.

But if the return stack manipulation occurs at runtime, it doesn't
make any difference that the system is an umbilical system ~ the
microcontroller still has to be able to support the return stack
manipulation.

Albert van der Horst

unread,

Dec 10, 2011, 12:23:27 PM12/10/11

to

In article <4edf693c....@192.168.0.50>,

Stephen Pelc <steph...@INVALID.mpeforth.com> wrote:
>On Wed, 07 Dec 2011 06:23:52 -0600, Andrew Haley
><andr...@littlepinkcloud.invalid> wrote:
>
>>Hmmm. I've a pretty good idea how that trick works, and I don't think
>>it needs R-stack manipulation. As usual, I'm prepared to be
>>convinced.
>
>Prove it.
>

>So here's a simple and not entirely portable implementation
>that works on most desktop class CPUs and classical Forths.
>The original code was presented by Bob Smith at a FORML conference.

>Your version should preserve the prefix/postfix switch.
>
>2variable aprior \ holds previous instruction and data
> 0 0 aprior 2!

The Pentium has instructions with both address data and immediate
data plus operands that can be specified by complicated
expressions like [AX +8* BX]. Then there are prefixes.
I really wonder how you store that in a 2-variable?
Or is that a simplification for pedagogic purposes?

>
>variable <prefix> \ prefix/postfix mode
> <prefix> on
>
>: prefix? \ -- t/f ; true if in prefix mode
> <prefix> @
>;
>
>: ?prefix \ n -- n' ; executes previous opcode
> prefix? if
> r>

> aprior 2@ 2swap aprior 2!

> >r
> endif
>;
>
>The assembler uses this, often as below, to permit prefix or
>postfix operation.
>

>: 2op-r/m \ op1 op2 reg -- ; reg/mem16 --

> create
> , , ,
> does>

> ?prefix \ do the last one
> ... \ lay the code
> reset \ ready for next opcode
>;

Does this work out for error detection?

>
>$0f $00 $02 2op-r/m LLDT

>...
>
>There is a derivative with fewer side effects, but it's harder
>to understand. What is nice about ?PREFIX is that you just stick
>it into a Forth assembler in the right places and the assembler
>becomes switchable between prefix and postfix.
>
>Stephen

Groetjes Albert

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Stephen Pelc

unread,

Dec 10, 2011, 12:34:34 PM12/10/11

to

On 10 Dec 2011 17:23:27 GMT, Albert van der Horst

<alb...@spenarnc.xs4all.nl> wrote:

>>2variable aprior \ holds previous instruction and data
>> 0 0 aprior 2!
>
>The Pentium has instructions with both address data and immediate
>data plus operands that can be specified by complicated
>expressions like [AX +8* BX]. Then there are prefixes.
>I really wonder how you store that in a 2-variable?
>Or is that a simplification for pedagogic purposes?

: ?prefix \ struct -- struct' ; executes previous opcode
prefix? if
r> \ -- struct pc

aprior 2@ 2swap aprior 2!
>r
endif
;

What is stored in APRIOR is a pointer to a data structure, the address
returned by DOES>, and the address to be executed. Just two items.
The examples are taken from a production assembler. We have used this
technique for 20+ years.

>>: 2op-r/m \ op1 op2 reg -- ; reg/mem16 --
>> create
>> , , ,
>> does>
>> ?prefix \ do the last one
>> ... \ lay the code
>> reset \ ready for next opcode
>>;
>
>Does this work out for error detection?

The error handler is a few lines that are also sensitive to the
prefix/postfix mode.

Andrew Haley

unread,

Dec 10, 2011, 12:43:54 PM12/10/11

to

Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:

> In article <4edf693c....@192.168.0.50>,
> Stephen Pelc <steph...@INVALID.mpeforth.com> wrote:
>
>>So here's a simple and not entirely portable implementation
>>that works on most desktop class CPUs and classical Forths.
>>The original code was presented by Bob Smith at a FORML conference.
>>Your version should preserve the prefix/postfix switch.
>>
>>2variable aprior \ holds previous instruction and data
>> 0 0 aprior 2!
>
> The Pentium has instructions with both address data and immediate
> data plus operands that can be specified by complicated
> expressions like [AX +8* BX]. Then there are prefixes.
> I really wonder how you store that in a 2-variable?
> Or is that a simplification for pedagogic purposes?

This isn't really anything to do with the Pentium ISA is it? All that
the 2variable stores is a code address and a data address.

Andrew.

Andrew Haley

unread,

Dec 10, 2011, 12:49:57 PM12/10/11

to

Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>>> Still, my feeling is that a system implementor will rather leave
>>> more memory to the programmer than to provide all of the core
>>> words (hmm, ok, they could provide many of the words as a listing
>>> to type in, then they would not consume memory unless typed in).
>>
>>Err, type in? Is this a joke? It'd be distributed electronically,
>>surely.
>
> For the point of this discussion (as well as for my opinion of quality
> of the system), that makes little difference. The "provides as
> listing" option used to be mentioned regularly as a way to standard
> compliance in earlier years; it's probably mentioned somewhere in the
> standard.

It says "may provide definitions, including definitions of words in
the Core word set, in source form only." I think the implication is
that it's machine-readable source, not a listing. On desktop systems
(not an 8051, admittedly) compilation of the source might be so fast
that a user wouldn't even notice it happening when the system was
started.

Andrew.

BruceMcF

unread,

Dec 10, 2011, 1:06:10 PM12/10/11

to

On Dec 10, 12:49 pm, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:
> Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:

It would typically be a machine readable listing rather than a printed
one, but the standard is awfully vague. But its more a rhetorical
point than a pragmatic one, as it'd be highly unlikely to have a
printed listing would be distributed without a machine readable file
of the listing, and even for systems with just a bidirectional serial
port as the console, its highly unlikely to be just a dumb terminal at
the other end.

Andrew Haley

unread,

Dec 10, 2011, 1:37:15 PM12/10/11

to

BruceMcF <agi...@netscape.net> wrote:
> On Dec 10, 12:49?pm, Andrew Haley <andre...@littlepinkcloud.invalid>

> wrote:
>>
>> It says "may provide definitions, including definitions of words in
>> the Core word set, in source form only." I think the implication is
>> that it's machine-readable source, not a listing. On desktop systems
>> (not an 8051, admittedly) compilation of the source might be so fast
>> that a user wouldn't even notice it happening when the system was
>> started.
>
> It would typically be a machine readable listing

OCR, you mean? :-)

> rather than a printed one, but the standard is awfully vague. But
> its more a rhetorical point than a pragmatic one, as it'd be highly
> unlikely to have a printed listing would be distributed without a
> machine readable file of the listing, and even for systems with just
> a bidirectional serial port as the console, its highly unlikely to
> be just a dumb terminal at the other end.

Sure, but even as a rehetorical point it's failed on me.

Andrew.

BruceMcF

unread,

Dec 10, 2011, 2:28:26 PM12/10/11

to

On Dec 10, 1:37 pm, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

>> It would typically be a machine readable listing

> OCR, you mean? :-)

Last listing I typed in, the listing was in one window on the computer
and the emulator of the target was in another. Since the listing was
in a file downloaded from the net, the only way I could have had a
paper copy would be printing it out myself.

> Sure, but even as a rhetorical point it's failed on me.

As I said, I think the original point lies along a red herring branch
in any event ~ as long as the return address manipulation is to be
done at runtime, then the microcontroller still needs to be able to do
it, even if programmed with an umbilical system and there is no
complete CORE system on the micro-controller itself.

So the issue of having a distinct >RR RR> operation for portable
return stack manipulation (as proposed elsewhere) does not go away
just because its an umbilical system.

Anton Ertl

unread,

Dec 11, 2011, 6:30:37 AM12/11/11

to

BruceMcF <agi...@netscape.net> writes:
>On Dec 10, 7:34=A0am, an...@mips.complang.tuwien.ac.at (Anton Ertl)

>wrote:
>> "Elizabeth D. Rather" <erat...@forth.com> writes:
>>
>> >On 12/9/11 6:35 AM, Anton Ertl wrote:
>> >> Does MPE have a fully compliant standard system for

>> >> the 8051? =A0I'd rather expect an umbilical system.

>>
>> >FORTH, Inc.'s systems for the 8051 (and other MCUs) are compliant with
>> >the proposed cross-compiler standard.
>>
>> Thank you for supporting my point.
>
>But if the return stack manipulation occurs at runtime, it doesn't
>make any difference that the system is an umbilical system ~ the
>microcontroller still has to be able to support the return stack
>manipulation.

Sure, but my claim was:

|I doubt that Forth
|implementations for these CPUs are fully compliant anyway, so if we
|standardized return-address manipulation, that would be just one
|standard feature among several that such a system does not
|implement.

I have yet to see a counterexample for this claim.

Andrew Haley

unread,

Dec 13, 2011, 2:07:54 PM12/13/11

to

Turns out it's pretty easy on x86 SwiftForth, but not exactly trivial:

icode (quotation)
here $400 + call
ret end-code

: [: postpone (quotation) postpone begin ; immediate
: ;] postpone exit postpone then
postpone r> postpone code> ; immediate

This takes advantage of the fact that the return stack really is the
processor's return stack and the rather nice ICODE allows the easy
insertion of assembly language fragments. It even generates pretty
efficient code. RECURSE doesn't work properly, though; is that a
problem? Do we even know what RECURSE should to inside a quotation?

The capture of locals' values would be nice to have, but perhaps
that's rather too much.

Andrew.

Alex McDonald

unread,

Dec 13, 2011, 4:37:31 PM12/13/11

to

On Dec 13, 7:07 pm, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

> Andrew Haley <andre...@littlepinkcloud.invalid> wrote:
> > Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:

Does Swiftforth have only one data space?

Andrew Haley

unread,

Dec 14, 2011, 4:45:29 AM12/14/11

to

Alex McDonald <bl...@rivadpm.com> wrote:
>
> Does Swiftforth have only one data space?

I don't really know what you mean. I'm pretty sure that the processor
has only one data space.

Andrew.

Alex McDonald

unread,

Dec 14, 2011, 4:58:37 AM12/14/11

to

On Dec 14, 9:45 am, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

ANS Forth; "data space: The logical area of the dictionary that can be
accessed." Some Forths maintain separate code and data spaces. Which
type is SwiftForth?

Andrew Haley

unread,

Dec 14, 2011, 5:09:46 AM12/14/11

to

Alex McDonald <bl...@rivadpm.com> wrote:
> On Dec 14, 9:45?am, Andrew Haley <andre...@littlepinkcloud.invalid>

> wrote:
>> Alex McDonald <b...@rivadpm.com> wrote:
>>
>> > Does Swiftforth have only one data space?
>>

>> I don't really know what you mean. ?I'm pretty sure that the processor

>> has only one data space.
>

> ANS Forth; "data space: The logical area of the dictionary that can be
> accessed." Some Forths maintain separate code and data spaces. Which
> type is SwiftForth?

Oh, I see. Just the one, I think.

Andrew.

Bernd Paysan

unread,

Dec 14, 2011, 8:08:34 AM12/14/11

to

Andrew Haley wrote:
> Turns out it's pretty easy on x86 SwiftForth, but not exactly trivial:
>
> icode (quotation)
> here $400 + call
> ret end-code
>
> : [: postpone (quotation) postpone begin ; immediate
> : ;] postpone exit postpone then
> postpone r> postpone code> ; immediate
>
> This takes advantage of the fact that the return stack really is the
> processor's return stack and the rather nice ICODE allows the easy
> insertion of assembly language fragments. It even generates pretty
> efficient code. RECURSE doesn't work properly, though; is that a
> problem? Do we even know what RECURSE should to inside a quotation?

Probably recurse the nested definition.

> The capture of locals' values would be nice to have, but perhaps
> that's rather too much.

Ah, no closures. Just nested anonymous definitions. What would be nice
is when the nested definition could have their own locals.

Andrew Haley

unread,

Dec 14, 2011, 10:18:05 AM12/14/11

to

Bernd Paysan <bernd....@gmx.de> wrote:
> Andrew Haley wrote:
>> Turns out it's pretty easy on x86 SwiftForth, but not exactly trivial:
>>
>> icode (quotation)
>> here $400 + call
>> ret end-code
>>
>> : [: postpone (quotation) postpone begin ; immediate
>> : ;] postpone exit postpone then
>> postpone r> postpone code> ; immediate
>>
>> This takes advantage of the fact that the return stack really is the
>> processor's return stack and the rather nice ICODE allows the easy
>> insertion of assembly language fragments. It even generates pretty
>> efficient code. RECURSE doesn't work properly, though; is that a
>> problem? Do we even know what RECURSE should to inside a quotation?
>
> Probably recurse the nested definition.

Ouch. I think that's going to make it harder.

>> The capture of locals' values would be nice to have, but perhaps
>> that's rather too much.
>
> Ah, no closures. Just nested anonymous definitions.

Shame. :-)

But the really evil problem with locals is that it really requires
garbage collection: there's no way to know when a closure is no longer
needed.

Andrew.

Anton Ertl

unread,

Dec 14, 2011, 10:27:57 AM12/14/11

to

Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>RECURSE doesn't work properly, though; is that a
>problem? Do we even know what RECURSE should to inside a quotation?

That would be up for discussion, but my opinion is that it should be
specified to do the following:

RECURSE calls the definition/quotation that directly contains it.

Example:

: foo
... recurse ... \ calls foo
[: ... recurse ... ;] \ calls first quotation
... recurse ... \ calls foo
[: ... recurse ... \ calls second quotation
[: ... recurse ... ;] \ calls quotation 2.1
... recurse ... ;] \ calls second quotation
... recurse ... \ calls foo
;

>The capture of locals' values would be nice to have, but perhaps
>that's rather too much.

Yes, expensive to implement. IMO using the names of locals that are
defined in definitions or quotations outside the directly containing
definition/quotation should be specified to be an ambiguous condition.
So an ambitious implementor can implement full closures (and maybe
they will catch on), but the rest will not be burdened with this
requirement.

Andrew Haley

unread,

Dec 14, 2011, 10:48:55 AM12/14/11

to

Andrew Haley <andr...@littlepinkcloud.invalid> wrote:
> Bernd Paysan <bernd....@gmx.de> wrote:
>> Andrew Haley wrote:
>>> Turns out it's pretty easy on x86 SwiftForth, but not exactly trivial:
>>>
>>> icode (quotation)
>>> here $400 + call
>>> ret end-code
>>>
>>> : [: postpone (quotation) postpone begin ; immediate
>>> : ;] postpone exit postpone then
>>> postpone r> postpone code> ; immediate
>>>
>>> This takes advantage of the fact that the return stack really is the
>>> processor's return stack and the rather nice ICODE allows the easy
>>> insertion of assembly language fragments. It even generates pretty
>>> efficient code. RECURSE doesn't work properly, though; is that a
>>> problem? Do we even know what RECURSE should to inside a quotation?
>>
>> Probably recurse the nested definition.
>
> Ouch. I think that's going to make it harder.

Oh, it's not so bad:

: [: last 2 cells + @ \ Save last code address
postpone (quotation) postpone begin
here last 2 cells + ! \ Last code address points to our quotation
; immediate

: ;] postpone exit
postpone then \ Branch over the quotation
last 2 cells + ! \ Restore last code address

postpone r> postpone code> ; immediate

And for a rather contrived test case:

: factorial ( n) ( f: - r)
[: ( n - xt) ( f: - r)
s>f
[: ( f: r - r')
f?dup if
fdup 1.0e f- recurse f*
else 1.0e0 then
;]
;]
execute execute ;

69 factorial fe. 171.12245E+96 ok

Andrew.

Andrew Haley

unread,

Dec 14, 2011, 10:50:01 AM12/14/11

to

Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:

> IMO using the names of locals that are defined in definitions or
> quotations outside the directly containing definition/quotation
> should be specified to be an ambiguous condition. So an ambitious
> implementor can implement full closures (and maybe they will catch
> on), but the rest will not be burdened with this requirement.

This makes sense.

Andrew.

Elizabeth D. Rather

unread,

Dec 14, 2011, 3:43:23 PM12/14/11

to

Code and data space are integrated in SwiftForth. They are segregated
in SwiftX targets, though: code space plus initialized and
uninitialized data space.

Bernd Paysan

unread,

Dec 14, 2011, 4:38:17 PM12/14/11

to

I'd expect that closures in Forth would be more explicit than just
having a local of an outside scope inside a quotation, especially I'd
expect closures to take their parameters from the stack. Something like

: foo
3 4 [{: a b :} a b + ;] execute . ;

foo 7 ok

would create a closure with two values. At least I would know how to
implement this, whereas an implicit closure with out-of-scope locals...
that's beyond the analysis capabilities I want a Forth compiler to have.

The way to implement this requires two things: a pointer to the
"currently active closure", which is saved and restored by the closure's
entry and exit point. Closure values use that pointer to refer to the
data, otherwise they work like locals.

And to create the closure itself, you allocate some memory, and use
CREATE DOES> there to set up the memory area (with a nameless CREATE).
The part following DOES> is the code after :}, preceeded by a >closure.

Memory? Closures need dynamic memory allocation, so you probably have
to explicitely free them in Forth.

Bernd Paysan

unread,

Dec 14, 2011, 5:24:43 PM12/14/11

to

Andrew Haley wrote:

> And for a rather contrived test case:
>
> : factorial ( n) ( f: - r)
> [: ( n - xt) ( f: - r)
> s>f
> [: ( f: r - r')
> f?dup if
> fdup 1.0e f- recurse f*
> else 1.0e0 then
> ;]
> ;]
> execute execute ;
>
> 69 factorial fe.

Apart from the f?dup and the floating point stack depth limits, this
works out of the box in bigForth, so I got that right by design.

This is my contrieved test case variation: It uses the floating point
stack only for the output.

: factorial ( n) ( f: - r)
[: ( n - xt) ( f: - r)

1e0

[: ( f: r - r')

?dup if
dup 1 - recurse s>f f*

then
;]
;]
execute execute ;

69 factorial fe.

BTW: the intra-word forward call you are using to jump over the
quotation is something I use in my regexp compiler.

Gerry Jackson

unread,

Dec 14, 2011, 6:03:13 PM12/14/11

to

Did you ever see the implementation I wrote a couple of years or so ago.
It does some very similar things.
www.qlikz.org/forth/library/lambda.zip

--
Gerry

Andrew Haley

unread,

Dec 15, 2011, 5:50:21 AM12/15/11

to

Bernd Paysan <bernd....@gmx.de> wrote:
> Andrew Haley wrote:
>

>> The capture of locals' values would be nice to have, but perhaps
>> that's rather too much.
>
> Ah, no closures. Just nested anonymous definitions. What would be nice
> is when the nested definition could have their own locals.

I've had a look at this, and it's not always going to be easy: you'd
have to be able add some names to the locals list and strip them back
at the end of each quotation. I guess this is easy enough when locals
are implemented using a private wordlist -- you can just use MARKER .
But otherwise it might be rather tricky.

Is this important, do you think? I can't make my mind up.

Andrew.

BruceMcF

unread,

Dec 15, 2011, 10:45:20 AM12/15/11

to

On Dec 15, 5:50 am, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

Couldn't it just be specified that the local names within each
quotation should be unique within each containing definition, and use
of local names not defined within a quotation is an ambiguous
condition?

Anton Ertl

unread,

Dec 15, 2011, 10:48:31 AM12/15/11

to

Andrew Haley <andr...@littlepinkcloud.invalid> writes:

>Bernd Paysan <bernd....@gmx.de> wrote:
>> Ah, no closures. Just nested anonymous definitions. What would be nice
>> is when the nested definition could have their own locals.
>
>I've had a look at this, and it's not always going to be easy: you'd
>have to be able add some names to the locals list and strip them back
>at the end of each quotation. I guess this is easy enough when locals
>are implemented using a private wordlist -- you can just use MARKER .
>But otherwise it might be rather tricky.
>
>Is this important, do you think? I can't make my mind up.

IMO it's certainly important enough that I will implement it. Whether
it will be important enough that others will implement it, too, will
remain to be seen. I think that quotations without locals are much
better than no quotations, but quotations with locals are better yet.

I also dislike the thought of two desirable features that don't work
together just because of an implementation limitation.

BruceMcF

unread,

Dec 15, 2011, 11:08:29 AM12/15/11

to

On Dec 15, 10:48 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> IMO it's certainly important enough that I will implement it.

> ... I think that quotations without locals are much better than

> no quotations, but quotations with locals are better yet.

Would locals in quotations extend the set of locals of the containing
definitions/quotations, or would only the locals defined within the
quotation be visible?

Either way, the containing definition gives a bounded namespace, so no
duplicated local names within the containing definition would allow
any implementation that allows multiple { ... } sections to compile
code written for implementations with locals named within a quotation
that are discarded at the end of that quotation.

Anton Ertl

unread,

Dec 15, 2011, 10:56:36 AM12/15/11

to

Bernd Paysan <bernd....@gmx.de> writes:
>Andrew Haley wrote:
>
>> Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>>
>>> IMO using the names of locals that are defined in definitions or
>>> quotations outside the directly containing definition/quotation
>>> should be specified to be an ambiguous condition. So an ambitious
>>> implementor can implement full closures (and maybe they will catch
>>> on), but the rest will not be burdened with this requirement.
>>
>> This makes sense.
>
>I'd expect that closures in Forth would be more explicit than just
>having a local of an outside scope inside a quotation,

But that's the meaning of "closure". E.g., the definition from
<http://en.wikipedia.org/wiki/Closure_(computer_science)>:

|a closure (also lexical closure, function closure, function value or
|functional value) is a function together with a referencing
|environment for the non-local variables of that function.[1] A closure
|allows a function to access variables outside its typical scope.

> especially I'd
>expect closures to take their parameters from the stack. Something like
>
>: foo
> 3 4 [{: a b :} a b + ;] execute . ;

That's just quotations with locals and does not need a special syntax:

: foo
3 4 [: {: a b :} a b + ;] execute . ;

A closure would be:

: foo
3 4 {: a b :}

[: a b + ;] execute ;

Hmm, reading on, I see that what you have in mind is not quotations
with locals, but that does not come out in your example. Let's try a
different one:

: foo
3 4 [{: a b :} a b + ;] ;

foo 1 swap execute . . \ should print 7 1

So it takes numbers from the stack when the xt is pushed on the stack,
not when the EXECUTE happens. Yes, you would be able to simulate
closures with that, but would already run into the problems mentioned
below.

>would create a closure with two values. At least I would know how to
>implement this, whereas an implicit closure with out-of-scope locals...
>that's beyond the analysis capabilities I want a Forth compiler to have.

You don't need any particular analysis capabilities unless you want a
particularly efficient implementation. However, a simple,
straightforward implementation has two problems:

1) It increases the costs of stuff that does not need closures (should
be manageable with a little analysis).

2) Either the closures are restricted (do not execute them after the
lifetime of the free variables has ended for other reasons), or we
have to use automatic storage reclamation for them (which does not
really fit into Forth).

>Memory? Closures need dynamic memory allocation, so you probably have
>to explicitely free them in Forth.

If you are going that far, you can implement real closures almost as
easily. But given the different usage from quotations, they should be
separated from quotations, in syntax (which you do) and in
discussions. Hmm, one might also have closures with the lifetime
restriction and without explicit freeing, and syntactically different
closures without lifetime restriction that need to be freed
explicitly.

Andrew Haley

unread,

Dec 15, 2011, 11:40:27 AM12/15/11

to

Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>>Bernd Paysan <bernd....@gmx.de> wrote:
>>> Ah, no closures. Just nested anonymous definitions. What would be nice
>>> is when the nested definition could have their own locals.
>>
>>I've had a look at this, and it's not always going to be easy: you'd
>>have to be able add some names to the locals list and strip them back
>>at the end of each quotation. I guess this is easy enough when locals
>>are implemented using a private wordlist -- you can just use MARKER .
>>But otherwise it might be rather tricky.
>>
>>Is this important, do you think? I can't make my mind up.
>
> IMO it's certainly important enough that I will implement it. Whether
> it will be important enough that others will implement it, too, will
> remain to be seen. I think that quotations without locals are much
> better than no quotations, but quotations with locals are better yet.

I can see the sense of that. On the other hand, if it's hard to do
fewer people will do it.

> I also dislike the thought of two desirable features that don't work
> together just because of an implementation limitation.

Sure, but that's Forth all over. The nice thing about *this* feature,
as it stands, is that you get a large linguistic advantage for almost
zero cost.

I'll have a look at what it takes.

Andrew.

Andrew Haley

unread,

Dec 15, 2011, 11:44:37 AM12/15/11

to

BruceMcF <agi...@netscape.net> wrote:
> On Dec 15, 10:48?am, an...@mips.complang.tuwien.ac.at (Anton Ertl)

> wrote:
>> IMO it's certainly important enough that I will implement it.

>>?... I think that quotations without locals are much better than

>> no quotations, but quotations with locals are better yet.
>
> Would locals in quotations extend the set of locals of the containing
> definitions/quotations, or would only the locals defined within the
> quotation be visible?

The latter, as discussed earlier. Capturing locals from the enclosing
scope is not very easy, and figuring out when they may be deallocated
is hard: I think you'd need a garbage collector or people would have
to remember to deallocate the closure.

Andrew.

BruceMcF

unread,

Dec 15, 2011, 12:07:25 PM12/15/11

to

On Dec 15, 11:44 am, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

> The latter, as discussed earlier. Capturing locals from the enclosing
> scope is not very easy, and figuring out when they may be deallocated
> is hard: I think you'd need a garbage collector or people would have
> to remember to deallocate the closure.

Why is that? That is, with anonymous quotations of a sequence of Forth
code, as opposed to with the programming concept of closure?

With a pooled namespace, release the locals at ; ~ if the quotation is
created at the interpreter, release the locals at the ;] that returns
back to the interpreter. Set a a count at 0 from the command line and
at 1 in compiled state, increment with [: ... decrement with ;] ... if
you decrement to 0, release the locals, if not, ; releases the locals
as normal.

And with the approach of removing the names at each ;] then it seems
like it would be intrinsic to the approach, each [: would create a
control-flow stack frame that gives the information where to reset the
namespace to, each ;] would clean up locals created within that
quotation.

Anton Ertl

unread,

Dec 15, 2011, 12:16:02 PM12/15/11

to

BruceMcF <agi...@netscape.net> writes:
>On Dec 15, 11:44=A0am, Andrew Haley <andre...@littlepinkcloud.invalid>
>wrote:
>> The latter, as discussed earlier. =A0Capturing locals from the enclosing

>> scope is not very easy, and figuring out when they may be deallocated
>> is hard: I think you'd need a garbage collector or people would have
>> to remember to deallocate the closure.
>
>Why is that? That is, with anonymous quotations of a sequence of Forth
>code, as opposed to with the programming concept of closure?

If you have quotations that have access to the locals of the enclosing
definitions, you have closures. And since in general you can pass the
xt around to arbitrary places and execute it arbitrarily later, these
locals would have to live an arbitrary time.

>With a pooled namespace, release the locals at ;

Yes, restricting the timespan during which the xt can be executed
would be a compromise; there are, however, interesting uses of
closures that are not compatible with that restriction.

> ~ if the quotation is
>created at the interpreter, release the locals at the ;] that returns
>back to the interpreter.

Since there are no locals outside a colon definition, we don't need to
worry about a closure referencing locals at that level.

Andrew Haley

unread,

Dec 15, 2011, 12:25:45 PM12/15/11

to

BruceMcF <agi...@netscape.net> wrote:
> On Dec 15, 11:44?am, Andrew Haley <andre...@littlepinkcloud.invalid>

> wrote:
>> The latter, as discussed earlier. Capturing locals from the enclosing
>> scope is not very easy, and figuring out when they may be deallocated
>> is hard: I think you'd need a garbage collector or people would have
>> to remember to deallocate the closure.
>
> Why is that? That is, with anonymous quotations of a sequence of Forth
> code, as opposed to with the programming concept of closure?

I don't understand the distinction. If the quotation has copies of
the locals in an enclosing scope, it's a closure.

> With a pooled namespace, release the locals at ; ~ if the quotation is
> created at the interpreter, release the locals at the ;] that returns
> back to the interpreter. Set a a count at 0 from the command line and
> at 1 in compiled state, increment with [: ... decrement with ;] ... if
> you decrement to 0, release the locals, if not, ; releases the locals
> as normal.

Why would you want to release the locals when returning to the
interpreter? What if your program never returns to the interpreter?

Andrew.

BruceMcF

unread,

Dec 15, 2011, 12:44:02 PM12/15/11

to

On Dec 15, 12:16 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> If you have quotations that have access to the locals of the enclosing
> definitions, you have closures. And since in general you can pass the
> xt around to arbitrary places and execute it arbitrarily later, these
> locals would have to live an arbitrary time.

So why ever release the values? Just release the names the source for
the quotations and the containing definitions use to manipulate the
values. A headerless value ought to be as easy to embed in a
definition as a headerless definition.

BruceMcF

unread,

Dec 15, 2011, 12:40:01 PM12/15/11

to

On Dec 15, 12:25 pm, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

> Why would you want to release the locals when returning to the
> interpreter? What if your program never returns to the interpreter?

Release the names, that is. Because if they were used in a later
definition, they'd be used in the compile-time scope of that
definition.

I still don't see why, if you want the *value itself* to persist
outside of the context of the containing definition, you'd *implement*
those as return-stack-frame locals. It seems to make more sense to
implement them as embedded anonymous values. But when the containing
definition closes, only the containing definition and the quotations
it contains that refer to the values would be able to access them
directly.

Anton Ertl

unread,

Dec 15, 2011, 12:55:23 PM12/15/11

to

BruceMcF <agi...@netscape.net> writes:
> A headerless value ought to be as easy to embed in a
>definition as a headerless definition.

Each closure can have many instances.

: n+ {: n -- :} [: n + ;] ;

defer foo
5 n+ is foo
3 foo \ produces 8
7 n+ is foo \ now we can release the earlier "5 n+" closure
\ unless we still have some other reference to it somewhere
3 foo \ produces 10

A program that produces closures inside a loop could leak a lot of
memory.

Andrew Haley

unread,

Dec 15, 2011, 1:11:14 PM12/15/11

to

BruceMcF <agi...@netscape.net> wrote:
> On Dec 15, 12:25?pm, Andrew Haley <andre...@littlepinkcloud.invalid>

> wrote:
>> Why would you want to release the locals when returning to the
>> interpreter? What if your program never returns to the interpreter?
>
> Release the names, that is. Because if they were used in a later
> definition, they'd be used in the compile-time scope of that
> definition.
>
> I still don't see why, if you want the *value itself* to persist
> outside of the context of the containing definition, you'd
> *implement* those as return-stack-frame locals. It seems to make
> more sense to implement them as embedded anonymous values.

If the captured values don't live on the return stack, they have to
live somewhere else. And what memory space is that? The only thing
that might work is to ALLOCATE that space, and then you have the
problem of FREEing it.

Andrew.

BruceMcF

unread,

Dec 15, 2011, 12:33:00 PM12/15/11

to

On Dec 15, 11:44 am, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

> The latter, as discussed earlier. Capturing locals from the enclosing
> scope is not very easy, and figuring out when they may be deallocated
> is hard: I think you'd need a garbage collector or people would have
> to remember to deallocate the closure.

Why would anonymous values embedded within a definition but jumped
over have to be deallocated? After all, the *names* still can be
forgotten when the containing definition closes.

The extra machinery would seem to be rather propagating the value to
all copies of those anonymous values, if there were to be more than
one, so that if the container that generated the imported local value
modified that value, the quotation if executed separately would
execute on the correct value ... but the solution to that would be to
declare the inheritable locals up front so that the corresponding
anynomous value is created up front and used in lieu of an ordinary
volatile local.

The prior definition of locals to behave like volatile values whose
name vanish when the definition is over supports this ~ in the
ordinary locals declaration / value establishment, if the local has
already been declared to be heritable by a quotation, you use the
embedded anonymous value instead.

The *name* can still vanish when the containing definition is
completed.

So:

Defer bar1
Defer bar2

: foo {" a b "} { a b c d e f }
c ** d + e - TO a
c d * e + f - TO b
[: a b + * ;] [ a b - * ;] ;
IS bar1 IS bar2

... would see bar1 and bar2 multiply by the same factors until foo is
executed on a new set of inputs, then it would shift (ignoring the
fact that this case would be done more easily with named values
passing the factors and also that when I've run foo, I've immediately
trampled the values passed into a and b, but rather assuming that
there is some application where the technique would be of some
purpose, ).

BruceMcF

unread,

Dec 15, 2011, 1:22:56 PM12/15/11

to

On Dec 15, 12:25 pm, Andrew Haley <andre...@littlepinkcloud.invalid>
wrote:

> BruceMcF <agil...@netscape.net> wrote:

>> Why is that? That is, with anonymous quotations of a sequence of Forth
>> code, as opposed to with the programming concept of closure?

> I don't understand the distinction. If the quotation has copies of
> the locals in an enclosing scope, it's a closure.

Not according to what I understood from what Anton said ~ AFAICT, he
said it has to be able to have multiple instances. The same xt
referring to multiple instances of the same definition ~ that's seems
like the hard part.

Mind, the same semantics I described would support multiple
instances ... you'd need to have an additional level of indirection to
the anonymous values, and have a word to instantiate each additional
instance and presuming that instantiated the additional anonymous
values in allocated memory, a word to free that instance when you are
done with it. The xt created would execute the handle for the imported
values than jump to the quotation code. But the first instance would
have the template of anonymous values embedded within it, so they
would not be allocated or freed.