Dumb Forth question from non-user

Paul Rubin

unread,

Dec 27, 2003, 12:10:31 AM12/27/03

to

Suppose I want to write a function that sums the squares of two args.
In C, I could write:

int sum_of_squares(int a, b) {
int square(int);
return square(a) + square(b);
}
int square(int n) {
return n*n;
}

What if I want to do it in Forth? Can I write something like:

: sum_of_squares ( a b -- sum ) square swap square + ;
: square ( n -- square ) dup * ;

Notice that sum_of_squares refers to square but square hasn't been
defined yet. If that's valid, how does the implementation deal with
it? Is there something like a linker which remembers the undefined
symbols and backpatches them? If it's invalid, how do you do mutually
recursive functions? Thanks.

Doug Hoffman

unread,

Dec 27, 2003, 6:49:35 AM12/27/03

to

> From: Paul Rubin <http://phr...@NOSPAM.invalid>

>
> Can I write something like:
>
> : sum_of_squares ( a b -- sum ) square swap square + ;
> : square ( n -- square ) dup * ;
>
> Notice that sum_of_squares refers to square but square hasn't been
> defined yet.

As you have surmised, the normal approach in this simple case would be to
just define square first. But there are ways to keep the order as shown,
for example:

defer square

: sum_of_squares ( a b -- sum ) square swap square + ;

: (square) ( n -- square ) dup * ;
' (square) is square

Not sure if the above is ANS. It may be implementation specific. But you
get the idea.

> Is there something like a linker which remembers the undefined
> symbols and backpatches them?

Not that I've seen. But I wouldn't be surprised if someone has done it.
That would not be a common thing to do.

> If it's invalid, how do you do mutually recursive functions?

Not sure what you mean by 'mutually recursive'. But recursion is not a
problem with Forth.

Btw, that was not a 'dumb' question. ;-)

-Doug

Paul Rubin

unread,

Dec 27, 2003, 7:11:59 AM12/27/03

to

Doug Hoffman <dhof...@journey.com> writes:
> defer square
> : sum_of_squares ( a b -- sum ) square swap square + ;
> : (square) ( n -- square ) dup * ;
> ' (square) is square
>
> Not sure if the above is ANS. It may be implementation specific. But you
> get the idea.

I don't really understand the above but I should probably just look at
some more docs. I'm not familiar with defer or the quote operator.

> > If it's invalid, how do you do mutually recursive functions?
>
> Not sure what you mean by 'mutually recursive'. But recursion is not a
> problem with Forth.

I mean two functions f and g, where f calls g, and g calls f. As opposed
to single recursion where f just calls itself.

> Btw, that was not a 'dumb' question. ;-)

Thanks ;). It's dumb in the sense that I figure the answer would be
obvious if I slog through the documentation or actually install a
Forth system and try it out. But the tutorials that I looked at
didn't say.

jonah thomas

unread,

Dec 27, 2003, 7:46:28 AM12/27/03

to

Doug Hoffman wrote:
>>From: Paul Rubin <http://phr...@NOSPAM.invalid>

>>Can I write something like:

>>: sum_of_squares ( a b -- sum ) square swap square + ;
>>: square ( n -- square ) dup * ;

>>Notice that sum_of_squares refers to square but square hasn't been
>>defined yet.

> defer square

> : sum_of_squares ( a b -- sum ) square swap square + ;
> : (square) ( n -- square ) dup * ;
> ' (square) is square

> Not sure if the above is ANS. It may be implementation specific. But you
> get the idea.

That isn't ANS, and it is common practice. I don't know why it didn't
get into the standard.

You can define it on almost any standard forth like:

: BAD-DEFER -1 ABORT" uninitialised deferred word" ;
: DEFER ( S: "name" -- )
CREATE ['] BAD-DEFER ,
DOES> @ EXECUTE ;

: (IS) ( S: "name" xt -- )
' >BODY ! ;
: [IS] ( C: "name" -- ) ( E: xt -- )
' >BODY POSTPONE LITERAL POSTPONE ! ; IMMEDIATE

IS is usually defined state-smart which can lead to some problems, I
think two different words are better. If you write code with (IS) [IS]
and want to run it on a system that has IS you can define them with

: (IS)
POSTPONE IS ;
: [IS]
POSTPONE IS ; IMMEDIATE

And they'll probably work.

Marcel Hendrix

unread,

Dec 27, 2003, 8:12:57 AM12/27/03

to

Paul Rubin <http://phr...@NOSPAM.invalid> writes Re: Dumb Forth question from non-user

> Doug Hoffman <dhof...@journey.com> writes:
>> defer square
>> : sum_of_squares ( a b -- sum ) square swap square + ;
>> : (square) ( n -- square ) dup * ;
>> ' (square) is square

>> Not sure if the above is ANS. It may be implementation specific. But you
>> get the idea.

> I don't really understand the above but I should probably just look at
> some more docs. I'm not familiar with defer or the quote operator.

A simple-minded definition for DEFER and IS goes:

\ tools
: DEFER create 0 , does> @ execute ;
: IS ' >body ! ;

defer square
: sum_of_squares ( a b -- sum ) square swap square + ;

:NONAME ( n -- square ) dup * ; IS square
3 4 sum_of_squares . 25 ok

It does more than you need:

:NONAME ( n -- cube ) dup dup * * ; IS square
3 4 sum_of_squares . 91 ok

In general the name "square" should be hidden after assignment,
there is no standard Forth for that, however.

-marcel

jonah thomas

unread,

Dec 27, 2003, 9:41:33 AM12/27/03

to

Marcel Hendrix wrote:

> defer square
> : sum_of_squares ( a b -- sum ) square swap square + ;
> :NONAME ( n -- square ) dup * ; IS square
> 3 4 sum_of_squares . 25 ok

> It does more than you need:

> :NONAME ( n -- cube ) dup dup * * ; IS square
> 3 4 sum_of_squares . 91 ok

> In general the name "square" should be hidden after assignment,
> there is no standard Forth for that, however.

You can do it, if it's worth the effort.

VARIABLE SCRATCH
HERE 0 , SCRATCH !
: SUM_OF_SQUARES ( a b -- sum )
[ SCRATCH @ ] LITERAL @ EXECUTE
SWAP [ SCRATCH @ ] LITERAL @ EXECUTE + ;

We need the variable because some systems use the data stack for a
control-flow stack while compiling, so the address has to be stored
somewhere.

Many systems let you store the address on the return stack to get past :
or any control-flow items that might use the data stack, but there are
lots of limitations on using the return stack. Even if a system lets
you use it while interpreting, typically you have to clear it by the end
of every line of input. You could have a compiling word just for this:

: 1:
>R : R> ;

but then you'd likely need another one for two items

: 2:
2>R : 2R> ;

and that way leads more complications.

In general it's probably simpler to just use the name.

If you want to hide it, you can on any system that allows plenty of
wordlists.

WORDLIST CONSTANT SCRATCH

GET-CURRENT SCRATCH SET-CURRENT
: SQUARE ( S: n -- n^2 )
DUP * ;
SET-CURRENT

GET-ORDER SCRATCH SWAP 1+ SET-ORDER
: SUM_OF_SQUARES ( a b -- sum )
SQUARE SWAP SQUARE + ;
GET-ORDER NIP 1- SET-ORDER

So you can at least hide names in a hidden wordlist. This also may be
more trouble than it's worth unless you want to hide enough words to
justify commands to simplify the process.

Tyler

unread,

Dec 27, 2003, 2:20:10 PM12/27/03

to

Paul Rubin <http://phr...@NOSPAM.invalid> wrote in message news:<7xoetuz...@ruckus.brouhaha.com>...

> Doug Hoffman <dhof...@journey.com> writes:
> > defer square
> > : sum_of_squares ( a b -- sum ) square swap square + ;
> > : (square) ( n -- square ) dup * ;
> > ' (square) is square
> >
> > Not sure if the above is ANS. It may be implementation specific. But you
> > get the idea.
>
> I don't really understand the above but I should probably just look at
> some more docs. I'm not familiar with defer or the quote operator.

Since the Forth PhDs have given you the details, let me offer the
nickel tour version:

Basically, you're manually setting up something analogous to a
C++ virtual function. <defer> says, in effect, to save a spot in
the dictionary with the following name. <'> ("tick") says
to leave the address of the following word on the stack.
Finally, <is> amends the original dictionary entry for
square with the new address you left on the stack with <'>.

> I mean two functions f and g, where f calls g, and g calls f. As opposed
> to single recursion where f just calls itself.

defer f
defer g

: foo <stuff> g <more stuff> ;
: goo <stuff> f <more stuff> ;

' foo is f
' goo is g

That's not the only way to do it, of course. You might want to defer
only one of the words, or you could define a default behavior and
assign it to f and g when you first set them up, letting later
circumstances dictate whether to update f and g's behaviors.

The mechanics of deferred execution may vary by implementation, but
it's a handy technique besides recursion. You can use it to set context
information, for example.

Cheers,
Tyler

Alaric B Snell

unread,

Dec 27, 2003, 4:30:42 PM12/27/03

to

Tyler wrote:
> Basically, you're manually setting up something analogous to a
> C++ virtual function. <defer> says, in effect, to save a spot in
> the dictionary with the following name. <'> ("tick") says
> to leave the address of the following word on the stack.
> Finally, <is> amends the original dictionary entry for
> square with the new address you left on the stack with <'>.

My FORTH, since I'm going to be doing a lot of code generation, has a
variety of compiling words instead of LITERAL and POSTPONE. I've got
things like COMPILE-BEGIN and COMPILE-END to start and stop compilation
(required since things like optimisers may defer generation of actual
code for a few instructions; COMPILE-END flushes the pipeline), then
COMPILE-PUSH and COMPILE-FPUSH for literals, COMPILE-DATA-BEGIN and
COMPILE-DATA-END for things like strings (you do COMPILE-DATA-BEGIN,
write stuff with , and C, and then to COMPILE-DATA-END; the runtime
behaviour is to push a pointer to the compiled data), then the
interesting ones: COMPILE-CALL, COMPILE-TAIL-CALL, and COMPILE-JUMP.
These take an XT off of the stack and compile code to call, tailcall, or
jump to that code - tailcall and jump will be identical on most
platforms, but I put the distinction in to allow for reversing anything
done by ENTER that would normally be undone by RETURN (there's
COMPILE-ENTER and COMPILE-RETURN for those) other than the actual return
itself.

(getting slowly towards the point) There are four variants of each of
the three call/jump instructions; plain (as above), conditional,
deferred, and deferred-conditional.

The deferred variants, when they execute, do not expect an xt on the
stack - but they leave a cell on the stack, which is a backreference,
and later one can execute FILL-[CALL|TAIL-CALL|JUMP] with the
backreference and an XT to fill it in.

Anyway - it just occured to me that I could implement my DEFER by
creating a word with interpreted semantics of printing an error, and
compiling semantics of COMPILE-DEFERRED-CALL and saving the
backreference in a linked list off of a pointer stored in the body of
the word. Then the IS word would take an XT and a word name, check the
word was indeed deferred (better safe than sorry!), and walk the linked
list of backreferences, filling them in. Oh, and replace the interpreted
semantics with the given XT. I have seperate interpreted / compiling
semantics pointers in my dictionary (if the compiling semantics pointer
is null, then compile a call to the interpreted semantics).

Then I would have the linker spoken of by the original poster, without
having had to build it into the kernel or anything ;-)

ABS

Julian V. Noble

unread,

Dec 27, 2003, 4:26:47 PM12/27/03

to

"Julian V. Noble" wrote:
>
[deleted ]
>
> Basically one has the BNF definition of an expression:
>
> expr -> term | term & expr
>
> where & stands for + or - . In words, "an expression is a term or a term
> followed by a + or - sign, followed by an expression". The definition is
> inherently recursive.
>
> As I have programmed it,
>
> : <expr> \ expr -> term | term & expr
> $pop LOCALS| op beg end | \ pop the string-stack
> end beg [CHAR] + op_find ( ptr | false) \ find a + or -
> ?DUP IF ( ptr) \ found
> DUP c@ >R \ save it
> \ $stack:
> ( ptr) end OVER 1+ R> $push \ expr' op'
> ( ptr) 1- beg op $push \ term op
>
> \ expr' op' term op
> term expr \ handle the term, then the
> \ expression
>
> ELSE \ + or - not found
> end beg op $push \ reset $stack
> term \ handle term only
> THEN
> ;
>
> : <term> \ term -> factor | factor % term
> similar stuff
> ;
>
> : <factor> \ factor -> id | fp# | ( expr ) | f^f | function
> $pop LOCALS| op beg end | \ true => success
> op end beg try_f1^f2 ?exit \ CASE or SWITCH -like structure
> op end beg try_id ?exit
> op end beg try_fp# ?exit
> op end beg try_(expr) ?exit
> op end beg try_func ?exit
> ." Not a factor!" ABORT
> ;
>
> ' <expr> defines expr \ resolve forward refs
> ' <term> defines term
> ' <factor> defines factor
>
> The only place the actual code for <expr> deviates from the above
> (which would work fine, BTW) is rather than say "expr" I skip a
> step by saying RECURSE.
>
> --
> Julian V. Noble
> Professor Emeritus of Physics
> j...@lessspamformother.virginia.edu
> ^^^^^^^^^^^^^^^^^^
> http://galileo.phys.virginia.edu/~jvn/
>
> "Science knows only one commandment: contribute to science."
> -- Bertolt Brecht, "Galileo".

Opps! I see I forgot to tell you that before you can make the forward
references to term and expr they have to be defined as stub words using
v:

v: expr
v: term
v: factor

The last thing you do is resolve the forward references, as shown, by
getting the execution tokens (xt's) of the actual subroutines <expr>,
<term> and <factor> using ' ("tick"), and then using the word "defines"
to install the xt's in the proper places in the corresponding stub words.

And the way the stub words work is that they have an xt stored within them.
When a stub word executes, it fetches that xt (of some other word) from
that storage and EXECUTEs it. (This may be as simple as a subroutine CALL,
or it could be a jump to some code address, or something else, depending
on how the particular Forth was designed. The ANS route to standardization
lets you ignore all such details of implementation.)

--
Julian V. Noble
Professor Emeritus of Physics
j...@lessspamformother.virginia.edu
^^^^^^^^^^^^^^^^^^
http://galileo.phys.virginia.edu/~jvn/

"Science knows only one commandment: contribute to science."
-- Bertolt Brecht, "Galileo".

Julian V. Noble

unread,

Dec 27, 2003, 4:14:40 PM12/27/03

to

This is only a slightly dumb question. If you realize that each new
subroutine definition extends the dictionary of available subroutines,
and that it happens as they are entered from the input device (keyboard,
file, whatever) then you see that generally you must define square
before you define sum_of_squares. (And of course you know by now that
in Forth jargon, subroutines are called "words".)

If you want to defer a subroutine definition you can do it as several
have suggested, with the words DEFER and IS. However, these words are
not part of ANS Forth, even if so ubiquitous they are almost universal.

Jonah Thomas has objected to the usual IS because it is "state-smart":
that is, it must detect whether it is being used interactively or inside
a new word definition, and must then perform two different actions
accordingly. Moreover, to be able to work inside a definition, IS must
be IMMEDIATE meaning it is always executed rather than being compiled.

Among ANS Forth purists there is a tendency to eschew state-smart words
because there are (rare) instances where they can cause trouble. Thus,
for example, there is ' (retrieve a word's execution token, or xt, while
the system is in the interactive mode) and ['] (do it and record the
xt as a LITERAL inside a word being compiled). The latter is IMMEDIATE.

I personally prefer the following locution:

\ ----------------------------------------------------------------------
\ This is an ANS Forth program requiring the CORE wordset

: use( ' \ state-smart ' for syntactic sugar
STATE @ IF POSTPONE LITERAL THEN ; IMMEDIATE

: dummy_error CR ." Dummy definition not initialized!" ABORT ;

' dummy_error CONSTANT 'BadDummy

: v: CREATE BadDummy , DOES> @ EXECUTE ; \ create dummy def'n

: defines ' >BODY STATE @
IF POSTPONE LITERAL POSTPONE !
ELSE ! THEN ; IMMEDIATE
\ ----------------------------------------------------------------------

It uses state-smart words, but I'm not such a purist.

I use it for several purposes:

1. to pass a function name as an argument to another function,
as in use( FSQRT 0e0 1e0 1e-8 )integral (my integration
routine can then use any function).

2. to perform indirect recursion (the use you asked about).

3. And I am sure I have used it for other things, but can't recall
offhand.

An example of indirect recursion in an expression parser can be found at

http://www.phys.virginia.edu/classes/551.jvn.fall01/

under the link "Forth system and example programs". You are looking
for the file ftran201.htm ( save it as text, ftran201.f, and don't
forget to download the docs). It is a hypertext file becuase I have
included links to some files you will need if you want to use ftran201.f .

Basically one has the BNF definition of an expression:

expr -> term | term & expr

where & stands for + or - . In words, "an expression is a term or a term

followed by a + or - sign, followed by an expression. The definition is

Albert van der Horst

unread,

Dec 27, 2003, 10:33:27 AM12/27/03

to

In article <BC12DC50.D6E7%dhof...@journey.com>,

Doug Hoffman <dhof...@journey.com> wrote:
>> From: Paul Rubin <http://phr...@NOSPAM.invalid>
>>
>> Can I write something like:
>>
>> : sum_of_squares ( a b -- sum ) square swap square + ;
>> : square ( n -- square ) dup * ;
>>
>> Notice that sum_of_squares refers to square but square hasn't been
>> defined yet.
>
>As you have surmised, the normal approach in this simple case would be to
>just define square first. But there are ways to keep the order as shown,
>for example:
>
>defer square
>: sum_of_squares ( a b -- sum ) square swap square + ;
>: (square) ( n -- square ) dup * ;
>' (square) is square
>
>Not sure if the above is ANS. It may be implementation specific. But you
>get the idea.

It is not "implementation specific". Implementation specific means
that the DEFER and IS could not be defined in portable ISO Forth
without knowing internals of the implementation. This is clearly not
the case:

For the above you could add at the top :

: DEFER CREATE 0 , DOES> @ EXECUTE ;
: IS ' >BODY ! ;

But the following is better suited to convince anybody that
no magic is required :

variable square-forward

( a b -- sum )

: sum_of_squares
square-forward @ EXECUTE swap square-forward @ EXECUTE + ;

: square ( n -- square ) dup * ;

' square square-forward !

This is a bit long winded, and that is the reason some prefer
a DEFER-construct.

>> Is there something like a linker which remembers the undefined
>> symbols and backpatches them?
>
>Not that I've seen. But I wouldn't be surprised if someone has done it.
>That would not be a common thing to do.
>
>> If it's invalid, how do you do mutually recursive functions?
>
>Not sure what you mean by 'mutually recursive'. But recursion is not a
>problem with Forth.

You have to do something as described above to have mutually
recursive functions. But indeed some Forths have mechanism for
automatic forward definitions. It seems one of those things
that everybody tries to implement in his Forth, and then
never it is never used.
Because of the constant awareness of compilation order
Forth folks tend not to use mutually recursive functions,
even where appropriate.

>-Doug
--
Albert van der Horst,Oranjestr 8,3511 RA UTRECHT,THE NETHERLANDS
One man-hour to invent,
One man-week to implement,
One lawyer-year to patent.

Paul Rubin

unread,

Dec 27, 2003, 7:44:17 PM12/27/03

to

alb...@spenarnc.xs4all.nl (Albert van der Horst) writes:
> But the following is better suited to convince anybody that
> no magic is required :
>
> variable square-forward
> ( a b -- sum )
> : sum_of_squares
> square-forward @ EXECUTE swap square-forward @ EXECUTE + ;
> : square ( n -- square ) dup * ;
> ' square square-forward !
>
> This is a bit long winded, and that is the reason some prefer
> a DEFER-construct.

That's not really the same, it's an indirect call to the square word,
more work for the interpreter and so forth. I hope the DEFER-based
solutions don't amount to the same thing.

> You have to do something as described above to have mutually
> recursive functions. But indeed some Forths have mechanism for
> automatic forward definitions. It seems one of those things that
> everybody tries to implement in his Forth, and then never it is
> never used. Because of the constant awareness of compilation order
> Forth folks tend not to use mutually recursive functions, even where
> appropriate.

That's basically what I wanted to know. Thanks.

jonah thomas

unread,

Dec 27, 2003, 9:52:21 PM12/27/03

to

Julian V. Noble wrote:

> Jonah Thomas has objected to the usual IS because it is "state-smart":
> that is, it must detect whether it is being used interactively or inside
> a new word definition, and must then perform two different actions
> accordingly. Moreover, to be able to work inside a definition, IS must
> be IMMEDIATE meaning it is always executed rather than being compiled.

> Among ANS Forth purists there is a tendency to eschew state-smart words
> because there are (rare) instances where they can cause trouble. Thus,
> for example, there is ' (retrieve a word's execution token, or xt, while
> the system is in the interactive mode) and ['] (do it and record the
> xt as a LITERAL inside a word being compiled). The latter is IMMEDIATE.

I wonder if DEFER and state-smart belong in the FAQ? I wonder if
they're there, I haven't looked at it for a long time.

Maybe it's time for another explanation about state-smart. Maybe if
somebody thinks my explanation leaves out something important they'll
write another.

The problem is that sometimes you want words to execute at compile-time,
and sometimes you want them to execute at run-time. And naive users
would like to see the same word *look like* it's doing the same thing
either way.

Like, CHAR gives you the numeric value for a character. CHAR A etc.
What if you want to do that while compiling, and get the value when the
routine executes? Like

CHAR A OF ... ENDOF

and so on. We like to say you can just type in the code at the keyboard
and watch it work. But one way we want to get a string from the input
and return the value of the first character, and the other way we want
to get a string from the input and compile that value as a literal. Two
different things, and we want them to look the same to the guy who has
just learned barely enough Forth to use it as a debugger for something else.

So we could give CHAR two different behaviors, and use one of them while
interpreting and the other while compiling. We could use the state
variable STATE to find out whether we're compiling or not, STATE is set
while compiling and reset while interpreting.

But then how would we write a word that would do CHAR later instead of
doing it right now? We could have a special word [COMPILE] or POSTPONE
that will compile the next word it finds instead of executing it. Then
when this command runs, it will either get a character or get a
character and compile it as a literal in some other definition, depending.

The more sophisticated the word, the more confusion you can get into.
Try to compile compiling words into other compiling words that go into
still other compiling words, and there are lots of chances to mess up.

There is 1. the state-smart word that's just been found in the input
stream. There's 2. the word that's being defined now. There's what
happens when #2 executes while interpreting, and what happens when #2
executes while compiling another 3. word. There's what happens when #3
executes while interpreting or while compiling another 4. word.

Sometimes the result you want is to do something now and compile
something into #2, or compile into #3, or compile into #4. Sometimes
the result you want is to do something when #2 executes and compile into
#3 or into #4. And so on. It can be hard to sort through all the
possibilities and get the right result.

What's even harder is that sometimes you care about whether the state
smart word is found *in the source code*. But the word itself doesn't
know whether it was just now found in the source code or not. All it
knows is whether STATE is on or off.

And further, we have two different methods in use. You can delay
execution by using POSTPONE OR [COMPILE] or you can use ' to get an
execution token to later EXECUTE . And these two methods are subtly
incompatible.

It isn't that the code is buggy. It's that there are a lot of different
things you might want it to do, and in complicated situations it doesn't
do what you'd naively expect. When it surprises you, you can go through
the details and eventually understand just what it does and why, and it
might still be hard to figure out how to make it do what you want. The
things that got set up in the first place to make it easy to use,
eventually break down and make it hard to use. And you don't have to go
very far into making compiling words before it may break down.

One way to deal with this is to avoid making your own compiling words.
If you never make a word IMMEDIATE then you won't have many problems.
But that removes a lot of the power. Extending the compiler to do what
you want is one of Forth's big strengths.

It helps to make two versions of words, one to use when interpreting and
the other to use when compiling, and don't mix them. And any word that
POSTPONEs a compiling word is itself a compiling word that you never use
while interpreting. Then at least you don't get the confusion between
what to do when interpreting and what to do when compiling.

What would be simpler, would be to just have the interpret version and
when you want to compile something, then compile it. Instead of CHAR
and [CHAR] you could have just CHAR and do

... [ CHAR A ] LITERAL ...

Instead of ' and ['] do

... [ ' ROT ] LITERAL ...

and so on. If we're going to do this sort of thing a lot we could have
a word like ]L which compiles a literal or ]2L which compiles two of
them. It's more complicated for casual users than having state-smart
words that pretend they do the same thing, but it's one simple rule to
learn. I have the idea that ColorForth does exactly this, but I don't
understand it well enough to be sure.

Jonah Thomas

unread,

Dec 27, 2003, 9:54:13 PM12/27/03

to

Paul Rubin wrote:
> alb...@spenarnc.xs4all.nl (Albert van der Horst) writes:

>>This is a bit long winded, and that is the reason some prefer
>>a DEFER-construct.

> That's not really the same, it's an indirect call to the square word,
> more work for the interpreter and so forth. I hope the DEFER-based
> solutions don't amount to the same thing.

DEFER is exactly the same thing when it's done in high-level Forth.
What would you have it do?

You might compile an address to call and later put code at that address.
Or you could leave a blank call and later when you know what you're
calling fill it in the old code.

But there's at present no portable way to do that. Forth systems vary
so much in the details, and nobody has tried to standardise a way to do
it. There are standard ways to forget all the code laid down after a
particular time, but no standard way to modify a little piece of code
after it's laid down. So turning data into code is the way to go if you
don't want to find something specific to your particular system.

Paul Rubin

unread,

Dec 27, 2003, 10:13:58 PM12/27/03

to

Jonah Thomas <j2th...@cavtel.net> writes:
> > That's not really the same, it's an indirect call to the square word,
> > more work for the interpreter and so forth. I hope the DEFER-based
> > solutions don't amount to the same thing.
>
> DEFER is exactly the same thing when it's done in high-level Forth.
> What would you have it do?

What a normal assembler does in that situation (say when you assemble
separate modules) is write a relocation table along with each module.
Then a separate linking pass puts everything together and backpatches
the undefined references.

> But there's at present no portable way to do that. Forth systems vary
> so much in the details, and nobody has tried to standardise a way to do
> it. There are standard ways to forget all the code laid down after a
> particular time, but no standard way to modify a little piece of code
> after it's laid down. So turning data into code is the way to go if you
> don't want to find something specific to your particular system.

I guess my question was partially cultural, i.e. not just "how can you
do this in Forth", but "how do Forth programmers deal with this?".
The answer seems to be "don't do that", which I guess has been working
ok for the people who actually face the question in practice, so
that's empirically a good enough answer.

John Passaniti

unread,

Dec 27, 2003, 10:24:01 PM12/27/03

to

"Paul Rubin" <http://phr...@NOSPAM.invalid> wrote in message

news:7xoetux2...@ruckus.brouhaha.com...

> Suppose I want to write a function that sums the squares of two args.
> In C, I could write:
>
> int sum_of_squares(int a, b) {
> int square(int);
> return square(a) + square(b);
> }
> int square(int n) {
> return n*n;
> }

Much evil in the C world comes from this top-down style of programming. By
simply changing to a bottom-up style (putting the definition for square
before sum_of_squares, the problem you describe simply goes away in Forth
and in C, you don't have the syntatic noise of the forward declaration of
square inside sum_of_squares.

If for some insane reason you must code top-down, at the very least place
the forward reference *outside* the function. Inside the function, you are
doing more than providing a forward reference. You are also potentially
preventing the compiler from warning you that you have changed the prototype
for the function. This happens because the forward reference is scoped by
the function. Witness the following, where I change the return type of
square in the forward declaration:

int sum_of_squares(int a, b) {
float square(int);
return square(a) + square(b);
}

int square(int n) {
return n*n;
}

Under GCC and most other C compilers, this compiles just fine and produces
no warnings from either the compiler or linker. More important is when you
run it, sum_of_squares produces garbage. If the forward reference is placed
outside the function, then an appropriate warning is generated.

> What if I want to do it in Forth? Can I write something like:
>
> : sum_of_squares ( a b -- sum ) square swap square + ;
> : square ( n -- square ) dup * ;

First answer is don't. Don't fight the language and define your words
bottom-up.

Second answer is to use DEFER/IS. This is portable and readable, but not
exactly what you want. Unlike the linker in C that will generate code to
directly call square, DEFER/IS results in an indirect function call. So,
you're introducing a level of call nesting that doesn't need to be there.

Some Forths have a means to do what you want. Some Forths (I've seen it in
cross-compiling Forths) will maintain a list of unresolved forward
references and then backpatch the call address as necessary. But that isn't
ROMable and it certainly isn't portable.

> Notice that sum_of_squares refers to square but square
> hasn't been defined yet. If that's valid, how does the
> implementation deal with it? Is there something like a
> linker which remembers the undefined symbols and
> backpatches them? If it's invalid, how do you do mutually
> recursive functions? Thanks.

DEFER/IS is the only portable way I'm aware of to handle the case of mutual
recursion. Other proprietary methods may be implemented in specific Forths.

John Passaniti

unread,

Dec 27, 2003, 10:27:32 PM12/27/03

to

"John Passaniti" <nn...@JapanIsShinto.com> wrote in message
news:l1sHb.36117$9y....@news02.roc.ny...

> DEFER/IS is the only portable way I'm aware of to handle the
> case of mutual recursion. Other proprietary methods may be
> implemented in specific Forths.

Whoops. I thought DEFER/IS were part of Standard Forth. I guess not.
Regardless, they can be defined using Standard Forth words, as was done
elsewhere in this thread.

Gary Chanson

unread,

Dec 28, 2003, 12:41:19 AM12/28/03

to

"Albert van der Horst" <alb...@spenarnc.xs4all.nl> wrote in message
news:HqK8Js.GMK...@spenarnc.xs4all.nl...

>
> You have to do something as described above to have mutually
> recursive functions. But indeed some Forths have mechanism for
> automatic forward definitions. It seems one of those things
> that everybody tries to implement in his Forth, and then
> never it is never used.

Quest32 supports automatic forward referencing and I use it fairly
regularly in larger programs. Most often, this is because it's simply
convenient to group logically related functions together and there's no
convenient way to load everything in exactly the right order. Occasionally,
it's because there's no practical way to avoid all forward references.

--

-GJC [MS Windows SDK MVP]
-Software Consultant (Embedded systems and Real Time Controls)
-gcha...@mvps.org

-Abolish public schools

Doug Hoffman

unread,

Dec 28, 2003, 6:35:56 AM12/28/03

to

> From: Paul Rubin <http://phr...@NOSPAM.invalid>
>

> I guess my question was partially cultural, i.e. not just "how can you
> do this in Forth", but "how do Forth programmers deal with this?".
> The answer seems to be "don't do that", which I guess has been working
> ok for the people who actually face the question in practice, so
> that's empirically a good enough answer.

I don't think "don't do that" is really the way to look at it. I, like Gary
Chanson, use deferred words quite a bit in larger projects for the same
reason(s) he cites. Sometimes it is just plain convenient to do so compared
to re-arranging your code.

I presume the example you gave was just for illustration since it would be
insane to not put the one definition in front of the other in such a simple
case.

-Doug

Paul Rubin

unread,

Dec 28, 2003, 7:00:19 AM12/28/03

to

Doug Hoffman <dhof...@journey.com> writes:
> I presume the example you gave was just for illustration since it
> would be insane to not put the one definition in front of the other
> in such a simple case.

Yes, of course. But I do tend to code in a top-down fashion, which
means forward references unless I reorder the code after writing it.

jonah thomas

unread,

Dec 28, 2003, 8:21:28 AM12/28/03

to

I've found for small projects there's no problem reordering the code.

I can program top-down by writing a high-level definition first, then
write a word it calls *above* it in the file, and write a word which
that word calls above *it* and so on. It isn't hard to start at the end
of the file and work toward the beginning.

But when I do it that way it may turn out that the natural way to write
the low-level definitions leaves me rewriting the later code since the
way I was thinking at the beginning just didn't turn out well. This is
because I tend to do things that look challenging, that I'm not sure
exactly how to do. When it's clear ahead of time how to do it then
there's no problem writing top down or bottom up either way.

When I'm not clear about the details it's easier to get clear by coding
than by thinking real hard first and doing it all in my head and then
coding. (With a language where coding was harder it would be easier to
think first.)

So when I start top-down usually I wind up throwing away a lot of
top-level code that didn't work out. That's OK, it helped me figure out
what I needed to do, and it wasn't tested code anyway, it was just
specifications written to look like Forth. And when I start bottom-up I
wind up throwing away some tested code that it turned out I didn't need.
That's OK too, it was low-level short routines that were easy to code
and test, and having tested code to build the next level with meant I
could test the next level as I went, so it was all easier than writing a
lot of code and testing it later.

Doug Hoffman

unread,

Dec 28, 2003, 8:28:10 AM12/28/03

to

> From: Paul Rubin <http://phr...@NOSPAM.invalid>
>

OK. In that case Forth may not be the language for you.

One thing I should point out about part of the "Forth style" of programming.
The idea is to code and test subordinate routines *before* one continues on
to code another routine that calls the first. That may have been a clumsy
way to state it but hopefully you get the idea. It really does make a lot
of sense to "test" a component of your code before use. With Forth's
interactivity and parameter passing via the stack this is a very natural way
to program and leads to fewer bugs in the finished product, IMHO.

-Doug

Ed Beroset

unread,

Dec 28, 2003, 9:26:47 AM12/28/03

to

Doug Hoffman wrote:
>>From: Paul Rubin <http://phr...@NOSPAM.invalid>
>>
>>Doug Hoffman <dhof...@journey.com> writes:
>>
>>>I presume the example you gave was just for illustration since it
>>>would be insane to not put the one definition in front of the other
>>>in such a simple case.
>>
>>Yes, of course. But I do tend to code in a top-down fashion, which
>>means forward references unless I reorder the code after writing it.
>
>
> OK. In that case Forth may not be the language for you.
>
> One thing I should point out about part of the "Forth style" of programming.
> The idea is to code and test subordinate routines *before* one continues on
> to code another routine that calls the first. That may have been a clumsy
> way to state it but hopefully you get the idea. It really does make a lot
> of sense to "test" a component of your code before use.

I find that I do some of each -- top down and bottom up, depending on
the particular portion of the code. Sometimes I work on some of the
high level words and get the logic correct on a PC first with stubs for
some of the low level code. As an example, I might want to make sure
that the task switching works as intended with the high level code, and
provide only "toy" data from a simulated EEPROM interface word. This is
often easier than doing everything on the target. It can also be
superior for test purposes, too. I can run a lot more automated testing
in a given period of time on a 3.2GHz Pentium 4 machine than on a 3MHz
embedded processor.

Ed

Gary Chanson

unread,

Dec 28, 2003, 3:49:20 PM12/28/03

to

"Paul Rubin" <http://phr...@NOSPAM.invalid> wrote in message

news:7xsmj5n...@ruckus.brouhaha.com...

Top-down design might be advantageous (but I have some doubts about even
that), but top-down implementation is a mistake. I typically start
somewhere in the middle and work out (but if you look at my code, you'd
probably think I start at the bottom).

andr...@littlepinkcloud.invalid

unread,

Dec 28, 2003, 4:44:43 PM12/28/03

to

Tyler <nl...@junglemate.com> wrote:
> Paul Rubin <http://phr...@NOSPAM.invalid> wrote in message news:<7xoetuz...@ruckus.brouhaha.com>...
>> Doug Hoffman <dhof...@journey.com> writes:
>> > defer square
>> > : sum_of_squares ( a b -- sum ) square swap square + ;
>> > : (square) ( n -- square ) dup * ;
>> > ' (square) is square
>> >
>> > Not sure if the above is ANS. It may be implementation specific. But you
>> > get the idea.
>>
>> I don't really understand the above but I should probably just look at
>> some more docs. I'm not familiar with defer or the quote operator.

> Since the Forth PhDs have given you the details, let me offer the
> nickel tour version:

> Basically, you're manually setting up something analogous to a
> C++ virtual function.

No it isn't, it's a pointer to a function. As in:

#include <stdio.h>

int (*square) (int);

int
real_square (int n)
{
return n*n;
}

int
main(void)
{
square = real_square;
printf ("%d\n", (*square)(9));
return 0;
}

No C++ needed.

Andrew.

Julian V. Noble

unread,

Dec 30, 2003, 11:36:45 AM12/30/03

to

Actually, I think this is an excellent explanation. Thanks, Jonah.

--
Julian V. Noble
Professor Emeritus of Physics
j...@lessspamformother.virginia.edu
^^^^^^^^^^^^^^^^^^
http://galileo.phys.virginia.edu/~jvn/

"God is not willing to do everything, and thereby take away our free wiil
and that share of glory that rightfully belongs to ourselves."

-- N. Machiavelli, "The Prince".

Richard Snow

unread,

Jan 2, 2004, 2:53:10 PM1/2/04

to

Gary Chanson wrote:

>
> "Paul Rubin" <http://phr...@NOSPAM.invalid> wrote in message
> news:7xsmj5n...@ruckus.brouhaha.com...
>> Doug Hoffman <dhof...@journey.com> writes:
>> > I presume the example you gave was just for illustration since it
>> > would be insane to not put the one definition in front of the other
>> > in such a simple case.
>>
>> Yes, of course. But I do tend to code in a top-down fashion, which
>> means forward references unless I reorder the code after writing it.
>
> Top-down design might be advantageous (but I have some doubts about
> even
> that), but top-down implementation is a mistake. I typically start
> somewhere in the middle and work out (but if you look at my code, you'd
> probably think I start at the bottom).
>

I have a method in my Michael language that solves this
problem for me. I have "Include files".

to include a file containing some code to be executed, I use
#filename

These can be nested up 20 levels. So I write the code
topdown, and then when the compiler complains about
an undefined token, I insert an include in tne appropriate
place. Then I compile again. I then get a message about an
"include file" not being found, I can code that file, and
try includeing my top level definition again. All of this
becomes fairly natural after a while and lets me design
top down but test bottom up.

Richard

--
Richard S. Snow
http://veteransinaction.org
http://richardsnow.is-a-geek.net (test server)

Gary Chanson

unread,

Jan 2, 2004, 9:12:32 PM1/2/04

to

"Richard Snow" <richa...@bellsouth.net> wrote in message
news:6ZjJb.11719$ml6....@bignews4.bellsouth.net...

>
> I have a method in my Michael language that solves this
> problem for me. I have "Include files".
>
> to include a file containing some code to be executed, I use
> #filename

And so do I...

> These can be nested up 20 levels. So I write the code
> topdown, and then when the compiler complains about
> an undefined token, I insert an include in tne appropriate
> place. Then I compile again. I then get a message about an
> "include file" not being found, I can code that file, and
> try includeing my top level definition again. All of this
> becomes fairly natural after a while and lets me design
> top down but test bottom up.

Yes, but what do you do when a function in file A wants to call a
function in file B and vice versa? Sometimes there isn't a convenient way
to subdivide a program into files without tripping over this problem.

Anil Rodrigues

unread,

Jan 2, 2004, 10:47:44 PM1/2/04

to

I use simple Forth in embedded controls.
As for " f calls g, and g calls f", this is something I specifically would
avoid, though I
could do it with some convolutions;
every call pushes the Return stack, traditionally, so there is the real danger
of it
quickly overflowing.

Cheers! Anil

Paul Rubin wrote:I guess my question was partially cultural, i.e. not just

Jeff Massung

unread,

Jan 2, 2004, 11:48:36 PM1/2/04

to

"Gary Chanson" <gcha...@No.Spam.TheWorld.net> wrote in
news:kypJb.20756$x34....@nwrdny02.gnilink.net:

> Yes, but what do you do when a function in file A wants to call a
> function in file B and vice versa? Sometimes there isn't a convenient
> way to subdivide a program into files without tripping over this
> problem.
>

Every time I think I've run into this problem, it really isn't. In my
experience, the reason A needs B and B needs A is because they both do C
and D. Just factor out the common code and it falls into place.

--
Best regards,
Jeff Massung jma[at]mfire.com
http://www.jm-basic.com/

Paul Rubin

unread,

Jan 2, 2004, 11:58:08 PM1/2/04

to

Jeff Massung <jma[at]nospam.mfire.com> writes:
> Every time I think I've run into this problem, it really isn't. In my
> experience, the reason A needs B and B needs A is because they both do C
> and D. Just factor out the common code and it falls into place.

There's lots of reasons for wanting mutual recursion. Writing
recursive descent parsers is a typical application.

Gary Chanson

unread,

Jan 3, 2004, 1:27:19 AM1/3/04

to

"Jeff Massung" <jma[at]nospam.mfire.com> wrote in message
news:Xns9464DDDCFCAB0...@216.168.3.50...

> "Gary Chanson" <gcha...@No.Spam.TheWorld.net> wrote in
> news:kypJb.20756$x34....@nwrdny02.gnilink.net:
>
> > Yes, but what do you do when a function in file A wants to call a
> > function in file B and vice versa? Sometimes there isn't a convenient
> > way to subdivide a program into files without tripping over this
> > problem.
> >
>
> Every time I think I've run into this problem, it really isn't. In my
> experience, the reason A needs B and B needs A is because they both do C
> and D. Just factor out the common code and it falls into place.

Sometime that does work. Sometimes it isn't convenient.

Anton Ertl

unread,

Jan 16, 2004, 11:47:08 AM1/16/04

to

jonah thomas <j2th...@cavtel.net> writes:
>I wonder if DEFER and state-smart belong in the FAQ?

Maybe.

> I wonder if
>they're there, I haven't looked at it for a long time.

Currently they are not.

>Maybe it's time for another explanation about state-smart.

It's quite good, but too long for full inclusion in the FAQ. If you
put it on a web page, I will put a few introductory remarks and a link
to it and to some other stuff on the topic in the FAQ.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html