question about forth internals

31 views
Skip to first unread message

p da

unread,
Apr 17, 2025, 2:37:45 PMApr 17
to 4th-co...@googlegroups.com
Maybe this will be off-topic but I was reading about forth internals and there's something I didn't fully understand.

In most forth the execution of words is done in what is called "internal loop" or NEXT  with the help of EXIT (;)  which returns from nested words and ENTER which enters in a new colon word (executes it) just beginning a new nested execution of next.   The question is when a new word in colon word definition is to be executed, ENTER is called which push current ip (next word to be executed in original word) and recalculates all register for init a new words thread in the new colon definition, executing word after word until find ; (EXIT)  which pop out the IP from return stack and thus restart the original execution dealing with next word who will have an EXIT at end that unnest to the calling word and so on.

But my question is what about the first one?   EXIT always pop IP, so it assumes IP will be in return stack but this is not always the case, because when executing a word in main loop there''s no ENTER routing pushing IP to the return stack,  as far as I know executing a word simply writing it in forth "terminal" implies looking up the word in dictionary and EXECUTE it  so I supposed it would be EXECUTE who push IP into return stack but watching some EXECUTE implementarions they don't do that,  not INTERPRET and also not QUIT.  so, when a colon word is invoked by writing it in forth REPL (or executed like ' WORD EXECUTE) where it will return when executing EXIT?

thanks in advance if you provide any hint

regards.


The Beez

unread,
Apr 18, 2025, 7:59:43 AMApr 18
to 4tH-compiler
Note there is a huge difference between 4tH and Forth. 4tH is more like a virtual processor with opcodes, which are executed in quick succession, while Forth is usually indirectly threaded. However, the effect is essentially the same - we jump around till we hit the machine code (in 4tH: compiled C).

Forth has an inner and an outer interpreter. The text interpreter looks like this:
: interpret ( -- )
  BEGIN ( )
    bl word dup c@                     \ scan next token
  WHILE ( c-addr )                     \ another token found
    find dup 1 = IF                    \ lookup in dictionary
      drop execute                     \ immediate
    ELSE          
      dup -1 = IF                      \ everything else
        drop state @
        IF compile,
ELSE execute THEN
      ELSE                             \ word not found, number?
        drop 0 0 rot count over c@ [char] - = dup >r
        IF 1- swap char+ swap THEN >number
        IF #notfound throw THEN drop drop r>
        IF negate THEN state @
        IF postpone LITERAL THEN       \ compile literal
      THEN
    THEN
  REPEAT drop ;

If FIND returns -1, it depends on STATE what has to be done. If it is zero, it is EXECUTEd - and that's. We're done. ENTER is usually defined like:

PUSH IP ; onto the "return address stack"
W+offset -> IP ; W still points to the Code Field, so W+offset is the address of the body
JUMP NEXT

You clearly see who is dumping the address to the Return stack.. EXIT is:

POP IP ; from the "return address stack"
JUMP NEXT

And there the return address is recuperated. Subroutine threaded code is even simpler. It's just a list of CALLs. CALL (as a machine instruction) will dump a return address on the stack, which is retrieved by RET. The text interpreter is just an assembler of call threads.
In short, every high level Forth word dumps its return address, recurse in subsequent Forth words until an EXIT is encountered.

But frankly, are you sure you're at the correct forum? These are questions concerning the classic Forth architecture, not the Forth language or the 4tH compiler. As a matter of fact, I've never written a classic Forth architecture since 4tH's architecture is radically different. 4tH implements the Forth language, not the architecture. So to be honest, I'm not the best person to ask those questions. It's not quite my expertise.

Hans Bezemer

p da

unread,
Apr 18, 2025, 9:37:08 AMApr 18
to 4th-co...@googlegroups.com
On Fri, Apr 18, 2025 at 1:59 PM The Beez <the.bee...@gmail.com> wrote:

 

But frankly, are you sure you're at the correct forum? These are questions concerning the classic Forth architecture, not the Forth language or the 4tH compiler. As a matter of fact, I've never written a classic Forth architecture since 4tH's architecture is radically different. 4tH implements the Forth language, not the architecture. So to be honest, I'm not the best person to ask those questions. It's not quite my expertise.

 A lot of thanks for your reply which is very complete and accurate, special thanks because it is a huge off-topic and also because as you said forth classic implementation is not your expertise, so I thank you very much.

I have a doubt about you've explained here involving execution of non colon words but prefer not to continue the off-topic, so simply thanks.

Finally I have a on topic question about 4th and not forth but prefer to start a new post in a new email.

thanks a lot and regards.


Hans Bezemer

--
You received this message because you are subscribed to the Google Groups "4tH-compiler" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 4th-compiler...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/4th-compiler/fabeb89e-9e74-4579-a9b1-9ea09424e41cn%40googlegroups.com.

The Beez

unread,
Apr 24, 2025, 7:40:40 AMApr 24
to 4tH-compiler
Some of the main differences between 4tH and Forth:
1. The 4tH compiler is essentially the text interpreter. The 4tH VM is essentially the address interpreter;
2. You cannot switch between the 4tH text interpreter and the 4tH address interpreter;
3. Consequently, 4tH does not have a STATE variable;
4. All 4tH's immediate words cannot be overridden or changed - nor can you add new ones from within the bare 4tH system (the preprocessor can do some, though);
5. All dictionary related words have to be resolved during 4tH's compile time - since after compilation, the dictionary is gone;
6. There is no user controlled parsing during 4tH's compile time - since after compilation, the source is gone.

That means INTERPRET looks like this (compilation, execution, not at all)

: interpret ( -- )
  BEGIN ( )
    bl word dup c@                     \ scan next token
  WHILE ( c-addr )  
                  \ another token found
    find dup 1 = IF                    \ lookup in dictionary
      drop execute                     \ immediate

    ELSE          
      dup -1 = IF                      \ everything else
        drop state @
        IF
compile,
ELSE execute THEN
      ELSE                             \ word not found, number?
        drop 0 0 rot count over c@ [char] - = dup >r
        IF 1- swap char+ swap THEN >number
        IF #notfound throw THEN drop drop r>
        IF negate THEN state @
        IF postpone LITERAL THEN       \ compile literal

      THEN
    THEN
  REPEAT drop ;

The address interpreter in exec_4th.4th is quite simple:

for (;(Object->ErrLine < Object->CodeSiz); Object->ErrLine++)
       switch (Object->CodeSeg [Object->ErrLine].Word) {
         ..
       }     

Yes, the IP is part of the 4tH applet - which makes it easy to trap the location of any errors, since it is saved in the applet itself. It's value is set upon creation (comp_4th.c, load_4th.c). Note the IP always increases by "one" - no matter which opcode has been executed. Even after BRANCH or CALL. Those instructions always seem to point to a location BEFORE the actual jump-location - and now you know why. It just made things easier. The GCC version of exec_4th.c is a bit more complex, but it works on the same principle.

Note a lot (but not all) of these limitations can be overcome by using the preprocessor - like IMMEDIATE words with POSTPONEs, special datatypes, renaming 4tH's internal words, adding or switching parameters (SYNONYM, BEGIN-STRUCTURE). 
Now, it will never match the full capabilities of Forth, you can come quite close.. And in some cases even exceed it.

Hans Bezemer

The Beez

unread,
Apr 24, 2025, 7:49:28 AMApr 24
to 4tH-compiler
Fun thing - compare this one to the version in interprt.4th - which works with an external dictionary (which BTW, doesn't have any immediate words, since there's nothing to compile):

: interpret                            ( --)
  begin                                ( --)
    bl parse-word dup                  \ scan next token
  while
    dictionary 2 string-key row        \ lookup in dictionary
    if                                 ( a n x)
      nip nip cell+ @c execute         \ everything else
    else                               ( a n)
      drop 2dup number error?          \ convert to a number
      if drop NotFound else -rot 2drop then
    then                               \ failed: issue an error message
  repeat                               ( --)
  drop drop                            ( --)
;

Hans Bezemer
Reply all
Reply to author
Forward
0 new messages