dot-quote implementation question

Rod Pemberton

unread,

Dec 10, 2011, 5:28:59 AM12/10/11

to

fig-Forth's dot-quote ." when compiled compiles (.") which is a word for
typing compiled strings. I'm wondering why a CFA routine, e.g., DOCON
DOCOL etc, was *not* used to implement string variables or string constants?
By CFA routine, I mean DOCOL DOCON DOVAR DOUSER

Also, fig-Forth marks CFA routines with ;CODE Apparently, the ;CODE
routines in : CONSTANT VARIABLE and USER were named as DOCOL
DOCON DOVAR and DOUSER at some point. With what Forth were those
names defined?

Rod Pemberton

Anton Ertl

unread,

Dec 10, 2011, 7:00:36 AM12/10/11

to

"Rod Pemberton" <do_no...@noavailemail.cmm> writes:
>
>fig-Forth's dot-quote ." when compiled compiles (.") which is a word for
>typing compiled strings. I'm wondering why a CFA routine, e.g., DOCON
>DOCOL etc, was *not* used to implement string variables or string constants?
>By CFA routine, I mean DOCOL DOCON DOVAR DOUSER

Probably because fig-Forth did not define string variables as separate
things. The address of a counted string can be stored in a normal
cell-sized variable or a constant can be named for it.

>Also, fig-Forth marks CFA routines with ;CODE Apparently, the ;CODE
>routines in : CONSTANT VARIABLE and USER were named as DOCOL
>DOCON DOVAR and DOUSER at some point. With what Forth were those
>names defined?

I guess fig-Forth. Fig-Forth was distributed as assembler listings
for conventional assemblers, and these routines had these names in the
assembly listings.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2011: http://www.euroforth.org/ef11/

BruceMcF

unread,

Dec 10, 2011, 9:52:07 AM12/10/11

to

On Dec 10, 7:00 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> "Rod Pemberton" <do_not_h...@noavailemail.cmm> writes:
> >Also, fig-Forth marks CFA routines with ;CODE Apparently, the ;CODE
> >routines in : CONSTANT VARIABLE and USER were named as DOCOL
> >DOCON DOVAR and DOUSER at some point. With what Forth were those
> >names defined?

> I guess fig-Forth. Fig-Forth was distributed as assembler listings
> for conventional assemblers, and these routines had these names in the
> assembly listings.

Note that the names respected the sometimes tight constraints on
assembler labels in some assemblers of the day ~ all upper case, all
ASCII text, labels short ~ it looks like five chars or less in the
6502 printout (which gives DOUSER as DOUSE) (which allows an 8byte
fixed width table with chars, count or nul terminator, and two bytes
to hold the address of the label).

Rod Pemberton

unread,

Dec 10, 2011, 9:10:13 PM12/10/11

to

"Anton Ertl" <an...@mips.complang.tuwien.ac.at> wrote in message
news:2011Dec1...@mips.complang.tuwien.ac.at...

> "Rod Pemberton" <do_no...@noavailemail.cmm> writes:
> >
> > fig-Forth's dot-quote ." when compiled compiles (.") which is a word for
> > typing compiled strings. I'm wondering why a CFA routine, e.g., DOCON
> > DOCOL etc, was *not* used to implement string variables or string
> > constants? By CFA routine, I mean DOCOL DOCON DOVAR DOUSER
>
> Probably because fig-Forth did not define string variables as separate
> things. The address of a counted string can be stored in a normal
> cell-sized variable or a constant can be named for it.
>

Do you (or anyone else) think string variables are useful functionality?
Missing functionality? If not, why not? Many HLLs have them. C doesn't
fully support them, which is more like Forth.

Rod Pemberton

A. K.

unread,

Dec 11, 2011, 2:59:04 AM12/11/11

to

On 11.12.2011 03:10, Rod Pemberton wrote:
> Do you (or anyone else) think string variables are useful functionality?
> Missing functionality? If not, why not? Many HLLs have them. C doesn't
> fully support them, which is more like Forth.
>

Space, the final frontier. These are the voyages of the starship
Enterforth. Its infinite mission, to explore strange new worlds to seek
out new life and new variables, to boldly go where no C has gone before.

Beam me down, Scottie.

It's 2011 and people in c.l.f. are still trying to parse strings...

:o)

Andrew Haley

unread,

Dec 11, 2011, 4:40:13 AM12/11/11

to

Rod Pemberton <do_no...@noavailemail.cmm> wrote:
>
> Do you (or anyone else) think string variables are useful functionality?

Sometimes.

> Missing functionality?

Missing from what? You want one, define one. How is a string
variable going to be any different from create ... allot ?

Andrew.

Anton Ertl

unread,

Dec 11, 2011, 5:33:36 AM12/11/11

to

"Rod Pemberton" <do_no...@noavailemail.cmm> writes:
>Do you (or anyone else) think string variables are useful functionality?

Lot's of people have thought that better support for strings,
including string variables, are a useful functionality. However, none
of the many string packages that have been proposed and implemented
have gained enough support that they have made a breakthrough.

My own thinking is that what the standard provides is mostly a local
optimum for the kind of memory management we get with Forth. For
anything significantly better (where string variables would make
sense), we would need automatic storage management (garbage collection
or reference counting).

Still, I have added a few words I found useful to Gforth, e.g.,
SAVE-MEM, STR<, STR=, STR-PREFIX?, >STRING-EXECUTE.

ken...@cix.compulink.co.uk

unread,

Dec 12, 2011, 3:53:43 AM12/12/11

to

In article <2011Dec1...@mips.complang.tuwien.ac.at>,

an...@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

> However, none
> of the many string packages that have been proposed and implemented
> have gained enough support that they have made a breakthrough.

There is also the point that there are languages designed with string
handling in mind which would make more sense to use. Still Forth has
much the same support for strings as languages like Fortran and Pascal.

Ken Young

Rod Pemberton

unread,

Dec 12, 2011, 5:04:18 AM12/12/11

to

"Andrew Haley" <andr...@littlepinkcloud.invalid> wrote in message
news:HJqdneRGLvDg53nT...@supernews.com...

It would be a true variable and it would need a CFA routine, like DOCON or
DOVAR etc, e.g., say DOSTR or somesuch ... That functionality is not
present in Forth, i.e., missing functionality. Current functionality is
close for string constants, but does not use a CFA routine. A true string
variable would only be a string variable, like words created by CONSTANT or
VARIABLE. Currently, string constants in Forth can be compiled into any
word. So, there would probably be a word, say STRING, to create a string
variable that would store DOSTR in the CFA field. A word created with
STRING could only hold a string, just like a word created by CONSTANT or
VARIABLE can only hold integer constants and variables.

Rod Pemberton

Andrew Haley

unread,

Dec 12, 2011, 5:56:15 AM12/12/11

to

Rod Pemberton <do_no...@noavailemail.cmm> wrote:
> "Andrew Haley" <andr...@littlepinkcloud.invalid> wrote in message
> news:HJqdneRGLvDg53nT...@supernews.com...
>> Rod Pemberton <do_no...@noavailemail.cmm> wrote:
>> >
>> > Do you (or anyone else) think string variables are useful functionality?
>>
>> Sometimes.
>>
>> > Missing functionality?
>>
>> Missing from what? You want one, define one. How is a string
>> variable going to be any different from create ... allot ?
>
> It would be a true variable and it would need a CFA routine, like DOCON or
> DOVAR etc, e.g., say DOSTR or somesuch ... That functionality is not
> present in Forth, i.e., missing functionality.

But what would that routine do? It'd leave its address on the stack.
Which is what CREATE does.

> Current functionality is close for string constants, but does not
> use a CFA routine. A true string variable would only be a string
> variable, like words created by CONSTANT or VARIABLE. Currently,
> string constants in Forth can be compiled into any word. So, there
> would probably be a word, say STRING, to create a string variable
> that would store DOSTR in the CFA field. A word created with STRING
> could only hold a string, just like a word created by CONSTANT or
> VARIABLE can only hold integer constants and variables.

What does DOSTR do? A sample implementation would help?

Andrew.

BruceMcF

unread,

Dec 12, 2011, 10:42:25 AM12/12/11

to

On Dec 12, 5:04 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "Andrew Haley" <andre...@littlepinkcloud.invalid> wrote in message

> > Missing from what? You want one, define one. How is a string
> > variable going to be any different from create ... allot ?

> It would be a true variable and it would need a CFA routine,
> like DOCON or DOVAR etc, e.g., say DOSTR or somesuch ...
> That functionality is not present in Forth, i.e., missing
> functionality.

What difference does it make about it having "its own CFA routine"?

Since whether it has its own CFA routine rather than a DOES> link is
an implementation details as opposed to a functionality, not having it
is not missing a functionality.

Indeed, Forth94 is perfectly happy with:

: CONSTANT CREATE , DOES> @ ;

... so having an *integer* constant does not imply the existence of a
DOCON routines.

Rod Pemberton

unread,

Dec 13, 2011, 9:47:23 AM12/13/11

to

"BruceMcF" <agi...@netscape.net> wrote in message
news:b1b9cfa4-29dd-431d...@v6g2000yqv.googlegroups.com...

>
> What difference does it make about it having "its own CFA routine"?

Two of the historical CFA routines, DOCON and DOVAR, seem to
implement two types for Forth: integer constant and integer variable.
I was thinking along the lines that additional routines could add additional
types.

> Since whether it has its own CFA routine rather than a DOES> link is
> an implementation details as opposed to a functionality, not having it
> is not missing a functionality.

Are you saying the CFA routines for DOVAR and DOCON are
sometimes one or both eliminated?

Are you saying all CFA routines except DOCOL can be replaced by
DODOES ("a DOES> link") to the appropriate functionality?

> Indeed, Forth94 is perfectly happy with:
>
> : CONSTANT CREATE , DOES> @ ;
>
> ... so having an *integer* constant does not
> imply the existence of a DOCON routines.

True. Although, it seems likely that DOCON is still used internally for
pre-compiled constants. Otherwise, I'd think the dictionary, at least the
pre-compiled portion, would be somewhat convoluted to implement, if using
DODOES and @ instead of DOCON. Wouldn't it?

AIUI, DOVAR and @ can be equivalent to DOCON. DODOES
performs DOVAR as part of it's functionality.

Rod Pemberton

Rod Pemberton

unread,

Dec 13, 2011, 10:08:42 AM12/13/11

to

"Andrew Haley" <andr...@littlepinkcloud.invalid> wrote in message

news:7pmdnYvXSP5SQHjT...@supernews.com...

> Rod Pemberton <do_no...@noavailemail.cmm> wrote:
> > "Andrew Haley" <andr...@littlepinkcloud.invalid> wrote in message
> > news:HJqdneRGLvDg53nT...@supernews.com...
> >> Rod Pemberton <do_no...@noavailemail.cmm> wrote:
> >> >

> >> How is a string
> >> variable going to be any different from create ... allot ?
> >
> > It would be a true variable and it would need a CFA routine, like
> > DOCON or DOVAR etc, e.g., say DOSTR or somesuch ... That
> > functionality is not present in Forth, i.e., missing functionality.
>
> But what would that routine do? It'd leave its address on the stack.

The address of a word with the address of DOSTR in it's CFA may be the
address of that word's NFA or LFA. So, DOSTR would leave the address
of the variable's data location on the stack, e.g., possibly PFA. DOSTR
might need to do other stuff too. The CFA routine DOVAR does this for
integer variables. DOVAR doesn't need to do other stuff. DOVAR and
DOCON essentially create routines for implementing two types: integer
variables and integer constants. I was thinking that a DOSTR routine
could create a type for variable strings. However, I guess DOVAR could
be re-used for any variable type that returns an address, which doesn't
need to modify the control-flow of the interpreter, and doesn't need to
do other stuff too. DODOES modifies the control flow. To re-use
DOVAR, strings and variables should be stored in the same memory area.

> Which is what CREATE does.

No, CREATE does not get executed when a variable is interpreted or
executed in an ITC Forth. The CFA routine is what gets executed. For
some implementations, the following is true. It depends on the
implementation. The CFA routine for variables is DOVAR. CREATE gets
executed when a variable or constant are created using VARIABLE or
CONSTANT. CREATE compiles DOVAR into the CFA for words
created by both VARIABLE and CONSTANT since they both use
CREATE. For words created by VARIABLE, the CFA remains set to
DOVAR. For words created by CONSTANT, DODOES eventually
overwrites the CFA which was set to DOVAR by CREATE because
CONSTANT uses DOES> .

Rod Pemberton

Rod Pemberton

unread,

Dec 17, 2011, 4:57:26 AM12/17/11

to

"Rod Pemberton" <do_no...@noavailemail.cmm> wrote in message
news:jc7oeq$76d$1...@speranza.aioe.org...

> "BruceMcF" <agi...@netscape.net> wrote in message
> news:b1b9cfa4-29dd-431d...@v6g2000yqv.googlegroups.com...
> >

Did my posts come across? One to Andrew and one to Bruce? ...

> > Since whether it has its own CFA routine rather than a DOES> link is
> > an implementation details as opposed to a functionality, not having it
> > is not missing a functionality.
>
> Are you saying the CFA routines for DOVAR and DOCON are
> sometimes one or both eliminated?
>
> Are you saying all CFA routines except DOCOL can be replaced by
> DODOES ("a DOES> link") to the appropriate functionality?
>

I was hoping somebody would reply to those. I previously found the
following definition in a Forth for Unix:

: CREATE HEADER COMPILE (DOES>) 0 , DOES> ;

I really didn't want to discuss that definition until someone answered my
questions. I figured if I mentioned that definition first, it would flavor
the conversation towards that definition.

> > Indeed, Forth94 is perfectly happy with:
> >
> > : CONSTANT CREATE , DOES> @ ;
> >
> > ... so having an *integer* constant does not
> > imply the existence of a DOCON routines.
>
> True. Although, it seems likely that DOCON is still used internally for
> pre-compiled constants. Otherwise, I'd think the dictionary, at least the
> pre-compiled portion, would be somewhat convoluted to implement, if using
> DODOES and @ instead of DOCON. Wouldn't it?
>

I was hoping somebody would respond to that to. Does not having
DOCON make implementing pre-compiled constants more difficult?

Rod Pemberton

Andrew Haley

unread,

Dec 17, 2011, 5:20:57 AM12/17/11

to

Rod Pemberton <do_no...@noavailemail.cmm> wrote:
> "Rod Pemberton" <do_no...@noavailemail.cmm> wrote in message
> news:jc7oeq$76d$1...@speranza.aioe.org...
>> "BruceMcF" <agi...@netscape.net> wrote in message
>> news:b1b9cfa4-29dd-431d...@v6g2000yqv.googlegroups.com...
>> >
>
> Did my posts come across? One to Andrew and one to Bruce? ...
>
>> > Since whether it has its own CFA routine rather than a DOES> link is
>> > an implementation details as opposed to a functionality, not having it
>> > is not missing a functionality.
>>
>> Are you saying the CFA routines for DOVAR and DOCON are
>> sometimes one or both eliminated?
>>
>> Are you saying all CFA routines except DOCOL can be replaced by
>> DODOES ("a DOES> link") to the appropriate functionality?

Sure. Why not?

> I was hoping somebody would reply to those. I previously found the
> following definition in a Forth for Unix:
>
> : CREATE HEADER COMPILE (DOES>) 0 , DOES> ;
>
> I really didn't want to discuss that definition until someone answered my
> questions. I figured if I mentioned that definition first, it would flavor
> the conversation towards that definition.

That looks odd. I wonder what (DOES>) is. It may be something to do
with the old fig-FORTH <BUILDS DOES> that required an extra cell after
the code field.

>> > Indeed, Forth94 is perfectly happy with:
>> >
>> > : CONSTANT CREATE , DOES> @ ;
>> >
>> > ... so having an *integer* constant does not
>> > imply the existence of a DOCON routines.
>>
>> True. Although, it seems likely that DOCON is still used
>> internally for pre-compiled constants. Otherwise, I'd think the
>> dictionary, at least the pre-compiled portion, would be somewhat
>> convoluted to implement, if using DODOES and @ instead of DOCON.
>> Wouldn't it?
>
> I was hoping somebody would respond to that to. Does not having
> DOCON make implementing pre-compiled constants more difficult?

Why on Earth should it? Constants are either going to be handled as

: constant create , ;code ...

or

: constant create , does> ...

or something that's equivalent. Surely the question of whether to use
code or high-level is only a matter of efficiency.

Andrew.

BruceMcF

unread,

Dec 17, 2011, 11:15:25 AM12/17/11

to

On Dec 13, 9:47 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "BruceMcF" <agil...@netscape.net> wrote in message

> > What difference does it make about it having "its own CFA routine"?

> Two of the historical CFA routines, DOCON and DOVAR, seem to
> implement two types for Forth: integer constant and integer variable.

The key word there is "implement". And of course, historically those
are not types at all ~ they are equally integer constants and
variables, character constants and variables, and pointer constants
and variables.

> I was thinking along the lines that additional routines could
> add additional types.

Yes, in a non-type-checking language, first order "types" would seem
to mean a system of routines for operating on that data. And yes,
implementing a constant and a variable is a reasonable part of that.

But whether the additional routines are primitives or compiled Forth
code is not an essential distinction for that. Implementing string
variables and constants that rely on DOES> routines for their function
would not make any essential difference to that.

>> Since whether it has its own CFA routine rather than a DOES> link is
>> an implementation details as opposed to a functionality, not having it
>> is not missing a functionality.

> Are you saying the CFA routines for DOVAR and DOCON are
> sometimes one or both eliminated?

DOCON definitely ~ its a question of efficiency (you don't want slow
constants) and sometimes bootstrapping whether its needed. Indeed,
going back to the $10 computer project where the target is the
venerable 6502 instruction set, because the $10 computers are
*produced* to run almost equally venerable cracks of old NES games, if
RAM is precious and clock cycles are precious but the dictionary is
mostly in flash RAM that is paged into memory in big chunks, you might
have a subroutine threaded forth and the CONSTANT might be compiled by
copying the string:
DEX
LDA #0
STA DL,X
LDA #0
STA DH,X
RTS
... and then CONSTANT picking the constant values off the stack and
putting them in the right place.

Don't confuse the behavior of the words with the implementation of the
words ~ a danger if you are implementing a Forth without first
programming a couple of applications in Forth.

> Are you saying all CFA routines except DOCOL can be replaced by
> DODOES ("a DOES> link") to the appropriate functionality?

Well, that would rely heavily on CREATE, so in that implementation
approach, DOCREA and DODOES would be critical. DOCON is an efficiency
decision.

> > Indeed, Forth94 is perfectly happy with:
>
> > : CONSTANT CREATE , DOES> @ ;
>
> > ... so having an *integer* constant does not
> > imply the existence of a DOCON routines.
>
> True. Although, it seems likely that DOCON is still used
> internally for pre-compiled constants. Otherwise, I'd think
> the dictionary, at least the pre-compiled portion, would be
> somewhat convoluted to implement, if using DODOES and @ instead
> of DOCON. Wouldn't it?

You'd just build DODOES and @ and the HLL CONSTANT in the dictionary
before building the first pre-compiled constant. If you are meta-
compiling a free-standing system, the constant values required to
*build* the system are in the host, and the constants *in the target*
only need to be available when the target kernel starts. So you can
just as well embed the constant values in the definitions as literals
and build all the constants available on target start-up as the last
step.

> AIUI, DOVAR and @ can be equivalent to DOCON. DODOES
> performs DOVAR as part of it's functionality.

In that approach ~ which again, is an *implementation* approach not a
language behavior design question ~ DODOES is not in the CFA in any
event. Its compiled as a code stub in the word that contained DOES> to
start up the interpreting of the following section, and the CFA
contains a pointer to that code.

Given that there's more juggling required than for a CREATE without a
DOES>, when you originally create a word, a DOVAR or DOCREA is
normally put into it to avoid that overhead.

But if its indirect or direct threaded, and you were trying to bring
up fig-Forth fashion, but with a minimum of code, you COULD have a
stub in the coded section that does:

DOVAR:
CALL DODOES
.WORD FETCH,EXIT

... and that's your DOVAR. It depends on what your objectives are. If
you are just trying to bring up a Forth system from scratch fig-Forth
style, which might be slow but which can then be used to meta-compile
a more efficient one, having the smallest possible processor dependent
section and then a larger section assembling compiled Forth
definitions which is processor independent is one way to go.

Rod Pemberton

unread,

Dec 18, 2011, 7:18:27 AM12/18/11

to

"Andrew Haley" <andr...@littlepinkcloud.invalid> wrote in message

news:NP-dnSVkwaKU8HHT...@supernews.com...
> [...]

> Surely the question of whether to use
> code or high-level is only a matter of efficiency.
>

So far I have 37 pre-compiled primitives (or low-level words), 39
pre-compiled high-level words, and 30 words in ASCII text. It's in C, but
basically identical to the method used for fig-Forth assembly. In order to
implement the inner and outer interpreters, parsing words, dictionary words,
and primitives, I need constants as pre-compiled words. If I define them in
high-level code, they either redefinitions of the same word or unavailable
when I need them.

Hugh Aguilar was criticizing my decision to implement Forth in C, again.
So, I gave him a status update on alt.lang.asm. It's here if anyone wants
to read it:

http://groups.google.com/group/alt.lang.asm/msg/8182959029b4ba23

Rod Pemberton

Andrew Haley

unread,

Dec 18, 2011, 7:30:36 AM12/18/11

to

Rod Pemberton <do_no...@noavailemail.cmm> wrote:
> "Andrew Haley" <andr...@littlepinkcloud.invalid> wrote in message
> news:NP-dnSVkwaKU8HHT...@supernews.com...
>> [...]
>> Surely the question of whether to use
>> code or high-level is only a matter of efficiency.
>
> So far I have 37 pre-compiled primitives (or low-level words), 39
> pre-compiled high-level words, and 30 words in ASCII text. It's in
> C, but basically identical to the method used for fig-Forth
> assembly. In order to implement the inner and outer interpreters,
> parsing words, dictionary words, and primitives, I need constants as
> pre-compiled words. If I define them in high-level code, they
> either redefinitions of the same word or unavailable when I need
> them.

Ah, OK, so you've got yourself a nasty bootstrapping problem. There's
a number of good techniques for implementing Forth that are going to
be difficult or impractical to use if you're going via standard C.

Andrew.

Rod Pemberton

unread,

Dec 18, 2011, 7:53:15 AM12/18/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:36148df9-ee6c-43b3...@d17g2000yql.googlegroups.com...

>
> Don't confuse the behavior of the words with the implementation of the
> words ~ a danger if you are implementing a Forth without first
> programming a couple of applications in Forth.

If my implementation correctly implements the behavior of the words, does it
matter if I confuse them?

E.g., if my Forth eventually becomes ANS and passes the Hayes core tests,
does it matter if I "wrongly" implemented the functionality of some words?
I.e., a word may have an understanding, or specification meaning, or
historical meaning that is not captured by the tests or implemented by me,
but passes the test. If so, is the error on my side or the compliance suite
side? I think it's the latter ... I have been attempting to capture the
essence of various words as they were implemented in other older Forths.
Sometimes they all do things the same way and other times it's wildly split
as to what was done and what works.

> > AIUI, DOVAR and @ can be equivalent to DOCON. DODOES
> > performs DOVAR as part of it's functionality.
>
> In that approach ~ which again, is an *implementation* approach not a
> language behavior design question ~ DODOES is not in the CFA in any
> event. Its compiled as a code stub in the word that contained DOES> to
> start up the interpreting of the following section, and the CFA
> contains a pointer to that code.

Ok. Mine works a bit differently, currently.

I have a CFA routine which I call DODOES. It gets the address for the "code

stub in the word that contained DOES> to start up the interpreting of the

following section" which I'll refer to as after-DOES> address hereafter.
The after-DOES> address is compiled by DOES> into a word's PFA. DOES> also
compiles DODOES into the CFA. AIUI, I have to compile the after-DOES>
address into the PFA due to my use of C. I describe my implementation in
some depth, after which I have a question about using , COMMA with CREATE
and DOES> prior to the DOES> which could break my implementation.

It's in C not assembly, so I can't put "raw" jump-to addresses into the CFA.
If it were in assembly, the CFA could be set to any address where a valid
assembly routine exists. I.e., for assembly that works that way, the CFA
can be set to the exact starting point of the after-DOES> code. It just
jumps there using the CFA value. Since mine is in C, the CFA must point to
a valid C routine. I can't place a C routine inlined with the after-DOES>
code and call that routine. I don't have a built-in C compiler in my Forth
interpreter. So, I can't create address values at runtime for the CFA. All
CFA values must be known at compile time. This means I have a limited and
fixed set of legitimate CFA values, unlike an assembly implementation which
could have a huge number of them. In this case, that means one of my
primitive (low-level) routines is used for CFA's: DOVAR DOCON DOCOL
DODOES etc. The after-DOES> code is in compiled Forth as interpretable
addresses. That means I have to have a routine which interprets the
addresses there. DOCOL (or ENTER) does just that, but it needs to be
redirected to the after-DOES> code. So, DODOES is a routine that gets
the address for the after-DOES> code from within the word compiled using
DOES> and does the redirection.

Currrently, my DOES> looks like this:

: DOES> R> , LIT DODOES LAST @ CFA ! ;

Recently, I was able to implement it as a defintion in ASCII text. It's
definition was changed recently too. I broke it a while back and just fixed
it. fig-Forth has a SMUDGE in it. I don't yet. Obviously, the R>
indicates it's an interpreted Forth, in this case ITC. LAST is probably
non-standard. I'd have to recheck was LAST is supposed to do ... Due to
implementation issues, I have both LAST and LATEST. LATEST is only for the
NFA. Actually, LATEST for the dictionary bits that are normally part of the
NFA, but I needed to relocate away from the name portion of the NFA. LAST
gets the address of the start of current dictionary entry. LAST is that
address no matter how I reorganize the dictionary structure. It could be
NFA, LFA, CFA, or PFA. I had LFA's first, but that caused an issue with a
fig-Forth like definition when I converted from word-by-word parsing to
line-by-line. So, I reorganized with NFA first, at least temporarily. The
sequence "LAST @ CFA !" stores DODOES into the CFA overwriting a
DOVAR put there via CREATE. The sequence "R> ," stores the address of
the after-DOES> code in the PFA.

DODOES is a primitive (low-level word) in C. It's placed into the CFA.
When executed instead of ENTER or DOCOL, it pushes the W register onto
the data/parameter stack. W register is the current instruction pointer for
the interpreter. DODOES then pushes the 1st PFA value onto the return
stack. That value is the address that DOES> comma'd into a definition.
It's the address of the after-DOES> code.

Since I'm storing the address of the after-DOES> code in the PFA, I suspect
that a , COMMA could insert values prior to the stored after-DOES> address.
In which case, DODOES would fail. It doesn't "know" that the after-DOES>
address is not in the location it expects. Is it valid to use , COMMA
between CREATE and DOES> ? If someone knows of valid, functional examples
of a , COMMA being used between CREATE and DOES> in a word, could you
post them? I'll need something to break and/or test with.

> But if its indirect or direct threaded, and you were trying to bring
> up fig-Forth fashion, but with a minimum of code, you COULD have a
> stub in the coded section that does:
>
> DOVAR:
> CALL DODOES
> .WORD FETCH,EXIT
>
> ... and that's your DOVAR.

Did you mean DOCON due to the FETCH?

Well, either way, that won't work in my current implementation. Both DOCON
and DODOES manipulate the return stack, while DOVAR doesn't. DOCON
continues execution after the constant. DOVAR doesn't. DODOES continues
execution at the after-DOES> address stored in the word by DOES>, i.e., the
DOES> "jump-to" address is in the 1st PFA. All three are primitives
(low-level words).

> It depends on what your objectives are.

My current main objective is to reduce my set of primitives and pre-compiled
Forth code. I'm migrating to Forth words in ASCII text. The issue I'm
having is the parsing words are needed to parse words in ASCII text, i.e.,
catch-22. I may actually have to add some primitives that I didn't
implement since there is no easy way to implement them in high level Forth.

Rod Pemberton

BruceMcF

unread,

Dec 18, 2011, 12:14:25 PM12/18/11

to

On Dec 18, 7:53 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> I have a CFA routine which I call DODOES. It gets the address for
> the "code stub in the word that contained DOES> to start up the
> interpreting of the following section" which I'll refer to as
> after-DOES> address hereafter.

Where is that address for the DOES> segment stored? Under
<BUILD ... DOES>
... a DODOES was stored in the CFA, the address of the DOES segment
was the first data entry, and DOES> executed the first data entry in
the built word and returned the address of the second data entry in
the built word..

The innovation in moving from <BUILDS ... DOES> to CREATE ... DOES>
was DOES> setting things up so that the address for the DOES> segment
in setting it up so that address *is* the CFA entry, so there is no
difference in size with a regular CREATEd word, so you can CREATE in
one word or on the command line and then execute DOES> in some
following word.

> The after-DOES> address is compiled by DOES> into a word's PFA.
> DOES> also compiles DODOES into the CFA.

Then I assume you have to pad CELL with a spare cell and return the
address two cells after the CFA as the data field address. Luckily
only CREATEd words can take >BODY so you don't have to worry about
detecting which ones have the data space right after the CFA and which
ones have the data space with a cell of padding/address.

I guess it kind of sucks if you have to put a spare cell in a CREATE
just in case a DOES> is executed later, but that's the cost for
portability to implementations that are able to embed something at the
head of the DOES> segment that can acts as a target for a CFA.

That means you wouldn't want to ditch DOVAR in favor of DOCREA because
while that space inefficiency might not be that much relative overhead
for a larger CREATEd structure or stack pad, it would start to add up
if present in every VARIABLE.

However, you could ditch DOCREA, by just putting DODOES in place when
you CREATE the word and starting the target address out at a stub that
is, in effect:

:noname DOES> ;

Then DOES> only has to update that DODOES target address, and CREATE
only has to set a variable with the address of that most recent DODOES
target address.

BruceMcF

unread,

Dec 18, 2011, 1:16:00 PM12/18/11

to

On Dec 18, 7:53 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> My current main objective is to reduce my set of primitives and
> pre-compiled Forth code. I'm migrating to Forth words in ASCII text.
> The issue I'm having is the parsing words are needed to parse words in
> ASCII text, i.e., catch-22. I may actually have to add some primitives
> that I didn't implement since there is no easy way to implement them in
> high level Forth.

Don't forget that variables do not need to be the structure
implemented by VARIABLE in order to work ~ if the kernel has a data
block, it can refer to locations in that data block by a label that
the C compiler understands, and then if there is a CONSTANT in the
dictionary that is set to the address of the beginning of the data
block, a working Forth system can define constants that contain the
address of the cells in that data block.

And, indeed, the word CONSTANT is not needed for that to work, since:
: <constant-name> <constant-value> ;

*is* a constant, just likely not as space and execution speed
efficient as a conventional constant.

Indeed, if you have CREATE :NONAME and EXECUTE you can define ":",
something like:
CREATE : :noname DOES> CREATE LATEST @ SMUDGE :noname ; EXECUTE

One trick with this is to have the name of the file that defines the
balance of the balance of the Forth in that data block, along with a
vector for REFILL. Start up opening that file for reading, and then
the original primitive REFILL routine is hardwired to just REFILL
lines from that file until empty.

Then *in* that file, you build the actual command line interpreter in
Forth, including both file and command line input, define a pseudo-
VARIABLE that is a constant pointing to that vector location, and *in
the middle of interpreting the file* reset the REFILL vector, so that
at that point you are interpreting a file, you have an input source
stack that has been pre-populated with the command line interpreter
beneath the file currently being interpreted, and when the file ends,
back you go to the command line.

>IN and STATE are obvious candidates for placement in that data block. So are KEY and EMIT vectors.

There's lots of functionality of Forth94 that can be postponed until
after a primitive Forth interpreter is up and running, as long as you
know that the primitive Forth interpreter will only work on a file
that defines the balance of the system. For example, if the file is
only going to have literals in either decimal or hexadecimal format,
which is under your control since you are writing the file, you can
hardwire a number conversion routine and, as above, replace the vector
with a full featured version resting on BASE, so any of the machinery
required by the full featured number conversion can be built in the
ASCII file.

Indeed, given that start-up, you could vector the whole interpreter
into chunks that are progressively replaced as the ASCII file is
interpreted.

If you start out with SOURCE hardwired, an >IN location, un-named, a
STATE location, un-named, a vector to a REFILL that works in the
context, and a PARSE that works, you've got most of the machinery for
a rudimentary Forth compiler/interpreter. Specify that the start-up
file uses only blanks as white space, load a guard \n at the end of
SOURCE in the vectored REFILL stub, and PARSE works just fine to
tokenize: as a loop, WHILE the current >IN space is a BL then >IN
increments, PARSE to the next BLANK, CMOVE to the HUNT buffer, and
HUNT for the token. IF STATE is clear, EXECUTE the xt, IF not,
COMPILE, the xt. If HUNT fails, convert to a number using the vectored
number conversion, calling the ABORT vector if the number conversion
fails, and if it succeeds, compile it if STATE is not clear.

PARSE is fine as a primitive, and if you restrict the format of the
ASCII kernel file you can define a dummy PARSE-NAME that is vectored
in your data block and replaced by the real PARSE-NAME when its been
defined in the ASCII file. To save space, the dummy PARSE-NAME can be
anonymous. To save MORE space, the definition could be contained in
the data block and then recovered for use by something else once the
real PARSE-NAME is in the dictionary and vectored into the data block
for use by the interpreter. Indeed, you could build stub primitives
that are hand-compiled Forth code and only used for bootstrapping in
the space that will be used by the input source and search order
stacks. In the best case, they take up less space than those stacks,
so there is no wasted space from having the one-off stub primitives in
the image.

Elizabeth D. Rather

unread,

Dec 18, 2011, 1:18:30 PM12/18/11

to

On 12/18/11 2:53 AM, Rod Pemberton wrote:
[discussion of the trials of implenting Forth in C deleted]

...

> Since I'm storing the address of the after-DOES> code in the PFA, I suspect
> that a , COMMA could insert values prior to the stored after-DOES> address.
> In which case, DODOES would fail. It doesn't "know" that the after-DOES>
> address is not in the location it expects. Is it valid to use , COMMA
> between CREATE and DOES> ? If someone knows of valid, functional examples
> of a , COMMA being used between CREATE and DOES> in a word, could you
> post them? I'll need something to break and/or test with.
>

Actually, there are more defining words that use , between the CREATE
and DOES> than those that don't.

: CONSTANT ( x -- ) CREATE , DOES> ( -- x ) @ ;
: 2CONSTANT ( x2 x1 -- ) CREATE , , DOES> ( -- x2 x1 ) 2@ ;

...and a more complicated word in a problem set in Forth Application
Techniques:

: ARRAY ( u -- ) CREATE DUP , ALLOT \ Indexed array of size u bytes
DOES> ( i -- caddr ) DUP @ \ i caddr size
ROT MIN + ; \ i clipped to size, added to caddr

10 array stuff ok

0 stuff . 245684 ok
1 stuff . 245685 ok
10 stuff . 245694 ok
11 stuff . 245694 ok

Really, it shouldn't be that hard :-)

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

BruceMcF

unread,

Dec 18, 2011, 5:33:49 PM12/18/11

to

On Dec 18, 7:53 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> I have a CFA routine which I call DODOES. It gets the address for
> the "code stub in the word that contained DOES> to start up the
> interpreting of the following section" which I'll refer to as
> after-DOES> address hereafter.

The trick is to have the address of the code stub *work* as a Code
Field Address. Then you just put *the address of the code stub* in the
CFA field.

Remember, you have
...
[word-in-containing-word]
[EXIT]
Address1:
*anything-here*
Address2:
[first-word-in-DOES-segment]
...

... because of the exit, and then DOES> will start executing at
Address2, there can be *anything* at Address1, for as much space as
you need, to launch the (DOES). Typically its a single item, but it
can be the actual machine code for a CALL or JMP, or a special
sentinel that the inner interpreter recognizes (eg, a leading byte of
zero where that can't be a normal code routine), or whatever.

So you set that up so it works, then write Address 1 into the CFA of
the latest word.

> It's in C not assembly, so I can't put "raw" jump-to addresses
> into the CFA. If it were in assembly, the CFA could be set to any
> address where a valid assembly routine exists. I.e., for assembly
> that works that way, the CFA can be set to the exact starting point
> of the after-DOES> code. It just jumps there using the CFA value.

That's just an example. For a bit threaded implementation, the "CFA"
can be either a machine call or a high level definition based on the
content of the special bit, so in that case you just @EXECUTE the
contents of the CFA, but a high level token at the CFA instead of a
primitive token, and Address1=Address2.

> Since mine is in C, the CFA must point to a valid C routine.
> I can't place a C routine inlined with the after-DOES> code and
> call that routine. I don't have a built-in C compiler in my Forth
> interpreter. So, I can't create address values at runtime for the
> CFA. All CFA values must be known at compile time. This means I
> have a limited and fixed set of legitimate CFA values, unlike an
> assembly implementation which could have a huge number of them.

But since you have a bound set of CFA routines, instead of storing
their address in the CFA, you can store their addresses in a short
table and point to an index of their addresses in the CFA, as int.

Then the inner interpreter executes out of the CFA routine table for
CFA values under, say, 255, and knows that its a DOES> address if the
CFA value is bigger than 16. So you've got an implicitly bit threaded
CFA on something like:
(cfa && !0FFh)

...

> I broke it a while back and just fixed
> it. fig-Forth has a SMUDGE in it. I don't yet. Obviously, the R>
> indicates it's an interpreted Forth, in this case ITC. LAST is
> probably non-standard.

AFAIR, its not reserved, the standard would let it do anything you
want it to do.

> I'd have to recheck was LAST is supposed to do ... Due to
> implementation issues, I have both LAST and LATEST. LATEST is only
> for the NFA. Actually, LATEST for the dictionary bits that are
> normally part of the NFA, but I needed to relocate away from the name
> portion of the NFA. LAST gets the address of the start of current
> dictionary entry. LAST is that address no matter how I reorganize the
> dictionary structure. It could be NFA, LFA, CFA, or PFA. I had LFA's
> first, but that caused an issue with a fig-Forth like definition when
> I converted from word-by-word parsing to line-by-line. So, I
> reorganized with NFA first, at least temporarily.

As long as you have LAST then you don't need a smudge ~ just don't
link into the dictionary at ":", but instead use ";" to link into the
dictionary using LAST. You'll need a LAST>LFA word or sequence like
your LAST>CFA sequence, and that'll be set.

> The sequence "LAST @ CFA !" stores DODOES into the CFA overwriting a
> DOVAR put there via CREATE. The sequence "R> ," stores the address of
> the after-DOES> code in the PFA.

You can't do that. You need to make your CFA two int's long, leave one
alone when you create it, and make the PFA start the second cell after
the CFA starts.

In short, you are recreating the <BUILDS ... DOES> behavior, but if
you have that behavior, then DOES> *cannot pair with CREATE anymore*,
not unless CREATE has an empty slot waiting for the address, so that
approach is quite definitely non-compliant without the padding of the
CREATE structure.

As I said before, you *could* have DODOES> and a compiled definition
address compiled by CREATE and just have the CREATE definition start
out pointing to the address of a:

Create-target-address:
[EXIT]

So that the word built by CREATE looks like:

Address-in-LAST-when-CREATE-completes:
[Dictionary-Entry]
[DODOES]
[Create-target-address]
DP-Address-after-CREATE-completes:
... ; PFA

... so then the PFA starts out the address after that, and that is
what is placed on the stack by DODOES so everything would work.

Its either creating a two-int structure at the point of the CFA and
the PFA for CREATED words at an offset of two ints, or else some form
of encoding in the entry in the CFA field. You can't do what you are
doing now and have a Forth94 compliant system.

Rod Pemberton

unread,

Dec 19, 2011, 5:24:46 AM12/19/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:6cc60bcb-a38a-4108...@t38g2000yqe.googlegroups.com...

> On Dec 18, 7:53 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:

...

> > I have a CFA routine which I call DODOES. It gets the address for
> > the "code stub in the word that contained DOES> to start up the
> > interpreting of the following section" which I'll refer to as
> > after-DOES> address hereafter.
>
> Where is that address for the DOES> segment stored? Under
> <BUILD ... DOES>
> ... a DODOES was stored in the CFA, the address of the DOES
> segment was the first data entry, and DOES> executed the first data
> entry in the built word and returned the address of the second data
> entry in the built word..

That sounds like what I'm describing. It's fig-Forth based. fig-Forth has
both <BUILDS and CREATE. I've not implemented the <BUILDS and
don't plan to. I did a fig-Forth like CREATE or so I thought ...

> The innovation in moving from <BUILDS ... DOES> to
> CREATE ... DOES> was DOES> setting things up so that the
> address for the DOES> segment in setting it up so that address
> *is* the CFA entry,

Ok.

Perhaps, I may have merged the two concepts or implemented more of a <BUILDS
instead of a CREATE. This may be because of following the fig-Forth way, or
because of the limitations I was seeing for runtime values for the CFA using
C.

In my post to Ms. Rather, it seems a few examples are working, which I
thought shouldn't be working due to intermediate COMMA's. So, I'm going to
have to more thoroughly review what is actually being done. I.e., from
reading my code, I think the address for the after-DOES> code should either
be shifted to a wrong location, or it should overwrite data in the PFA area.

> so there is no difference in size with a regular CREATEd word,
> so you can CREATE in one word or on the command line and
> then execute DOES> in some following word.

Ok.

fig-Forth indicates DOES> is compile only. I guess Forth moved away from
that.

> The innovation in moving from <BUILDS ... DOES> to CREATE ... DOES>
> was DOES> setting things up so that the address for the DOES> segment
> in setting it up so that address *is* the CFA entry, so there is no
> difference in size with a regular CREATEd word, so you can CREATE in
> one word or on the command line and then execute DOES> in some
> following word.

I'll have to think about whether or not I can find a solution to do those
two things (address in CFA no PFA and same size PFA as normal CREATEd
word) with the constraints I'm having on CFA values. I'm not sure that I
can.
However, I'm not entirely clear on why those cited situations won't work.
The functionality of DOES> is independent of the other word(s). So, why
does it matter if DOES> is used in a different word? or interactively? I
could probably use a couple of examples. I may have built something
different from either <BUILDS or CREATE ...

> > The after-DOES> address is compiled by DOES> into a word's
> > PFA. DOES> also compiles DODOES into the CFA.
>
> Then I assume you have to pad CELL with a spare cell and return
> the address two cells after the CFA as the data field address.

No, I'm not.

Rod Pemberton

Rod Pemberton

unread,

Dec 19, 2011, 5:25:17 AM12/19/11

to

"Elizabeth D. Rather" <era...@forth.com> wrote in message
news:E5qdnak5ntDqs3PT...@supernews.com...

> On 12/18/11 2:53 AM, Rod Pemberton wrote:
> [discussion of the trials of implenting Forth in C deleted]
...

> > Since I'm storing the address of the after-DOES> code in the PFA, I
> > suspect that a , COMMA could insert values prior to the stored
> > after-DOES> address. In which case, DODOES would fail. It
> > doesn't "know" that the after-DOES> address is not in the location
> > it expects. Is it valid to use , COMMA between CREATE and
> > DOES> ? If someone knows of valid, functional examples
> > of a , COMMA being used between CREATE and DOES> in
> > a word, could you post them? I'll need something to break
> > and/or test with.
> >
>
> Actually, there are more defining words that use , between the CREATE
> and DOES> than those that don't.
>
> : CONSTANT ( x -- ) CREATE , DOES> ( -- x ) @ ;

Interesting... I'm using that definition and it works. I guess I need to
review why that is working, since I currently think that the COMMA should
shift the PFA field where the address to the after-DOES> code is stored.

(I think this may be the first time ever that I need to find out why
something is working, instead of not working.)

> : 2CONSTANT ( x2 x1 -- ) CREATE , , DOES> ( -- x2 x1 ) 2@ ;
>

I don't have 2@ yet. Let me try DUP 4 + @ SWAP @ ... That works! Yes,
I definately have to check where the constants are being stored relative to
the link to the DOES> code. The address for the after-DOES> code
is apparently not overwriting them and the two COMMA's is apparently
not shifting the address for the after-DOES> code.

> ...and a more complicated word in a problem set in Forth Application
> Techniques:
>
> : ARRAY ( u -- ) CREATE DUP , ALLOT \ Indexed array of size u bytes
> DOES> ( i -- caddr ) DUP @ \ i caddr size
> ROT MIN + ; \ i clipped to size, added to caddr
>
> 10 array stuff ok
>
> 0 stuff . 245684 ok
> 1 stuff . 245685 ok
> 10 stuff . 245694 ok
> 11 stuff . 245694 ok
>

That fails. I don't have MIN. I'm not sure where the failure is yet.

Rod Pemberton

Rod Pemberton

unread,

Dec 19, 2011, 5:26:04 AM12/19/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:1970f26a-92f9-48b8...@n39g2000yqh.googlegroups.com...

> On Dec 18, 7:53 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:

...

> > It's in C not assembly, so I can't put "raw" jump-to addresses
> > into the CFA. If it were in assembly, the CFA could be set to any
> > address where a valid assembly routine exists. I.e., for assembly
> > that works that way, the CFA can be set to the exact starting point
> > of the after-DOES> code. It just jumps there using the CFA value.

> [...]

> > Since mine is in C, the CFA must point to a valid C routine.
> > I can't place a C routine inlined with the after-DOES> code and
> > call that routine. I don't have a built-in C compiler in my Forth
> > interpreter. So, I can't create address values at runtime for the
> > CFA. All CFA values must be known at compile time. This means I
> > have a limited and fixed set of legitimate CFA values, unlike an
> > assembly implementation which could have a huge number of them.
>
> But since you have a bound set of CFA routines, instead of storing
> their address in the CFA, you can store their addresses in a short
> table and point to an index of their addresses in the CFA, as int.
>
> Then the inner interpreter executes out of the CFA routine table for
> CFA values under, say, 255, and knows that its a DOES> address if the
> CFA value is bigger than 16. So you've got an implicitly bit threaded
> CFA on something like:
> (cfa && !0FFh)

Yes, would work. But, I'd prefer not to do it that way. I did something
similar with just 0 as a trap value.

> As long as you have LAST then you don't need a smudge ~ just don't
> link into the dictionary at ":", but instead use ";" to link into the
> dictionary using LAST. You'll need a LAST>LFA word or sequence like
> your LAST>CFA sequence, and that'll be set.

Currently, CREATE calls (CREATE) prior to COMPILE DOVAR.
(CREATE) sets the link and other fields.

> > The sequence "LAST @ CFA !" stores DODOES into the CFA
> > overwriting a DOVAR put there via CREATE. The sequence "R> ,"
> > stores the address of the after-DOES> code in the PFA.
>
> You can't do that.

I'm currently doing that ... So, why can't I?

> You need to make your CFA two int's long, leave one
> alone when you create it, and make the PFA start the
> second cell after the CFA starts.

Why do I need the extra cell?

> In short, you are recreating the <BUILDS ... DOES> behavior, but if
> you have that behavior, then DOES> *cannot pair with CREATE anymore*,
> not unless CREATE has an empty slot waiting for the address, so that
> approach is quite definitely non-compliant without the padding of the
> CREATE structure.

What do you mean by DOES> cannot pair with CREATE?

Also, do you mean when compiling a definition or interpreting interative
input? If interpreting interactive input, is DOES> allowed to be used
outside a definition? fig-Forth indicates DOES> is compile only:

"A word which defines the run-time action within a high-level
defining word. ..."

So far, I'm recreating the fig-Forth behavior for CREATE ... DOES> or so I
think but not exactly ... Is that the same as the <BUILDS ... DOES>
behavior? fig-Forth has both CREATE and <BUILDS . I haven't implemented
fig-Forth like <BUILDS and have no plans to do so. I did do CREATE and
DOES> based on fig-Forth.

> Its either creating a two-int structure at the point of the CFA and
> the PFA for CREATED words at an offset of two ints, or else some form
> of encoding in the entry in the CFA field. You can't do what you are
> doing now and have a Forth94 compliant system.

Ok. Well, it's nowhere near Forth94 yet.

Rod Pemberton

BruceMcF

unread,

Dec 19, 2011, 12:38:28 PM12/19/11

to

On Dec 19, 5:26 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

>>> The sequence "LAST @ CFA !" stores DODOES into the CFA
>>> overwriting a DOVAR put there via CREATE. The sequence "R> ,"
>>> stores the address of the after-DOES> code in the PFA.

Its been over a quarter century since I last used these archaic terms.
Remind me what PFA is an abbreviation for? That's where the xt's go in
the compiled Forth definitions, right?

CFA:
Address-of-routine-to-perform-following
PFA:
Address-of-word1
Address-of-word2
...

So in you fig-Forth style CONSTANT, that would be:

CFA:
DOCON
PFA:
constant-value

And VARIABLE is:

CFA:
DOVAR
PFA:
variable-contents

This would be easier if the whole system was bit/token-threaded ~ say,
if xt's under 32 or 64 or 128 were reserved for primitives and larger
xt's were addresses of compiled definitions. You say you have few
enough primitives for that to work, and in this context it would allow
you to reduce your primitives even more.

So you can just have:

CFA:
Address of HL routine | token for primitive

Then if the HL routine call places the following address on the top of
the return stack, "DOVAR" and "DCON" are:

DOCON:
.word RTO
.word FETCH
.word EXIT

DOVAR:
.word RTO
.word EXIT

BruceMcF

unread,

Dec 19, 2011, 12:11:14 PM12/19/11

to

On Dec 19, 5:24 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> "BruceMcF" <agil...@netscape.net> wrote in message

>>> I have a CFA routine which I call DODOES. It gets the address for
>>> the "code stub in the word that contained DOES> to start up the
>>> interpreting of the following section" which I'll refer to as
>>> after-DOES> address hereafter.

>> Where is that address for the DOES> segment stored? Under
>> <BUILD ... DOES>
>> ... a DODOES was stored in the CFA, the address of the DOES
>> segment was the first data entry, and DOES> executed the first data
>> entry in the built word and returned the address of the second data
>> entry in the built word..

> That sounds like what I'm describing. It's fig-Forth based. fig-Forth has
> both <BUILDS and CREATE. I've not implemented the <BUILDS and
> don't plan to. I did a fig-Forth like CREATE or so I thought ...

But fig-Forth DOES> is designed to work after a <BUILD ... CREATE is
only used for things like structures that return the start of their
data address.

You have to modify either the behavior of the fig-Forth DOES> or
modify the behavior of the fig-Forth CREATE if you want the two to
work together properly.

> > The innovation in moving from <BUILDS ... DOES> to
> > CREATE ... DOES> was DOES> setting things up so that the
> > address for the DOES> segment in setting it up so that address
> > *is* the CFA entry,

> Ok.

> Perhaps, I may have merged the two concepts or implemented more of a <BUILDS
> instead of a CREATE. This may be because of following the fig-Forth way, or
> because of the limitations I was seeing for runtime values for the CFA using
> C.

You implemented a CREATE, and then implemented a DOES> that assumed it
is being used with a <BUILDS

>> so there is no difference in size with a regular CREATEd word,
>> so you can CREATE in one word or on the command line and
>> then execute DOES> in some following word.

> Ok.

> fig-Forth indicates DOES> is compile only. I guess Forth moved away from
> that.

No, Forth94 DOES> is allowed to be, and normally is, compile-only. The
standard only specifies the compile-time behavior of DOES> and the run-
time behavior of what it compiles.

Its CREATE that is, and has always been even back to fig-Forth days,
usable from the command line.
: does-count DOES> COUNT ;
CREATE csv-sep$ does-count
3 C, CHAR " C, CHAR , C, CHAR " C,

The DOES> is executed when does-count is executed, it modifies the csv-
sep$ so it returns a ca,u style reference instead of a counted string
style reference.

>> The innovation in moving from <BUILDS ... DOES> to CREATE ... DOES>
>> was DOES> setting things up so that the address for the DOES> segment
>> in setting it up so that address *is* the CFA entry, so there is no
>> difference in size with a regular CREATEd word, so you can CREATE in
>> one word or on the command line and then execute DOES> in some
>> following word.

> I'll have to think about whether or not I can find a solution to do those
> two things (address in CFA no PFA and same size PFA as normal CREATEd
> word) with the constraints I'm having on CFA values. I'm not sure that I
> can.
> However, I'm not entirely clear on why those cited situations won't work.
> The functionality of DOES> is independent of the other word(s). So, why
> does it matter if DOES> is used in a different word? or interactively? I
> could probably use a couple of examples. I may have built something
> different from either <BUILDS or CREATE ...

See above. If I have a $, word that compiles a string to the
dictionary, I can also do:

CREATE csv-sep$ PARSE | ","| $, does-count

And I can define a word that takes a ca,u string and builds either a
counted or a cell-counted string based on how big a string I hand to
it. I think that's something like:

: does-count DOES> COUNT ;
: does-cell-count DOES> DUP CELL+ SWAP @ ;

127 CONSTANT Max-Char-Counted \ This is 255 on many systems

: constant$ ( ca u "name" -- )
CREATE DUP Max-Char-Counted U> IF , does-cell-count ELSE C, does-
count THEN
HERE OVER CHARS ALLOT SWAP CHARS MOVE ;

But in any event, you can only put the does part inside a conditional
if its been factored out to its own word.

>>> The after-DOES> address is compiled by DOES> into a word's
>>> PFA. DOES> also compiles DODOES into the CFA.

>> Then I assume you have to pad CELL with a spare cell and return
>> the address two cells after the CFA as the data field address.

> No, I'm not.

I didn't speculate if you were doing that. I was describing what you
have to do in that case.

Elizabeth D. Rather

unread,

Dec 19, 2011, 1:38:05 PM12/19/11

to

On 12/19/11 12:24 AM, Rod Pemberton wrote:
...

> In my post to Ms. Rather, it seems a few examples are working, which I
> thought shouldn't be working due to intermediate COMMA's. So, I'm going to
> have to more thoroughly review what is actually being done. I.e., from
> reading my code, I think the address for the after-DOES> code should either
> be shifted to a wrong location, or it should overwrite data in the PFA area.
>
>> so there is no difference in size with a regular CREATEd word,
>> so you can CREATE in one word or on the command line and
>> then execute DOES> in some following word.
>
> Ok.
>
> fig-Forth indicates DOES> is compile only. I guess Forth moved away from
> that.

Despite the recent discussion, DOES> is still specified as compile-only.

Elizabeth D. Rather

unread,

Dec 19, 2011, 1:49:18 PM12/19/11

to

On 12/19/11 7:38 AM, BruceMcF wrote:
> On Dec 19, 5:26 am, "Rod Pemberton"<do_not_h...@noavailemail.cmm>
> wrote:
>
>>>> The sequence "LAST @ CFA !" stores DODOES into the CFA
>>>> overwriting a DOVAR put there via CREATE. The sequence "R> ,"
>>>> stores the address of the after-DOES> code in the PFA.
>
> Its been over a quarter century since I last used these archaic terms.
> Remind me what PFA is an abbreviation for? That's where the xt's go in
> the compiled Forth definitions, right?
>
> CFA:
> Address-of-routine-to-perform-following
> PFA:
> Address-of-word1
> Address-of-word2
> ...

Well, to be absolutely correct, PFA="Parameter Field *Address*" which
means the address *of* the parameter field of the word being referenced.
That is what you store in the parameter field of a word being compiled
in an ITC implementation, for example. The *parameter field* is where
the payload of a definition goes.

Similarly, CFA="Code Field Address" and is the address *of* the part of
a definition that contains a pointer to the code to be executed, that
is, the *code field*.

But lots of people get this wrong :-)

BruceMcF

unread,

Dec 19, 2011, 1:53:02 PM12/19/11

to

On Dec 19, 12:11 pm, BruceMcF <agil...@netscape.net> wrote:
> CREATE csv-sep$ PARSE | ","| $, does-count

Sheesh, I type on a box without a Forth at hand and write drivel like
that.

CREATE csv-sep$ CHAR | PARSE ","| $, does-count

I must have a:
: PARSE| [CHAR] | PARSE ;

... lurking in some dark corner of my memory ~ handy for strings with
" in them, especially on a small system without a S\" at hand.

A. K.

unread,

Dec 19, 2011, 3:34:30 PM12/19/11

to

On 19.12.2011 19:38, Elizabeth D. Rather wrote:
>
> Despite the recent discussion, DOES> is still specified as compile-only.
>
> Cheers,
> Elizabeth
>

IIRC there is no 'compile-only' specificationat all in the standard.

And for DOES> it just states that "interpretation semantics for this
word are undefined."

Of course this makes life easier.. ;-)

Andreas

BruceMcF

unread,

Dec 19, 2011, 3:51:48 PM12/19/11

to

On Dec 19, 3:34 pm, "A. K." <a...@nospam.org> wrote:
> On 19.12.2011 19:38, Elizabeth D. Rather wrote:
> > Despite the recent discussion, DOES> is still specified as compile-only.

> IIRC there is no 'compile-only' specificationat all in the standard.

> And for DOES> it just states that "interpretation semantics for this
> word are undefined."

Which would then be what someone means when they say that something is
"specified" as compile-only ~ there is only a specification for what
it does when compiling.

BruceMcF

unread,

Dec 19, 2011, 4:04:15 PM12/19/11

to

On Dec 19, 1:49 pm, "Elizabeth D. Rather" <erat...@forth.com> wrote:
> On 12/19/11 7:38 AM, BruceMcF wrote:
>
>
>
>
>
> > On Dec 19, 5:26 am, "Rod Pemberton"<do_not_h...@noavailemail.cmm>
> > wrote:
>
> >>>> The sequence "LAST @ CFA !" stores DODOES into the CFA
> >>>> overwriting a DOVAR put there via CREATE. The sequence "R> ,"
> >>>> stores the address of the after-DOES> code in the PFA.
>
> > Its been over a quarter century since I last used these archaic terms.
> > Remind me what PFA is an abbreviation for? That's where the xt's go in
> > the compiled Forth definitions, right?

> > CFA:
> > Address-of-routine-to-perform-following
> > PFA:
> > Address-of-word1
> > Address-of-word2
> > ...

> Well, to be absolutely correct, PFA="Parameter Field *Address*" which
> means the address *of* the parameter field of the word being referenced.
> That is what you store in the parameter field of a word being compiled
> in an ITC implementation, for example. The *parameter field* is where
> the payload of a definition goes.

In the above, in assembler style, CFA and PFA as labels are the
addresses themselves, not variables containing those addresses. Maybe
"ActualCFA:" and "ActualPFA:"

> Similarly, CFA="Code Field Address" and is the address *of* the part of
> a definition that contains a pointer to the code to be executed, that
> is, the *code field*.

I recalled CFA, and thought I'd recalled PFA, but I've seen enough
different implementation styles and header structure since I last
looked at an ITC to be unsure.

I think that for Rod's, DOCREATE should be the equivalent of:
R> CELL+ EXIT

and DODOES should be the equivalent of:
R> DUP @ >R CELL+ EXIT

... to avoid stepping on the address of target DOES> segment.

Coos Haak

unread,

Dec 19, 2011, 4:23:43 PM12/19/11

to

Op Mon, 19 Dec 2011 05:25:17 -0500 schreef Rod Pemberton:

> "Elizabeth D. Rather" <era...@forth.com> wrote in message
> news:E5qdnak5ntDqs3PT...@supernews.com...

<snip>

>> Actually, there are more defining words that use , between the CREATE
>> and DOES> than those that don't.
>>
>>: CONSTANT ( x -- ) CREATE , DOES> ( -- x ) @ ;
>
> Interesting... I'm using that definition and it works. I guess I need to
> review why that is working, since I currently think that the COMMA should
> shift the PFA field where the address to the after-DOES> code is stored.
>

What do you mean with shift? COMMA is simply
: , HERE 1 CELLS ALLOT ! ;

> (I think this may be the first time ever that I need to find out why
> something is working, instead of not working.)
>
>>: 2CONSTANT ( x2 x1 -- ) CREATE , , DOES> ( -- x2 x1 ) 2@ ;
>>
>
> I don't have 2@ yet. Let me try DUP 4 + @ SWAP @ ... That works! Yes,

Correct, but you might want to use CELL+ instead of '4 +' even if
: CELL+ 4 + ;

<snip>
>
> Rod Pemberton

--
Coos

CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html

BruceMcF

unread,

Dec 19, 2011, 4:25:18 PM12/19/11

to

On Dec 19, 1:49 pm, "Elizabeth D. Rather" <erat...@forth.com> wrote:

> Well, to be absolutely correct, PFA="Parameter Field *Address*" which
> means the address *of* the parameter field of the word being referenced.
> That is what you store in the parameter field of a word being compiled
> in an ITC implementation, for example. The *parameter field* is where
> the payload of a definition goes.

Aha, I have been thinking of a DTC. The ITC for a colon definition is:

CFA: .word DOCOL
PFA: .word PF
PF: .word name1,...,name_n,EXIT

So DODOES *does* have to do the equivalent of:

R> DUP @ >R CELL+ EXIT

... as I thought (except in C for Rod's implementation), but there's
no problem with overwriting the "Parameter Field" because the PFA
holds the address of the start of the DOES> segment. The layout of the
CREATEd word before DOES> executes is:

CFA: .word DOCREATE
PFA: .word PF
PF: ; target of DP when CREATE completes

... and DOCREATE does the C-equivalent of"
R> CELL+ EXIT

After DOES> executes, then the same structure becomes:

CFA: .word DODOES
PFA: .word does-segment
PF: ; whatever was written into the dictionary after CREATE

Andrew Haley

unread,

Dec 20, 2011, 4:00:10 AM12/20/11

to

BruceMcF <agi...@netscape.net> wrote:

> On Dec 19, 1:49?pm, "Elizabeth D. Rather" <erat...@forth.com> wrote:
>> Well, to be absolutely correct, PFA="Parameter Field *Address*" which
>> means the address *of* the parameter field of the word being referenced.
>> That is what you store in the parameter field of a word being compiled
>> in an ITC implementation, for example. The *parameter field* is where
>> the payload of a definition goes.
>
> Aha, I have been thinking of a DTC. The ITC for a colon definition is:
>
> CFA: .word DOCOL
> PFA: .word PF

There is no need for this extra word in ITC. The only difference with
DTC is that in DTC the CFA points to JMP DOCOL not .word DOCOL .

> PF: .word name1,...,name_n,EXIT
>
> So DODOES *does* have to do the equivalent of:
> R> DUP @ >R CELL+ EXIT
> ... as I thought (except in C for Rod's implementation), but there's
> no problem with overwriting the "Parameter Field" because the PFA
> holds the address of the start of the DOES> segment.

A child of <BUILDS DOES> needs an extra cell.

Andrew.

Andrew Haley

unread,

Dec 20, 2011, 4:04:28 AM12/20/11

to

I've been thinking about this, and it seems to me like the core
problem is trying to do this in pure Standard C. If you allowed
yourself the freedom to use a few machine-dependent features it'd be
much easier and much cleaner. Sure, there would be a portability
layer, but it needn't be a huge thing.

Andrew.

BruceMcF

unread,

Dec 20, 2011, 7:49:19 AM12/20/11

to

Rod Pemberton wrote:
> "BruceMcF" <agi...@netscape.net> wrote in message
> news:1970f26a-92f9-48b8...@n39g2000yqh.googlegroups.com...

> > You can't do that.

> I'm currently doing that ... So, why can't I?

Oops, I was wrong. Its been too long since I thought about Indirect
Threading ... in the direct threading model I was thinking about,
there isn't really a PFA, just a Code Field and a Parameter Field,
with the Code Field containing machine code, a la Camel.

With ITC, you certainly can use the PFA to hold the address of the
does-segment, without overwriting the original Parameter Field. The
*address of* the PFA does double duty ~ you push the next address onto
the data stack, and the contents of that at address onto the return
stack. Where the address of the PFA comes from depends I guess on your
implementation details: it could be inferred from the address of the
CFA in "W", it could be on the return stack.

Given your objectives, it seems to make sense to define VARIABLE and
CONSTANT in the ASCII file, as:

: VARIABLE CREATE 0 , ;

BruceMcF

unread,

Dec 20, 2011, 7:40:28 AM12/20/11

to

Andrew Haley wrote:
> A child of <BUILDS DOES> needs an extra cell.

Yes, but I have been thinking about how a Direct Threaded
implementation avoids that when doing CREATE ... DOES> not about how
an Indirect Threaded Implementation avoids that ... an ITC has a CFA,
a PFA, and the Parameter Field itself. So the DODOES finds the address
of its does-segment in the PFA without overwriting the original
Parameter Field, it just has to infer the address of the Parameter
Field from the address of the PFA and push it on the data stack before
launching the inner interpreter on the does-segment.

Rod Pemberton

unread,

Dec 20, 2011, 10:49:21 AM12/20/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:d90ad0d0-c722-44ea...@32g2000yqp.googlegroups.com...

> On Dec 19, 5:26 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
>
> >>> The sequence "LAST @ CFA !" stores DODOES into the CFA
> >>> overwriting a DOVAR put there via CREATE. The sequence "R> ,"
> >>> stores the address of the after-DOES> code in the PFA.
>
> Its been over a quarter century since I last used these archaic terms.
> Remind me what PFA is an abbreviation for? That's where the xt's go in
> the compiled Forth definitions, right?
>

I'm not quite clear on what an "xt" is. It seems to have a slightly
different meaning from the old Forths where they are just an address
like below:

> CFA:
> Address-of-routine-to-perform-following
> PFA:
> Address-of-word1
> Address-of-word2
> ...

Yes. As Ms. Rather noted, I also use PFA to refer to the PF area. I.e.,
one could think of PFA as "parameter field area" instead of "parameter field
address". Each "Address-of-word" points to a CFA. Each CFA has the
appropriate routine to handle that word.

> So in you fig-Forth style CONSTANT, that would be:
>
> CFA:
> DOCON
> PFA:
> constant-value

In that notation, it'd be

CFA:
DOCON
PFA:
constant-value
EXIT

Compiled high-level words use ENTER in CFA and EXIT compiled at the end,
like DOCOL and NEXT. I don't currently recall why my constant has an EXIT.
From the code, it doesn't seem to be needed, i.e., there are no DOCON's that
continue execution after the constant ... I seemed to have originally named
DOCON as DOLIT. In that case, I may have expected a DOLIT word to be
compiled into some other word.

> And VARIABLE is:
>
> CFA:
> DOVAR
> PFA:
> variable-contents

Yes. DOVAR doesn't have the EXIT.

> This would be easier if the whole system was bit/token-threaded ~ say,
> if xt's under 32 or 64 or 128 were reserved for primitives and larger
> xt's were addresses of compiled definitions. You say you have few
> enough primitives for that to work, and in this context it would allow
> you to reduce your primitives even more.
>
> So you can just have:
>
> CFA:
> Address of HL routine | token for primitive
>

I think the primitives are fine. They are function pointers to C routines
and are valid in the CFA field just like DOCOL/ENTER, DOCON, DOVAR, etc.
So, I don't see why I would switch them to tokens.

AIUI, the issue with what you described previously about CREATE ... DOES> is
that I cannot fill a CFA field with an address at runtime with a routine
that is unknown at compile time to the C compiler.

> Then if the HL routine call places the following address on the top of
> the return stack, "DOVAR" and "DCON" are:
>
> DOCON:
> .word RTO
> .word FETCH
> .word EXIT
>
> DOVAR:
> .word RTO
> .word EXIT
>

RTO? RTO must place the IP or IP+1 or W on the return stack?

Rod Pemberton

Rod Pemberton

unread,

Dec 20, 2011, 10:53:00 AM12/20/11

to

"Coos Haak" <chf...@hccnet.nl> wrote in message
news:en7jqy57a3cf$.1oo4tb5zpev7n$.dlg@40tude.net...

> Op Mon, 19 Dec 2011 05:25:17 -0500 schreef Rod Pemberton:
>
> > "Elizabeth D. Rather" <era...@forth.com> wrote in message
> > news:E5qdnak5ntDqs3PT...@supernews.com...
> <snip>
> >> Actually, there are more defining words that use , between the CREATE
> >> and DOES> than those that don't.
> >>
> >>: CONSTANT ( x -- ) CREATE , DOES> ( -- x ) @ ;
> >
> > Interesting... I'm using that definition and it works. I guess I need
> > to review why that is working, since I currently think that the COMMA
> > should shift the PFA field where the address to the after-DOES> code
> > is stored.
> >
>
> What do you mean with shift? COMMA is simply
> : , HERE 1 CELLS ALLOT ! ;

COMMA allocates and stores a value in the dictionary. My CREATE ... DOES>
is storing a pointer to the DOES> code in my PF area, not my CFA. E.g.,
like so:

CFA: address of DOCOL
PF field 1: address
PF field 2: address
PF field 3: address
...

The DOES> address I believe is being retrieved from PF field 1. The DOES>
address I believe is being stored at the current dictionary pointer location
which could be PF field 1, or PF field 2, or ... So, if a COMMA is between
CREATE and DOES>, I'd expect it to shift where the DOES> address is stored,
but not from where it is retrieved. I'd also expect the possibility that
something would overwrite PF field where the DOES> address is being stored,
or the exact opposite. Any of those would break my implementation.
However, CREATE ... DOES> is functioning correctly, at least for some
trivial test cases. So, I'll have to find out why it's working. I'm not
sure why what I expected from my code and what is going on are different
yet.

Rod Pemberton

Rod Pemberton

unread,

Dec 20, 2011, 10:53:54 AM12/20/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:a186cfca-6955-48f1...@i6g2000vbe.googlegroups.com...

> The ITC for a colon definition is:
>
> CFA: .word DOCOL
> PFA: .word PF
> PF: .word name1,...,name_n,EXIT

I have not explicitly stored the PFA address. PF just continues from CFA.

CFA: .word DOCOL

PF: .word name1,...,name_n,EXIT

> So DODOES *does* have to do the equivalent of:
> R> DUP @ >R CELL+ EXIT
> ... as I thought (except in C for Rod's implementation),

I think it's slightly different, but don't know for sure. I've replied to
this in another post.

> but there's
> no problem with overwriting the "Parameter Field" because the
> PFA holds the address of the start of the DOES> segment.

Well, I have no PFA field. I was using PFA to refer to PF area as Ms.
Rather noted many do. That's why I need to track down what is going on.

> The layout of
> the CREATEd word before DOES> executes is:
>
> CFA: .word DOCREATE
> PFA: .word PF
> PF: ; target of DP when CREATE completes
>
> ... and DOCREATE does the C-equivalent of"
> R> CELL+ EXIT
>
> After DOES> executes, then the same structure becomes:
>
> CFA: .word DODOES
> PFA: .word does-segment
> PF: ; whatever was written into the dictionary after CREATE

Adding the PFA seems to be the "padding" solution you mentioned earlier.

Rod Pemberton

Rod Pemberton

unread,

Dec 20, 2011, 10:57:05 AM12/20/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:5148d2e6-54e0-4168...@i6g2000vbe.googlegroups.com...

> I think that for Rod's, DOCREATE should be the equivalent of:
> R> CELL+ EXIT

Hmm, I don't have a DOCREATE. Merits of DOCREATE are?

My CREATE COMPILEs a DOVAR currently.

> and DODOES should be the equivalent of:
> R> DUP @ >R CELL+ EXIT

I'm not quite sure of the exact C to Forth for my DODOES primitive. But, I
think it would be equivalent to (untested):

R> DUP CELL+ @ >R EXIT

Rod Pemberton

Rod Pemberton

unread,

Dec 20, 2011, 10:58:02 AM12/20/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:9933c2d5-8a97-468d...@d17g2000yql.googlegroups.com...

> Rod Pemberton wrote:
> > "BruceMcF" <agi...@netscape.net> wrote in message
> >
news:1970f26a-92f9-48b8...@n39g2000yqh.googlegroups.com...
>
> > > You can't do that.
>
> > I'm currently doing that ... So, why can't I?
>
> Oops, I was wrong.

Ok, I don't have a PFA, just PF. This was brought up in a different post by
you. I've replied to this there.

> Given your objectives, it seems to make sense to define VARIABLE and
> CONSTANT in the ASCII file, as:
>
> : VARIABLE CREATE 0 , ;
> : CONSTANT CREATE , DOES> @ ;

Yes, I already use those in the ASCII file.

Rod Pemberton

BruceMcF

unread,

Dec 20, 2011, 11:20:59 AM12/20/11

to

On Dec 20, 10:49 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> I'm not quite clear on what an "xt" is. It seems to have a slightly
> different meaning from the old Forths where they are just an address
> like below:

In most implementations its an address, in some implementations its an
offset from a base address, in some implementations its some kind of
token. Normally its compiled with "," but sometimes it has a special
form in the compiled definition, as in bit threaded implementations
where either a high or low bit is set or reset to show an address of a
compiled definition versus an address of code.

> > CFA:
> > Address-of-routine-to-perform-following
> > PFA:
> > Address-of-word1
> > Address-of-word2
> > ...

> Yes. As Ms. Rather noted, I also use PFA to refer to the PF area. I.e.,
> one could think of PFA as "parameter field area" instead of "parameter field
> address".

OK, that should just be "PF" then. If you had an actual PFA, there'd
be no problem.

> Compiled high-level words use ENTER in CFA and EXIT compiled at the end,
> like DOCOL and NEXT. I don't currently recall why my constant has an EXIT.
> From the code, it doesn't seem to be needed, i.e., there are no DOCON's that
> continue execution after the constant ... I seemed to have originally named
> DOCON as DOLIT. In that case, I may have expected a DOLIT word to be
> compiled into some other word.

Except DOLIT does not work like that, since with DOLIT

>> This would be easier if the whole system was bit/token-threaded ~ say,
>> if xt's under 32 or 64 or 128 were reserved for primitives and larger
>> xt's were addresses of compiled definitions. You say you have few
>> enough primitives for that to work, and in this context it would allow
>> you to reduce your primitives even more.

>> So you can just have:

>> CFA:
>> Address of HL routine | token for primitive

> I think the primitives are fine. They are function pointers to C routines
> and are valid in the CFA field just like DOCOL/ENTER, DOCON, DOVAR, etc.
> So, I don't see why I would switch them to tokens.

If space is no object, then adopt an actual CFA/PFA/PF model. If space
is an issue, tokens are more compact. Either way eliminates the
problem of the code after CREATE filling up the first cell of the
Parameter Field and then DOES> over-writing that first cell.

> AIUI, the issue with what you described previously about CREATE ... DOES> is
> that I cannot fill a CFA field with an address at runtime with a routine
> that is unknown at compile time to the C compiler.

Yes. One option, if you tokenize the CFA, then you "know" that
anything in the CFA that is not a token is an address of a DOES>
segment.

Or, second option, if you have a CFA/PFA/PF structure, then you
already have the space set aside.

And, third option, you bump up the cell that DOCREATE returns and have
CREATE do a 1CELL ALLOT at the end, to reserve space for DODOES to
store its parameter.

So its multiple ways to handle it, but you cannot have CREATE leave
the DP pointing to the cell that DOES> is going to later fill in with
the address of the does segment.

Coos Haak

unread,

Dec 20, 2011, 1:58:58 PM12/20/11

to

Op Tue, 20 Dec 2011 10:53:00 -0500 schreef Rod Pemberton:

You confuse me, what is the word that is started with 'CFA: address of
DOCOL'
Is this the defining word, containing CREATE and DOES> or the defined word
that is built by CREATE and DOES> ?

I would think that, e.g.
: CONST CREATE , DOES> @ ; was layed out about like this

CFA: .word DOCOL
PFA: .word CREATE
.word ,
.word some part of DOES>
DOCONST: .word another part of DOES>
.word @
.word EXIT (or SEMIS as in Figforth)

100 CONST HUNDRED may be layout as:

CFA: .word DOCONST
PFP: .word 100

BruceMcF

unread,

Dec 20, 2011, 2:44:20 PM12/20/11

to

On Dec 20, 1:58 pm, Coos Haak <chfo...@hccnet.nl> wrote:
> Op Tue, 20 Dec 2011 10:53:00 -0500 schreef Rod Pemberton:

> > COMMA allocates and stores a value in the dictionary. My CREATE ... DOES>
> > is storing a pointer to the DOES> code in my PF area, not my CFA. E.g.,
> > like so:

> > CFA: address of DOCOL
> > PF field 1: address
> > PF field 2: address
> > PF field 3: address
> > ...

> You confuse me, what is the word that is started with 'CFA: address of
> DOCOL'
> Is this the defining word, containing CREATE and DOES> or the defined word
> that is built by CREATE and DOES> ?

Its an example of the layout of an ordinary compiled definition.

BruceMcF

unread,

Dec 20, 2011, 2:50:50 PM12/20/11

to

On Dec 20, 10:53 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "Coos Haak" <chfo...@hccnet.nl> wrote in message

>
> news:en7jqy57a3cf$.1oo4tb5zpev7n$.dlg@40tude.net...
>
>
>
>
>
> > Op Mon, 19 Dec 2011 05:25:17 -0500 schreef Rod Pemberton:
>

> > > "Elizabeth D. Rather" <erat...@forth.com> wrote in message

> > >news:E5qdnak5ntDqs3PT...@supernews.com...
> > <snip>
> > >> Actually, there are more defining words that use , between the CREATE
> > >> and DOES> than those that don't.
>
> > >>: CONSTANT ( x -- ) CREATE , DOES> ( -- x ) @ ;
>
> > > Interesting... I'm using that definition and it works. I guess I need
> > > to review why that is working, since I currently think that the COMMA
> > > should shift the PFA field where the address to the after-DOES> code
> > > is stored.
>
> > What do you mean with shift? COMMA is simply
> > : , HERE 1 CELLS ALLOT ! ;
>
> COMMA allocates and stores a value in the dictionary. My CREATE ... DOES>
> is storing a pointer to the DOES> code in my PF area, not my CFA. E.g.,
> like so:
>
> CFA: address of DOCOL
> PF field 1: address
> PF field 2: address
> PF field 3: address
> ...

> The DOES> address I believe is being retrieved from PF field 1. The DOES>
> address I believe is being stored at the current dictionary pointer location
> which could be PF field 1, or PF field 2, or ... So, if a COMMA is between
> CREATE and DOES>, I'd expect it to shift where the DOES> address is stored,
> but not from where it is retrieved.

There's no reason why should change where the address of the DOES>
segment is stored, since the address of the DOES> segment will be
stored at some offset to your (misnamed) CFA (since you don't have an
address of a code field there, but rather the pointer to a C function,
so it properly would be "C Pointer Address", or CPA ~ also might have
better chances of getting employed in this tough job market than a
CFA, since liquidators need CPA's too).

> I'd also expect the possibility that
> something would overwrite PF field where the DOES> address is being stored,
> or the exact opposite. Any of those would break my implementation.
> However, CREATE ... DOES> is functioning correctly, at least for some
> trivial test cases. So, I'll have to find out why it's working. I'm not
> sure why what I expected from my code and what is going on are different
> yet.

Try this one.

: TEST: CREATE 0 , DOES> DUP @ . ;
TEST: THIS
TRUE THIS !
FALSE THIS !

BruceMcF

unread,

Dec 20, 2011, 3:23:49 PM12/20/11

to

On Dec 20, 10:57 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "BruceMcF" <agil...@netscape.net> wrote in message

> news:5148d2e6-54e0-4168...@i6g2000vbe.googlegroups.com...

> > I think that for Rod's, DOCREATE should be the equivalent of:
> > R> CELL+ EXIT

> Hmm, I don't have a DOCREATE. Merits of DOCREATE are?

You are defining variables and constants in your ASCII file, so you
don't need to have DOVAR or DOCON ...

: VARIABLE CREATE 0 , ;
: 2VARIABLE CREATE 0 , 0 , ;
: CONSTANT , DOES> @ ;
: 2CONSTANT , , DOES> 2@ ;

You can have CREATE build a new entry as:

CFA: DODOES
PF: DOES_CREATE

Where:
DOES_CREATE:
EXIT

So the minimum set of C-routines you need for function pointers to
store in CFA's are DOCOL and DODOES.

Adding DOCREATE makes everything defined with CREATE that does *not*
take a DOES> quicker (including variables), since it ignores the first
PF field, pushes a value two integers past the CFA that it was located
in on the stack, and performs a NEXT, so it avoids the ENTER that is
done as a consequence of DOCOL and DODOES.

> My CREATE COMPILEs a DOVAR currently.

Why would you have a routine called DOVAR if you define VARIABLE in
the ASCII file?

> > and DODOES should be the equivalent of:
> > R> DUP @ >R CELL+ EXIT
>
> I'm not quite sure of the exact C to Forth for my DODOES primitive. But, I
> think it would be equivalent to (untested):
>
> R> DUP CELL+ @ >R EXIT

Is the return stack value pointing to the CFA or to the first address
of the parameter field?

Its got to be the latter, since otherwise you'd trample the CFA when
you use a variable, and blow things up.

But that means that:

CFA: DODOES
PF: x1 ; leave address of PF on stack
x2 ; fetch this value and >R to execute DOES> segment

Then it would seem to mean that you'd be passing simple tests because
you are trampling *after* the simple test ~ possibly in space at HERE,
it you test the CREATEd word right away.

So create a 2-cell variable, execute DOES> on it, store to it, then
execute it again and see if it works.

: 2VARIABLE CREATE 0 , 0 , DOES> ;
2VARIABLE A
TRUE TRUE A 2!
A 2@ . .

If your C-code does what you say it does, the results could be
interesting.

Albert van der Horst

unread,

Dec 20, 2011, 4:43:33 PM12/20/11

to

In article <jcqb56$rai$1...@speranza.aioe.org>,

Rod Pemberton <do_no...@noavailemail.cmm> wrote:
>"BruceMcF" <agi...@netscape.net> wrote in message
>news:5148d2e6-54e0-4168...@i6g2000vbe.googlegroups.com...
>> I think that for Rod's, DOCREATE should be the equivalent of:
>> R> CELL+ EXIT
>
>Hmm, I don't have a DOCREATE. Merits of DOCREATE are?
>
>My CREATE COMPILEs a DOVAR currently.

You would be better off if your thinking went like
"my DOVAR compiles a CREATE currently".
>
>

<SNIP>

>
>
>Rod Pemberton

Groetjes Albert

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Albert van der Horst

unread,

Dec 20, 2011, 4:55:01 PM12/20/11

to

In article <J56dnQ6j7rEH023T...@supernews.com>,

Andrew Haley <andr...@littlepinkcloud.invalid> wrote:
>BruceMcF <agi...@netscape.net> wrote:
>> On Dec 19, 1:49?pm, "Elizabeth D. Rather" <erat...@forth.com> wrote:
>>> Well, to be absolutely correct, PFA="Parameter Field *Address*" which
>>> means the address *of* the parameter field of the word being referenced.
>>> That is what you store in the parameter field of a word being compiled
>>> in an ITC implementation, for example. The *parameter field* is where
>>> the payload of a definition goes.
>>
>> Aha, I have been thinking of a DTC. The ITC for a colon definition is:
>>
>> CFA: .word DOCOL
>> PFA: .word PF
>
>There is no need for this extra word in ITC. The only difference with
>DTC is that in DTC the CFA points to JMP DOCOL not .word DOCOL .

A CFA can never point to .word DOCOL. It must point to executable code.
It need not point to a JMP DOCOL. It can point to DOCOL itself.
a look at ciforth (lina/wina)

If you want an example of a clean implementation of ITC -- at the
expense of space and speed --, you may have a look at ciforth
(lina/wina).

>
>> PF: .word name1,...,name_n,EXIT
>>
>> So DODOES *does* have to do the equivalent of:
>> R> DUP @ >R CELL+ EXIT
>> ... as I thought (except in C for Rod's implementation), but there's
>> no problem with overwriting the "Parameter Field" because the PFA
>> holds the address of the start of the DOES> segment.
>
>A child of <BUILDS DOES> needs an extra cell.

To store a pointer to what it DOES.

>
>Andrew.

BruceMcF

unread,

Dec 20, 2011, 5:27:16 PM12/20/11

to

On Dec 20, 3:23 pm, BruceMcF <agil...@netscape.net> wrote:
> On Dec 20, 10:57 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
>
> > "BruceMcF" <agil...@netscape.net> wrote in message
> >news:5148d2e6-54e0-4168...@i6g2000vbe.googlegroups.com...
> > > I think that for Rod's, DOCREATE should be the equivalent of:
> > > R> CELL+ EXIT
> > Hmm, I don't have a DOCREATE. Merits of DOCREATE are?
>
> You are defining variables and constants in your ASCII file, so you
> don't need to have DOVAR or DOCON ...
>
> : VARIABLE CREATE 0 , ;
> : 2VARIABLE CREATE 0 , 0 , ;
> : CONSTANT , DOES> @ ;
> : 2CONSTANT , , DOES> 2@ ;
>
> You can have CREATE build a new entry as:
>
> CFA: DODOES
> PF: DOES_CREATE
>
> Where:
> DOES_CREATE:
> EXIT
>
> So the minimum set of C-routines you need for function pointers to
> store in CFA's are DOCOL and DODOES.

... that is, over and above your actual primitives, which would be:

CFA: pointer to C function of primitive

Your Forth primitives and DOCREATE and DOCOLON and you'd be set.

BruceMcF

unread,

Dec 20, 2011, 5:33:56 PM12/20/11

to

On Dec 20, 4:55 pm, Albert van der Horst

> >A child of <BUILDS DOES> needs an extra cell.
> To store a pointer to what it DOES.

Unless the pointer to the does segment can be recognized as something
other than a normal content of a CFA.

Andrew Haley

unread,

Dec 21, 2011, 4:01:43 AM12/21/11

to

Albert van der Horst <alb...@spenarnc.xs4all.nl> wrote:
> In article <J56dnQ6j7rEH023T...@supernews.com>,
> Andrew Haley <andr...@littlepinkcloud.invalid> wrote:
>>BruceMcF <agi...@netscape.net> wrote:
>>> On Dec 19, 1:49?pm, "Elizabeth D. Rather" <erat...@forth.com> wrote:
>>>> Well, to be absolutely correct, PFA="Parameter Field *Address*" which
>>>> means the address *of* the parameter field of the word being referenced.
>>>> That is what you store in the parameter field of a word being compiled
>>>> in an ITC implementation, for example. The *parameter field* is where
>>>> the payload of a definition goes.
>>>
>>> Aha, I have been thinking of a DTC. The ITC for a colon definition is:
>>>
>>> CFA: .word DOCOL
>>> PFA: .word PF
>>
>>There is no need for this extra word in ITC. The only difference with
>>DTC is that in DTC the CFA points to JMP DOCOL not .word DOCOL .
>
> A CFA can never point to .word DOCOL. It must point to executable code.

DOCOL is executable code. The CFA is the address of the code field,
not the code field itself.

> It need not point to a JMP DOCOL. It can point to DOCOL itself.
> a look at ciforth (lina/wina)

Of course, that's normal. What do you think I said?

Andrew.

BruceMcF

unread,

Dec 21, 2011, 11:41:35 AM12/21/11

to

On Dec 20, 10:58 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

>> Given your objectives, it seems to make sense to define VARIABLE and
>> CONSTANT in the ASCII file, as:

>> : VARIABLE CREATE 0 , ;
>> : CONSTANT CREATE , DOES> @ ;

> Yes, I already use those in the ASCII file.

Then you don't need a DOVAR or a DCON ~ because when you define a
constant, it will have a DODOES in it, and when you define a variable,
it will have a DOCREATE in it.

As far as the variable bootstrapping problem, for variables like >IN
for your parser, your "primitive variables" can be handbuilt with
DOCREATE or you can define a block of dataspace that is referenced in
the C-source by offset within the block and link the Forth into that
data block with a set of CONSTANT definitions for the addresses within
that data block.

A chunk of that data block can be addresses to bootstrap routines used
to get your parser up and running to interpret the ASCII file, which
are then built on or replaced while the ASCII file is being
interpreted. For instance, if the ASCII file assumes hexadecimal
literals at the outset, your boostrap literal number interpreter is a
simple mask and shift process on a string of known maximum length so
BASE can be defined in the ASCII file and then the xt of the Forth94
literal conversion routine replaces the xt of the bootstrap literal
number converter in the bootstrap data block.

In this bootstrap approach, its handy if the ASCII file defines a
SAVESYSTEM so that it can save the resulting kernel.

Rod Pemberton

unread,

Dec 21, 2011, 2:56:15 PM12/21/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:3119f74d-1bb0-4f28...@l29g2000yqf.googlegroups.com...
> [...]

> Try this one.
>
> : TEST: CREATE 0 , DOES> DUP @ . ;
> TEST: THIS
> TRUE THIS !

0

> FALSE THIS !
>

-1

Rod Pemberton

Rod Pemberton

unread,

Dec 21, 2011, 4:14:25 PM12/21/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:325142d5-d312-4e7f...@o9g2000yqa.googlegroups.com...

> On Dec 20, 10:57 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > "BruceMcF" <agil...@netscape.net> wrote in message
> > news:5148d2e6-54e0-4168...@i6g2000vbe.googlegroups.com...

...

> > My CREATE COMPILEs a DOVAR currently.
> >
> Why would you have a routine called DOVAR if you define
> VARIABLE in the ASCII file?

CREATE compiles something into the CFA ...

IIRC, descriptions of Forth that use the definitions for CONSTANT and
VARIABLE that I'm using and you posted, compile DODOES and DOVAR. I don't
recall where those definitions came from. They seem to be very prevalent.
Are those eForth definitions? IIRC, fig-Forth compiled DOCON and DOVAR. I
have some notes which say:

CREATE compiles DOVAR (or ENTER DOVAR)
VARIABLE compiles DOVAR (or ENTER DOVAR)
CONSTANT compiles DODOES (or ENTER DODOES)
LITERAL compiles DOLIT (or ENTER DOLIT)

> Is the return stack value pointing to the CFA or to the
> first address of the parameter field?

Uh ...

> Its got to be the latter, since otherwise you'd trample the CFA
> when you use a variable, and blow things up.
>
> But that means that:
>
> CFA: DODOES
> PF: x1 ; leave address of PF on stack
> x2 ; fetch this value and >R to execute DOES> segment

Yeah, not sure ...

I have to get back to you on those.

> Then it would seem to mean that you'd be passing simple tests
> because you are trampling *after* the simple test ~ possibly in
> space at HERE, it you test the CREATEd word right away.
>
> So create a 2-cell variable, execute DOES> on it, store to it,
> then execute it again and see if it works.
>
> : 2VARIABLE CREATE 0 , 0 , DOES> ;
> 2VARIABLE A
> TRUE TRUE A 2!
> A 2@ . .
>
> If your C-code does what you say it does, the results could
> be interesting.
>

I don't have the 2X words. It crashes if I rewrite to double operations
using normal @ and !, e.g., TRUE A ! TRUE A CELL+ ! . It seems that this a
problem. I can dump my definitions, but it's all in hex. Currently, I've
got an address I can't identify as to which Forth word it belongs ... I
guess I'm going to have to add name resolution to my dump before I can track

down what is going on.

If I change the two zero's to 5 and 6, it compiles 5 and 6 int PF x1 and x2
respectively. PF x3 is an address in 2VAR. The value at that addres does
not match valid address for DOES> or DODOES or anything related like EXIT
... I think it's broken, or not fully functional. I.e., at a minimum it is
shifting the address that points into 2VAR and I think the primitive gets
from PF x1 only.

Rod Pemberton

Rod Pemberton

unread,

Dec 21, 2011, 4:14:56 PM12/21/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:159725a0-c652-4ce2...@u32g2000yqe.googlegroups.com...

> On Dec 20, 10:58 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:

...

> >> Given your objectives, it seems to make sense to define VARIABLE and
> >> CONSTANT in the ASCII file, as:
> >> : VARIABLE CREATE 0 , ;
> >> : CONSTANT CREATE , DOES> @ ;
>
> > Yes, I already use those in the ASCII file.
>
> Then you don't need a DOVAR or a DCON ~ because when you define
> a constant, it will have a DODOES in it, and when you define a variable,
> it will have a DOCREATE in it.

That would change VARIABLE from using DOVAR to using DOCREATE, but
CONSTANT still uses DODOES. I asked about DOCREATE elsewhere
and you replied.

> For instance, if the ASCII file assumes hexadecimal

> literals at the outset [...]

Currently, I use decimal in the ASCII file, although I do prefer hex. IIRC,
it should accept any base that C accepts since I'm using C's sscanf()
function to convert text numbers in NUMBER. However, it would be in C's
format, e.g., 0x20 for 32. At some point, that may be converted to Forth
and I may eventually support BASE etc.

> [...] your boostrap literal number interpreter is a

> simple mask and shift process on a string of known
> maximum length so BASE can be defined in the
> ASCII file and then the xt of the Forth94 literal
> conversion routine replaces the xt of the bootstrap literal
> number converter in the bootstrap data block.

Forth code for parsing numbers is not implemented, yet, if ever. My
perspective - perhaps warped - is if I can use some standard C function
calls as useful primitives, e.g., sccanf(), strcmp(), strcpy(), strlen(),
strchr(), etc., should I try to implement them in Forth? My interpreter is
coded in C and so has access to a full set of C libraries. Although I'm
attempting to minimize C use, some C functions are simple, available, and
very useful, just like DUP or SWAP or 0=, etc.

Rod Pemberton

Rod Pemberton

unread,

Dec 21, 2011, 4:16:46 PM12/21/11

to

"Coos Haak" <chf...@hccnet.nl> wrote in message

news:zrwtxdv3v6p8$.it3pvtej34hz.dlg@40tude.net...

>
> You confuse me, what is the word that is started with 'CFA: address of
> DOCOL'
> Is this the defining word, containing CREATE and DOES> or the defined
> word that is built by CREATE and DOES> ?

That would be the defined word.

I just tested an example by Bruce, I think it's broken.

Rod Pemberton

BruceMcF

unread,

Dec 21, 2011, 5:53:05 PM12/21/11

to

On Dec 21, 4:14 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> CREATE compiles something into the CFA ...

It would be called DOCREA in the fig-Forth approach, if you don't have
a DOVAR because you finish compiling the system out of your ASCII
file.

> IIRC, descriptions of Forth that use the definitions for CONSTANT and
> VARIABLE that I'm using and you posted, compile DODOES and DOVAR.

They'd be either/or ~ if you define CONSTANT and VARIABLE with CREATE
and DOES> then DOCON and DOVAR are redundant, if you don't then you
need DOCON and DOVAR ~ sometimes DOCREA is the same as DOVAR and then
one of the two is redundant, sometimes its not.

> Are those eForth definitions?

They are long standing definitions. Heck, if you know the system in
detail and especially if you have a system monitor, you can bootstrap
from a working interpreter and with CREATE , and C, in the dictionary
and nothing else.

> IIRC, fig-Forth compiled DOCON and DOVAR.

Yes.

> CREATE compiles DOVAR (or ENTER DOVAR)
> VARIABLE compiles DOVAR (or ENTER DOVAR)
> CONSTANT compiles DODOES (or ENTER DODOES)
> LITERAL compiles DOLIT (or ENTER DOLIT)

When the contents at the code field address are the same for CREATE
and
VARIABLE then whether you call it DOVAR or DOCREATE is arbitrary,
isn't it?

But given that you need to have two cells in your CREATEd structure so
it can work after DOES> has done it's magic, I'd suggest that you need
a DOCREATE and then can decide whether to have the compiled kernel one
routine longer and include a DOVAR:
: VARIABLE HEADER 'DOVAR , 0 , ;

... and save space in all of your variables. But save that until after
CREATE DOES> works.

> If I change the two zero's to 5 and 6, it compiles 5 and 6 int PF x1
> and x2 respectively. PF x3 is an address in 2VAR. The value at that
> addres does not match valid address for DOES> or DODOES or anything
> related like EXIT ... I think it's broken, or not fully functional.

I'd go House and say an attempted treatment could be the quickest
diagnosis tool. Change CREATE so that it pads an empty cell before
returns, and change DOES> so that it puts DODOES at the CFA and the
address of the target segment at the address following. It it fixes
the problem, your prior does was trashing the dictionary, just doing
it where it did not show up in the single cell test.

BruceMcF

unread,

Dec 21, 2011, 6:14:14 PM12/21/11

to

On Dec 21, 4:14 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> My perspective - perhaps warped - is if I can use some standard C
> function calls as useful primitives, e.g., sccanf(), strcmp(), strcpy()

> strlen(), strchr(), etc., should I try to implement them in Forth?

It depends on your goals. If your goal is to have a Forth94 system,
you need to sooner or later have a Forth94-compliant interpreter, and
the simplest way to get that is to implement the number parser in
Forth ~ its a straightforward definition readily available in a
variety of existing implementations.

However, for the bootstrap, it depends on whether you car about kernel
size and dependencies, or not. Its quite possible to write a C-
compiler in Forth that is along the lines of the fig-Forth system in
that the I/O is from the ground up, and you don't *call* any standard
C functions. However, its certainly more convenient to call a standard
C library, static bound, and have a free standing Forth kernel that
still inherits a substantial chunk of local system knowledge from the
system in which is it compiled. Several of the embedded system
oriented standard C libraries are fine grained enough that you do not
drag in everything including the kitchen sink when you do that.

Coos Haak

unread,

Dec 21, 2011, 6:55:19 PM12/21/11

to

Op Wed, 21 Dec 2011 16:14:25 -0500 schreef Rod Pemberton:

<snip>

> CREATE compiles something into the CFA ...

CREATE makes a header and cfa (containing DOCREA or ENTER DOCREA). Both did
not exist before executing CREATE.

>
> IIRC, descriptions of Forth that use the definitions for CONSTANT and
> VARIABLE that I'm using and you posted, compile DODOES and DOVAR. I don't
> recall where those definitions came from. They seem to be very prevalent.
> Are those eForth definitions? IIRC, fig-Forth compiled DOCON and DOVAR. I
> have some notes which say:

Those definitions are in eForth but they are much older. E.g. in Figforth
from about 1978.

>
> CREATE compiles DOVAR (or ENTER DOVAR)
> VARIABLE compiles DOVAR (or ENTER DOVAR)
> CONSTANT compiles DODOES (or ENTER DODOES)
> LITERAL compiles DOLIT (or ENTER DOLIT)

LITERAL is something different. DOLIT does not start a definition. LITERAL
compiles DOLIT and the number into the colon definition. In Figforth it was
called LIT.

Rod Pemberton

unread,

Dec 21, 2011, 9:48:58 PM12/21/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:b745bccd-d5c7-4f0d...@p41g2000yqm.googlegroups.com...

I've basically confirmed that address pointing to the DOES> code is expected
in the 2nd PF by my DODOES CFA primitive, but it is only written in that
location if there is exactly one , COMMA between CREATE ... DOES> . One
comma is what is needed to make the version of CONSTANT I'm using work.
That address is placed in a different PF locations depending on the number
of , COMMAs. Zero in 1st, one in 2nd, two in 3rd ... I.e., gets placed at
DP or HERE which is affecte by the , COMMAs, but is read from a fixed
position.

So, it seems that having the extra PFA field or cell padding is needed to
provide a place where that address can be stored where won't shift around.
That's something I'd rather not do. It seems wasteful. I'll think about
other solutions maybe next year ...

Rod Pemberton

Rod Pemberton

unread,

Dec 21, 2011, 9:49:24 PM12/21/11

to

"Rod Pemberton" <do_no...@noavailemail.cmm> wrote in message
news:jcn3b6$39r$1...@speranza.aioe.org...
> "Elizabeth D. Rather" <era...@forth.com> wrote in message
> news:E5qdnak5ntDqs3PT...@supernews.com...
> > On 12/18/11 2:53 AM, Rod Pemberton wrote:
...

> > : 2CONSTANT ( x2 x1 -- ) CREATE , , DOES> ( -- x2 x1 ) 2@ ;
> >
>
> I don't have 2@ yet. Let me try DUP 4 + @ SWAP @ ... That works! Yes,
> I definately have to check where the constants are being stored relative
to
> the link to the DOES> code. The address for the after-DOES> code
> is apparently not overwriting them and the two COMMA's is apparently
> not shifting the address for the after-DOES> code.
>

Bruce's 2VARIABLE fails:

: 2VARIABLE CREATE 0 , 0 , DOES> ;

So, I must've done something wrong when I tested 2CONSTANT . It should've
failed. Upon re-testing it fails too.

I'm probably going to have to add the PFA field or cell padding so I can
store the link in a fixed location. Currently, it is shifting depending on
the number of COMMAs between CREATE ... DOES> . I.e., stored at DP or HERE
which is affected by number of COMMAs, but is read from 2nd PF to work with
the definition of CONSTANT which has a single COMMA.

Rod Pemberton

BruceMcF

unread,

Dec 22, 2011, 1:02:24 AM12/22/11

to

On Dec 21, 9:48 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> "BruceMcF" <agil...@netscape.net> wrote in message

...

>> Change CREATE so that it pads an empty cell before
>> returns, and change DOES> so that it puts DODOES at the CFA

> I've basically confirmed that address pointing to the DOES> code is expected
> in the 2nd PF by my DODOES CFA primitive, but it is only written in that
> location if there is exactly one , COMMA between CREATE ... DOES> .

One
> comma is what is needed to make the version of CONSTANT I'm using work.
> That address is placed in a different PF locations depending on the

> number of , COMMAs. ?..

Ah, yes, to make one consumer of it work, you stopped DOES> trampling
at the head of the program field, at the cost of storing it relative
to HERE when DOES> executes ... whiech there is no general way to let
DODEAS where the address is located, because there'd be no placefor
the offset to be stored.

> So, it seems that having the extra PFA field or cell padding is needed to
> provide a place where that address can be stored where won't shift
> around.

> That's something I'd rather not do. It seems wasteful. I'll think about
> other solutions maybe next year ...

The waste is when CREATE is used without DOES> ... and the solution to
that is to tokenize the C-function pointer in the cfa so if its an
address, the does action can be implied.

If that waste bothers you, then DO use an actual DOVAR and DOCON, and
use a lower level common factor than CREATE. Say,
HEADER ( "name" -- )

: VARIABLE ( "name" -- ) HEADER 'DOVAR , 0 , ;
: CONSTANT ( x "name" -- ) HEADER 'DOCON , , ;
: CREATE ( "name" -- ) HEADER 'DOCREATE , 0 , ;

But allowing a word that has been made with CREATE to inherit a new
behavior with DOES> is no waste, and as you note, you cannot embed a
piece of a c-routine at the head of the does segment to do it there.
Space required to provide desired functionality is not wasted space.

Albert van der Horst

unread,

Dec 22, 2011, 3:33:17 PM12/22/11

to

In article <5eb33a21-d427-477a...@v24g2000yqk.googlegroups.com>,
BruceMcF <agi...@netscape.net> wrote:
>On Dec 20, 4:55=A0pm, Albert van der Horst

You can do all kind of things, but if you can't jump -- directly or
indirectly -- to the CFA you have HWTC , humonguous waste threaded
coded, because you need to interpret the interpreter.

Albert van der Horst

unread,

Dec 22, 2011, 3:38:52 PM12/22/11

to

In article <pPOdnfeBKuzqPWzT...@supernews.com>,

OK. Confusion between CFA and its content.
I'm in the habit of considering CFA as a variable that contains
something.

>
>Andrew.

BruceMcF

unread,

Dec 22, 2011, 4:32:39 PM12/22/11

to

On Dec 22, 3:33 pm, Albert van der Horst <alb...@spenarnc.xs4all.nl>
wrote:
> In article <5eb33a21-d427-477a-b68c-92f21fc34...@v24g2000yqk.googlegroups.com>,

>
> BruceMcF <agil...@netscape.net> wrote:
> >On Dec 20, 4:55=A0pm, Albert van der Horst
> >> >A child of <BUILDS DOES> needs an extra cell.
> >> To store a pointer to what it DOES.
> >Unless the pointer to the does segment can be recognized as something
> >other than a normal content of a CFA.

> You can do all kind of things, but if you can't jump -- directly or
> indirectly -- to the CFA you have HWTC , humonguous waste threaded
> coded, because you need to interpret the interpreter.

Waste of what? Time or space? If it was an aligned FORTH and the table
of function pointers were based on an odd bit address, you could
compile the index to the function pointer itself into compiled
definitions and have the same indirection *except* for CREATEd words.

Rod Pemberton

unread,

Dec 29, 2011, 3:39:35 AM12/29/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:f4e32f99-7fc0-4839...@f33g2000yqh.googlegroups.com...

On Dec 21, 9:48 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "BruceMcF" <agil...@netscape.net> wrote in message
...
> >> Change CREATE so that it pads an empty cell before
> >> returns, and change DOES> so that it puts DODOES at the CFA

> > So, it seems that having the extra PFA field or cell padding is needed
> > to provide a place where that address can be stored where won't
> > shift around.

> > That's something I'd rather not do. It seems wasteful. I'll think about
> > other solutions maybe next year ...

> The waste is when CREATE is used without DOES> ... and the solution to

> that is [...]

... perhaps to use an eForth compatible DOES> ?

eForth uses definitions similar to those I'm using for CONSTANT and
VARIABLE. I'm not quite sure, but it seems that eForth doesn't overwrite
the CFA with DODOES. I think doing that may be the source of my problems.
Instead, I think eForth leaves the CFA as DOVAR, compiles DODOES inline into
a defintion, and follows that by the DOES> address. I think DODOES in
eForth is implemented like a LIT value, except that the value gets jumped to
instead of stacked.

Currently, my DOES> is this:

: DOES> R> , LIT DODOES LAST @ CFA ! ;

It compiles the DOES> address into the PF region - in no specific location -
and compiles DODOES into the CFA. So, the DODOES in the CFA doesn't know
where to find the DOES> address. DODOES gets an address immediately
following the DODOES location, which in this case is one location past the
CFA, or always the 1st PF.

For my DOES> to be compatible with the CONSTANT and VARIABLE definitions, I
think my DOES> should something like this:

: DOES> COMPILE DODOES R> , ;

DODOES here is like LIT, but instead of placing the next value compiled into
the word onto the stack, the next value gets jumped to. This compiles
DODOES inline into the definition followed by the DOES> address. The CFA
remains set to DOVAR. So, when DODOES is executed, it fetches the following
inlined data, which is the DOES> address, and jumps to it. I think that
should work. Since DODOES is inline, it always knows the address is
immediately afterwards. That implies my DODOES definition/implementation is
correct for this DOES> definition. Implementing DOES> this way should also
allow multiple DOES> 's per definition, although what use that'd be or it's
legality is unknown by me. I may have to change DOVAR to allow execution to
continue with the remaining compiled words in a definition. That change
should allow VARIABLE and/or CONSTANT work. Implementing DOES> this way,
perhaps the eForth way, should eliminate the need for a mandatory PFA field
or padding. I've not tested this.

FYI, my primitive or low-level DODOES is effectively:

: DODOES R> DUP CELL+ @ >R ;

My high-level DODOES is coded as that. I've never used the high-level
version, so it's commented out.

Rod Pemberton

BruceMcF

unread,

Dec 29, 2011, 3:37:35 PM12/29/11

to

On Dec 29, 3:39 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> eForth uses definitions similar to those I'm using for CONSTANT and
> VARIABLE. I'm not quite sure, but it seems that eForth doesn't
> overwrite the CFA with DODOES. I think doing that may be the source
> of my problems.

> Instead, I think eForth leaves the CFA as DOVAR, compiles DODOES
> inline into a defintion, and follows that by the DOES> address.

Which eForth, v1 or v2?

No, it does not leave the CFA as DOVAR, instead it changes the CFA as
the location of DODOES at the head of the does segment. Yes, it does
compile DODOES inline at the head of the does segment.

As you've noted, your CFA is not a real code field, its a C-function-
pointer-field, "CFPFA" ... and that's why you can't use the eForth
implementation.

*That's* what limits your choices to two basic approaches:
(1) Pad CREATE words so that PFA =(CFA++)++, rather than CFA++
(2) Encode the difference between actual C-function-pointers and DOES
target segment addresses, and have your C-coded inner interpreter
decode and act appropriately.

You can chase up other options "debug code" style, but they will be
will-o-wisps ~ the other valid options for an indirect threaded
interpreter assume that the content of the CFA is actually a code
address. For a particular CPU and particular C-compiler, you could
embed code that would work correctly, but it would not be portable
across multiple CPU''s and C-compilers.

If you are not using the top half of the memory range, your C-function-
pointers will all be positive, and one encoding is to set the sign bit
of DOES segment addresses. Then its an if-then on the value being
positive, jumping to the C-function if its positive, otherwise
clearing the top bit and performing DODOES inline within the C-coded-
inner-interpreter.

However, option (1) is simpler, so I'd suggest doing that first,
profiling the result, and if it consumes more space than you want
while having ample speed for your purposes, exploring option (2).

Rod Pemberton

unread,

Dec 30, 2011, 1:31:19 PM12/30/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:dd825268-f575-4665...@z12g2000yqm.googlegroups.com...

> On Dec 29, 3:39 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > eForth uses definitions similar to those I'm using for CONSTANT and
> > VARIABLE. I'm not quite sure, but it seems that eForth doesn't
> > overwrite the CFA with DODOES. I think doing that may be the source
> > of my problems.
>
> > Instead, I think eForth leaves the CFA as DOVAR, compiles DODOES
> > inline into a defintion, and follows that by the DOES> address.
>
> Which eForth, v1 or v2?

Not sure. No point.

> No, it does not leave the CFA as DOVAR, instead it changes the CFA as
> the location of DODOES at the head of the does segment. Yes, it does
> compile DODOES inline at the head of the does segment.

Hmm... Ok.

> As you've noted, your CFA is not a real code field, its a C-function-
> pointer-field, "CFPFA" ... and that's why you can't use the eForth
> implementation.
>
> *That's* what limits your choices to two basic approaches:
> (1) Pad CREATE words so that PFA =(CFA++)++, rather than CFA++
> (2) Encode the difference between actual C-function-pointers and DOES
> target segment addresses, and have your C-coded inner interpreter
> decode and act appropriately.

So, I decided I try what I thought eForth was doing, that you said it
wasn't, to see if I could get that to work or not. It seems a bit cleaner
to me. If not, then I would attempt one of the methods you've strongly
suggested. Am I stubborn? ;-)

Well, I got sidetracked slightly and ran into some other issues. I didn't
know why there was an EXIT on my constants created with DOCON, but not on
my variables created with DOVAR. Removing the the EXIT's and the part of
the DOCON primitive code that used the EXIT's revealed a bug in X, the
termination primitive for the dictionary. So, I fixed that. I decide to
eliminate DOCON since I also have LIT. Why use a primitive for a constant
that can only be used in the CFA when I can use LIT for constants anywhere
as long as I use ENTER ... EXIT? I had used DOCON for six constants: 3
bootstrap and 3 high-level forth. I replace them with ENTER ... EXIT
sequences using LIT. So, I now have one less CFA routine and primitive!
Unfortunately, that created a circular reference for one constant. I needed
to know CELL for LIT indirectly via CELL+ and which meant I couldn't define
CELL in terms of LIT. So, I implemented a variable with CELL's value and
rewrote CELL to fetch that variable. That breaks the circular reference.
So,I upped my variables to ten. In the process, I realized that DOVAR could
probably be eliminated too. If LIT can fetch the next value from the
instruction stream, place it on the stack, and continue execution with the
next compiled address, then one should be able to code a similar word that
places the address of the next value on the stack and continues execution.
If so, then DOVAR can be eliminated too (except maybe for CELL ...). Is
there a common name for such a word? e.g., ADR or LOC? To me, it just seems
to be the interpreter definition for LIT but without the fetch:

: LIT R> DUP CELL+ >R @ ;

As for the DOES>, the changed definition seems to be compiling properly and
in a way I think should work. But, I would need to change DOVAR to allow
execution to continue with the next compiled address after the variable
value. That would require adjusting the primitive to push a value on the
return stack and have each variable use an EXIT leading me back into the
same issue I just had with DOCON. Now, if I eliminate DOVAR replacing it
with a word like LIT that doesn't fetch, then I can see this all working ...
And, I'll have eliminated another primitive and CFA routine! Alternately, I
may duplicate DOVAR's functionality with one for pre-compiled variables, and
another slightly modified for DOVAR for runtime, at least until I can rework
things.

Rod Pemberton

BruceMcF

unread,

Dec 30, 2011, 2:12:22 PM12/30/11

to

On Dec 30, 1:31 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> So, I decided I try what I thought eForth was doing, that you said it
> wasn't, to see if I could get that to work or not. It seems a bit
> cleaner to me.

So, you are going to leave the CFA as DOVAR. How precisely does that
result in a call to the does segment? You don't have a working DOES>
unless the does segment gets executed. That requires the address of
the does segment to be stored in the CREATEd word.

There is a third solution, actually, which is to make it a doubly
indirect threaded compiler ~ have the contents at the code field
address be the address of a C function pointer rather than being a C
function pointer itself. Then you can embed the C function pointer for
DODOES at the head of the DOES segment and you are set.

> I decide to eliminate DOCON since I also have LIT. Why use a
> primitive for a constant that can only be used in the CFA when I can
> use LIT for constants anywhere as long as I use ENTER ... EXIT?

The normal rationale is efficiency ~ saving the execution of the ENTER
and EXIT. If the focus is on a small kernel, its fine to have pre-
compiled constants using LIT.

> If LIT can fetch the next value from the instruction stream, place
> it on the stack, and continue execution with the next compiled
> address, then one should be able to code a similar word that
> places the address of the next value on the stack and continues
> execution.

Where would you use it? Other than in defining a variable to be a four
cell sequence of:
ENTER <foo> [_variable_location_] EXIT
... in place of:
DOVAR [_variable_location_]

... because if its a straight up swap of that "<foo>" word for DOVAR
when that "<foo>" word is not used anywhere else, its hard to see what
the benefit is.

> As for the DOES>, the changed definition seems to be compiling
> properly and in a way I think should work. But, I would need to
> change DOVAR to allow execution to continue with the next compiled
> address after the variable value.

But you can't use that alternate DOVAR for CREATE, because CREATE can
have a body much longer than just one cell. Just consider a normal
long string constant,

: $CONSTANT ( ca u "name" )
CREATE DUP , HERE SWAP CHARS DUP ALLOT MOVE
DOES> DUP @ SWAP CELL+ ;

Where its:
<name-structure>
CFA: DOCREATE ; or maybe DODOES
; ?? Maybe one cell padding for DOES> segment address
.word u ; stored cell long length
.char ... ; "u" chars in string.

Rod Pemberton

unread,

Dec 31, 2011, 5:52:45 AM12/31/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:724b7ee6-9199-4566...@d8g2000yqk.googlegroups.com...

> On Dec 30, 1:31 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:

> So, you are going to leave the CFA as DOVAR.

I'm going to attempt to ...

> How precisely
> does that result in a call to the does segment?

It doesn't.

> You don't have a working DOES>
> unless the does segment gets executed.

It should.

> That requires the address of the does segment
> to be stored in the CREATEd word.

It is.

CONSTANT, which uses DOES>, does this:

DOVAR ( CFA )
<comma'd constant> ( 1st PF )
... ( optional compiled addresses )
DODOES
DOES> address
... ( optional compiled addresses )
EXIT

DOVAR executes placing the address of the constant on the stack. It will
continue execution at DODOES. (not currently implemented) DODOES pushes the
IP on the stack, gets the DOES> address and pushes that on the return stack.
EXIT returns to pushed address. It seems to me that DODOES no longer needs
to push the IP since DOVAR is doing that.

> There is a third solution, actually, which is to make it a doubly
> indirect threaded compiler ~ have the contents at the code field
> address be the address of a C function pointer rather than being a C
> function pointer itself. Then you can embed the C function pointer for
> DODOES at the head of the DOES segment and you are set.

I'd have to review how exactly I did that. There was quite a bit of
juggling until I found an ANSI C compatible solution for addresses,
constants, function pointers, etc. I.e., I'm not sure at the moment for
which one the C type would be correct.

> > I decide to eliminate DOCON since I also have LIT. Why use a
> > primitive for a constant that can only be used in the CFA when I can
> > use LIT for constants anywhere as long as I use ENTER ... EXIT?
>
> The normal rationale is efficiency ~ saving the execution of the ENTER
> and EXIT. If the focus is on a small kernel, its fine to have pre-
> compiled constants using LIT.

Yes, but in my case that adds a primitive, i.e., C code, instead of Forth
code as text or precompiled in C.

> > If LIT can fetch the next value from the instruction stream, place
> > it on the stack, and continue execution with the next compiled
> > address, then one should be able to code a similar word that
> > places the address of the next value on the stack and continues
> > execution.
>
> Where would you use it?

DOVAR, the DODOES described above, ...

> [...] is not used anywhere else, its hard to see what
> the benefit is.

Well, it should eliminate a primitive which is C code. One primitive is not
a big issue, but the fewer the better. The Forth code can become more
generalized or standardized and less C environment specific.

> > As for the DOES>, the changed definition seems to be compiling
> > properly and in a way I think should work. But, I would need to
> > change DOVAR to allow execution to continue with the next compiled
> > address after the variable value.
>
> But you can't use that alternate DOVAR for CREATE, because CREATE
> can have a body much longer than just one cell. Just consider a normal
> long string constant,
>
> : $CONSTANT ( ca u "name" )
> CREATE DUP , HERE SWAP CHARS DUP ALLOT MOVE
> DOES> DUP @ SWAP CELL+ ;
>
> Where its:
> <name-structure>
> CFA: DOCREATE ; or maybe DODOES
> ; ?? Maybe one cell padding for DOES> segment address
> .word u ; stored cell long length
> .char ... ; "u" chars in string.

I think $CONSTANT would do this:

<dictionary header>
CFA: DOVAR
comma'd constant
.string ... ; ( non-standard format )
DODOES
DOES> address
EXIT

Yes, if DOVAR continues execution after the constant, that would require a
branch or 'DOSTR' to skip over ALLOT'd space ... So, there are some issues
with user defined strings and arrays. I think a different definition could
place the ALLOT'd space after the code ... If so, is this an implementation
issue? I.e., are there or could there be some Forth's for which the
$CONSTANT definition will fail validly? , COMMA and C, seem to be the
primary uses of ALLOT, at least in what I've coded so far. It's possible
that ALLOT could be modified or duplicated so that large allocations place a
branch around the allotted space. String concatenation or linked lists via
multiple ALLOTs might be an issue. Are those done that way in Forth?
Anyway, just a thought ...

Rod Pemberton

BruceMcF

unread,

Dec 31, 2011, 2:33:18 PM12/31/11

to

On Dec 31, 5:52 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> <dictionary header>
> CFA: DOVAR
> comma'd constant
> .string ... ; ( non-standard format )
> DODOES
> DOES> address
> EXIT

> Yes, if DOVAR continues execution after the constant, that would
> require a branch or 'DOSTR' to skip over ALLOT'd space ...

Over and above the massive wasted space you are adding to CREATEd
segments that have a DOES> applied just to avoid padding all CREATEd
words by one cell ...

... while building the body in standard Forth code, the data space is
*contiguous* ~ unbroken.

Remember, the PURPOSE of CREATE HERE and ALLOT is as building blocks
to allow the construction of general purpose data structures. That is
*why* they can be used to define VARIABLE and CONSTANT ~ but a version
that only *works* for defining VARIABLE and CONSTANT does not actually
work when interpreting actual Forth code.

(1) And the construction of the general purpose data structure has to
be able to start AS SOON as you are done with CREATE.

(2) And has to be able to be patched with DOES> *at any time while
building that word*.

(1) and (2) are an envelope around your design criteria. Violate (1)
and CREATE is useless as a general purpose data structure building
word, violate (2) and you place constraints on DOES> that do not exist
in normal implementations and which existing or future Forth94 source
has not reason to respect.

Think about a structure that is an item in a linked list of counted
strings:

CFA: DOVAR
.word ... ; comma'd constant
; **1 ?
.word nextitem
; **2 ?
.byte $4,"item
; **3 ?

: MAKE-LINKED: ( addr1 "name" -- addr2 )
CREATE HERE SWAP , ;

There are a set of things, "tokens", that are linked in the background
for whatever reason.

: token: ( "name" -- ) \ "name" ( -- addr )
token-head @ MAKE-LINKED: token-head ! DOES> CELL+ ;

The $token is one of them (I suppose it might be strings of 31 or
less, so high three bits not clear can be detected to be some other
type of token), and the string token word itself should act as a
string constant:

: $token: ( ca u "name" -- ) \ "name" ( -- ca u )
token: DUP 31 0 XOR AND ABORT" token string too long."
$, DOES> CELL+ COUNT ;

Note that you don't "know" that the string is going to be counted when
$, executes, since its just a word that places a counted string at
HERE.

Now matter how much you squirm, the DODOES cannot be placed at point
"**2" above without violating contiguity of the body. So its got to
either be at point **1 or at point **3.

But it can't be at **3, because DOES> can be executed at any time
while build the word. Including, as above, executed once for one data
structure and executed again for an extension of that data structure
to over-ride the original default behavior.

So it has to be at **1.

But if its at **1, what do you gain leaving the DOVAR in place,
because the first thing your DODOES would have to do would be to DROP
the variable address placed on the stack to put in its place the
address of the actual body of the created word.

And if its at **1, you have added TWO cells to every created word.

CFA: DOVAR
.word ... ; variable location, not used
DOCREATED
.word 0 ; still need a place for the DODOES address to go
EXIT
; body

... which on DOES becomes

CFA: DOVAR
.word xxxx ; variable location, not used
DODOES
.word does_segment
EXIT

> So, there are some issues with user defined strings and arrays.

Yes, there are issues: you are breaking the actual use of CREATE for
actual code, while flailing around with trying to do the basics the
hard way around.

Rod Pemberton

unread,

Dec 31, 2011, 8:10:46 PM12/31/11

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:a9e89231-b066-43ef...@n39g2000yqh.googlegroups.com...

> On Dec 31, 5:52 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > <dictionary header>
> > CFA: DOVAR
> > comma'd constant
> > .string ... ; ( non-standard format )
> > DODOES
> > DOES> address
> > EXIT
>
> > Yes, if DOVAR continues execution after the constant, that would
> > require a branch or 'DOSTR' to skip over ALLOT'd space ...
>
> Over and above the massive wasted space you are adding to CREATEd
> segments that have a DOES> applied just to avoid padding all CREATEd
> words by one cell ...

Yeah, that too ...

> ... while building the body in standard Forth code, the data space is
> *contiguous* ~ unbroken.

Yes.

> CFA: DOVAR
> .word ... ; comma'd constant
> ; **1 ?
> .word nextitem
> ; **2 ?
> .byte $4,"item
> ; **3 ?

> [...]

> Now matter how much you squirm, the DODOES cannot be placed
> at point "**2" above without violating contiguity of the body. So its
> got to either be at point **1 or at point **3.

> [...]

> So it has to be at **1.

It seems that it is quite complicated to store a DOES> address in the PF
region. I.e., a variable location leads to inability to locate it, a fixed
location will overwrite data or code address or vice-versa, allowing
execution to continue until it finds it will work for code definitions but
not data definitions which will have to be jumped over, etc. The apparent
difficulties in implementing DOES> would imply to me that DOES> is really
not a good fit with the way a Forth interpreter works ... I don't believe
that other Forth words have such issues, do they? Well, I can't store the
DOES> address in the CFA. I'm 100% positive that'll fail. So, I may have
no choice but to add a extra field for it. Once I do that, that field will
be named PFA instead of the code or data part of a word. So, what is the
code or data part usually named?

> Yes, there are issues: you are breaking the actual use of CREATE
> for actual code, while flailing around with trying to do the basics the
> hard way around.

I'm just exploring the options ...

If I hadn't realized the my DOES> only worked for CONSTANT and VARIABLE, I
wouldn't be going through this. It was hard enough to find info on how they
worked the first time, and to get them to work for CONSTANT and VARIABLE. I
couldn't even dump definitions as hex values back then ...

The one solution is elegant for a code definition, IMO. I hadn't considered
a data definition. That could be lack of Forth experience, or it could be I
didn't come across the iimplementation problem via execution of some other
word since I'm not done implementing ... ;-) E.g., I haven't implemented
." dot-quote, etc., where I would be storing a block of data in a word.
E.g., I know another use of CREATE fails due to an overwrite of my input
buffer. I just haven't found and fixed it yet.

Thanks.

Rod Pemberton

BruceMcF

unread,

Dec 31, 2011, 10:12:17 PM12/31/11

to

On Dec 31, 8:10 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> It seems that it is quite complicated to store a DOES> address in the PF region.

That shouldn't be surprising, since the whole point of the data body
is to give the user maximum flexibility, and so trying to implement
DOES> to trample on the body is not going to occur to even a sometime
off&on forth user like myself, let alone an experienced forth user.

> Well, I can't store the
> DOES> address in the CFA. I'm 100% positive that'll fail.

That depends on your inner interpreter: but it should be easier to add
the padding and wait until you have a working system.

All of this is because your "CFA" isn't really a code field address.
If it was a CFA, you could just embed code to do the does segment
launch at the start of the does, then the address of that code can go
into the code field.

(1) Make it a pointer to a c function pointer, then you can start the
does segment with a DODOES that is a c-function pointer

(2) Make your primitives tokens, and execute the primitives in a case
statement. Do the does action implicitly when the contents of CFA are
out of range of the tokes.

(3) If the c-function pointers will be in the positive, flip the sign
bit of does-segment addresses.

So its not particularly hard to come up with a one cell substitute
that will work with the right c-coded inner interpreter, its just that
padding the CFA only requires adjusting DOCREATE DODOES and CREATE so
its probably the short path to a working implementation.

Regarding terminology, go ahead and call the thing that holds c-
function the code field, so the address of that field is the CFA. The
data body address is a traditional parameter field, so its address may
as well be the PFA.

You don't have to call the padding cell anything, just treat the code
field as two cells wide. CFA>PFA would just be:
: CFA>PFA ( cfa -- pfa) CELL+ CELL+ ;

Rod Pemberton

unread,

Jan 4, 2012, 5:13:26 AM1/4/12

to

"BruceMcF" <agi...@netscape.net> wrote in message

news:724b7ee6-9199-4566...@d8g2000yqk.googlegroups.com...

> On Dec 30, 1:31 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:

...

> > If LIT can fetch the next value from the instruction stream, place
> > it on the stack, and continue execution with the next compiled
> > address, then one should be able to code a similar word that
> > places the address of the next value on the stack and continues
> > execution.
>
> Where would you use it?

: LOC R> DUP CELL+ >R ;

I responded to that before but not as completely, so I'll be repeating
myself slightly. I think LOC should allow me to eliminate DOVAR. The
exception is CELL which also has a circular reference with LOC. CELL
will have to remain DOVAR based for now. LOC is more generic and does the
same thing as DOVAR, but allows execution to continue with the following
compiled word, just like LIT. Also, LOC, a high-level word, can be used to
implement both variables and constants - stand-alone or inlined - without
need of DOVAR or DOCON as primitives. E.g., it should be useful if someone
needs a word local, temporary variable such as counter. I think it would be
convenient with DUP and +! for a counter within a loop. It'd be unnamed and
completely internal to the current definition. For bootstrap code, LIT
<val> can replaced with: LOC <val> @

I really like the extra flexibility that LOC provides, or should. I'll
probably attempt using LOC in the next few days. I probably won't remove
or replace LIT with LOC, at least for a while, since I've got 22 of them
already ...

Rod Pemberton

BruceMcF

unread,

Jan 4, 2012, 3:49:01 PM1/4/12

to

On Jan 4, 5:13 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> I really like the extra flexibility that LOC provides, or should. I'll
> probably attempt using LOC in the next few days.

> I probably won't remove or replace LIT with LOC, at least for a while,
> since I've got 22 of them already ...

I don't get the benefit yet ... if you're swapping DOVAR with LOC then
you've not gained anything yet in terms of primitive count, and have
made each VARIABLE one cell larger. If you replace each:
... [DOLIT] [<value>] ....

... in the definition with:
... [LOC] [<value>] [@] ...

... you've fattened up all of your definitions by one cell per
literal, to eliminate the [LIT] and allow [LOC] to work as both a
space-inefficient DOVAR factor and a space-inefficient DOLIT factor.

But if you are willing to make all of your variables one cell bigger,
you can define VARIABLE as (in effect):
: VARIABLE HERE 0 , ALLOT CONSTANT ;

... like a variable definition in a ROMable Forth compiler, making the
variable a constant pointing to the variable cell.

That saves the primitive right away and allows literals to be two
cells per literal in a definition rather than three cells per literal.

Indeed, if you are willing to trade off dictionary size for fewer
primitives, you can VARIABLE make two cells bigger than the fig-Forth
style and CONSTANT one cell bigger, with, in effect:
: CONSTANT HEADER 'DOLIT , , 'EXIT , ;
: VARIABLE HERE 0 , CONSTANT ;