RFD: Legacy Wordset

Peter Knaggs

unread,

Aug 23, 2009, 12:40:52 PM8/23/09

to

At the Vienna meeting it was decided that the "Obsolete" word marked in
section 1.4.2 namely #TIB, CONVERT, EXPECT, FORGET, QUERY, SPAN and TIB
should be removed from the CORE-EXT and TOOLS-EXT word sets and placed
into a new word set of there own, the Legacy word set.

The new chapter (word set) complete with introduction can be found at:

http://www.rigwit.co.uk/forth/legacy-09-2.pdf

Elizabeth D Rather

unread,

Aug 23, 2009, 10:18:59 PM8/23/09

to

The intent of the Forth94 TC was that in the next standard these words
should go away altogether, and indeed, in what was intended to be the
start of the next round, a meeting at NASA-GSFC in 1999, the TC voted to
discard them.

IMO retaining these words in a "Legacy" wordset will have the
undesirable effect of perpetuating them. The world has been warned for
15 years that they were going away; just say "goodbye"!

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Stephen Pelc

unread,

Aug 24, 2009, 6:15:48 AM8/24/09

to

On Sun, 23 Aug 2009 16:18:59 -1000, Elizabeth D Rather
<era...@forth.com> wrote:

>The intent of the Forth94 TC was that in the next standard these words
>should go away altogether, and indeed, in what was intended to be the
>start of the next round, a meeting at NASA-GSFC in 1999, the TC voted to
>discard them.
>
>IMO retaining these words in a "Legacy" wordset will have the
>undesirable effect of perpetuating them. The world has been warned for
>15 years that they were going away; just say "goodbye"!

The law of unintended consequences still applies! You know that some
client with a code base evolving over 25 years still hasn't removed
even older FIG-isms. Without this warning, some hotshot will write a
library that reuses these names and breaks later code. The Forth200x
approach in this instance is to minimise surprises.

That hotshots neither read nor write documentation is another
matter altogether.

Stephen

--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads

Howerd

unread,

Aug 24, 2009, 8:29:10 AM8/24/09

to

Hi Elizabeth,

Oops - I must have missed the warning - I've just added a target
version of QUIT to MSP430 SwiftX which uses some of these legacy
words.
Are there new words that have the same functionality. and if so, is
there a reference implementation?

Regards

Howerd

On 24 Aug, 03:18, Elizabeth D Rather <erat...@forth.com> wrote:
> Peter Knaggs wrote:
> > At the Vienna meeting it was decided that the "Obsolete" word marked in
> > section 1.4.2 namely #TIB, CONVERT, EXPECT, FORGET, QUERY, SPAN and TIB
> > should be removed from the CORE-EXT and TOOLS-EXT word sets and placed
> > into a new word set of there own, the Legacy word set.
>

...

Bernd Paysan

unread,

Aug 24, 2009, 9:42:55 AM8/24/09

to

Elizabeth D Rather wrote:
> The intent of the Forth94 TC was that in the next standard these words
> should go away altogether, and indeed, in what was intended to be the
> start of the next round, a meeting at NASA-GSFC in 1999, the TC voted to
> discard them.
>
> IMO retaining these words in a "Legacy" wordset will have the
> undesirable effect of perpetuating them. The world has been warned for
> 15 years that they were going away; just say "goodbye"!

The purpose of the "legacy" wordset is to tell people that words with that
name were in use, and maybe still are in use. E.g. if I write a library,
and choose that "EXPECT" is a good word for defining patterns in a form
editor (like s" [0-9]*" EXPECT will read in only digit strings), I'm warned
that other people might already be using these words. And when I'm
implementing a Forth, and are about to implement EXPECT, because I remember
having it seen in Starting Forth, I'm discouraged.

I think the text here

"They have been grouped together into the legacy word set as although
obsolete these words are still to be found in widespread use in legacy
code."

is a bit too positive and encouraging. IMHO it's more that we want to
document these legacy words, because despite we discourage use in
implementations and applications, people still may want to keep them for
backward compatibility reasons, and so the names are not free to use.

I'd rather put it into the informal appendix. The legacy words should not
be part of the formal standard document.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

Peter Knaggs

unread,

Aug 24, 2009, 1:08:40 PM8/24/09

to

Bernd Paysan wrote:
>
> The purpose of the "legacy" wordset is to tell people that words with that
> name were in use, and maybe still are in use. E.g. if I write a library,
> and choose that "EXPECT" is a good word for defining patterns in a form
> editor (like s" [0-9]*" EXPECT will read in only digit strings), I'm warned
> that other people might already be using these words. And when I'm
> implementing a Forth, and are about to implement EXPECT, because I remember
> having it seen in Starting Forth, I'm discouraged.
>
> I think the text here
>
> "They have been grouped together into the legacy word set as although
> obsolete these words are still to be found in widespread use in legacy
> code."
>
> is a bit too positive and encouraging. IMHO it's more that we want to
> document these legacy words, because despite we discourage use in
> implementations and applications, people still may want to keep them for
> backward compatibility reasons, and so the names are not free to use.

I also want to cover the case, when somebody who is new to Forth is
reading through old code, which may use these words. Such a person
would need to dig out the old fig-FORTH standard. By including the
definitions here we ease their pain. As this proposal has not been
through the RfD/CfV process, I am open to suggestions.

> I'd rather put it into the informal appendix. The legacy words should not
> be part of the formal standard document.

Personally, I would agree, but the vote at the last meeting wanted it
normative. Again this is something we could ask the group to vote on.

--
Peter Knaggs

Elizabeth D Rather

unread,

Aug 24, 2009, 1:26:11 PM8/24/09

to

Stephen Pelc wrote:
> On Sun, 23 Aug 2009 16:18:59 -1000, Elizabeth D Rather
> <era...@forth.com> wrote:
>
>> The intent of the Forth94 TC was that in the next standard these words
>> should go away altogether, and indeed, in what was intended to be the
>> start of the next round, a meeting at NASA-GSFC in 1999, the TC voted to
>> discard them.
>>
>> IMO retaining these words in a "Legacy" wordset will have the
>> undesirable effect of perpetuating them. The world has been warned for
>> 15 years that they were going away; just say "goodbye"!
>
> The law of unintended consequences still applies! You know that some
> client with a code base evolving over 25 years still hasn't removed
> even older FIG-isms. Without this warning, some hotshot will write a
> library that reuses these names and breaks later code. The Forth200x
> approach in this instance is to minimise surprises.
>
> That hotshots neither read nor write documentation is another
> matter altogether.

The policy of other language TCs is to remove words after one cycle of
warnings. Since ANSI requires reaffirmation or revision every five
years that could be a fairly short period. We modeled the whole
"obsolescent" approach on common standards practice.

People maintaining a long-term code base (and God knows we love 'em)
should track standards changes as well as other changes that affect
them. Yes, standards bodies need to try to minimize their pain, but
when the serious decision is made to declare something "obsolescent"
that is intended as an aggressive notice that this feature has a problem
and needs to go.

So long as a legacy application continues to use its legacy underlying
system, there shouldn't be a problem. If the app wishes to retain
portability, it needs to follow the standard, including abandoning
deprecated words.

Declaring a word "obsolescent" is not done trivially, because a word is
"out of fashion", but because it has some serious unresolved
side-effects or issues that the standard is trying to resolve. In the
case of most of these, the problem is that people were trying to
manipulate the input stream in ways that, although they worked on some
implementations, were disastrous on others. Rendering the input stream
more "opaque" preserved most of the basic capability while protecting
against the worst side-effects.

FORGET has already been discussed here exhaustively.

Elizabeth D Rather

unread,

Aug 24, 2009, 1:51:07 PM8/24/09

to

Howerd wrote:
> Hi Elizabeth,
>
> Oops - I must have missed the warning - I've just added a target
> version of QUIT to MSP430 SwiftX which uses some of these legacy
> words.
> Are there new words that have the same functionality. and if so, is
> there a reference implementation?
>
> Regards
>
> Howerd

Direct manipulation of the input stream pointers is discouraged. SOURCE
returns the address and length of the input stream (replacing TIB, #TIB,
and SPAN). SAVE-INPUT and RESTORE-INPUT let you temporarily redirect
it. CONVERT is superseded by >NUMBER. EXPECT is superseded by ACCEPT.
FORGET is superseded by MARKER. The function of QUERY may be performed
with ACCEPT and EVALUATE.

SwiftX Pro includes a target-resident interpreter.

Cheers,
Elizabeth

> On 24 Aug, 03:18, Elizabeth D Rather <erat...@forth.com> wrote:
>> Peter Knaggs wrote:
>>> At the Vienna meeting it was decided that the "Obsolete" word marked in
>>> section 1.4.2 namely #TIB, CONVERT, EXPECT, FORGET, QUERY, SPAN and TIB
>>> should be removed from the CORE-EXT and TOOLS-EXT word sets and placed
>>> into a new word set of there own, the Legacy word set.
> ...
>> The intent of the Forth94 TC was that in the next standard these words
>> should go away altogether, and indeed, in what was intended to be the
>> start of the next round, a meeting at NASA-GSFC in 1999, the TC voted to
>> discard them.
>>
>> IMO retaining these words in a "Legacy" wordset will have the
>> undesirable effect of perpetuating them. The world has been warned for
>> 15 years that they were going away; just say "goodbye"!
>

Elizabeth D Rather

unread,

Aug 24, 2009, 1:54:51 PM8/24/09

to

Describing them in an appendix, such as Bernd suggests, would satisfy
that purpose without encouraging people to perpetuate them.

alextangent

unread,

Aug 25, 2009, 12:39:54 PM8/25/09

to

I suppose it's too much to ask that [COMPILE] be added to the list?

--
Regards
Alex McDonald

Peter Knaggs

unread,

Aug 25, 2009, 1:09:04 PM8/25/09

to

This was the list of words which where marked as obsolete in the '94
document. If you would like to suggest we obsolete [COMPILE] in the new
standard, please make your argument.

--
Peter Knaggs

Anton Ertl

unread,

Aug 25, 2009, 1:03:35 PM8/25/09

to

Peter Knaggs <p...@bcs.org.uk> writes:

>alextangent wrote:
>> I suppose it's too much to ask that [COMPILE] be added to the list?
>
>This was the list of words which where marked as obsolete in the '94
>document. If you would like to suggest we obsolete [COMPILE] in the new
>standard, please make your argument.

There is no way to determine if a word has "other than default
compilation semantics". OTOH, we know it for standard words, and for
words defined in standard ways, so if [COMPILE] does not work as
advertised for words defined in non-standard ways, that's not really a
problem for the standard (standard programs will still work on
standard systems).

FIND is a bigger problem IMO: no well-defined behaviour even for
standard programs and a stone-age interface.

But we can discuss if there are any problematic words that we might
want to obsolete some time in the future during the Forth200x meeting;
we should provide well-defined replacements, though.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2009: http://www.euroforth.org/ef09/

Albert van der Horst

unread,

Aug 26, 2009, 6:12:35 AM8/26/09

to

In article <2009Aug2...@mips.complang.tuwien.ac.at>,

Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>Peter Knaggs <p...@bcs.org.uk> writes:
>>alextangent wrote:
>>> I suppose it's too much to ask that [COMPILE] be added to the list?
>>
>>This was the list of words which where marked as obsolete in the '94
>>document. If you would like to suggest we obsolete [COMPILE] in the new
>>standard, please make your argument.
>
>There is no way to determine if a word has "other than default
>compilation semantics". OTOH, we know it for standard words, and for
>words defined in standard ways, so if [COMPILE] does not work as
>advertised for words defined in non-standard ways, that's not really a
>problem for the standard (standard programs will still work on
>standard systems).
>
>FIND is a bigger problem IMO: no well-defined behaviour even for
>standard programs and a stone-age interface.

I'm sure a lot of people want to get rid of FIND and its companion
WORD.

This is the best I could come up with

worddoc( {DICTIONARY},{FOUND},{found},{sc --- dea},{},
{ Look up the string forthvar({sc}) in the dictionary observing
the current search order. If found, leave the dictionary
entry address forthvar({dea}) of the first entry found, else
leave a forthdefin({nil pointer}). },

(sc is an (addr,len) pair )

Its companion is PARSE-NAME that it is already there.

Without the notion of a ``dictionary entry address'' there
is no general way to arrive at other properties of a word
than the execution token (such as its name, or whether it is
immediate). In other words, currently Forth in general has no
canonical way of presenting the result of a dictionary search. It
is hard to solve this without committing to a certain
implementation model.

I'm not sure whether returning zero for ``no result'' has
a precedent, but it is mucho baaje convenient.

<SNIP>

>
>- anton

Groetjes Albert

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Anton Ertl

unread,

Aug 26, 2009, 9:21:15 AM8/26/09

to

Albert van der Horst <alb...@spenarnc.xs4all.nl> writes:
>In article <2009Aug2...@mips.complang.tuwien.ac.at>,
>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>>FIND is a bigger problem IMO: no well-defined behaviour even for
>>standard programs and a stone-age interface.
>
>I'm sure a lot of people want to get rid of FIND and its companion
>WORD.

WORD does not have problems like FIND, it's just not very useful. As
such it should be moved to Core Ext. Or maybe the idea of a legacy
wordset would be good for words like WORD (whereas words with problems
are obsolete, not legacy).

>This is the best I could come up with
>
>worddoc( {DICTIONARY},{FOUND},{found},{sc --- dea},{},
>{ Look up the string forthvar({sc}) in the dictionary observing
>the current search order. If found, leave the dictionary
>entry address forthvar({dea}) of the first entry found, else
>leave a forthdefin({nil pointer}). },

Maybe you should present that in formatted form rather than source
form. I guess the word name is FOUND, though.

Yes, something like that plus additional words for dealing with deas
(or name tokens, as they are called in Gforth). The words in Gforth
for that are:

http://www.complang.tuwien.ac.at/forth/gforth/Docs-html/Name-token.html

The only problem with that is that it does not fit implementations
that use several name entries to represent interpretation and
compilation semantics. But since AFAIK no widely used system uses
these implementation techniques, maybe we should just go ahead with
this stuff.

Anton Ertl

unread,

Aug 26, 2009, 9:30:45 AM8/26/09

to

Peter Knaggs <p...@bcs.org.uk> writes:
>I also want to cover the case, when somebody who is new to Forth is
>reading through old code, which may use these words. Such a person
>would need to dig out the old fig-FORTH standard.

No, for these particular words they would just have to dig out the
Forth-94 standard.

If the old code is written for a Fig-Forth-based system, they should
definitely read the Fig-Forth manual rather than Forth 200x or
Forth-94, because the Forth-83 fault line is between that code and
these standards.

I am also not sure if someone who is new to Forth would try to
understand programs (old or new) by reading the standard document.

>By including the
>definitions here we ease their pain.

I think this is such an uncommon case that I am not sure if it's
worthwhile to make the standard bigger for it.

>As this proposal has not been
>through the RfD/CfV process, I am open to suggestions.

IMO just listing the names and the last standard in which each name
was included is enough.

The purposes I envision for this are:

1) A new system implementor shouldn't accidentially use one of these
names for a system-specific word.

2) Future proponents (of standard extensions) shouldn't accidentially
use one of these names for a new proposed standard word.

>> I'd rather put it into the informal appendix. The legacy words should not
>> be part of the formal standard document.
>
>Personally, I would agree, but the vote at the last meeting wanted it
>normative. Again this is something we could ask the group to vote on.

If we want to force purpose 1 above (as far as a standard can force
such things), we should make it normative and require Forth-94
semantics for these words if they are implemented (either by
referencing Forth-94 or by directly including the text).

However, I am not convinced that it is necessary to try to force this.
Moreover, I am convinced that checking for these words in a test suite
is more effective than the difference between normative and
informative sections in the document.

Elizabeth D Rather

unread,

Aug 26, 2009, 2:44:27 PM8/26/09

to

I completely agree with all this, except that I feel strongly that
including them in any normative section will encourage folks to
perpetuate them.

Cheers,
Elizabeth

--

Jonah Thomas

unread,

Aug 26, 2009, 4:08:02 PM8/26/09

to

Elizabeth D Rather <era...@forth.com> wrote:

> I completely agree with all this, except that I feel strongly that
> including them in any normative section will encourage folks to
> perpetuate them.

So if it's not normative, you're giving people a list of words that some
Forth systems used to use, and you're warning them that if they re-use
those names they might run into name clashes.

But if it is normative, you're giving people a list of words that are
not actually standard words, and you're requiring that they.... What?

Non-normative sounds quite appropriate to me at this point.

Sp...@controlq.com

unread,

Aug 26, 2009, 8:21:24 PM8/26/09

to

On Wed, 26 Aug 2009, Elizabeth D Rather wrote:

> Date: Wed, 26 Aug 2009 08:44:27 -1000
> From: Elizabeth D Rather <era...@forth.com>
> Newsgroups: comp.lang.forth
> Subject: Re: RFD: Legacy Wordset

>
> I completely agree with all this, except that I feel strongly that including
> them in any normative section will encourage folks to perpetuate them.
>
> Cheers,
> Elizabeth

Why not simply change the name from "Legacy" to "Deprecated" wordset?

The connotation is different, and when I read that an interface is
deprecated, I think twice before I use it. I might still use it, but at
least I rationalize exactly why I would, and more often than not, I'd find
the alternative.

Rob Sciuk

Duke Normandin

unread,

Aug 27, 2009, 8:47:55 AM8/27/09

to

I agree! I second your proposal!
--
duke

Elizabeth D Rather

unread,

Aug 27, 2009, 9:18:53 AM8/27/09

to

Better still, we could add a requirement that executing any of these
words would generate a message saying, "This system is using hopelessly
obsolete and deprecated words, and should be updated immediately."

Nah, better to take them out of the normative section altogether. A
note regarding this is appropriate for the Appendix describing changes
from Forth94 (there *will* be one, yes?).

Cheers,
Elizabeth

Jerry Avins

unread,

Mar 11, 2010, 8:22:10 PM3/11/10

to

_Their_ own?

Jerry
--
Why am I in a handbasket? Where are we going?
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯

Bruce McFarling

unread,

Mar 11, 2010, 9:25:31 PM3/11/10

to

On Mar 11, 8:22 pm, Jerry Avins <j...@ieee.org> wrote:
> Peter Knaggs wrote:
> > At the Vienna meeting it was decided that the "Obsolete" word marked in
> > section 1.4.2 namely #TIB, CONVERT, EXPECT, FORGET, QUERY, SPAN and TIB
> > should be removed from the CORE-EXT and TOOLS-EXT word sets and placed
> > into a new word set of there own, the Legacy word set.

> > The new chapter (word set) complete with introduction can be found at:
> > http://www.rigwit.co.uk/forth/legacy-09-2.pdf

> _Their_ own?

I missed this discussion entirely.

I agree with Ms. Rather, there should not be any definitions in the
glossary of the wordset proper.

There is a procedural rationale for a "Legacy" or "Deprecated"
wordset. In its normative section, it should simply state that the
following words marked as obsolescent in the Forth-94 standard have
now been removed from the standard. Relegate the definitions to the
parallel Appendix for informational purposes.

Also, one wonders what the point is of an Environmental Query for
existence of these words. Rather, what would be informative is an
Environmental Query that the implementation respects the relegation of
these words from obsolescent to deprecated status.

Sieur de Bienville

unread,

Mar 12, 2010, 8:51:35 PM3/12/10

to

For what it's worth, I agree Mr. McFarling and Ms. Rather on this
matter.

The Beez'

unread,

Mar 13, 2010, 6:45:21 AM3/13/10

to

Simply rip 'em out and add WORD and FIND to the package. PARSE is a
great replacement for WORD and FIND badly needs replacement too.

Hans Bezemer

Coos Haak

unread,

Mar 13, 2010, 8:53:32 AM3/13/10

to

Op Sat, 13 Mar 2010 03:45:21 -0800 (PST) schreef The Beez':

> Simply rip 'em out and add WORD and FIND to the package. PARSE is a
> great replacement for WORD and FIND badly needs replacement too.
>
> Hans Bezemer

Don't forget PARSE-NAME as a great substitute for WORD.
PARSE doesn't skip leading delimeters.

--
Coos

CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html

Bruce McFarling

unread,

Mar 13, 2010, 5:05:18 PM3/13/10

to

On Mar 13, 6:45 am, "The Beez'" <hans...@bigfoot.com> wrote:
> Simply rip 'em out and add WORD and FIND to the package. PARSE is a
> great replacement for WORD and FIND badly needs replacement too.

SEARCH-WORDLIST works if you have wordlists, but its better for
searching a particular wordlist.

NAME-SEARCH( ca u -- 0 | xt 1 | xt -1 )
... would be a good match to PARSE-NAME.

Ed

unread,

Mar 13, 2010, 8:36:21 PM3/13/10

to

It might be a good match but it resolves nothing because
PARSE-NAME resolved nothing.

PARSE indeed provides the primitive which could replace WORD
(because it's a factor of WORD ).

But WORD itself cannot can be replaced. It's been at the core
of forth since earliest days and lurks in so many programs that it
cannot be got rid of without potentially breaking them. And if one
can't get rid of WORD then there's little point replacing FIND .

Bruce McFarling

unread,

Mar 14, 2010, 1:08:16 AM3/14/10

to

On Mar 13, 8:36 pm, "Ed" <nos...@invalid.com> wrote:
> But WORD itself cannot can be replaced. It's been at the core
> of forth since earliest days and lurks in so many programs that it
> cannot be got rid of without potentially breaking them.

I believe it was "The Beez" who suggested ripping WORD and FIND out,
and I neither dissented nor concurred ...

... just as a library might use WORD as a primitive once if better
alternatives are not available, without using it again, a library that
finds SEARCH-WORDLIST and GET-ORDER are not available might use FIND
to define NAME-SEARCH, without ever using it again.

> And if one can't get rid of WORD then there's little point replacing FIND.

A source code library is perfectly free to state dependencies one or
the other of WordA and WordB being present.

The standardization of [DEFINED] and [UNDEFINED] makes that easier to
automate many things, but bootstrapping [DEFINED] and [UNDEFINED] in a
small Forth-94 CORE system requires WORD and FIND. A library standard
would cope with that with distinct bootstrap for distinct starting
points. One or more of those starting points would be a boostrap base
for low resource systems. A common scenario for low resource systems
is for dataspace to be especially tight, so a bootstrap base that
relies on the more dataspace frugal PARSE-NAME and NAME-SEARCH rather
than WORD and FIND cannot be dismissed out of hand.

Ed

unread,

Mar 14, 2010, 3:03:17 AM3/14/10

to

How much baggage and convolution do you think a programmer will bear
before they finally give up forth?

Marcel Hendrix

unread,

Mar 14, 2010, 3:35:04 AM3/14/10

to

"Ed" <nos...@invalid.com> writes Re: RFD: Legacy Wordset
[..]

> How much baggage and convolution do you think a programmer will bear
> before they finally give up forth?

As the words DUP + DROP OVER SWAP ROT and NIP
can be trivially and completely portably defined
using >R R> and R@ , the former 7 words can
be removed from the CORE wordset, without any
negative consequences whatever.

That should provide some breathing room.

-marcel

-- -------------------------------------------
ANEW -perverse

: dup >r r@ r> ;
: + >r dup dup - r> - - ;
: drop dup - + ;
: over >r dup dup r@ - dup >r - r> r> + ;
: swap over >r >r drop r> r> ;
: rot >r swap r> swap ;
: nip swap drop ;

: test ( -- -80 )
0
1 2 3 4
ROT ( -- 1 3 4 2 )
DUP + DROP ( -- 1 3 4 )
NIP ( -- 1 4 )
SWAP ( -- 4 1 )
OVER + + ( -- 9 )
89 - + ; ( -- -80 )

SEE test
$01244E00 : test
$01244E0A push #176 b#
$01244E0C ;

Peter Knaggs

unread,

Mar 14, 2010, 6:20:38 AM3/14/10

to

On Sat, 13 Mar 2010 11:45:21 -0000, The Beez' <han...@bigfoot.com> wrote:
>
> Simply rip 'em out and add WORD and FIND to the package. PARSE is a
> great replacement for WORD and FIND badly needs replacement too.

123456789 123456789 123456789 123456789 123456789 123456789 123456789
Nobody is suggesting we remove WORD or FIND. The words in question
are: #TIB FORGET SPAN CONVERT QUERY TIB EXPECT

Although there is a question over FORGET.

The procedure for removing a word from the document is to move the
word into an EXT wordlist and mark it as obsolete. Thus in a future
revision of the standard the word can be removed.

--
Peter Knaggs

Albert van der Horst

unread,

Mar 14, 2010, 10:14:54 AM3/14/10

to

In article <hnheeq$2ll$1...@news-01.bur.connect.com.au>,

WORD in ciforth could be in a loadable extensions, because
it can be defined portably based on PARSE and PARSE-NAME.
I don't use WORD anywhere.

FIND is even worse, I hate its stack diagram.
I have never made use or sense of the indication of immediacy it
returns.

I would have it replaced by a word that returns some dictionary
entry address, (name field address, dictionary header address,
whatever) and defined ways to arrive from there at an xt
(or separate compilation and interpretation xt's)
or immediacy information.

Anton Ertl

unread,

Mar 14, 2010, 11:56:53 AM3/14/10

to

Bruce McFarling <agi...@netscape.net> writes:

That would have the same problem as FIND (and SEARCH-WORDLIST): It
does not give a well-defined result for words that cannot be
represented just with an xt and a flag. I.e., it would not solve the
biggest problem that FIND has. Better solutions have been proposed
here several times.

Bruce McFarling

unread,

Mar 14, 2010, 12:12:21 PM3/14/10

to

On Mar 14, 10:14 am, Albert van der Horst <alb...@spenarnc.xs4all.nl>
wrote:

> I would have it replaced by a word that returns some dictionary
> entry address, (name field address, dictionary header address,
> whatever) and defined ways to arrive from there at an xt
> (or separate compilation and interpretation xt's)
> or immediacy information.

Something like the following? (with "flags" defined to be non-zero).

NAME-SEARCH ( ca u -- FALSE | nt flags )
Immediate? ( flags -- fl )
NT>XT ( nt -- xt )

Bruce McFarling

unread,

Mar 14, 2010, 12:19:12 PM3/14/10

to

On Mar 14, 6:20 am, "Peter Knaggs" <p...@bcs.org.uk> wrote:
> The procedure for removing a word from the document is to move the
> word into an EXT wordlist and mark it as obsolete. Thus in a future
> revision of the standard the word can be removed.

That would be useful information for the explanatory appendix, as that
is unlikely to be common knowledge. For example, Ms. Elizabeth Rather
said above:

> The intent of the Forth94 TC was that in the next standard these
> words should go away altogether, and indeed, in what was intended
> to be the start of the next round, a meeting at NASA-GSFC in 1999,
> the TC voted to discard them.

There is nothing in the "Legacy" wordset that indicates that words
marked obsolete are on their last step out the door.

Bruce McFarling

unread,

Mar 14, 2010, 12:27:19 PM3/14/10

to

On Mar 14, 6:20 am, "Peter Knaggs" <p...@bcs.org.uk> wrote:

> On Sat, 13 Mar 2010 11:45:21 -0000, The Beez' <hans...@bigfoot.com> wrote:

> > Simply rip 'em out and add WORD and FIND to the package. PARSE is a
> > great replacement for WORD and FIND badly needs replacement too.

> Nobody is suggesting we remove WORD or FIND.

BTW, I believe you are replying to a message where The Beez suggested
precisely that. Rather, removing WORD and FIND is beyond the scope of
the RfD, and the subject line should have been edited. Sorry.

Peter Knaggs

unread,

Mar 14, 2010, 3:54:19 PM3/14/10

to

On Sun, 14 Mar 2010 16:19:12 -0000, Bruce McFarling <agi...@netscape.net>
wrote:

> On Mar 14, 6:20 am, "Peter Knaggs" <p...@bcs.org.uk> wrote:
>> The procedure for removing a word from the document is to move the
>> word into an EXT wordlist and mark it as obsolete. Thus in a future
>> revision of the standard the word can be removed.
>
> That would be useful information for the explanatory appendix, as that
> is unlikely to be common knowledge. For example, Ms. Elizabeth Rather
> said above:
>
>> The intent of the Forth94 TC was that in the next standard these
>> words should go away altogether, and indeed, in what was intended
>> to be the start of the next round, a meeting at NASA-GSFC in 1999,
>> the TC voted to discard them.

To quote from the ANS document:
>
> 1.4.2 Obsolescent features
>
> This Standard adopts certain words and practices that cause some
> previously used words and practices to become obsolescent. Although
> retained here because of their widespread use, their use in new
> implementations or new programs is discouraged, because they may be
> withdrawn from future revisions of the Standard.

> There is nothing in the "Legacy" wordset that indicates that words
> marked obsolete are on their last step out the door.

That is because there was a significant resistance to removing the
words from the document altogether. The proposed wordset was a
compromise, and does not represent the whole proposal.

The current state of the full proposal can be seen at:

http://groups.google.com/group/comp.lang.forth/msg/1abe91f8c9c4a2a1

However, given the strength of feeling over FORGET I put this
proposal to the back burner until I could find the time to rework
it.

--
Peter Knaggs

Albert van der Horst

unread,

Mar 14, 2010, 4:52:17 PM3/14/10

to

In article <0445a66d-dfcd-453a...@x12g2000yqx.googlegroups.com>,
Bruce McFarling <agi...@netscape.net> wrote:
>On Mar 13, 8:36=A0pm, "Ed" <nos...@invalid.com> wrote:
>> But =A0WORD =A0itself cannot can be replaced. =A0It's been at the core

>> of forth since earliest days and lurks in so many programs that it
>> cannot be got rid of without potentially breaking them.
>
>I believe it was "The Beez" who suggested ripping WORD and FIND out,
>and I neither dissented nor concurred ...
>
>... just as a library might use WORD as a primitive once if better
>alternatives are not available, without using it again, a library that
>finds SEARCH-WORDLIST and GET-ORDER are not available might use FIND
>to define NAME-SEARCH, without ever using it again.
>

>>=A0And if one can't get rid of WORD then there's little point replacing FI=

>ND.
>
>A source code library is perfectly free to state dependencies one or
>the other of WordA and WordB being present.
>
>The standardization of [DEFINED] and [UNDEFINED] makes that easier to
>automate many things, but bootstrapping [DEFINED] and [UNDEFINED] in a
>small Forth-94 CORE system requires WORD and FIND. A library standard

>would cope with that with distinct bootstrap for distinct starting
>points. One or more of those starting points would be a boostrap base
>for low resource systems. A common scenario for low resource systems
>is for dataspace to be especially tight, so a bootstrap base that
>relies on the more dataspace frugal PARSE-NAME and NAME-SEARCH rather
>than WORD and FIND cannot be dismissed out of hand.

Indeed. In a small ciforth core system [DEFINED] is defined
using carnal knowledge, in a way a user of the system wouldn't dream of:

: [UNDEFINED] NAME 2DUP WANTED PRESENT 0= ; IMMEDIATE
: [DEFINED] POSTPONE [UNDEFINED] 0= ; IMMEDIATE

This, by the way, tries to load a word from the library (WANTED)
before it is declared not present in the search order.
(This seems to be allowed.)

Bruce McFarling

unread,

Mar 15, 2010, 2:21:47 AM3/15/10

to

On Mar 14, 4:52 pm, Albert van der Horst <alb...@spenarnc.xs4all.nl>
wrote:

> Indeed. In a small ciforth core system [DEFINED] is defined
> using carnal knowledge, in a way a user of the system wouldn't dream of:

> : [UNDEFINED] NAME 2DUP WANTED PRESENT 0= ; IMMEDIATE
> : [DEFINED] POSTPONE [UNDEFINED] 0= ; IMMEDIATE

I should hope it is allowed, normally when I use "[UNDEFINED]" I
*prefer* that it be false and there is a system-provided word to do
the job.

Bruce McFarling

unread,

Mar 15, 2010, 3:21:52 AM3/15/10

to

On Mar 14, 3:54 pm, "Peter Knaggs" <p...@bcs.org.uk> wrote:
> To quote from the ANS document:

> > 1.4.2 Obsolescent features

> > This Standard adopts certain words and practices that cause some
> > previously used words and practices to become obsolescent. Although
> > retained here because of their widespread use, their use in new
> > implementations or new programs is discouraged, because they may be
> > withdrawn from future revisions of the Standard.

That is precisely what I had in mind when I said it might not be
common knowledge that there is this two step procedure when Forth-94
gave the strong impression that it would likely be a direct drop kick
for goal.

> > > There is nothing in the "Legacy" wordset that indicates that words
> > > marked obsolete are on their last step out the door.

> That is because there was a significant resistance to removing the
> words from the document altogether. The proposed wordset was a
> compromise, and does not represent the whole proposal.

Another piece of information that is neither stated nor alluded to in
the "informative" or "explanatory" appendix, nor in the prelude to the
proposal in the current RfD. If this is a compromise required to get
this cruft out of CORE EXT, then I'm all for it.

EG, for the Appendix18:

"Some existing Standard words have had the status of obsolescent, and
are therefore subject to removal in a further revision of the
Standard. If removed from the Standard, compliant implementations
would be free to use these names to refer to incompatible semantics.
Due to widespread use in legacy code, this is not presently considered
good practice.

These words are collected in this wordset so that: an implementation
that does not support legacy code can implement the full CORE EXT
wordset without including words that should not be used in new
development; implementations that support legacy code using these
words can signal this support in response to the LEGACY environmental
query; and these semantics are reserved for these words in support of
legacy code".

WRT Forget, wouldn't the simplest course be to just not worry about it
right now? Its in TOOLS EXT, a grab bag that includes EDITOR and
ASSEMBLER. Knowing that an implementation has all of CORE EXT is a lot
more informative as to what you can do than knowing that it has all of
TOOLS EXT.

Also, at 46 words, implementations and applications listing 40 words
respectively provided from or required from CORE EXT to document the 6
obsolescent words that are not provided or required is a bigger
nuisance. If the obsolescent CORE EXT words are placed in LEGACY,
there is always the presently empty LEGACY EXT to accept the other
word still marked obsolescent and any words that might later be
rendered obsolescent if it should be decided at some time in the
future.

Ed

unread,

Mar 16, 2010, 3:12:00 AM3/16/10

to

Albert van der Horst wrote:
> ...

> WORD in ciforth could be in a loadable extensions, because
> it can be defined portably based on PARSE and PARSE-NAME.
> I don't use WORD anywhere.

Individual implementions may be able to define WORD in loadable
form but it's doubtful a Standard Program could given the variability
possible under "3.4.1.1 Delimiters".

> FIND is even worse, I hate its stack diagram.
> I have never made use or sense of the indication of immediacy it
> returns.

> ...

The fact is WORD and FIND work, have done so for 30+ years,
and still do today. If and when they become unusable in mainstream
forths, I'll worry about replacing them.

The impetus behind all this is that people read "A.6.2.2008 PARSE"
and bought the story that WORD could be eliminated - which led
to PARSE-NAME. If WORD/FIND can't be eliminated from the
Standard because they're too entrenched in forth, where does that
now leave PARSE-NAME which isn't even a factor of WORD.
On its own by the look of it.

There have already been attempts to find PARSE-NAME a partner
but the justifications for doing so are looking more like excuses to
cover up the initial mistake.

Coos Haak

unread,

Mar 16, 2010, 12:46:03 PM3/16/10

to

Op 14 Mar 2010 14:14:54 GMT schreef Albert van der Horst:

As an example, CHForth in his current form _has_ WORD and FIND as a
loadable extension. I don't miss them. I use PARSE-NAME and FIND-XT (
c-addr u -- c-addr u false | xt x true ) where x are the IMMEDIATE and
COMPILE-ONLY flags.

> Groetjes Albert
>
> --

Van't zelfde

Bruce McFarling

unread,

Mar 16, 2010, 1:04:55 PM3/16/10

to

On Mar 16, 3:12 am, "Ed" <nos...@invalid.com> wrote:
> The fact is WORD and FIND work, have done so for 30+ years,
> and still do today.

But in many cases, WORD as an operation only provided for portability
and relying on factors that can be more efficiently exported as PARSE
and PARSE-NAME, and similarly for FIND.

Ed

unread,

Mar 16, 2010, 6:27:57 PM3/16/10

to

One only has to look at mainstream forths today to find that their own
use of WORD outstrips anything else by a mile. That doesn't sound
obsolete to me.

PARSE-NAME was a thought bubble floated in an ANS rationale.

Ed

unread,

Mar 16, 2010, 6:30:51 PM3/16/10

to

Marcel Hendrix wrote:
> "Ed" <nos...@invalid.com> writes Re: RFD: Legacy Wordset
> [..]
> > How much baggage and convolution do you think a programmer will bear
> > before they finally give up forth?
>
> As the words DUP + DROP OVER SWAP ROT and NIP
> can be trivially and completely portably defined
> using >R R> and R@ , the former 7 words can
> be removed from the CORE wordset, without any
> negative consequences whatever.
>
> That should provide some breathing room.
>
> -marcel
>
> -- -------------------------------------------
> ANEW -perverse

> ...

Yes it is perverse.

I feel sorry for those who once had trust in forth. Who joined forth in the
hopes that the principles and application of logic which the language
espoused would be applied by all users, and especially by those who
would later find themselves with the responsibility of maintaining the
language.

What a let down it must be to find that all that's left after 30 years, are
users whose only responsibility is to themselves. Who sit around a
table engaged in political horse-trading to see who will come out on top.
Where principles can be compromised and bartered away. What does
it matter if every mistake and poor choice one made in the past is now
intrenched in the language and foisted upon all users. What does it
matter if in order to get what one wants, one has to expound rationales
to a public that one barely believes oneself.

This is forth's future, and yes it is perverse.

Albert van der Horst

unread,

Mar 17, 2010, 8:02:24 AM3/17/10

to

In article <hnnas4$735$1...@news-01.bur.connect.com.au>,

Well, well. Let me tell you this.

The problem with WORD is that it is two functionalities in one,
depending on control data that is disguised as data.
The modern insight is that this is bad design and bad factoring.
The best proof of this is that WORD can be defined in terms
of PARSE and PARSE-NAME. (In ciforth it actually is.)
Each time you use WORD to do PARSE-NAME, WORD is obliged to
waste its time detecting it must do PARSE-NAME not PARSE.
Each time you use WORD to do PARSE-NAME, the user is obliged to
waste his time remembering whether leading delimiters must be
skipped or not.

All this was cute in a time that assembly was compact compared to
high level code, and one had to cram functionality in one definition.
Not any more.
You remember ENCLOSE ?

Elizabeth D Rather

unread,

Mar 17, 2010, 3:09:08 PM3/17/10

to

Albert van der Horst wrote:
...
>

> Well, well. Let me tell you this.
>
> The problem with WORD is that it is two functionalities in one,
> depending on control data that is disguised as data.
> The modern insight is that this is bad design and bad factoring.
> The best proof of this is that WORD can be defined in terms
> of PARSE and PARSE-NAME. (In ciforth it actually is.)
> Each time you use WORD to do PARSE-NAME, WORD is obliged to
> waste its time detecting it must do PARSE-NAME not PARSE.
> Each time you use WORD to do PARSE-NAME, the user is obliged to
> waste his time remembering whether leading delimiters must be
> skipped or not.

That sounds like a problem with your implementation. In my experience,
WORD is clean and efficient and does what it's designed to do well.
PARSE is available for those situations in which you wish *not* to skip
leading delimiters. And knowing which you want is surely a conscious
decision based on your knowledge of what you're trying to do.

> All this was cute in a time that assembly was compact compared to
> high level code, and one had to cram functionality in one definition.
> Not any more.
> You remember ENCLOSE ?

No, actually, I've never seen ENCLOSE. And traditionally, ITC Forth was
more compact than assembler, not less.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

J Thomas

unread,

Mar 17, 2010, 4:11:29 PM3/17/10

to

Elizabeth D Rather wrote:
> Albert van der Horst wrote:

>> The problem with WORD is that it is two functionalities in one,
>> depending on control data that is disguised as data. The modern insight
>> is that this is bad design and bad factoring.

> That sounds like a problem with your implementation. In my experience,

> WORD is clean and efficient and does what it's designed to do well.

It's fine for what it's designed to do.

> PARSE is available for those situations in which you wish *not* to skip
> leading delimiters. And knowing which you want is surely a conscious
> decision based on your knowledge of what you're trying to do.

WORD does do two things.

: WORD
HERE PARSE-WORD STRING, ;

It does PARSE-WORD and it also creates a counted string at HERE .

When that's what you want to do, no problem at all. But when all you want
is PARSE-WORD then WORD does a little extra. Not a big efficiency hit,
just copying a counted string. Usually you can use the string at HERE
just as well as you can use the string in the input buffer. Usually
putting the string at HERE won't cause any trouble, you won't have
anything at HERE for it to overwrite and won't call anything that
overwrites it. And if you're looking for the address that the word
starts, you can always get it by taking the count and subtracting that
from the current source.

SOURCE DROP >IN @ + HERE C@ -

Maybe a little inefficient but no big deal.

Still, when I want PARSE-WORD then PARSE-WORD does its job better than
WORD COUNT . And it does no harm to have PARSE-WORD and STRING, both of
which are already factors of WORD and both of which are useful
themselves. It costs two extra headers. If you don't mind having WORD in
high level code then it costs hardly any code space.

Ed

unread,

Mar 17, 2010, 7:59:00 PM3/17/10

to

J Thomas wrote:
> ...

> WORD does do two things.
>
> : WORD
> HERE PARSE-WORD STRING, ;
>
> It does PARSE-WORD and it also creates a counted string at HERE .

> ...

No.

A.6.2.2008 's version of PARSE-WORD (equivalent to PARSE-NAME )
has no delimiter character on the stack and therefore is *not* a factor of
WORD.

Bruce McFarling

unread,

Mar 17, 2010, 11:37:03 PM3/17/10

to

It is not the factor as shown above. I believe that Albert van Horst
is referring to using PARSE-NAME as the factor for the function of
WORD selected by passing it "BL" ...
... DUP BL = IF PARSE-NAME ELSE ...

Coos Haak

unread,

Mar 18, 2010, 6:49:41 AM3/18/10

to

Op Thu, 18 Mar 2010 10:59:00 +1100 schreef Ed:

: WORD ( char "<chars>ccc<char>" -- c-addr )
DUP BL = IF DROP PARSE-NAME
ELSE >R SOURCE OVER SWAP >IN @ /STRING
R@ SKIP
DROP SWAP - >IN ! R> PARSE
THEN
HERE PLACE HERE BL OVER COUNT + C!
;

So, PARSE-NAME is a factor of WORD

Anton Ertl

unread,

Mar 18, 2010, 9:20:50 AM3/18/10

to

There is another widely implemented PARSE-WORD that takes a delimiter.
That was the reason why PARSE-NAME was standardized with that name and
not with the name PARSE-WORD.

Anton Ertl

unread,

Mar 18, 2010, 9:37:18 AM3/18/10

to

Elizabeth D Rather <era...@forth.com> writes:
>In my experience,
>WORD is clean and efficient and does what it's designed to do well.

In my experience it is clean neither in its interface nor in its
implementation, for the same reason:

One has to special-case BL WORD, because that may (must with the FILE
wordset) also treat other white space as delimiters.

As for what it's designed to do: What is that? Do you ever use it
with a different argument than BL? If so, for what?

To me the only use where it has at least a little bit of justification
is BL WORD, and that's better subsumed by PARSE-NAME. Passing a
delimiter to WORD is an overgeneralization, and it's a useless
overgeneralization.

Concerning efficiency, it is certainly less efficient than PARSE-NAME,
because it has to deal with its parameter and special-case it, and it
has to copy the resulting string to its buffer.

>PARSE is available for those situations in which you wish *not* to skip
>leading delimiters.

I.e., every time the delimiter is not BL.

So yes, once we have a replacement for FIND (which is totally broken,
unlike WORD, which just has a bad interface), we can phase out WORD.

There may be programs using it, that's why we take a long time to
phase out existing features. I think in this case it's quite easy:
Replace BL WORD COUNT with PARSE-NAME and BL WORD FIND with PARSE-NAME
<find-replacement>; that should take care of most uses of WORD.

As for system usage: Whether WORD is in the standard or not, systems
can use it internally, so that's no reason to keep it in the standard.

Concerning mainstream Forths using it internally, I have just checked
Gforth. Apart from tests, it uses WORD only in [DEFINED], and that's
now replaced with a shorter implementation using PARSE-NAME. But
maybe Gforth is an exotic Forth system.

It may be interesting how I found the use of WORD. That's not as easy
as using grep, because "word" occurs in many comments, and these are
also words like WORDLIST. So what did I do? I just deleted the
definition of WORD and looked what broke; only [DEFINED] broke, and
after fixing that, the tests.

Bruce McFarling

unread,

Mar 18, 2010, 11:04:57 AM3/18/10

to

On Mar 18, 9:20 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> There is another widely implemented PARSE-WORD that takes a delimiter.
> That was the reason why PARSE-NAME was standardized with that name and
> not with the name PARSE-WORD.

If PARSE-WORD was exact match - that is, no special white-space
treatment - and with STR, taking wouldn't Coos Haak's WORD (which
assumes the WORD buffer is at HERE) be something like:

: WORD ( char "<chars>ccc<char>" -- c-addr )
DUP BL = IF
DROP PARSE-NAME

ELSE PARSE-WORD

Albert van der Horst

unread,

Mar 18, 2010, 5:22:04 PM3/18/10

to

In article <bd177f76-9c44-475d...@z3g2000yqz.googlegroups.com>,
Bruce McFarling <agi...@netscape.net> wrote:

>On Mar 17, 7:59=A0pm, "Ed" <nos...@invalid.com> wrote:
>> J Thomas wrote:
>> > ...
>> > WORD does do two things.
>>
>> > : WORD

>> > =A0 =A0HERE PARSE-WORD STRING, ;

>>
>> > It does PARSE-WORD and it also creates a counted string at HERE .
>> > ...
>>
>> No.
>>

>> A.6.2.2008 's version of =A0PARSE-WORD =A0(equivalent to =A0PARSE-NAME )
>> has no delimiter character on the stack and therefore is *not* a factor o=

>f
>> WORD.
>
>It is not the factor as shown above. I believe that Albert van Horst
>is referring to using PARSE-NAME as the factor for the function of
>WORD selected by passing it "BL" ...

> ... DUP BL =3D IF PARSE-NAME ELSE ...

I think I must apologize (especially to Elizabeth) because of
my misconception that the behaviour with respect to leading
delimiters is different for blanks and non-blanks.

The following still stands:
Note that NAME skips till anything considered blank (virgin 1], space,
tab, cr, lf, end of input), where in the other case it
compares to an exact delimiter.

Sure thing is that NAME and PARSE can be used as factors in
WORD, e.g. in ciforth I have the following definitions

\ _ leaves a don't care.
: NAME ( aka PARSE-NAME)
\ Skip till not blank, or endofinput, leave pos
_ BEGIN DROP IN[] ?BLANK OVER SRC CELL+ @ - AND 0= UNTIL
\ Skip till blank ( endofinput is also blank), leave pos
_ BEGIN DROP IN[] ?BLANK UNTIL
OVER -
;

: WORD
DUP BL = IF
DROP NAME
ELSE
>R
\ Skip leading delimiters.
BEGIN IN[] R@ = WHILE DROP REPEAT DROP
-1 >IN +!
R> PARSE
THEN
HERE 22 BLANK HERE $!-BD HERE
;

It seems that the two points of skipping and use of a transient region
separately and independantly make it impossible to use WORD as a
factor for S" .

If that is true, in a system with both S" and WORD a good designer
will come up with PARSE as a factor.

Groetjes Albert

1] virgin, i.e. 0FF or blank flash.

J Thomas

unread,

Mar 18, 2010, 5:12:49 PM3/18/10

to

OK, too bad. They sort of share a factor. As I understand it, the point
of PARSE-NAME is that it skips all white-space characters, not just BL.
It does it the way the text interpreter does it, and you can't reliably
do just BL and expect to catch other white-space. What a mess.

I don't see any easy way to factor those so the obvious similarity can be
used.

In a vaguely similar situation, CASE OF , OF normally takes a single
character and branches on comparison. OF is something like OVER = IF
DROP . But you can do whatever comparisons you want and generate a flag
and then compare that flag to a true or a false flag, whichever you
prefer. Something that's defined as a single compare can be extended
easily. But it would be hard to do that with WORD or the factor of PARSE-
NAME . You'd probably have to pass the command an xt to do the
comparison. Complicated and hard to read.

The Beez'

unread,

Mar 18, 2010, 6:46:56 PM3/18/10

to

On 17 mrt, 20:09, Elizabeth D Rather <erat...@forth.com> wrote:
> That sounds like a problem with your implementation. In my experience,
> WORD is clean and efficient and does what it's designed to do well.
> PARSE is available for those situations in which you wish *not* to skip
> leading delimiters. And knowing which you want is surely a conscious
> decision based on your knowledge of what you're trying to do.

There are three major problems with WORD:
(a) It's two functionalities in one. Chuck never liked IF's, neither
do I. It's confusing behavior.
(b) It takes up memory and valuable time to copy the string to a
buffer.
(c) It doesn't follow the ANS paradigm addr/count as one of the two
not-obsolete exceptions.

If I want a WS parse, I use PARSE-NAME;
If I want a clean parse, I use PARSE;
If I want to skip leading delimiters I use PARSE-WORD (much like WORD,
but leaves an addr/count string).

My implementation also has OMIT which skips leading delimiters so,
yes, I could even do without PARSE-WORD
: PARSE-WORD DUP OMIT PARSE ;

Ed

unread,

Mar 18, 2010, 8:02:25 PM3/18/10

to

Coos Haak wrote:
> Op Thu, 18 Mar 2010 10:59:00 +1100 schreef Ed:
>
> > J Thomas wrote:
> >> ...
> >> WORD does do two things.
> >>
> >>: WORD
> >> HERE PARSE-WORD STRING, ;
> >>
> >> It does PARSE-WORD and it also creates a counted string at HERE .
> >> ...
> >
> > No.
> >
> > A.6.2.2008 's version of PARSE-WORD (equivalent to PARSE-NAME )
> > has no delimiter character on the stack and therefore is *not* a factor of
> > WORD.
>
> : WORD ( char "<chars>ccc<char>" -- c-addr )
> DUP BL = IF DROP PARSE-NAME
> ELSE >R SOURCE OVER SWAP >IN @ /STRING
> R@ SKIP
> DROP SWAP - >IN ! R> PARSE
> THEN
> HERE PLACE HERE BL OVER COUNT + C!
> ;
>
> So, PARSE-NAME is a factor of WORD

That's not a factor - it's a workaround to overcome the fact PARSE-NAME
*isn't* a factor. It's the sort of hack one resorts to when one has made a
wrong design choice earlier on.

Let's be clear. The premise behind the ANS' PARSE-WORD suggestion
was that WORD *could* be eliminated:

"... If both PARSE and PARSE-WORD are present, the need for
WORD is largely eliminated."

It turns out that one can't for a host of reasons. Add to that WORD is
still very effective and handles 99% or more of forth token parsing needs.

I can understand what some are saying i.e. it might be handy if the
parsing component of WORD was factored out. Yes it might. But
let's not overstate the need for it, or implement it as a non-factor.

Coos Haak

unread,

Mar 18, 2010, 8:30:22 PM3/18/10

to

Op Fri, 19 Mar 2010 11:02:25 +1100 schreef Ed:

> Coos Haak wrote:
<snip>

>>: WORD ( char "<chars>ccc<char>" -- c-addr )
>> DUP BL = IF DROP PARSE-NAME
>> ELSE >R SOURCE OVER SWAP >IN @ /STRING
>> R@ SKIP
>> DROP SWAP - >IN ! R> PARSE
>> THEN
>> HERE PLACE HERE BL OVER COUNT + C!
>> ;
>>
>> So, PARSE-NAME is a factor of WORD
>
> That's not a factor - it's a workaround to overcome the fact PARSE-NAME
> *isn't* a factor. It's the sort of hack one resorts to when one has made a
> wrong design choice earlier on.
>

I still maintain that PARSE-NAME is a factor. WORD is in _my_
implementation a loadable extension, IOW a workaround. PARSE-NAME is
in the kernel.

Ed

unread,

Mar 18, 2010, 9:08:02 PM3/18/10

to

The Beez' wrote:
> ...

> There are three major problems with WORD:
> (a) It's two functionalities in one. Chuck never liked IF's, neither
> do I. It's confusing behavior.
> (b) It takes up memory and valuable time to copy the string to a
> buffer.

It's hard to believe Chuck would design a word with so many problems!

Yes, WORD has two functionalities in one. Presumably WORD
wasn't factored because there was no pressing need to do so.
There still isn't.

WORD was efficient. It placed the result at HERE (or thereabouts)
because that's where the header would be built. It consumed no extra
space, sharing it with other transient areas such as numeric pictured
output. And one could write to WORD's buffer if need be.

If WORD was deemed efficient enough to use in the 60's and 70's
when minis and micros were slow, how could it possibly be an issue
today.

> (c) It doesn't follow the ANS paradigm as one of the two
> not-obsolete exceptions.

Precisely. The ANS paradigm doesn't state that existing non addr/count
words must be replaced. That would be revolution for its own sake
without any regard for the consequences.

WORD and FIND have a strong historical context and it would have
been unwise to overturn them based on an ideological preference.
Besides which, they still worked and continue to.

Elizabeth D Rather

unread,

Mar 19, 2010, 3:35:23 AM3/19/10

to

Thank you! Excellent response. WORD has been the center of Forth's
text interpretation since the early 70's. It was perfectly tuned to its
use. Not only did it move its string to the space where it might very
likely end up residing as a header (a space that is reusable otherwise),
that space is also (when the dictionary and data space are integrated)
where S" strings, ," strings, and most other parsed strings will go.
The *only* significant issue with it has been that the "skip leading
delimiters" behavior requires that words like ( and S" be followed by
*two* spaces, which has been at most a minor irritant.

It is good that we now have PARSE for the situations in which you don't
want to skip leading delimiters. But we still use WORD heavily, and
don't plan to change.

J Thomas

unread,

Mar 19, 2010, 9:27:01 AM3/19/10

to

Ed wrote:

> Coos Haak wrote:
>> schreef Ed:
>> > J Thomas wrote:

>> >> WORD does do two things.
>> >>
>> >>: WORD
>> >> HERE PARSE-WORD STRING, ;

>> > No.

>> >
>> > A.6.2.2008 's version of PARSE-WORD (equivalent to PARSE-NAME )
>> > has no delimiter character on the stack and therefore is *not* a
>> > factor of WORD.

>> : WORD ( char "<chars>ccc<char>" -- c-addr )
>> DUP BL = IF DROP PARSE-NAME
>> ELSE >R SOURCE OVER SWAP >IN @ /STRING
>> R@ SKIP
>> DROP SWAP - >IN ! R> PARSE
>> THEN
>> HERE PLACE HERE BL OVER COUNT + C!
>> ;
>>
>> So, PARSE-NAME is a factor of WORD

> That's not a factor - it's a workaround to overcome the fact PARSE-NAME
> *isn't* a factor. It's the sort of hack one resorts to when one has
> made a wrong design choice earlier on.

Yes!

I agree with you that WORD was usually adequate, and we can get by with
no additional changes. I want to explore where the wrong design choice
came from, noting that it might not be worth fixing.

*Factor 1:*

The first thing about WORD is that it moves things to HERE whether you
want it to or not. The factor STRING, is useful though it doesn't belong
in the core wordset because the core is too crowded already. If you can
do the parsing separately then you can do WORD by first parsing and then
doing STRING, .

This looks potentially useful to me, but not tremendously important. It's
an obvious small improvement to be able to parse a string and then put it
anywhere, rather than put it at HERE and then have to MOVE it if you want
it elsewhere. Obviously better but not a whole lot better.

If we have a word like WORD that doesn't move the string but supplies the
address and count, we want it to do two different things.

1. We want it to accept a character, parse the input buffer and discard
that character and return the first string that does not contain that
character.

2. We want it to parse the input buffer and discard whitespace, and
return the first string that does not contain whitespace -- where
whitespace is "implementation dependent" but will usually include
anything except 32-126 although on UTF? it may be more complicatead.

The second desire came with ANS Forth. It is a complication. Standard
programs can be written using only graphic characters 32-126 (and EOL
which needn't be exposed to the parser), and then there is no issue about
whitespace.

WORD PARSE etc accept one character as a delimiter. We could set up a way
to accept multiple delimiters and then we'd be on the way toward regular
expressions. No thank you. We just need the whitespace special case so we
can have tabs and other whitespace in source code.

Mixing this complication in with WORD etc is the design flaw. It does not
fit. You can make BL be a special case for WORD etc with the rare problem
that you then can't use WORD etc to parse only for BL .

WORD and PARSE-NAME are simple and easy when BL is the only whitespace.
They just don't generalise well to multiple whitespaces.

The obvious solution is to make words like ACCEPT-CODE and READ-CODE-LINE
which convert whitespace to blanks when the buffer is first filled. Then
for source code you don't have to worry about whitespace and you only
have to check it once. Would you need CODE-S" too? [sigh]

There's more factoring but I'll stop here for now.

J Thomas

unread,

Mar 19, 2010, 10:21:17 AM3/19/10

to

One obvious factor for WORD is to leave the string as-is, ready to be
moved anywhere with STRING,

Whitspace is an ugly complication. Putting aside the whitespace mess,
there are other factors.

*Factor #2*

WORD first skips the leading delimiter, and then searches for the leading
delimiter. Two functions. They could be separated for another small
improvement. I believe traditionally the name of the first function was
SKIP and it was never standardised.

Apart from whitespace, you could implement word with something like

: WORD
dup skip parse here -rot string, ;

following The Beez's idea.

> Let's be clear. The premise behind the ANS' PARSE-WORD suggestion was
> that WORD *could* be eliminated:
>
> "... If both PARSE and PARSE-WORD are present, the need for WORD is
> largely eliminated."
>
> It turns out that one can't for a host of reasons. Add to that WORD

> is still very effective and handles 99% or more of forth token parsing
> needs.

The central problem is whitespace. Apart from that, we get several small
improvements if WORD's factors are available. And in that case WORD can
be eliminated easily provided we have a substitute for FIND .

Counted strings are a useful way to store string data of length < 256. I
believe it makes sense to factor the count out of words like FIND that do
not specifically need it. If the word itself immediately does COUNT then
when I want to use it and don't have a counted string, it's a bother to
create a counted string to give to it. If you usually start with counted
strings then putting COUNT into FIND is more compact. But I usually start
with the input buffer....

*Factor #3*

WORD works only on the input buffer. A version that worked on any string
would be more powerful.

This is possible but baroque from the String wordset, using SEARCH .
Again ignoring whitespace it isn't hard without that. Something like:

: STRING-SKIP ( addr len delimiter -- addr' len' )
>R BEGIN
DUP WHILE
OVER C@ R@ = WHILE
1 /STRING
REPEAT
THEN RDROP ;

: STRING-PARSE (addr len delimiter -- addr' len' )
>R 2DUP BEGIN
DUP WHILE
OVER C@ R@ - WHILE
1 /STRING
REPEAT
THEN RDROP NIP - ;

Would those be better DO LOOP ? Part of it is almost identical but it
isn't easy to factor.

> I can understand what some are saying i.e. it might be handy if the
> parsing component of WORD was factored out. Yes it might. But let's
> not overstate the need for it, or implement it as a non-factor.

It got implemented to use whitespace because that's the most common use.
Given a bad early design choice, people adapt as best they can.

*My conclusions: *

If there was no cost for extra names, these factors would all be worth
having. Since there is a cost, they're only worth having if they're
useful enough to justify the names. In the past they were not.

It's probably better to convert whitespace to blanks in the buffers
before doing WORD etc. But the standard mandates that the interpreter can
handle whitespace so it's probably too late for that solution.

I've learned something by looking at it even if there's no immediate
result.

Andrew Haley

unread,

Mar 19, 2010, 10:24:45 AM3/19/10

to

Elizabeth D Rather <era...@forth.com> wrote:
> Albert van der Horst wrote:
> ...

> That sounds like a problem with your implementation. In my experience,
> WORD is clean and efficient and does what it's designed to do well.
> PARSE is available for those situations in which you wish *not* to skip
> leading delimiters. And knowing which you want is surely a conscious
> decision based on your knowledge of what you're trying to do.
>
>> All this was cute in a time that assembly was compact compared to
>> high level code, and one had to cram functionality in one definition.
>> Not any more.
>> You remember ENCLOSE ?
>
> No, actually, I've never seen ENCLOSE.

It was pretty evil, a fig-FORTH special. Also, the 6502 reference
implementation of ENCLOSE has a horrible bug that took me a long while
to figure out: if a block is more than 256 bytes long there can be a
missing carry and ENCLOSE crashes. <shudder>

Andrew.

Albert van der Horst

unread,

Mar 19, 2010, 12:03:24 PM3/19/10

to

In article <hnuil8$4tp$1...@news-01.bur.connect.com.au>,

Ed <nos...@invalid.com> wrote:
>The Beez' wrote:
>> ...
>> There are three major problems with WORD:
>> (a) It's two functionalities in one. Chuck never liked IF's, neither
>> do I. It's confusing behavior.
>> (b) It takes up memory and valuable time to copy the string to a
>> buffer.
>
>It's hard to believe Chuck would design a word with so many problems!

I checked out "Footsteps in an empty valley".
Apparently:
Chuck couldn't care less if leading delimiters are trimmed for
non-blanks.
Chuck worked with blocks. There is just one character to be considered
blank, 32. No tabs, no newlines.

So the the ANSI standard is where we lost our innocence ;-)

Groetjes Albert

Elizabeth D Rather

unread,

Mar 19, 2010, 2:40:58 PM3/19/10

to

Albert van der Horst wrote:

> In article <hnuil8$4tp$1...@news-01.bur.connect.com.au>,
> Ed <nos...@invalid.com> wrote:
>> The Beez' wrote:
>>> ...
>>> There are three major problems with WORD:
>>> (a) It's two functionalities in one. Chuck never liked IF's, neither
>>> do I. It's confusing behavior.
>>> (b) It takes up memory and valuable time to copy the string to a
>>> buffer.
>> It's hard to believe Chuck would design a word with so many problems!
>
> I checked out "Footsteps in an empty valley".
> Apparently:
> Chuck couldn't care less if leading delimiters are trimmed for
> non-blanks.
> Chuck worked with blocks. There is just one character to be considered
> blank, 32. No tabs, no newlines.
>
> So the the ANSI standard is where we lost our innocence ;-)
>
> Groetjes Albert

The requirement to accept whitespace characters was necessary in order
to use text files managed by external editors for program source. They
still aren't required for blocks. The use of text files and popular
programmer's editors is a practical step, and it was a good trade-off.

Bruce McFarling

unread,

Mar 19, 2010, 2:43:31 PM3/19/10

to

On Mar 19, 10:21 am, J Thomas <jethom...@gmail.com> wrote:
> If there was no cost for extra names, these factors would all be worth
> having. Since there is a cost, they're only worth having if they're
> useful enough to justify the names. In the past they were not.

SKIP and SCAN are useful enough, they can't be standardized because of
the various handling of BL SKIP and BL SCAN. Some SKIPs are exact
match and some SKIPs take BL for all white space.

> It's probably better to convert whitespace to blanks in the buffers
> before doing WORD etc.

WS-SKIP ( ca1 u1 -- ca2 u2 ) and exact match SKIP ( ca1 u1 c -- ca2
u2 )
WS-SCAN ( ca1 u1 -- ca2 u2 ) and exact match SCAN ( ca1 u1 c -- ca2
u2 )

SSPLIT ( ca1 u1 ca2 u2 -- ca1 u3 ca2 u2 ) where ca2 u2 is a suffix of
ca1 u2
>INPUT ( ca u -- ) ... must be inside input buffer
INPUT> ( -- ca u )

... as primitives together with REFILL give a lot of flexibility.

: PARSE ( c -- ca u ) INPUT> 2DUP SCAN SSPLIT >INPUT ;

With >WORD$ defined for the ( ca1 u -- ca2 ) word buffer filling
operation, then something like

: WORD ( c -- ) >R INPUT> R@ BL = IF
RDROP WS-SKIP 2DUP WS-SCAN
ELSE
R@ SKIP 2DUP R> SCAN
THEN SSPLIT >INPUT >WORD$ ;

etc.

J Thomas

unread,

Mar 19, 2010, 4:53:06 PM3/19/10

to

Bruce McFarling wrote:
> J Thomas <jethom...@gmail.com> wrote:

>> If there was no cost for extra names, these factors would all be worth
>> having. Since there is a cost, they're only worth having if they're
>> useful enough to justify the names. In the past they were not.
>
> SKIP and SCAN are useful enough, they can't be standardized because of
> the various handling of BL SKIP and BL SCAN. Some SKIPs are exact match
> and some SKIPs take BL for all white space.

So they could be standard words with new names!

>> It's probably better to convert whitespace to blanks in the buffers
>> before doing WORD etc.
>
> WS-SKIP ( ca1 u1 -- ca2 u2 ) and exact match SKIP ( ca1 u1 c -- ca2 u2 )
> WS-SCAN ( ca1 u1 -- ca2 u2 ) and exact match SCAN ( ca1 u1 c -- ca2 u2 )
>
> SSPLIT ( ca1 u1 ca2 u2 -- ca1 u3 ca2 u2 ) where ca2 u2 is a suffix of
> ca1 u2
>>INPUT ( ca u -- ) ... must be inside input buffer
> INPUT> ( -- ca u )

I think you can get by with less if you convert whitespace to blanks in
the buffers first. Then the whole issue goes away. It looks to me like
fewer words. Of course, if you find yourself wanting to parse read-only
buffers there's a problem. You'd have to copy to a buffer that isn't read-
only and continue, if you found nonblank whitespace. But converting
whitespace to spaces is pretty straightforward. And then the rest is
straightforward too. Don't we already have Forths that for example
convert all source code to uppercase before FINDing it? Not so much
different.

> ... as primitives together with REFILL give a lot of flexibility.
>
> : PARSE ( c -- ca u ) INPUT> 2DUP SCAN SSPLIT >INPUT ;

Agreed, that looks workable.

Bruce McFarling

unread,

Mar 19, 2010, 6:24:38 PM3/19/10

to

On Mar 19, 4:53 pm, J Thomas <jethom...@gmail.com> wrote:
> >> It's probably better to convert whitespace to blanks in the buffers
> >> before doing WORD etc.

For my purposes, they are general purpose tools. Using them for
supporting Forth-94 standard text input words is another application.

I'd not want to destroy information that I might want to get back, so
I'd not be in favor of the convert-to-blanks approach, even if it
suffices for just the application of the standard text interpreter
words. For one thing, I still sometimes use the VDE convention of
<space><return> as a soft-return marker for word-wrappable paragraphs.

The Beez'

unread,

Mar 20, 2010, 12:47:52 PM3/20/10

to

On 19 mrt, 19:40, Elizabeth D Rather <erat...@forth.com> wrote:
> The requirement to accept whitespace characters was necessary in order
> to use text files managed by external editors for program source. They
> still aren't required for blocks. The use of text files and popular
> programmer's editors is a practical step, and it was a good trade-off.

So, if I retell the story in my own words:
- The WORD that Chuck made was much like PARSE-WORD nowadays, that is:
nice and clean;
- It was logical for two reasons, first it worked with blocks - which
contained only blanks - and second, counted strings were the norm back
then;
- Then the sequential files came in (remember that extension .SEQ?)
which could have several different white space characters;
- In order to "salvage" old code and be compatible with the new world,
they crammed an ugly patch in WORD which said "DUP BL = IF DROP PARSE-
NAME ELSE PARSE-WORD THEN", problem solved;
- Then ANS-Forth came in, introducing "PARSE" and addr/count strings.
In order to save WORDs head three things were done. First FIND was
maintained, second "PARSE-WORD" and "PARSE-NAME" were conveniently
kept out of the standard (but hinted at in the Rationale) and third
WORD was not declared obsolete, although it was just as much an ugly
and tainted relic as "EXPECT" and "QUERY";
- And now, finally, under the everlasting argument "I don't want to
break a 100,000 ancient and obsolete programs", WORD becomes the
paramount of well designed code.

Sorry, I don't buy it. For two reasons:
(a) Probably 90% can be fixed by replacing "BL WORD COUNT" with "PARSE-
NAME" and a subsequent "WORD COUNT" with "PARSE-WORD";
(b) We FINALLY have a chance to free ourselves from 255-length counted
strings - as the ANS-Forth standard so elegantly designed.

Add the adoption of PLACE and +PLACE and we won't be the laughing
stock of programming languages anymore, because we've hidden all the
ugly carnal details behind an elegant abstraction. There are few
situations where carnal knowledge of strings is required.

Sorry Elizabeth, it may seem it is directed to you, but that is not
the case at all. I have too much respect for you for that. But darn,
let's do this! Let's make this final step and make string support
consistent again!

Marcel Hendrix

unread,

Mar 20, 2010, 12:21:20 PM3/20/10

to

"The Beez'" <han...@bigfoot.com> writes Re: RFD: Legacy Wordset
[..]

> (a) Probably 90% can be fixed by replacing "BL WORD COUNT" with "PARSE-
> NAME" and a subsequent "WORD COUNT" with "PARSE-WORD";
> (b) We FINALLY have a chance to free ourselves from 255-length counted
> strings - as the ANS-Forth standard so elegantly designed.

[..]

Apparently, OSX has (some) function names with more than 256 characters.
As a consequence, iForth supports REALLY LONG word names since vsn 4, and
WORD is becoming less and less useful.

The 256 character world is crumbling down, and I don't mind weeding
out inappropriate WORDs and FINDs from my own source code. The first
step would be an agreed upon alternative for FIND .

-marcel

Bruce McFarling

unread,

Mar 20, 2010, 1:57:02 PM3/20/10

to

On Mar 20, 12:21 pm, m...@iae.nl (Marcel Hendrix) wrote:
> The 256 character world is crumbling down, and I don't mind weeding
> out inappropriate WORDs and FINDs from my own source code. The first
> step would be an agreed upon alternative for FIND.

Yes, the 256 character limit is an "artificial" limit, since for small
systems the *real* limit is 31 (or possibly 15) characters for a name,
while even small systems that can now add a quite large filespace in
the form of an SD card if they can get an SPI running at somewhere
between 100kHz and 400kHz.

There is already an agreed upon alternative for FIND in a single
wordlist ... though of course perhaps because those who would care
most about FIND cares less about SEARCH-WORDLIST.

The replacement ought to either be or support the same stack picture
without the wordlist specification.

For implementations that prefer the FIND style retained original
string on failure and wish to expose all status info from a dictionary
entry, a factor of SEARCH-NAME that works that way would be:

HUNT-NAME ( ca1 u1 -- ca2 status TRUE | ca1 u1 FALSE )

... with an immediate status reading word providing the SEARCH-
WORDLIST style result for a two-word SEARCH-NAME.

Given that ``S'' stands for a cell-counted string in S" and a cell-
counted string allows us to store an arbitrary number of tokens rather
than just one token, one might suggest char counted and cell counted
strings each have their own respective places:

PLACE ( ca1 uc ca2 -- )
+PLACE ( ca1 uc ca2 -- )
COUNT ( ca1 -- ca2 uc )
PLACES ( ca1 u ca2 -- )
+PLACES ( ca1 u ca2 -- )
COUNTS ( ca1 -- ca2 u )

Ed

unread,

Mar 20, 2010, 6:19:15 PM3/20/10

to

J Thomas wrote:
> ...

> ...

There's a catch. ANS doesn't definitively state what is white space
- it's implementation-defined. Even if one implements white space
handling in INCLUDE-FILE , or WORD or PARSE-NAME there's
no guarantee it will work in every situation - just as READ-LINE
isn't guaranteed to work on text files from another OS.

The way I read "3.4.1.1 Delimiters" provision for white space
delimiters other than BL while available to implementers, is more
entitlement than requirement. And it's "user beware".

For these reasons (and to keep things simple) the only white space
my interpreter and parsing words support is BL. INCLUDE-FILE
provides rudimentary conversion of white space characters to blanks
for text files.

Ed

unread,

Mar 20, 2010, 8:11:05 PM3/20/10

to

Marcel Hendrix wrote:
> "The Beez'" <han...@bigfoot.com> writes Re: RFD: Legacy Wordset
> [..]
> > (a) Probably 90% can be fixed by replacing "BL WORD COUNT" with "PARSE-
> > NAME" and a subsequent "WORD COUNT" with "PARSE-WORD";
> > (b) We FINALLY have a chance to free ourselves from 255-length counted
> > strings - as the ANS-Forth standard so elegantly designed.
> [..]
>
> Apparently, OSX has (some) function names with more than 256 characters.
> As a consequence, iForth supports REALLY LONG word names since vsn 4, and
> WORD is becoming less and less useful.

> ...

WORD is still very useful but you've encountered the odd situation
which you elected to support. Not everyone may think it was
warranted.

The question for Standards is how to support change responsibly
without disenfranchising other users by throwing out long-standing
practice.

If it was determined that WORD and FIND needed updating, I would
look to see whether there existed suitable factors that provide the
new functionality while still making WORD and FIND trivial to define.

Anton Ertl

unread,

Mar 21, 2010, 6:04:27 AM3/21/10

to

"Ed" <nos...@invalid.com> writes:
>There's a catch. ANS doesn't definitively state what is white space
>- it's implementation-defined.

It's explicitly specified for files.

>The way I read "3.4.1.1 Delimiters" provision for white space
>delimiters other than BL while available to implementers, is more
>entitlement than requirement. And it's "user beware".

Read "11.3.6 Parsing":

|When parsing from a text file using a space delimiter, control
|characters shall be treated the same as the space character.

>For these reasons (and to keep things simple) the only white space
>my interpreter and parsing words support is BL. INCLUDE-FILE
>provides rudimentary conversion of white space characters to blanks
>for text files.

All that complexity to keep using an outdated word.

J Thomas

unread,

Mar 21, 2010, 9:02:04 AM3/21/10

to

Ed wrote:
> J Thomas wrote:

>> WORD and PARSE-NAME are simple and easy when BL is the only whitespace.
>> They just don't generalise well to multiple whitespaces.
>>
>> The obvious solution is to make words like ACCEPT-CODE and
>> READ-CODE-LINE which convert whitespace to blanks when the buffer is
>> first filled. Then for source code you don't have to worry about
>> whitespace and you only have to check it once. Would you need CODE-S"
>> too? [sigh] ...
>
> There's a catch. ANS doesn't definitively state what is white space -
> it's implementation-defined. Even if one implements white space
> handling in INCLUDE-FILE , or WORD or PARSE-NAME there's no
> guarantee it will work in every situation - just as READ-LINE isn't
> guaranteed to work on text files from another OS.
>
> The way I read "3.4.1.1 Delimiters" provision for white space
> delimiters other than BL while available to implementers, is more
> entitlement than requirement. And it's "user beware".

That's a good point. So, does it make sense to have a hook into the word
that converts whitespace to blanks? A system which has that could let
users decide what will be whitespace for them today. Then if they run
into a file where this is a problem, they can fix it.

However, I personally have not run into a problem here. Every Forth
source file I've ever loaded has been plain ASCII text with maybe a few
tabs. Some years ago we had people who were enthusiastic for literate
programming in Forth, where the default is comments and you use a special
symbol to start interpreting, and maybe have various formatting symbols
mixed into the code. It was an interesting idea but it didn't seem to
catch on.

My guess is that Forth programmers will also not use text editors that
put in lots of special codes that a Forth compiler must interpret as
whitespace. But I could be wrong.

> For these reasons (and to keep things simple) the only white space my
> interpreter and parsing words support is BL. INCLUDE-FILE provides
> rudimentary conversion of white space characters to blanks for text
> files.

That keeps things simple.

J Thomas

unread,

Mar 21, 2010, 9:03:39 AM3/21/10

to

Ed wrote:

> The question for Standards is how to support change responsibly without
> disenfranchising other users by throwing out long-standing practice.
>
> If it was determined that WORD and FIND needed updating, I would look to
> see whether there existed suitable factors that provide the new
> functionality while still making WORD and FIND trivial to define.

Yes. And at worst WORD and FIND would go into the Legacy wordset and
still be supported.

Peter Knaggs

unread,

Mar 21, 2010, 1:07:07 PM3/21/10

to

Given that there is not going to be a Legacy wordset, at the instance of
this group, that could be difficult. The approach would be to move WORD
and FIND into CORE EXT. I would be opposed to that until a suitable
alternative has made it from EXT to a core wordset. The only candidate
being PARSE as it is currently CORE EXT.

The other candidate PARSE-NAME was introduced as part of the 200x review
and thus should not be considered a as a core word on this revision. Any
other factor you might like to propose would have the same problem. It
should be introduced as an EXT word on this review and promoted to a core
word in the next revision.

--
Peter Knaggs

Anton Ertl

unread,

Mar 21, 2010, 1:11:48 PM3/21/10

to

"Peter Knaggs" <p...@bcs.org.uk> writes:
>On Sun, 21 Mar 2010 13:03:39 -0000, J Thomas <jeth...@gmail.com> wrote:
>
>> Ed wrote:
>>
>>> The question for Standards is how to support change responsibly without
>>> disenfranchising other users by throwing out long-standing practice.
>>>
>>> If it was determined that WORD and FIND needed updating, I would look to
>>> see whether there existed suitable factors that provide the new
>>> functionality while still making WORD and FIND trivial to define.
>>
>> Yes. And at worst WORD and FIND would go into the Legacy wordset and
>> still be supported.
>
>Given that there is not going to be a Legacy wordset, at the instance of
>this group, that could be difficult. The approach would be to move WORD
>and FIND into CORE EXT.

And declare them as obsolescent.

>I would be opposed to that until a suitable
>alternative has made it from EXT to a core wordset.

Why?

BTW, WORD (and PARSE-NAME) can be implemented with core words, so one
less reason to keep WORD in the CORE.

FIND is so badly specified that it cannot be used for much. No reason
to keep it in CORE.

> The only candidate
>being PARSE as it is currently CORE EXT.
>
>The other candidate PARSE-NAME was introduced as part of the 200x review
>and thus should not be considered a as a core word on this revision.

Why not?

We don't have any rules about wordsets, and I don't think we need any.

The rule we have is that if we remove features (including words), they
first become obsolete for a cycle, before finally being removed (or
not).

Stephen Pelc

unread,

Mar 21, 2010, 2:27:41 PM3/21/10

to

On Sat, 20 Mar 2010 18:21:20 +0200, m...@iae.nl (Marcel Hendrix) wrote:

>Apparently, OSX has (some) function names with more than 256 characters.
>As a consequence, iForth supports REALLY LONG word names since vsn 4, and
>WORD is becoming less and less useful.

Names over 2000 characters long have been observed in the wild. Ulrich
Drepper's paper on DSOs is illuminating.

>The 256 character world is crumbling down, and I don't mind weeding
>out inappropriate WORDs and FINDs from my own source code. The first
>step would be an agreed upon alternative for FIND .

In the spirit of SEARCH-WORDLIST, we have SEARCH-CONTEXT

: Search-Context \ c-addr len -- 0 | xt 1 | xt -1
\ *G Perform the *\fo{SEARCH-WORDLIST} operation on all wordlists
\ ** within the search order.

We expect to remove most uses of WORD and FIND from VFX Forth over
the next couple of years. However, it will be a long time before
they can be removed from a standard.

Stephen

--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads

Peter Knaggs

unread,

Mar 21, 2010, 4:44:33 PM3/21/10

to

On Sun, 21 Mar 2010 17:11:48 -0000, Anton Ertl
<an...@mips.complang.tuwien.ac.at> wrote:

> "Peter Knaggs" <p...@bcs.org.uk> writes:
>>
>> Given that there is not going to be a Legacy wordset, at the instance of
>> this group, that could be difficult. The approach would be to move WORD
>> and FIND into CORE EXT.
>
> And declare them as obsolescent.

Indeed.

>> I would be opposed to that until a suitable
>> alternative has made it from EXT to a core wordset.
>
> Why?

This would lead to the 200x standard not including any core words for
parsing. I can not believe the Forth community would allow this.

>> The only candidate being PARSE as it is currently CORE EXT.
>>
>> The other candidate PARSE-NAME was introduced as part of the 200x review
>> and thus should not be considered a as a core word on this revision.
>
> Why not?

So far we not have not introduced any new "core" words, they have all
been added to the extended wordsets. Adding new experimental words
to the core has been shown to be a rather bad idea, remember Forth-83.
Introducing them in to the extended wordset before promoting them to
the core wordsets seams like a practical approach.

--
Peter Knaggs

J Thomas

unread,

Mar 21, 2010, 6:45:57 PM3/21/10

to

I believe you have fully answered Ed's concern about supporting change
responsibly without disenfranchising users of long-standing practice.

Bruce McFarling

unread,

Mar 21, 2010, 6:48:59 PM3/21/10

to

On Mar 21, 1:07 pm, "Peter Knaggs" <p...@bcs.org.uk> wrote:
> Given that there is not going to be a Legacy wordset, at the instance of this group, that could be difficult.

When did that happen? My first preference is that the obsolescent
words in CORE EXT be removed entirely. My second preference is that
they be quarantined in LEGACY. My last preference is that the
obsolescent words be retained in CORE EXT.

I had been entirely unaware that the decision made a little while
after the Forth-94 process completed to remove the words had been
suspended on an insistence that the definitions continue to be
reserved in the standard.

If LEGACY is the only way to get the cruft out of CORE EXT, that's
better than nothing.

As far as FORGET is concerned, its not in CORE EXT so its less of an
issue.

Bruce McFarling

unread,

Mar 21, 2010, 6:57:30 PM3/21/10

to

On Mar 21, 1:07 pm, "Peter Knaggs" <p...@bcs.org.uk> wrote:

> The other candidate PARSE-NAME was introduced as part of the 200x review and thus should not be considered a as a core word on this revision.

I'm not clear on what is suggested to be on offer, here. Fine to move
PARSE into CORE and PARSE-NAME into CORE EXT, then?

Peter Knaggs

unread,

Mar 21, 2010, 8:14:12 PM3/21/10

to

On Sun, 21 Mar 2010 22:48:59 -0000, Bruce McFarling <agi...@netscape.net>
wrote:

>
> On Mar 21, 1:07 pm, "Peter Knaggs" <p...@bcs.org.uk> wrote:
>> Given that there is not going to be a Legacy wordset, at the instance
>> of this group, that could be difficult.
>
> When did that happen?

I refer you to the revised proposal <4AAAA10...@bcs.org.uk>.

Note that it is my intention to bring this proposal forward again, but only
after dealing with FORGET. I am waiting until after the Rostok meeting
before
bringing that back to the table.

> My first preference is that the obsolescent words in CORE EXT be removed
> entirely. My second preference is that they be quarantined in LEGACY. My
> last preference is that the obsolescent words be retained in CORE EXT.

...

> If LEGACY is the only way to get the cruft out of CORE EXT, that's
> better than nothing.

There is another alternative, for which I refer you to the proposal.

--
Peter Knaggs

Peter Knaggs

unread,

Mar 21, 2010, 8:26:26 PM3/21/10

to

On Sun, 21 Mar 2010 22:57:30 -0000, Bruce McFarling <agi...@netscape.net>
wrote:
>

PARSE-NAME was introduced in to CORE EXT by the x:parse-name proposal
which was accepted at the Santander meeting. As this was introduced
in this revision moving it from CORE EXT to CORE would be a significant
step. As PARSE is already in the CORE EXT wordset from the original
'94 document, moving it into the CORE wordset would not be as
significant.

In either case such a proposal would obviously have to go trough the
normal RfD/CfV process before being considered by the committee.

--
Peter Knaggs

Bruce McFarling

unread,

Mar 21, 2010, 9:37:33 PM3/21/10

to

On Mar 21, 8:14 pm, "Peter Knaggs" <p...@bcs.org.uk> wrote:
> I refer you to the revised proposal <4AAAA103.80...@bcs.org.uk>.

That doesn't get through to Google Groups as a reference, but if I
understand correctly, the link to the pdf that was posted at some
point was v1, replaced by v2 with the obsolescent words removed as had
been previously planned?

That's good too. IMHO, the most contentious obsolescent word, FORGET,
is not worth delaying the clean up of CORE EXT.

Ed

unread,

Mar 21, 2010, 10:15:00 PM3/21/10

to

Anton Ertl wrote:
> "Ed" <nos...@invalid.com> writes:
> >There's a catch. ANS doesn't definitively state what is white space
> >- it's implementation-defined.
>
> It's explicitly specified for files.
>
> >The way I read "3.4.1.1 Delimiters" provision for white space
> >delimiters other than BL while available to implementers, is more
> >entitlement than requirement. And it's "user beware".
>
> Read "11.3.6 Parsing":
>
> |When parsing from a text file using a space delimiter, control
> |characters shall be treated the same as the space character.

And when one looks into it:

3.1.2.2 Control characters
All non-graphic characters included in the implementation-defined
character set are defined in this Standard as control characters. In
particular, the characters {0..31}, which could be included in the
implementation-defined character set, are control characters.
Programs that require the ability to send or receive control characters
have an environmental dependency.

3.4.1.1 Delimiters
If the delimiter is the space character, hex 20 (BL), control characters
may be treated as delimiters. The set of conditions, if any, under
which a space delimiter matches control characters is implementation
defined.

4.1.1 Implementation-defined options
...
conditions under which control characters match a space delimiter
(3.4.1.1 Delimiters);

From that, a control character could be almost anything and it's not
guaranteed it will be equivalent to a blank delimiter.

C may have been built around the concept of white-space but for forth
it's BL. There are no functions in forth to test for white space, nor does
the literature even refer to it. Definitions are couched in terms of <space>
as delimiter:

Table 2.1 - Parsed text abbreviations
...
Abbreviation Description
------------ -----------
<char> the delimiting character marking the end of the
string being parsed
<chars> zero or more consecutive occurrences of the
character char
<space> a delimiting space character
<spaces> zero or more consecutive occurrences of the
character space
<quote> a delimiting double quote
<paren> a delimiting right parenthesis
<eol> an implied delimiter marking the end of a line
ccc a parsed sequence of arbitrary characters,
excluding the delimiter character
name a token delimited by space, equivalent to
ccc<space> or ccc<eol>

> >For these reasons (and to keep things simple) the only white space
> >my interpreter and parsing words support is BL. INCLUDE-FILE
> >provides rudimentary conversion of white space characters to blanks
> >for text files.
>
> All that complexity to keep using an outdated word.

The other approach would be to replace parsing words with ones that
handle white-space which the Standard doesn't define and couldn't be
guaranteed to work across text file formats.

I'll stick with the outdated word and BL as delimiter. The mess that is
white-space, I'll keep in INCLUDE-FILE.

Ed

unread,

Mar 21, 2010, 11:21:14 PM3/21/10

to

Stephen Pelc wrote:
> On Sat, 20 Mar 2010 18:21:20 +0200, m...@iae.nl (Marcel Hendrix) wrote:
>
> >Apparently, OSX has (some) function names with more than 256 characters.
> >As a consequence, iForth supports REALLY LONG word names since vsn 4, and
> >WORD is becoming less and less useful.
>
> Names over 2000 characters long have been observed in the wild. Ulrich
> Drepper's paper on DSOs is illuminating.

> ...

Many things are announced or appear in the computing world but
whether they come to pass or have traction is another thing.

C and Fortran seem happy enough with 31 or 63 character identifiers.
It's hard to believe Forth suddenly needs 2000 character identifiers,
when the maximum length of an S" string as implemented by
SwiftForth, VFX and Win32Forth is still 255 chars.

Nor apparently do their users need more, because if one attempts
to input more than 255 chars, some of these systems will crash.
It seems no-one noticed.

Imagining a need, is not the same as having it :)

The Beez'

unread,

Mar 22, 2010, 3:34:34 AM3/22/10

to

On Mar 22, 4:21 am, "Ed" <nos...@invalid.com> wrote:
> C and Fortran seem happy enough with 31 or 63 character identifiers.

> Nor apparently do their users need more, because if one attempts
> to input more than 255 chars, some of these systems will crash.
> It seems no-one noticed.

Well, think of a normal line or even a line in a block: you can (with
the current limit) cram only TWO identifiers in one line, without
performing ANY operation. Imagine, needing TWO lines for:

myconstant myvariable !

Somehow, that doesn't seem very Forth-like to me. Even 31 chars is
rather long for my taste.

Hans Bezemer

Peter Knaggs

unread,

Mar 22, 2010, 5:51:23 AM3/22/10

to

On Mon, 22 Mar 2010 01:37:33 -0000, Bruce McFarling <agi...@netscape.net>
wrote:

> On Mar 21, 8:14 pm, "Peter Knaggs" <p...@bcs.org.uk> wrote:

>> I refer you to the revised proposal <4AAAA103.80...@bcs.org.uk>.
>
> That doesn't get through to Google Groups as a reference,

http://groups.google.com/group/comp.lang.forth/browse_thread/thread/c791294690c65d74

The above link from www.forth200x.org will take you to the current
proposal.

> but if I
> understand correctly, the link to the pdf that was posted at some
> point was v1, replaced by v2 with the obsolescent words removed as had
> been previously planned?

Rather than creating a separate Legacy wordset the proposal is to
place a simple list of the removed words in Appendix D, a comparison
between 200x and ANS.

--
Peter Knaggs

Stephen Pelc

unread,

Mar 22, 2010, 6:24:12 AM3/22/10

to

On Mon, 22 Mar 2010 14:21:14 +1100, "Ed" <nos...@invalid.com> wrote:

>Stephen Pelc wrote:
>> On Sat, 20 Mar 2010 18:21:20 +0200, m...@iae.nl (Marcel Hendrix) wrote:
>>
>> >Apparently, OSX has (some) function names with more than 256 characters.
>> >As a consequence, iForth supports REALLY LONG word names since vsn 4, and
>> >WORD is becoming less and less useful.
>>
>> Names over 2000 characters long have been observed in the wild. Ulrich
>> Drepper's paper on DSOs is illuminating.
>> ...

>C and Fortran seem happy enough with 31 or 63 character identifiers.

>It's hard to believe Forth suddenly needs 2000 character identifiers,
>when the maximum length of an S" string as implemented by
>SwiftForth, VFX and Win32Forth is still 255 chars.

The names are in OpenOffice. If one of our clients needed to access
on of these, I would treat it as a valid request.

Thomas Pornin

unread,

Mar 22, 2010, 8:06:36 AM3/22/10

to

According to Ed <nos...@invalid.com>:

> C and Fortran seem happy enough with 31 or 63 character identifiers.

The C standard specifies that the implementation may have a limit on the
number of characters which are considered "significant" for comparisons
of identifiers; those limits are implementation-defined with some
minimal values which depend on the context. Fortunately, these are only
_minimal_ values and existing C compilers accept larger identifiers with
all characters considered as significant. And long identifiers _are_
used.

For instance, when writing C code which should interface with Java code
(implementation of native methods), the C function name shall match the
fully qualified Java method name, with name mangling for the types of
the arguments and returned value (this is needed for disambiguation, due
to overloading of method names in Java). Names longer than 100
characters are not uncommon.

--Thomas Pornin

Albert van der Horst

unread,

Mar 22, 2010, 8:53:08 AM3/22/10

to

In article <ho3hgt$o3e$1...@news-01.bur.connect.com.au>,
Ed <nos...@invalid.com> wrote:

>J Thomas wrote:
>
>There's a catch. ANS doesn't definitively state what is white space
>- it's implementation-defined. Even if one implements white space
>handling in INCLUDE-FILE , or WORD or PARSE-NAME there's
>no guarantee it will work in every situation - just as READ-LINE
>isn't guaranteed to work on text files from another OS.
>
>The way I read "3.4.1.1 Delimiters" provision for white space
>delimiters other than BL while available to implementers, is more
>entitlement than requirement. And it's "user beware".
>
>For these reasons (and to keep things simple) the only white space
>my interpreter and parsing words support is BL. INCLUDE-FILE
>provides rudimentary conversion of white space characters to blanks
>for text files.

Suppose you want to use a system with blocks in flash.
Would you not use a facility to write to blocks one line
at a time, where virgin flash ( 0FF ) is considered blank
space?

And... is it a big deal?

: ?BLANK BL 80 WITHIN 0= ;

Groetjes Albert

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Bruce McFarling

unread,

Mar 22, 2010, 10:54:37 AM3/22/10

to

On Mar 22, 8:53 am, Albert van der Horst <alb...@spenarnc.xs4all.nl>
wrote:

> And... is it a big deal?

> : ?BLANK BL 80 WITHIN 0= ;

The only issue is that sometime you want whatever is the system's
whitespace, and sometimes you actually want to match BL. If you want
to match "everything that is not an ASCII7 character", that's
straightforward ... if you want to match, "everything that is not a
Latin-1 character", only slightly less so ... but if you want to
"system, match *your* whitespace", that requires a defined name.

Bruce McFarling

unread,

Mar 22, 2010, 10:58:43 AM3/22/10

to

On Mar 22, 5:51 am, "Peter Knaggs" <p...@bcs.org.uk> wrote:
> Rather than creating a separate Legacy wordset the proposal is to
> place a simple list of the removed words in Appendix D, a comparison
> between 200x and ANS.

Aha, I saw the text to be added to Annex D, but the proposal did not
note what Annex D entailed.

That works too.

Anton Ertl

unread,

Mar 22, 2010, 11:43:16 AM3/22/10

to

Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>Elizabeth D Rather <era...@forth.com> wrote:
>> Albert van der Horst wrote:
>>> You remember ENCLOSE ?
>>
>> No, actually, I've never seen ENCLOSE.
>
>It was pretty evil, a fig-FORTH special.

From the fig-Forth Glossary:

|ENCLOSE addr1 c -- addr1 n1 n2 n3
| The text scanning primitive used by WORD. From the text address
| addr1 and an ascii delimiting character c, is determined the byte
| offset to the first non-delimiter character n1, the offset to the
| first delimiter after the text n2, and the offset to the first
| character not included.
| This proceedure will not process past an ascii 'null', treating it
| as an unconditional delimiter.

Elizabeth D Rather

unread,

Mar 22, 2010, 2:30:31 PM3/22/10

to

Surely if you're using the concept of a "function name" in a context
that requires very long strings you're doing so to fulfill an
*application* need, not a general programming need, since such long
strings are far from practical as real function names. Therefore, its
reasonable to handle long strings in application code, rather than
expect a system "out of the box" to provide very long function-names.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather (US & Canada) 800-55-FORTH
FORTH Inc. +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================

Ed

unread,

Mar 22, 2010, 7:58:48 PM3/22/10

to

Are you saying that long language identifiers (dictionary names in Forth)
are not necessary to effect an OS function call that has a long name?

That's exactly my point. So for some to argue that WORD/FIND are
obsolete because forth needs dictionary names longer than 255 char,
strikes me as unfounded.

I have no quibble with forths wishing to support long *strings*. My S"
example was simply to demonstrate that to date, few have. PARSE
allows one to input potentially long strings should one need it.

Elizabeth D Rather

unread,

Mar 22, 2010, 9:09:12 PM3/22/10

to

Exactly.