Recognizer RfD, summary of topics discussed so far

Matthias Trute

unread,

May 16, 2015, 3:00:08 PM5/16/15

to

Hi,

following a summary of the topics discussed so far and
what I comment on them. The management summary is simple

The proposal in the 1st RFD is not changed.

In the following text I try to discuss every topic
and the reasons why I think that the proposal is at
least good enough.

The text is copy'n'paste-ed from my master document
at http://amforth.sourceforge.net/pr/Recognizer-rfc-B.pdf
and http://amforth.sourceforge.net/pr/Recognizer-rfc-B.text
(plain ascii). The links remain stable for the forseeable
future, but until the status is changed to "final" the content
may change.

You may also note that I mention some things not really discussed
yet (e.g. tick-ing a word). This second version is still
work-in-progress, so I hope for and welcome feedback.

Extended Rationale from the discussion of Version 1

There is an almost common agreement that recognizers shall
replace the default command interpreter behaviour if provided
by the system implementer. Andrew Haley suggests that
recognizers should be used as a least resort tool only if the
standard text interpreter cannot deal with the input data. That
means that the interpreter will always handle the dictionary
searches and the number checks itself and only if they fail
activates the recognizer stack. This leaves the interpreter
untouched but removes the full flexibility. The final wordings
may find a solution for that. The majority questions the
usefulness of such a 2 class interpreter design.

Name Tokens

Name Tokens (nt) are part of the Forth 2012 Programming Tools
word set.

The words found in the dictionary with FIND return the
execution token and the immediate flag. Using the Programming
Tools word set, the dictionary lookup can be done based on
TRAVERSE-WORDLIST called e.g. REC:NAME ( addr len -- nt
R:NAME|R:FAIL). The major difference to FIND is that all header
information is available to handle the token:
:NONAME NAME>INTERPRET EXECUTE ; ( nt -- ) \ interpret
:NONAME NAME>COMPILE EXECUTE ; ( nt -- ) \ compile
:NONAME NAME>COMPILE SWAP POSTPONE LITERAL COMPILE, ; \ postpone
RECOGNIZER: R:NAME

To handle a set of word lists like the order stack additional
steps have to be made. E.g. separate recognizing word for earch
word list that in turn get combined in the recognizer stack.

Search Order Word Set

A large part of the Search Order word set is close to what
recognizers do for dictionary searches. The Order stack can be
seen as a subset of the recognizer stack. The words handling
the order stack (ALSO, PREVIOUS, FORTH, ONLY etc) may be
extended/changed to handle the recognizer stack too/instead.

On the other hand, ALSO is essentially DUP on a different
stack. ONLY and FORTH set a predefined stack content. With the
GET/SET-RECOGNIZERS words all changes can be prepared on the
data stack with the usual data stack words.

A further difference between word lists and recognizers is that
their identification tokens are not interchangable. There is no
relation between a wordlist identifier and a recognizer
identifier (the execution token of a REC: word).

Completely unrelated is SET/GET-CURRENT. Recognizers don't deal
with the places, new words are put into. Possible changes here
are not considered part of the recognizer word set proposal.

A complete redesign of the Search Order word set affects many
programs, worth an own RFD. The common tools to actually
implement both recognizer and search order word sets may be
useful for themselves.

GET/SET-RECOGNIZERS

An alternative solution are words inspired by those that link
the data stack and return stack: >R and R>. Likewise a
>RECOGNIZER would put the new item on the top of the recognizer
stack. Since this element is processed first in DO-RECOGNIZER,
the action prepends to the recognizer stack, which is less
convenient. Having the recognizer loop acting the other way
(bottom up) is no less confusing and therefore not an option
too. Furthermore I expect that most changes to the recognizer
stack take place at the end (bottom) of it appending a new
recognizer. Since there is no commonly agreed way to access a
stack at its bottom, words like N>R and NR>, that are in fact
the proposed GET/SET-RECOGNIZERS words, are needed and all
changes taking place on the data stack. Even more difficult is
the task to insert or remove a recognizer in the middle. Again
the standard data stack words are the simplest way to do it.

Postpone and '

Adding a POSTPONE method has been seen as overly complex. A big
issue is that POSTPONE is defined for wordlist entries only.
Unless a common agreement is found what POSTPONE means to other
data or other actions, the POSTPONE shall be applied to named
entries from wordlist only. All other data types should default
to "-48 THROW". (An ambigous situation).

To drop the POSTPONE method is not an option either. Consider a
recognizer that searches a hidden word list when certain
criteria are met (e.g. a prefix is found). These words could be
interpreted and compiled correctly, but could not be postponed
since POSTPONE would not find them. A solution like
: POSTPONE
PARSE-NAME REC:WORD R:FAIL =
IF
\ system specific error action
ELSE
\ system specific postpone action
THEN
; IMMEDIATE

makew little sense unless system specific knowledge is used.
Name tokens and REC:NAME instead of REC:WORD can greatly
simplify this task.

Implementing ' (tick) is a related topic, but much simpler. It
shall use the REC:WORD internally to achieve a consistent
behaviour.
: ' PARSE-NAME REC:WORD R:FAIL =
IF
\ system specific error action if not found
ELSE
DROP \ ignore immediate flag
THEN
;

The name token based recognizer would be close
: ' PARSE-NAME REC:NAME R:FAIL =
IF
\ system specific error action if not found
ELSE
NAME>INTERPRET
THEN
;

2-Method API

Anton Ertl suggested an alternatitive implementation of the
recognizer. Basically all text data is converted into a literal
at parse time. Later the interpreter decides whether to either
to do an execute or a compilation action with the literal data,
depending on STATE. POSTPONE is a combination of storing the
literal data together with their compile time action.

System Message: WARNING/2 (Recognizer-rfc-B.txt, line 801)

Cannot analyze code. No Pygments lexer found for "none".

interpretation: conv final-action
compilation: conv literal-like postpone final-action
postpone:
conv literal-like postpone literal-like postpone final-action

The conv-action is what is done inside the DO-RECOGNIZERS
action (REC:* words) and the literal-like and final-action set
replaces the proposed 3 method set in R:*. It is not yet clear
whether this approach covers the same range of possibilities as
the proposed one. Another side effect is that postponing
literals like numbers becomes possible without further notice.

A complete reference implementation does not yet exist, some
aspects were published at comp.lang.forth by Jenny Brien.

Stateless interpreter

An alternative implementation of the interpreter without STATE.
For legacy applications a STATE variable is maintained but not
used.

The code depends on DEFER and IS to be present. Similiar code
can be found in gforth and win32forth.
\ legacy state support
VARIABLE STATE
: on ( addr -- ) -1 SWAP ! ;
: off ( addr -- ) 0 SWAP ! ;

\ the two states of the interpreter
: (interpret-i) _R>INT EXECUTE ;
: (interpret-c) _R>COMP EXECUTE ;
DEFER (interpret) ' (interpret-i) IS (interpret)

\ switch interpreter modes
: ] STATE on ['] (interpret-c) IS (interpret) ;
: [ STATE off ['] (interpret-i) IS (interpret) ; IMMEDIATE

: interpret
BEGIN
PARSE-NAME DUP \ get something
WHILE
DO-RECOGNIZER \ analyze it
(interpret) \ act on data, maybe leave the loop
?stack \ simple housekeeping
REPEAT 2DROP
;

R:FAIL and Exceptions

The R:FAIL word has two purposes. One is to deliver a boolean
information whether a parsing word could deal with a word. The
other task is the method table of for the interpreter to
actually handle the parsed data, this time by generating a
proper error message and to leave the interpreter. While the
method table simplifies the interpreter loop, the flag
information seems to be odd. On the other hand a comparision of
the returned R:* token with the constant R:FAIL can easily be
optimized.

A completely different approach is using exceptions to deliver
the flag information. Using them requires the exception word
set, which may not be present on all systems. In addition, an
exception is a somewhat elaborate error handling tool and
usually means than something unexpected has happened. Matching
a string to a sequence of patterns means that exceptions are
used in a normal flow of compare operations. Exceptions are an
unusual way to organize a control flow.

Matthias Trute

unread,

May 17, 2015, 3:55:53 AM5/17/15

to

The tick command is slightly more complex than I initially
thought. I reworded the chapter accordingly.

Implementing ' (tick) is a related topic. It is desireable to
use the recognizer stack to achieve a consistent behaviour. The
difficulty here is to decide whether a recognized item is an
executable "tick-able" word and to handle the returned data
correctly.
: ' PARSE-NAME DO-RECOGNIZER R:WORD =
IF
DROP \ ignore immediate flag
ELSE
\ system specific error action "not found"

THEN
;

The name token based recognizer would be close

: ' PARSE-NAME DO-RECOGNIZER R:NAME =
IF
NAME>INTERPRET
ELSE
\ system specific error action "not found"
THEN
;

This is obviously not a general solution however. TDB.

Bernd Paysan

unread,

May 17, 2015, 2:53:35 PM5/17/15

to

To avoid long postings, result split per topic.

Matthias Trute wrote:
> Postpone and '
>
> Adding a POSTPONE method has been seen as overly complex. A big
> issue is that POSTPONE is defined for wordlist entries only.
> Unless a common agreement is found what POSTPONE means to other
> data or other actions, the POSTPONE shall be applied to named
> entries from wordlist only. All other data types should default
> to "-48 THROW". (An ambigous situation).

No, the other way round. One of the reasons why a recognizer is superior to
a state-smart (or dual-xt) parsing word like S" or TO is that you can just
postpone the thing.

With a string, interpretation and compilation looks the same, but postpone
is

s" some string" postpone sliteral

And postponing TO <something> is simply impossible without some carnal
knowledge.

My proposal would be to not drop this feature, but try to educate people.
The postpone part is where recognizers really can do better than ordinary
words.

The reluctance probably comes from people who implement their recognizers as
almost-normal words, where the postpone part simply doesn't work.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
net2o ID: kQusJzA;7*?t=uy@X}1GWr!+0qqp_Cn176t4(dQ*
http://bernd-paysan.de/

JennyB

unread,

May 28, 2015, 8:20:49 AM5/28/15

to

On Saturday, 16 May 2015 20:00:08 UTC+1, Matthias Trute wrote:

> Name Tokens
>
> Name Tokens (nt) are part of the Forth 2012 Programming Tools
> word set.
>

where name tokens are used internally by TRAVERSE-WORDLIST. I see no case for a name token to be part of the core, as an item that could be returned on the stack and perhaps stored in a data structure.

Since Forth has been moving away from dependency on the immediate flag for decades, I would suggest that FIND-NAME (or whatever we call it) should return what NAME>COMPILE returns. There is no requirement to be able to move easily from an xt to its dictionary entry (if it had one), so this pair of xts together provide all the information needed to recover the original definition's intentions.

In what follows SET-COMPILER sets the compiling action to be returned by the latest definition, and COMPILE, compiles a call to the xt regardless of what compiling action may have been set (it cannot tell from the xt alone).

There's a useful general pattern for recognizer base words.
c-addr u -- i*x xt | 0

where i*x is the 'something' recognized and xt is the action the compiler should take.

Given that, there are only three generic recognizers needed:

R:DEF xt1 xt2 -- handle a definition
R:LIT i*x xt -- handle a literal
R:LIT-DO i*x xt xt1 xt2 -- handle a literal action pair

For most Forths the actions of R:DEF would be:

:NONAME DROP EXECUTE ; ( xt1 xt2 -- ) \ interpret
EXECUTE ( xt1 xt2 -- ) \ compile
> :NONAME SWAP POSTPONE LITERAL COMPILE, ; \ postpone

There are then two portable ways to add checking for compile-only words.
Say you want Bar to execute the compile-ony action Foo:

: NO-INTERP TRUE ABORT" Compile-only word ;

: BAR no-interp ;
:NONAME drop foo ; set-compiler

or

: (comp-only) EXECUTE ;
: COMP-ONLY ['] (comp-only) set-compiler ;

: BAR foo ; comp-only

and set interpret action of R:DEF to

:NONAME ['] (comp-only) = IF no-interp THEN EXECUTE ;

> Search Order Word Set
>
> A large part of the Search Order word set is close to what
> recognizers do for dictionary searches. The Order stack can be
> seen as a subset of the recognizer stack.

I'd think of this the other way around. Recognizers are useful without
Search Order, but we don't want to mess with the Search Order syntax and
we don't want to end up with two ways of specifying search order.

The most basic thing you want to do is to add a recognizer to the
word-not-found part of the interpreter:

REC+ \ xt -- append to recognizer chain
\ the recognizer is effective immediately

Next stage: switch recognizer sets

SCOPE \ 'name' -- begin a named recognizer chain
\ /name/ puts its token on the stack

SET-CURRENT \ scope -- set scope as the current recognizer chain
\ rec+ adds a recognizer to the scope named by SCOPE or SET-CURRENT

Next Stage: the building block for new interpreters

DO-CHAIN \ c-addr u scope -- i*x r:foo | r:fail

Do you see a certain resemblence to SEARCH-WORDLIST?

Final stage: Add the rest of the Search Order Wordset.
A wordlist is now an anonymous Scope.

: SCOPE WORDLIST DUP CONSTANT SET-CURRENT ;

Now each scope/wordlist in the Search Order acts like a simple interpreter.
DO-RECOGNIZERS calls DO-CHAIN to search for definitions first, and if that
fails, tries each of the recognizers set REC+ when that scope was current.
Only when that fails does it try the next scope in the search order.
FIND-NAME etc, search only for definitions.

' is simply : PARSE-NAME FIND-NAME
0= ABORT" not found" ;
If you think it helps, use one of the methods discussed above to avoid ticking compile-only words.

Prefixes
Prefixes are the most common (and fastest) test for recognizers, so there should be a simple way to define them. Albert's prefixes are unfortunately incompatible with this scheme (and some dictionary structures) but it should be possible, without carnal knowledge, to define

PREFIX 'name' xt --
add a recognizer that checks for the prefix 'name.'
If found, place >IN after the prefix and perform xt,
otherwise restore >IN to beginning of word and return R:FAIL

POSTPONE

Postpone is very easy if the compiling xt is on the stack. There are just
three variants - postpone an xt, postpone a literal or postpone a literal
and its associated action:

: POST-DEF ( xt1 xt2 -- )
SWAP POSTPONE LITERAL COMPILE, ;

: POST-LIT ( i*x lit -- )
DUP >R EXECUTE R> COMPILE, ;

: POST-LIT-DO ( i*x lit xt1 xt2 -- )
2>R POST-LIT 2R> POST-XT ;

In conjunction with SET-COMPILER, these cover all cases I think, even Berndt's dot-parser:

: >oo> ( object xt -- ) swap >o execute o> ;
:noname drop ( xt -- ) postpone >o compile, postpone o> ; set-compiler

: with \ 'name' -- xt1 xt2
parse-name rec:name r:fail = abort" not found" ;
: [with] with 2literal ; comp-only

:NONAME ' ['] LITERAL [WITH] >oo> R:LIT-DO ;
PREFIX ::

JennyB

unread,

May 28, 2015, 8:26:13 AM5/28/15

to

It's still not right, even though it respects your desired Search Order. If it matches any other recognizer, your error action had better but sure to clear whatever stack it might have put something on.

Alex McDonald

unread,

May 28, 2015, 9:56:22 AM5/28/15

to

on 28/05/2015 13:20:47, JennyB wrote:
> On Saturday, 16 May 2015 20:00:08 UTC+1, Matthias Trute wrote:
>
>> Name Tokens
>>
>> Name Tokens (nt) are part of the Forth 2012 Programming Tools
>> word set.
>>
> where name tokens are used internally by TRAVERSE-WORDLIST. I see no

> case f or a name token to be part of the core, as an item that could
> be returned o n the stack and perhaps stored in a data structure.

>
> Since Forth has been moving away from dependency on the immediate flag
> for decades, I would suggest that FIND-NAME (or whatever we call it)

> should ret urn what NAME>COMPILE returns. There is no requirement to
> be able to move e asily from an xt to its dictionary entry (if it had

> one), so this pair of xts together provide all the information needed
> to recover the original def inition's intentions.

Just to be clear here, what NAME>COMPILE returns is not a pair of XTs;
the signature is ( nt 末 x xt ) and x does not need to be an XT. The only
requirement is that the X is consumed and performs the compilation
semantics of the word represented by NT. In my system, NAME>COMPILE
returns ( nt 末 nt xt ). The XT when EXECUTEd uses the supplied input NT
to do the compilation, since moving from the XT to the NT to get the
needed information is a longer code path, unnecessary and may not be
possible (as you note). The NT has all the information required to do the
compilation.

I've not yet reviewed the rest.

[snipped]

Matthias Trute

unread,

May 29, 2015, 3:12:01 PM5/29/15

to

> > Name Tokens
> >
> > Name Tokens (nt) are part of the Forth 2012 Programming Tools
> > word set.
> >
> where name tokens are used internally by TRAVERSE-WORDLIST. I see no case for
> a name token to be part of the core, as an item that could be returned on the
> stack and perhaps stored in a data structure.

Uhm. I agree that name tokens are not part of the CORE neither are the
recognizers. What I wanted to express is that recognizers can be used
to explore the possibilities (and perhaps limits) of the name tokens. By implementing a recognizer that uses NTs instead of what FIND returns I
had some success. Until I learned that ' (tick) may cause some serious
trouble. It may be that ' is the signal to really distinguisch between the
search order and the recognizer stacks. Maybe.

>
> Since Forth has been moving away from dependency on the immediate flag for
> decades, I would suggest that FIND-NAME (or whatever we call it) should
> return what NAME>COMPILE returns. There is no requirement to be able to move
> easily from an xt to its dictionary entry (if it had one), so this pair of
> xts together provide all the information needed to recover the original
> definition's intentions.

For me an XT is an identifier for a code sequence, no more no less. The
forth VM uses them. The forth text interpreter (not the VM) needs more
information (a name, some flags like immediate). Here I start to like the
idea of the name token. It's a well defined id that gives access to all
relevant information the text interpreter needs: the flags and the
code to be executed (or compiled) aka an/some execution tokens.

>
> There's a useful general pattern for recognizer base words.
> c-addr u -- i*x xt | 0
>
> where i*x is the 'something' recognized and xt is the action the compiler
> should take.

cmopare that with

c-addr u -- i*x r:table | r:fail

not much different, isn't it? The major difference I see is that
you mandate the interpret action to be a No-Op (well, dropping the
XT is needed), the compile action to be an user supplied code
snippet and the flag, well a flag. Right? That would be at least
simpler than Antons suggestion with the two methods. OTOH a
->foo to behave like "TO foo" would be impossible with it. Which
in turn makes the ]] [[ syntax less useful.

I still think, that the (small) complexity of the RFD is still worth
its price. You can do a lot more with very little effort. Ok, some standard
cases look strange or overly complex. The same argument holds for Anton's
2-method idea.

> > Search Order Word Set
> >
> > A large part of the Search Order word set is close to what
> > recognizers do for dictionary searches. The Order stack can be
> > seen as a subset of the recognizer stack.
>
> I'd think of this the other way around. Recognizers are useful without
> Search Order, but we don't want to mess with the Search Order syntax and
> we don't want to end up with two ways of specifying search order.

Yes.

>
> The most basic thing you want to do is to add a recognizer to the
> word-not-found part of the interpreter:

Well, there is no standard "no-found" action defined in Forth2012.
It would be an starting point for Andrew Haleys suggestion to
keep the forth interpreter as it is and have the recognizers only
active for words that are neither word in the search order nor numbers.
see above.

>
> REC+ \ xt -- append to recognizer chain
> \ the recognizer is effective immediately
>
> Next stage: switch recognizer sets
>
> SCOPE \ 'name' -- begin a named recognizer chain
> \ /name/ puts its token on the stack
>
> SET-CURRENT \ scope -- set scope as the current recognizer chain
> \ rec+ adds a recognizer to the scope named by SCOPE or SET-CURRENT

I'll need some time to think about that.

>
> Next Stage: the building block for new interpreters
>
> DO-CHAIN \ c-addr u scope -- i*x r:foo | r:fail

Chains are already part of the win32forth and I'm going to
like them. But not for what I use recognizers. It's more
like an executable named wordlist (and maybe implemented as
one).

> Prefixes
> Prefixes are the most common (and fastest) test for recognizers, so there
> should be a simple way to define them.

Creating such a definig word for it should be an easy
task. No need to be part of the standard. IMHO.

Matthias

JennyB

unread,

May 31, 2015, 5:43:04 PM5/31/15

to

On Friday, 29 May 2015 20:12:01 UTC+1, Matthias Trute wrote:
> Until I learned that ' (tick) may cause some serious

> trouble. It may be that ' is the signal to really distinguish between the

> search order and the recognizer stacks. Maybe.
>

Yes. You don't want ' (or anything that is specifically looking for a
definition) to even attempt to search anything in the search order that
isn't a wordlist.

You're trying to make the equivalence:

Wordlist <=> Recognizer
Search Order <=> Recognizer Stack

I'm arguing for:

Definition <=> Recognizer
Wordlist <=> Recognizer Chain (Scope)
Search Order <=> Search Order

> > There's a useful general pattern for recognizer base words.
> > c-addr u -- i*x xt | 0
> >
> > where i*x is the 'something' recognized and xt is the action the compiler
> > should take.
>
> cmopare that with
>
> c-addr u -- i*x r:table | r:fail
>
> not much different, isn't it?

No. Your way does save putting a lot on the stack for R:LIT-DOs, but it
does mean that for each distinct literal/action pair you need to define an
R:Table, even if you're only going to use it once. But you can mix the two
methods as you see fit; I was only pointing out the pattern.

> The major difference I see is that
> you mandate the interpret action to be a No-Op (well, dropping the
> XT is needed), the compile action to be an user supplied code
> snippet and the flag, well a flag. Right?

I mandate the first xt to be the execution token of the definition - what it
does when executed (whenever that is) The second is the compiling action
(EXECUTE for Immediate Words, COMPILE, for defaults, or something else
defined by the system or the user) For dual words the compiling action may
drop the xt, or use it as it sees fit. For compile-only words the compiling
action may either cause the xt to execute, or drop it. In all cases that
are at present portable the first item is the definition's xt. I don't see
how you can write portable compiling actions if it is anything else.

I'm also using the second xt as a flag if no definition is found (or the
literal conversion did not succeed). Perhaps that's not a Good Idea, but
its all internal working - the text interpreter sees an R:Table in the end
anyway.

> That would be at least
> simpler than Antons suggestion with the two methods. OTOH a
> ->foo to behave like "TO foo" would be impossible with it. Which
> in turn makes the ]] [[ syntax less useful.

Well, I imagine you define VALUEs of the various kinds with some sort of
method table to find their own particular set method. Call that TO!

TO! xt -- addr xt | 0

See the pattern? You can resolve that at parse time, so:

:NONAME PARSE-NAME FIND-DEF?
0= IF R:FAIL EXIT THEN
TO! ?DUP
0= IF R:FAIL EXIT THEN
( addr xt) ['] LITERAL SWAP
( addr lit xt ) ['] COMPILE, R:LIT-DO ;
PREFIX ->

What's hard about that?

In fact R:LIT-DO is too generic. Almost always you'll want something that
takes one cell and an xt:

' EXECUTE
: 1+XT_COMP SWAP POSTPONE LITERAL COMPILE, ;
' 1+XT_COMP
:NONAME POSTPONE 2LITERAL
POSTPONE 1+XT_COMP ;
RECOGNIZER: R:1+XT

:NONAME PARSE-NAME FIND-DEF?
0= IF R:FAIL EXIT THEN
TO! ?DUP
0= IF R:FAIL EXIT THEN
( addr xt) R:1+XT ;
PREFIX ->

> I still think, that the (small) complexity of the RFD is still worth
> its price. You can do a lot more with very little effort.

I agree. The Search Order needs work, though, before we think of replacing
the classic text interpreter, though recognizers do make it easy to roll
your own interpreter, using as much or as little of the host as you need.

>
> > > Search Order Word Set

> Well, there is no standard "no-found" action defined in Forth2012.
> It would be an starting point for Andrew Haleys suggestion to
> keep the forth interpreter as it is and have the recognizers only
> active for words that are neither word in the search order nor numbers.
> see above.

That was my intention. Should the recognizers come above or below the numbers?

>
> Chains are already part of the win32forth and I'm going to
> like them. But not for what I use recognizers. It's more
> like an executable named wordlist (and maybe implemented as
> one).
>

If it's executable and named, it's more like a word than a wordlist.

> > SET-CURRENT \ scope -- set scope as the current recognizer chain
> > \ rec+ adds a recognizer to the scope named by SCOPE or SET-CURRENT
>
> I'll need some time to think about that.

I'm not entirely sure either. Without wordlists, this effectively sets a
"search order" of:

REC:WORD /SCOPE/

Maybe it would be better to call it SET-CHAIN because it's a sort of
embryonic SET-ORDER.

This is how I see Scopes working without wordlists:

REC+ appends a recognizer to the most recently declared SCOPE, but the scope
is only made active by SET-CHAIN.

: COMPILED _R>COMP EXECUTE ;

Scope Numbers-First
' rec:num rec+
' rec:word rec+ / searched after numbers

: compile-table
BEGIN parse-name /condition/ WHILE
Numbers-first search-chain compiled REPEAT ;

You'll need, of course, immediate words to trigger the exit or any other
actions you want to take.

With Wordlists, SCOPE also creates a wordlist and sets it current, so REC+
appends the recognizer to that wordlist, and so it becomes active when it is
put in the search order.

Albert van der Horst

unread,

May 31, 2015, 10:14:51 PM5/31/15

to

In article <3c9c3559-949a-419b...@googlegroups.com>,
JennyB <jenny...@googlemail.com> wrote:
<SNIP>

>That was my intention. Should the recognizers come above or below
>the numbers?

If you don't use recognizers to implement numbers, you're doing
something wrong.

Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

JennyB

unread,

Jun 1, 2015, 1:46:11 AM6/1/15

to

Yes, but should they be explicitly included in the chain, or are the system-supplied recognizers the equivalent of FORTH-WORDLIST? I would think that SET-CHAIN should cause the named scope to be searched first, and then the system-supplied recognizers. SEARCH-CHAIN would search only the recognizers explicitly declared in the scope.

Albert van der Horst

unread,

Jun 1, 2015, 7:09:41 AM6/1/15

to

In article <1e83cdde-16a1-4897...@googlegroups.com>,

JennyB <jenny...@googlemail.com> wrote:
>Yes, but should they be explicitly included in the chain, or are the

Yes what? This is Usenet.

JennyB

unread,

Jun 1, 2015, 11:55:11 AM6/1/15

to

On Monday, 1 June 2015 12:09:41 UTC+1, Albert van der Horst wrote:
> In article <1e83cdde-16a1-4897...@googlegroups.com>,
> JennyB <jenny...@googlemail.com> wrote:
> >Yes, but should they be explicitly included in the chain, or are the
>
> Yes what? This is Usenet.
>

Sorry, yes, to your last comment: "If you don't using recognizers to
implement numbers, you're doing something wrong. I was posting via the
Google mobile interface that doesn't allow quoting, but I thought it
preserved the threading.

Your Prefixes allow recognizers to be (almost normal, but STATE-smart
definitions, so you interpreter doesn't need a second stage. That's not
possible here, so the question is where in the interpreter user-defined
recognizers should go.

My solution is that they should be tried before the system-supplied number
recognizers so that:

1. When you specify what recognizers you need, you don't have to explicitly mention the system ones; they are tried after your own fail, like FORTH-WORDLIST
2. If for some reason you don't want to include the system recognizers you
can either:

a: Put REC:FAIL at the end of the chain of recognizers you set with SET-CHAIN

or
b: search a scope directly with SEARCH-CHAIN

When you have the SEARCH-ORDER wordset, a SCOPE acts as a named wordlist. that does SET-CURRENT when it is first declared.

SCOPE FOO

is the same as

WORDLIST DUP CONSTANT FOO SET-CURRENT

except that it also starts a new recognizer chain, and so long as it is the CURRENT wordlist, REC+ will append a recognizer to that chain.

A Scope now has potentially a wordlist or a chain of other recognizers, or both.

SEARCH-WORDLIST on a scope searches its wordlist, which may contain zero
definitions.
SEARCH-CHAIN first searches the wordlist and returns an r:table if
successful, otherwise it searches any other recognizers in order.

A Scope is therefore has the main components of a custom single-vocabulary
interpreter. The Search Order replaces Matthias' Recognizer Stack.
DO-RECOGNIZER calls SEARCH-CHAIN on each scope in the Search Order in turn.

I hope that makes sense. I haven't implemted any of this; I'm just thinking it out as I go along.