Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Recognizers

312 views
Skip to first unread message

Alex

unread,
Jan 22, 2017, 1:42:04 AM1/22/17
to
Some recognizer observations.

1. Using -2 STATE ! for POSTPONE actions is very useful as it provides a
simple index on (0 -1 and -2) NEGATE CELLS into the appropriate
RECOGNIZER: table. It remains ANS compliant:

STATE is true when in compilation state, false otherwise. The true value
in STATE is non-zero, but is otherwise implementation-defined.

2. There's an unintended consequence with numbers (in fact, any
non-FINDable words can be treated this way).

: plit postpone literal ;

' d>s
:noname d>s plit ;
:noname d>s plit postpone plit ;
recognizer: r:number ( n 0 -- )

: postpone ( -<name>- )
-2 state dup @ >r !
parse-name recognize execute
r> state ! ; immmediate

This allows POSTPONE 10 even though 10 isn't a "proper" word and can't
be ticked or found in the usual wordlist searches. I've not yet
considered a use for it.

3. Adding the Windows constant server was very simple. (It's a DLL that
allows for symbols like MB_HELP VK_F1 VK_F2 etc to be used without
declaring thousands of CONSTANTs.)

library wincon.dll
3 import: wcFindWin32Constant

: wincon-call ( a1 -- n f ) \ call to find constant
>r 0 sp@ r> count swap wcFindWin32Constant ;

: rec:wincon ( addr u -- d r:number | r:fail ) \ find constant
buf-allot dup>r place r@ ?uppercase \ uppercase a copy
wincon-call dup 0= if \ found constant? if not
2drop \ drop returned values
s" A" r@ +place \ append an 'A' look for that
r@ wincon-call
then rdrop
if 0 r:number else drop r:fail then ;

' rec:wincon get-recognizers 1+ set-recognizers

It's very clean indeed.

--
Alex

Matthias Trute

unread,
Jan 22, 2017, 6:06:50 AM1/22/17
to
Am Sonntag, 22. Januar 2017 07:42:04 UTC+1 schrieb Alex:
> Some recognizer observations.
>

>
> 2. There's an unintended consequence with numbers (in fact, any
> non-FINDable words can be treated this way).
>
> : plit postpone literal ;
>
> ' d>s
> :noname d>s plit ;
> :noname d>s plit postpone plit ;

If you change this line to something like
:noname -1 abort" invalid postpone" ;
(untested) postponing of un-findable items won't
create a surprise. Some people find it attractive
however that postpone does not only handle words
from the dictionary but other things too.


>
> 3. Adding the Windows constant server was very simple. (It's a DLL that
> allows for symbols like MB_HELP VK_F1 VK_F2 etc to be used without
> declaring thousands of CONSTANTs.)
>
> library wincon.dll
> 3 import: wcFindWin32Constant
>
> : wincon-call ( a1 -- n f ) \ call to find constant
> >r 0 sp@ r> count swap wcFindWin32Constant ;
>
> : rec:wincon ( addr u -- d r:number | r:fail ) \ find constant
> buf-allot dup>r place r@ ?uppercase \ uppercase a copy
> wincon-call dup 0= if \ found constant? if not
> 2drop \ drop returned values
> s" A" r@ +place \ append an 'A' look for that
> r@ wincon-call
> then rdrop
> if 0 r:number else drop r:fail then ;
>
> ' rec:wincon get-recognizers 1+ set-recognizers
>
> It's very clean indeed.

I'm not using windows so I cannot really comment about
it but your solution looks pretty smart indeed.

Matthias

Anton Ertl

unread,
Jan 22, 2017, 10:22:45 AM1/22/17
to
Alex <al...@rivadpm.com> writes:
>Some recognizer observations.
>
>1. Using -2 STATE ! for POSTPONE actions is very useful as it provides a
>simple index on (0 -1 and -2) NEGATE CELLS into the appropriate
>RECOGNIZER: table. It remains ANS compliant:
>
>STATE is true when in compilation state, false otherwise. The true value
>in STATE is non-zero, but is otherwise implementation-defined.

To me that means that both -2 and any other non-zero value just means
compilation state. Postpone state is outside of the realm of existing
standards.

The important thing here is that a program can contain

state @ if comp-state-stuff else int-state-stuff then

and that program should work as expected in a standard system.

OTOH, such programs probably are not fit for postpone state anyway,
and even if the standard would say that only -1 means compile state,
and the program was written as

state @ case
0 of int-state-stuff endof
1 of comp-state-stuff endof
true abort" unknown STATE"
endcase

(which is the best one could hope for) that would buy very little.

Bottom line: The problem is not the state value itself (it would be if
the following were not also the case), but that Forth-94/2012 excludes
any states except for interpretation and compilation.

What can we do?

1) Forget about postpone state.

a) Forget about it completely. Not very satisfactory. See below
why we want something in this direction.

b) Implement ]] ... [[ to have its own loop instead of the text
interpreter loop, like ] had in PolyForth. That's somewhat against
the grain of Forth-94/2012, which explicitly standardized the
common interpreter loop. But in any case, that's what is done in
<http://www.complang.tuwien.ac.at/forth/compat/macros.fs>.

2) Actually have STATE=-2 (or some other special value) mean "postpone
state". Existing words that use STATE would probably not work as
intended in postpone state, so users should avoid using such words in
postpone state, but they probably also won't work as intended in a
1b-style implementation.

Overall, whether you go with 1b or 2, there are some existing pieces
of code that won't work as intended in postpone state, so programmers
will have to be careful. But a lot of code will work in the
cut-and-paste way described below, which is much better than if we did
not have ]]...[[ at all or if recognizers did not support it.

>2. There's an unintended consequence with numbers (in fact, any
>non-FINDable words can be treated this way).
>
>: plit postpone literal ;
>
> ' d>s
> :noname d>s plit ;
> :noname d>s plit postpone plit ;
>recognizer: r:number ( n 0 -- )
>
>: postpone ( -<name>- )
> -2 state dup @ >r !
> parse-name recognize execute
> r> state ! ; immmediate
>
>This allows POSTPONE 10 even though 10 isn't a "proper" word and can't
>be ticked or found in the usual wordlist searches. I've not yet
>considered a use for it.

That's an intended consequence, and the main reason for having support
for a postpone action in recognizers. The first consequence is that,
with an appropriately written POSTPONE, you can write

POSTPONE 10

where a Forth-94/2012 programmer would write

10 POSTPONE literal

Furthermore, if you implement ]] ... [[ for defining macros, you can
now really cut and paste code between inside ]] ... [[, ordinary
compiled code, and interpreted code: E.g., you can write stuff like

]] dup 42 = [[

instead of

]] dup [[ 42 ]] literal = [[

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2016: http://www.euroforth.org/ef16/

Alex

unread,
Jan 22, 2017, 5:26:35 PM1/22/17
to
On 1/22/2017 09:42, Anton Ertl wrote:

>
> b) Implement ]] ... [[ to have its own loop instead of the text
> interpreter loop, like ] had in PolyForth. That's somewhat against
> the grain of Forth-94/2012, which explicitly standardized the
> common interpreter loop. But in any case, that's what is done in
> <http://www.complang.tuwien.ac.at/forth/compat/macros.fs>.

With recognizers and a POSTPONE state of -2, an implementation of ]] [[
for macros is trivial; no interpreter loop is required; no special
casing is required for literals; & anything that has a recogniser on the
recogniser stack with a POSTPONE section is valid.

: foo ]] dup 42 = [[ ; immediate

Obviously, it also supports the much talked about

: end ]] exit then [[ ; immediate



\ Avoid having to write so many POSTPONEs; Instead of
\ POSTPONE a 10 POSTPONE LITERAL POSTPONE c
\ write
\ ]] a n c [[

:noname -600 throw ; \ internal error
:noname -600 throw ;
:noname
get-recognizers 1- set-recognizers drop
-1 state ! ;
recognizer: r:macro ( -- )

: rec:macro ( addr u -- r:macro | r:fail )
s" [[" compare if r:fail else r:macro then ;

: ]] ( -- ) \ switch into postpone state
['] rec:macro get-recognizers 1+ set-recognizers
-2 state ! ; immediate



--
Alex

Alex

unread,
Jan 23, 2017, 8:30:08 AM1/23/17
to
On 1/22/2017 01:42, Alex wrote:
> Some recognizer observations.

Refer to http://amforth.sourceforge.net/pr/Recognizer-rfc-C.html

3. Having R:FAIL not on the recognizer stack is inflexible. When
supporting more than one stack, the same R:FAIL appears as the last
implied entry in all of them, and an action like -13 THROW may not be
right for every situation. E.g. I may want to support forward
definitions on "not found" rather than throw an error.

The solutions would appear to be:

a. Modify the R:FAIL table when switching stacks. Messy and requires
setting 3 entries in the table at R:FAIL and the value of
FORTH-RECOGNIZER. Very difficult to manage, especially when recovery in
CATCH might require restoring the default recognition stack.

b. Require each stack to have its own R:FAIL equivalent action as the
last entry in the stack. There are some objections raised in the RfD at
"FLAGS, R:FAIL or EXCEPTIONS".

c. Maintain the current R:FAIL but avoid executing it by adding an
R:FAIL<this stack> to the bottom of those stacks that require it with
SET-RECOGNIZERS to catch unresolved requests.

Given that I believe the stack will only infrequently change dynamically
and that I'll be using mainly static FORTH-RECOGNIZERs, I'm going to use
approach C and have a "catch all" before the built-in R:FAIL.

My application for this is meta-compilation. While generating code
during meta-compilation I'm using a META-RECOGNIZER stack, and the
actions of the standard REC:NAME, REC:NUM etc are metacompilation
actions META:NAME and META:NUM, and I wish to use R:FAIL (now META:FAIL
prior to the default R:FAIL) to provide forward declarations without
having to declare them in advance.

Standard stack:

' REC:NAME ' REC:NUM 2 FORTH-RECOGNIZER SET-RECOGNIZERS

Meta stack:

4 RECOGNIZER META-RECOGNIZER
' META:FAIL ' META:NAME ' META:NUM 3 META-RECOGNIZER SET-RECOGNIZERS


4. Varying stack item counts from REC:<name> words.

This one has caused me a lot of grief while debugging.

: rec:num ( addr u -- d r:num | r:fail )



5. Names.

I haven't found a use for R>COMP R>INT or R>POST. Also, the names are a
little suspect; why not match NAME>X and have R>INTERPRET R>COMPILE and
R>POSTPONE (if they are required).

R:<name> (a table) and REC:<name> (a word) are too similar, and an
attempt has been made to semi-standardize them.

RECOGNIZER would be clearer as RECOGNIZER-STACK

DO-RECOGNIZER I have implemented as the verb RECOGNIZE; it leads to
pretty understandable code like this (?STACK is an underflow test on the
data stack).

: interpreter ( -- ) \ outer interpreter
begin ?stack parse-name dup
while recognize execute
repeat 2drop ;

--
Alex

Alex

unread,
Jan 23, 2017, 8:57:16 AM1/23/17
to
On 1/23/2017 08:30, Alex wrote:
> On 1/22/2017 01:42, Alex wrote:

>
> c. Maintain the current R:FAIL but avoid executing it by adding an
> R:FAIL<this stack> to the bottom of those stacks that require it with
> SET-RECOGNIZERS to catch unresolved requests.

The problems of R:<name> and REC:<name>. This should read

c. Maintain the current R:FAIL table but avoid executing its entries by
adding a REC:FAIL<this stack> to the bottom of those stacks that require
it to catch unresolved requests.

--
Alex

Matthias Trute

unread,
Jan 23, 2017, 1:58:46 PM1/23/17
to
Am Montag, 23. Januar 2017 14:30:08 UTC+1 schrieb Alex:
> On 1/22/2017 01:42, Alex wrote:
> > Some recognizer observations.
>
> Refer to http://amforth.sourceforge.net/pr/Recognizer-rfc-C.html
>
> 3. Having R:FAIL not on the recognizer stack is inflexible. When
> supporting more than one stack, the same R:FAIL appears as the last
> implied entry in all of them, and an action like -13 THROW may not be
> right for every situation. E.g. I may want to support forward
> definitions on "not found" rather than throw an error.

R:FAIL is never on the stack, R:FAIL is a result of the actions.
That R:FAIL is used to execute some actions is truly the
"last resort" like an outermost exception catcher.

adding a "not found" hook or a forward definition resolver as a
recognizer is easily possible.

> 4. Varying stack item counts from REC:<name> words.
>
> This one has caused me a lot of grief while debugging.
>
> : rec:num ( addr u -- d r:num | r:fail )

I'm sure you mean this stack effect(s)

: rec:num ( addr u -- n r:num | d r:dnum | r:fail)

This word is part of the informal appendix, there are
three different recognizers for each number format
(single, double cell numbers and the 'x' notation)
available at (among others)

http://theforth.net/package/recognizers/current-view/rec-num.4th

that have a clean stack effect.
> 5. Names.
>
> I haven't found a use for R>COMP R>INT or R>POST. Also, the names are a
> little suspect; why not match NAME>X and have R>INTERPRET R>COMPILE and
> R>POSTPONE (if they are required).

NAME>x require a name token which may be not available (infact
IMHO only very few forth's currently can have one. IMHO). There is
currently a discussion with the gforth people about the DT>x words
(as they are called now), maybe it's time to disclose the version
4 of the RFD. It addresses the name confusion like

>
> R:<name> (a table) and REC:<name> (a word) are too similar, and an
> attempt has been made to semi-standardize them.
>
> RECOGNIZER would be clearer as RECOGNIZER-STACK

too. Have a look at http://amforth.sourceforge.net/Recognizers.html
for the details.

Matthias

Alex

unread,
Jan 23, 2017, 2:45:10 PM1/23/17
to
On 1/23/2017 13:58, Matthias Trute wrote:
> Am Montag, 23. Januar 2017 14:30:08 UTC+1 schrieb Alex:
>> On 1/22/2017 01:42, Alex wrote:
>>> Some recognizer observations.
>>
>> Refer to http://amforth.sourceforge.net/pr/Recognizer-rfc-C.html
>>
>> 3. Having R:FAIL not on the recognizer stack is inflexible. When
>> supporting more than one stack, the same R:FAIL appears as the last
>> implied entry in all of them, and an action like -13 THROW may not be
>> right for every situation. E.g. I may want to support forward
>> definitions on "not found" rather than throw an error.
>
> R:FAIL is never on the stack, R:FAIL is a result of the actions.

My bad; I meant as a result of the recognizer stack. Let me play some
more with recognizers & I may come back to this one.

> That R:FAIL is used to execute some actions is truly the
> "last resort" like an outermost exception catcher.
>
> adding a "not found" hook or a forward definition resolver as a
> recognizer is easily possible.

Yes, I see that.

>
>> 4. Varying stack item counts from REC:<name> words.
>>
>> This one has caused me a lot of grief while debugging.
>>
>> : rec:num ( addr u -- d r:num | r:fail )
>
> I'm sure you mean this stack effect(s)
>
> : rec:num ( addr u -- n r:num | d r:dnum | r:fail)
>
> This word is part of the informal appendix, there are
> three different recognizers for each number format
> (single, double cell numbers and the 'x' notation)
> available at (among others)
>
> http://theforth.net/package/recognizers/current-view/rec-num.4th
>
> that have a clean stack effect.

No, I meant ( addr u -- d r:num | r:fail ), where success has 3 stack
items and failure only the failure flag; all the REC:<name> words have
that feature. Whether it's a problem longer erm I can't say, but it did
cost me a number of hours while debugging a stack runaway & memory
access issues some distance from the code. I was erroneously returning (
0 r:fail ) in some -- not all -- cases.

>> 5. Names.
>>
>> I haven't found a use for R>COMP R>INT or R>POST. Also, the names are a
>> little suspect; why not match NAME>X and have R>INTERPRET R>COMPILE and
>> R>POSTPONE (if they are required).
>
> NAME>x require a name token which may be not available (infact
> IMHO only very few forth's currently can have one. IMHO). There is
> currently a discussion with the gforth people about the DT>x words
> (as they are called now), maybe it's time to disclose the version
> 4 of the RFD. It addresses the name confusion like
>
>>
>> R:<name> (a table) and REC:<name> (a word) are too similar, and an
>> attempt has been made to semi-standardize them.
>>
>> RECOGNIZER would be clearer as RECOGNIZER-STACK
>
> too. Have a look at http://amforth.sourceforge.net/Recognizers.html
> for the details.
>

Ah, thanks. I've been working off V3.

> Matthias
>


--
Alex

Bernd Paysan

unread,
Jan 23, 2017, 7:59:11 PM1/23/17
to
Am Sun, 22 Jan 2017 14:42:53 +0000 schrieb Anton Ertl:

> OTOH, such programs probably are not fit for postpone state anyway, and
> even if the standard would say that only -1 means compile state,
> and the program was written as
>
> state @ case
> 0 of int-state-stuff endof
> 1 of comp-state-stuff endof
> true abort" unknown STATE"
> endcase

No user-defined word sees postpone state, as the interpreter in postpone
state postpones everything, until it finds [[.

So only the interpreter loop itself sees postpone state.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
net2o ID: kQusJzA;7*?t=uy@X}1GWr!+0qqp_Cn176t4(dQ*
http://bernd-paysan.de/

Anton Ertl

unread,
Jan 24, 2017, 11:37:05 AM1/24/17
to
Bernd Paysan <be...@net2o.de> writes:
>Am Sun, 22 Jan 2017 14:42:53 +0000 schrieb Anton Ertl:
>
>> OTOH, such programs probably are not fit for postpone state anyway, and
>> even if the standard would say that only -1 means compile state,
>> and the program was written as
>>
>> state @ case
>> 0 of int-state-stuff endof
>> 1 of comp-state-stuff endof
>> true abort" unknown STATE"
>> endcase
>
>No user-defined word sees postpone state, as the interpreter in postpone
>state postpones everything, until it finds [[.

Good point. Certainly the code occuring between ]] and [[ would not
see the unusual state.

However, the (user-defined) code for the postpone part of a recognizer
would see it. But that is new code that should know about the
existence of postpone state (if we implement it as a state) and know
how to deal with that. I also have trouble imagining an accidential
way in which a recognizer could call a state-smart word that would
produce something useful in interpretation and compilation state, and
fail in postpone state. So, as it stands, -2 as postpone state would
be acceptable by me, despite the exhaustive specification in
Forth-94/2012.

Bernd Paysan

unread,
Jan 24, 2017, 4:48:47 PM1/24/17
to
Am Tue, 24 Jan 2017 16:23:24 +0000 schrieb Anton Ertl:
> However, the (user-defined) code for the postpone part of a recognizer
> would see it.

Yes, but that code shouldn't be state-aware by design...

JennyB

unread,
Jan 27, 2017, 6:05:25 PM1/27/17
to
From version 4:

For simple use cases (literals) it's possible to automatically
convert this approach into the 3-method API (Anton Ertl and
Bernd Paysan):
: rec-methods {: literal-xt final-xt -- interpret-xt compile-xt postpone
-xt :}
final-xt
:noname literal-xt compile, final-xt ]] literal compile, ; [[ dup >r
:noname literal-xt compile, r> compile, postpone ;
;

With that command, the standard number recognizer can be
rewritten as
\ numbers
:NONAME ; \ final-action do nothing
' LITERAL \ literal-action
rec-methods RECOGNIZER: DT:NUM

Anton Ertl writes in comp.lang.forth:

If you define recognizers through these components, you
don't need to specify the three components, in particular
not a POSTPONE action; and yet POSTPONEing literals works as
does any other POSTPONEing of recognizers. With that, one
might leave it up to systems whether they support
POSTPONEing recognizers or not.

Disadvantage: Does not combine with doing the dictionary
look-up as a recognizer for immediate words:

If you make the immediate word a parse-time action with a
noop for literal-like and noop for run-time, it works
correctly for interpretation and compilation, but not for
POSTPONE. And anything else is even further from the desired
behaviour. One could enhance this scheme to support
immediate words correctly, but I don't see a clean way to do
that.

So there seems to be a choice:
1. Compose the behaviour of recognizers of these components,
but do not treat the dictionary as a recognizer.
2. Treat the dictionary as a recognizer, but build recognizers
from interpretation, compilation, and postponeing
behaviour.

A complete reference implementation does not exist, many
aspects were published at comp.lang.forth by Jenny Brien.

----

For immediate words everything moves one stage further on: what normally happens at interpretation happens when compiling, and what normally happens at compilation happens while postponing. You get to decide what happens at interpretation, whether the same as compiling, a warning, or something else.

Thus:

: rec-immediate-methods {: interpret-xt | literal-xt final-xt -- interpret-xt compile-xt postpone
-xt :}
final-xt
:noname literal-xt compile, final-xt ]] literal compile, ; [[
;

For a recognizer using FIND :

:NONAME DROP EXECUTE ;
' 2LITERAL
:NONAME 1= IF EXECUTE ELSE COMP, THEN ;

rec-immediate-methods RECOGNIZER: DT:WORD


For a recognizer using NAME :

:NONAME NAME>INTERPRET EXECUTE ;
' LITERAL
:NONAME NAME>COMPILE EXECUTE ;

rec-immediate-methods RECOGNIZER: DT:WORD

Alex

unread,
Jan 29, 2017, 5:54:57 PM1/29/17
to
On 1/23/2017 13:30, Alex wrote:
> On 1/22/2017 01:42, Alex wrote:
>> Some recognizer observations.
>
> Refer to http://amforth.sourceforge.net/pr/Recognizer-rfc-C.html
>

6. Search order, wordlists and FIND-NAME

The maintenance of two search orders is problematic, and a lack of
convenience words like ONLY ALSO PREVIOUS etc make it a little harder to
understand what will be found when.

http://www.complang.tuwien.ac.at/anton/euroforth/ef16/papers/ertl-recognizers.pdf
recommends

"While hooking into find has some advantages, the advantages of hooking
into the text interpreter, in particular with respect to backwards
compatibility, outweigh them."

I'm not so sure; phrases like

get-recognizers ’ single-recognizer -rot 1+ set-recognizers

are not paricularly clear without some form of explanation. Support
words would be useful.

7. FORTH-RECOGNIZER

Having this as a value to allow multiple recognizer stacks is debatable.
We don't have a way of specifying multiple search order stacks.

The whole area of search orders and recognizer stacks is much messier
than I think it need be. A uniform stack & some way of managing it is
imho required, but it might make the proposal a larger and more
fundamental change than the community is willing to support.



--
Alex

JennyB

unread,
Jan 30, 2017, 12:11:33 PM1/30/17
to
Posting from Google Groups which doesn't do quoting now :(

Alex wrote:

6. Search order, wordlists and FIND-NAME

The maintenance of two search orders is problematic, and a lack of
convenience words like ONLY ALSO PREVIOUS etc make it a little harder to
understand what will be found when.

Me:
To the first point: wordlists and recognizers have points in common,
but enough differences to make sharing an order difficult. It's natural
for recognizers to be executable, less so for wordlists, and it's not
a great idea to use SET-CURRENT or SEARCH-WORDLIST on a recognizer.
Also, in my view, ticking a recognizer doesn't make much sense.


Alex: Support words would be useful.

Me: They are not too hard to write:

: >REC \ rec recstack -- ; add recognizer to front
DUP >R SWAP >R get-recognizers
>R SWAP 1+ R> set-recognizers ;

: REC< \ rec recstack -- add recognizer to back
DUP >R get-recognizers 1+ R> set-recognizers ;

: <REC \ recstack -- rec ; remove recognizer from front
DUP >R get-recognizers 1- R> ROT >R
set-recognizers R> ;

: REC> \ recstack -- rec ; remove recognizer from back
DUP >R get-recognizers 1- NIP R> set-recognizers ;

REC:NAME is for an example only, REC:FIND is more immediately useful.

Alex:
7. FORTH-RECOGNIZER

Having this as a value to allow multiple recognizer stacks is debatable.
We don't have a way of specifying multiple search order stacks.

Me:
We do now

REC-STACK ( size -- stack-id ) Create a new recognizer stack with size elements.
Analogous to WORDLIST
See the section Nesting Recognizer Stacks in
http://amforth.sourceforge.net/pr/Recognizer-rfc-D.html

Allowing recognizers of the form

: rec-name rec-stack recognise ;

probably makes management easier, and certainly reduces
the number of items you need on the main stack.
It also allows you to bind a wordlist to a recognizer for the
literals it uses by putting them in the same stack so they
can be added or removed from the recognizer order together.

You could simplify this by having get- and set- only apply
to a fixed main stack - keeping the contents of a rec-stack
constant. If there was only one main stack you could over-ride
its current behaviour by pushing a recognizer that always
succeeds on the front, but it seems cleaner to say:

FORTH-RECOGNIZER myrec-stack TO FORTH-RECOGNIZER

Alex:
The whole area of search orders and recognizer stacks is much messier
than I think it need be. A uniform stack & some way of managing it is
imho required, but it might make the proposal a larger and more
fundamental change than the community is willing to support.

Me:
It's simple to have recognizers replace the number-handling part of the
interpreter (many Forths have this vectored anyway) and that may be enough
for some people. But there are also applications where it is useful to have
recognizers active before the dictionary search, and in that case having
REC:FIND as just another recognizer in a unified recognizer order is the
simplest solution. Legacy custom interpreters that use FIND will still work
without modification (but of course will not do recognitions), which might
not be true if you changed the inner workings of FIND or the search-order words.


Anton Ertl

unread,
Jan 30, 2017, 1:09:26 PM1/30/17
to
Alex <al...@rivadpm.com> writes:
>On 1/23/2017 13:30, Alex wrote:
>> On 1/22/2017 01:42, Alex wrote:
>>> Some recognizer observations.
>>
>> Refer to http://amforth.sourceforge.net/pr/Recognizer-rfc-C.html
>>
>
>6. Search order, wordlists and FIND-NAME
>
>The maintenance of two search orders is problematic

Yes, combining the recognizer and wordlist search order looks
attractive, but the problem is existing uses of SET-ORDER ONLY etc.
If the programmer has written, say

forth-wordlist 1 set-order

should SET-ORDER add the integer recognizer, or integer and FP
recognizers or more recognizers? Likewise, GET-ORDER should not
return the default recognizers, but would return any additional
recognizers. With the recognizer order as separate entity, SET-ORDER
does not affect the active recognizers, and GET-ORDER does not see
them.

>http://www.complang.tuwien.ac.at/anton/euroforth/ef16/papers/ertl-recognizers.pdf
>recommends
>
>"While hooking into find has some advantages, the advantages of hooking
>into the text interpreter, in particular with respect to backwards
>compatibility, outweigh them."

Hooking into FIND seemed like the logical thing to do for my
"temporary word" approach: FIND would then return the xt of the
temporary word, and FIND-NAME the nt. For the current, more practical
proposal, hooking into FIND cannot work, because the additional data
is passed on the stack, so the stack effect would be wrong (and
stashing the data in some fixed buffer does not cut it, either).

But even for the "temporary word" approach, hooking it into FIND is
not a good idea, as pointed out by Bernd Paysan: FIND is used in
existing user-defined interpreters, and these may not be prepared to
have additional recognizers active.

>I'm not so sure; phrases like
>
>get-recognizers ’ single-recognizer -rot 1+ set-recognizers
>
>are not paricularly clear without some form of explanation. Support
>words would be useful.

The idea here is probably that GET-ORDER and SET-ORDER are standard
already, and to provide an interface that parallels this interface.
As for convenience words, not even >ORDER has been proposed, and I
expect changes to the search order to be more frequent than changes to
the recognizer order.

That being said, maybe GET-ORDER/SET-ORDER is not so good that we
should make it the model for further interfaces.

Lately I lean towards an interface where we can construct a recognizer
as a sequence of other recognizers in the dictionary, e.g., as follows:

\ we have a recognizer STANDARD-RECOGNIZERS

rec-seq my-recognizers
' standard-recognizers ,
' angle-recognizer ,
end-seq

' my-recognizers is forth-recognizer
... do something with angles ...
' standard-recognizers is forth-recognizer

>7. FORTH-RECOGNIZER
>
>Having this as a value to allow multiple recognizer stacks is debatable.
>We don't have a way of specifying multiple search order stacks.

With the GET/SET interface, we don't need that, true. With an
interface like the one above, we would need it.

>The whole area of search orders and recognizer stacks is much messier
>than I think it need be. A uniform stack & some way of managing it is
>imho required, but it might make the proposal a larger and more
>fundamental change than the community is willing to support.

I don't think it's particularly messy. It's just that there are a
buch of options, each with something to be said for them. If we want
to think them all through and find the best one, you end up in a mess.
But given that there has been no strong movement to replace
GET-ORDER/SET-ORDER with something more convenient, maybe a suboptimal
solution is not a big problem. So let's just go with the one that the
proponent decides on.

Gerry Jackson

unread,
Jan 31, 2017, 4:03:33 PM1/31/17
to

I've been thinking about implementing recognisers in my system and ISTM
that it should be quite straightforward. My system has this arrangement:

an xt is the address of a pointer to a small block of memory which, for
most words consists of three cells:
- a pointer to an interpreter method
- a pointer to a compile method
- the primitive to be compiled
plus something else for a few special words

This block is shared between different classes of words e.g. all colon
definitions share the same block. I plan to add a postpone method to
these. So this block will then look very much like the 3 xt's that a
DT-TOKEN references (see the Recogniser RfD). When the text interpreter
finds a word in the search order the xt is returned and the interpreter
accesses the appropriate interpret, compile or postpone (in future)
method as appropriate.

Each word list in the search order has an associated find-name method
(currently all the same but designed in so that I could have, for
example, a faster wordlist search if wanted). So the text interpreter
executes each wordlist's version of find-name.

To implement a recogniser I could make a recogniser a special type of
wordlist with its own version of find-name that returned a DT (or FAIL)
for the recogniser instead of an XT (of course I would need to rename
find-name to something more generalised). The text interpreter then
accesses one of the 3 methods as before or tries the next 'wordlist' -
no change is needed.

With this arrangement where a recogniser is a pseudo wordlist SET-ORDER
and GET-ORDER do not need to change. ONLY could set the 'search order'
to the minimal wordset and the basic forth recogniser. A recogniser can
be easily added to either end of the search order but inserting it into
the middle would be more difficult.

So there would be no need for a separate recogniser stack,
GET-RECOGNISERS, SET-RECOGNISERS, FORTH-RECOGNISER, RECOGNISE or
REC-STACK (although a similar word to set a search order stack could be
useful). i.e. much simplified compared to the RfD.

Does all this sound reasonable? Any showstoppers? Dictionary search
words such as FIND etc need to be considered of course.

(I've just realised I've spelled RECOGNISE with an S instead of a Z -
it's the way I was taught!)

--
Gerry

Albert van der Horst

unread,
Feb 1, 2017, 6:38:21 AM2/1/17
to
In article <o6qtvs$pjk$1...@dont-email.me>,
This sounds very reasonable to me. It looks like ciforth, but in
ciforth it is even simpler. ONLY is a wordlist which defines the
minimum search order and "recognizers" are added to ONLY.

You get almost everything with my STATE-smart IMMEDIATE parsing
prefixes, that the official recognizer proposal has.
Except
1234ABH \ No. The recognizable part must be up front, not at the end.
OK
POSTPONE " \ " is a word, you can postpone it.
\ You're an ace if you understand what happens, though.
POSTPONE "BLA" ? ciforth ERROR # 15 : CANNOT FIND WORD TO BE POSTPONED

I don't find that offensive in a language that forbids this:

IF
IF ? ciforth ERROR # 17 : COMPILATION ONLY, USE IN DEFINITION

You may find you have the similar restrictions, so check for that.
>
>(I've just realised I've spelled RECOGNISE with an S instead of a Z -
>it's the way I was taught!)

English spelling versus American.

>
>--
>Gerry

Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

0 new messages