On 28-May-20 17:25, Anton Ertl wrote:
> Alex McDonald <
al...@rivadpm.com> writes:
>> On 28-May-20 08:25, Anton Ertl wrote:
>>> Alex McDonald <
al...@rivadpm.com> writes:
>>>> It will also attempt, in no specific order: to reduce the number of
>>>> words required to implement the proposal; do some significant bike
>>>> shedding around names (for example, removing the ambiguity of RECTYPE
>>>> and avoiding words like or containing NULL);
>>>
>>> Note that the current names are the result of a bikeshedding session
>>> at a Forth 200x meeting. The fact that you are not happy with the
>>> result and that even a committee member who was present has second
>>> thoughts about it shows the pointlessness of discussing names (that's
>>> why it's called bikeshedding).
>>>
>>> And the big problem is that it detracts from more substantial issues.
>>
>> I'm not alone in thinking that describing it as bikeshedding is not
>> helpful. I would find a words like NON_AGNITA obfuscatory when
>> discussing unrecognized tokens, much as I do for this description:
>>
>> RECTYPE-NULL ( -- RECTYPE-NULL ) RECOGNIZER
>> The null data type id. It is to be used if no other data type id is
>> applicable but one is needed.
>
> Nobody proposed NON_AGNITA. However, I am sure that Matthias Trute
I didn't say they did.
> originally did not intend to obfuscate when he called this word
> R:FAIL. Neither did the Forth-200x committee when we agreed on
> RECTYPE-NULL.
So although Matthias did not mean to obfuscate with R:FAIL, the
committee decided to change it to RECTYPE-NULL. Was it not clear enough?
Did it cause confusion?
I seem to remember yours was the dissenting voice but others saw fit to
change it. Ergo, if I am taking this forward (see the foot of this post)
then I propose to do the same, and you will continue to point out that
they are bikeshed plans. I'll have to live with it.
> What makes you think that you will find a name that
> nobody finds obfuscatory? I am sure you will produce another set of
> names that lots of people disagree with. And the end result will be
> that the whole discussion becomes more and more confusing, because old
> contributions to the discussion become incomprehensible.
I proposed some time back UNRECOGNIZED for RECTYPE-NULL. Let's try it:
===
A system provided data type information is called RECTYPE-NULL. It is
used if no other one [sic] is applicable.
There is a system provided data type named UNRECOGNIZED. It is returned
by the system if the parsed word is not recognized by any recognizer.
===
RECOGNIZE ( addr len rec-seq-id -- i*x RECTYPE-DATATYPE | RECTYPE-NULL )
RECOGNIZER
Apply the string at "addr/len" to the elements of the recognizer set
identified by rec-seq-id. Terminate the iteration if either a parsing
word returns a data type id that is different from RECTYPE-NULL or the
set is exhausted. In this case return RECTYPE-NULL.
RECOGNIZE ( addr len <rec-seq-id> -- i*x <rectype> | UNRECOGNIZED )
RECOGNIZER
Apply the string at "addr len" to the elements of the recognizer set
identified by <rec-seq-id>. Terminate the iteration if either a parsing
word returns a <rectype> that is different from UNRECOGNIZED. If the set
REC-SEQ-ID is exhausted, return UNRECOGNIZED.
===
RECTYPE-NULL ( -- RECTYPE-NULL ) RECOGNIZER
The null data type id. It is to be used if no other data type id is
applicable but one is needed. Its associated methods perform system
specific error actions. The actual numeric value is system dependent.
UNRECOGNIZED ( -- <rectype> ) RECOGNIZER
The <rectype> returned to the system by a recognizer when it fails to
recognize the string in the parse area. The UNRECOGNIZED <rectype> is an
system specific opaque value.
===
REC-NT ( addr len -- NT RECTYPE-NT | RECTYPE-NULL )
REC-NT ( addr len -- NT RECTYPE-NT | UNRECOGNIZED )
===
As you might expect, I find the word UNRECOGNIZED much clearer, as it
says exactly what the outcome has been; the string is unrecognized.
>
>> A "null data type id" is close to word salad
>
> "data type id" is defined earlier, so no, it is not. I find the
"null data type id" IMHO (to be contrasted with your HO) is word salad.
The words are, of course, individually understandable and the phrase
"data type id" gets an explanation. But putting them together does not
ensure that it makes sense as a whole.
> definition of "data type id" suboptimal, however. In any case,
> compare with the version before the renaming
> <
http://amforth.sourceforge.net/pr/Recognizer-rfc-C.html>. IIRC the
> committee only produced word names, but Bernd Paysan (who was the
> committee's contact with Matthias Trute) may be able to tell you what
> he wrote to him after that meeting.
I hope Bernd pops up here; it would be useful to know his intentions.
>
>>>> remove the requirement for
>>>> a fixed name REC-NUM REC-FLOAT
>>>
>>> I think that one of the failures of Forth-94 (and 2012) is that there
>>> is no default search order (and that's despite having standard
>>> wordlist names). The problem is that there are cases where a program
>>> wants to change the existing search order, but the result is
>>> system-specific.
>>
>> Given a scenario where I have a recognizer for assembler opcodes that
>> start with V for vector ops, I might wish to have ALSO ASMVEX use this
>> recognizer but only when searching ASMVEX. Is this what you mean?
>>
>> Or is it a more general observation about the search order wordset, a
>> lack of a default order, and repeating its consequent design shortcomings?
>
> I don't remember a scenario, but I remember that I found this a
> hindrance several times over the years.
>
> Your example mixes up recognizers and vocabularies. I meant that this
> was already a problem without recognizers.
I see.
(As an aside, recognizers can do all of what the search order wordset
does, and I remember vaguely a discussion to this effect. I have
associated a recognizer with a wordlist for some experiments, but at
this early stage I can't say whether it works or not. It's certainly not
of interest until we discuss REC-FIND.)
>
> For recognizers, I expect even more such problems. E.g., if I want to
> replace the system's integer recognizer or float recognizer with
> something else (e.g., something that does not accept system-specific
> extensions), I need to know what the recognizer I want to replace is
> called. Or if I want to recreate a recognizer sequence from
> recognizers, I need the names of the recognizers.
>
>> If it's the latter, then yes, it would be possible to specify a default
>> recognizer order; but that assumes a fixed recognizer nameset if there
>> is to be only one list. Perhaps two lists are required; a default list
>> of unnamed recognizers that provide base functionality, and a list of
>> user defined recognizers which is run through first and can replace the
>> default set's token recognition.
>
> That sounds overly complicated.
>
> Currently the standard describes exactly what is recognized: words,
> integers (singles and doubles), floats. What's wrong if we specify
> named recognizers for that?
Will that push us to fixed names for complex numbers (REC-COMPLEX) and
so on?
>
>>> I hope that the recognizer proposal will eventually be tight enough to
>>> avoid this problem. I think we need fixed-name standard recognizers
>>> to achieve this.
>>
>> OK, but it's possible to write a perfectly adequate specification
>> without mentioning them by name.
>
> Maybe, but I am doubtful. But anyway, why would you want to avoid
> standardizing the names of the recognizers that every Forth system
> with recognizers has?
Perhaps the proposal needs to be in two parts; the recognizer itself and
the standard names for system recognizers. It would allow us to focus on
what is important first.
>
>>> I am pretty unhappy about REC-NUM, and would rather prefer that
>>> single-cell and double-cell recognizers be separated, but REC-NUM is
>>> the easiest way to transition from NUMBER?; separate recognizers would
>>> require more substantial refactorings in many Forth systems, possibly
>>> affecting more lines than implementing recognizers themselves. Given
>>> how many years it has taken until, e.g., Stephen Pelc has looked at
>>> recognizers, I am not sure we want to add another five years to get
>>> rid of REC-NUM.
>>
>> There's nothing preventing REC-SNUM and REC-DNUM being factors of
>> REC-NUM if that's what you wish.
>
> The question is what is standardized. If we standardize REC-NUM, we
> probably will not standardize REC-SNUM and REC-DNUM (because there is
> no common practice). If we standardize REC-SNUM and REC-DNUM, there
> is hardly any need for REC-NUM. Ok, you might argue that REC-NUM is
> agnostic to whether the system supports the double-number wordset or
> not, but is there any system that implements Forth-2012 that does not
> support the double-number wordset.
A good argument for splitting the current proposal and making that part
of a second proposal.
>
> - anton
>
As to proposing version 5 (or E) of the recognizer proposal, I'm going
to wait a little while to see if Bernd or Ulrich turns up, but I don't
want to let that stop the discussion.
--
Alex