Hello everybody,
I would like to suggest a temporary fix to the tag order sensitivity
problem, while we wait for a permanent solution. Please forgive me, if
anybody has suggested this method before, but I think the idea is new:
1. We introduce a magic tag LINE, maintained by the compiler,
constituted by the *whole* reading line (plus the word form at the
start) as *one* tag, i.e. *without breaking on space*.
2. If LIST or on-the-fly definitions use a tag parenthesis with space,
e.g. (Tag1 Tag2), in a rule with the flag TAGORDER, this will be
converted internally to /^(.* )?Tag1 Tag2( .*)?$/r.
REMOVE TAGORDER (Tag3) IF (*1 (Tag1 Tag2)) ;
3. This complex and order-sensitive tag is then checked against the LINE
tag as a normal reg.ex. match, like also otherwise done for //r tags.
With this solution we need only very little coding (I hope) for now.
This is, admittedly, an unoptimized hack, but if it works, people can
start writing the order-sensitive rules they need, while we wait for an
integrated and optimized solution later.
In addition to, or instead of, TAGORDER at the rule level, we could also
introduce the concept of a "nonbreaking space character", e.g. ·
(mini-bullet) or double underscore, to allow flexible use of tag order
down at the level of individual contexts: (Tag1·Tag2) or (Tag1__Tag2).
Those of you who need tag order, what do you think?
Tino, is my intuition correct that it would not be so hard to turn this
algorithmical idea into code? And what would it cost, speed-wise? Given
that it would be relevant only for some rules, I guess, it can't be too bad.
Best
Eckhard
--
Eckhard Bick,
cand.med., dr.phil.
University of Southern Denmark
e-mail:
eckhar...@gmail.com
web:
http://beta.visl.sdu.dk