Kevin Brubeck Unhammer
unread,Jun 9, 2017, 6:45:38 AM6/9/17Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Sign in to report message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to constrain...@googlegroups.com, Linda Wiechetek, Sjur Nørstebø Moshagen
In the Divvun grammar checker project, we have input
"<jierpmálaš>"
"jierpmálaš" A Sg Nom &syn-super-part2
and want to write a rule like
COPY (Superl &SUGGEST) TARGET (A &syn-super-part2) IF (NOT 0 (&SUGGEST));
where the expected output is
"<jierpmálaš>"
"jierpmálaš" A Superl &SUGGEST Sg Nom &syn-super-part2
What are the heuristics for placement of the new tags here? I can't seem
to make them go anywhere except at the end; e.g. the actual output of
the above rule is
"<jierpmálaš>"
"jierpmálaš" A Sg Nom &syn-super-part2
"jierpmálaš" A Sg Nom &syn-super-part2 Superl &SUGGEST COPY:4
Is there a simple way to control tag placement? I know it's possible to
do
COPY (Superl Sg Nom &SUGGEST) EXCEPT (Sg Nom) TARGET (A &syn-super-part2) IF (NOT 0 (&SUGGEST));
and get
"<jierpmálaš>"
"jierpmálaš" A Sg Nom &syn-super-part2
"jierpmálaš" A &syn-super-part2 Superl Sg Nom &SUGGEST COPY:4
which is pretty much what I want. I don't care about the &-tags, but the
other tags go into an FST where order matters. But with two numbers and
seven cases we have to have 14 COPY rules. Add possessive tags and so
on, and it quickly turns unmaintainable.
I notice that the docs for SUBSTITUTE do specify insertion point ("at the
last removed tag"), so a better workaround might be
COPY (&SUGGEST) TARGET (A &syn-super-part2) IF (NOT 0 (&SUGGEST));
SUBSTITUTE (A) (A Superl) TARGET (A &syn-super-part2 &SUGGEST) IF (NOT 0 (Superl));
but then the rule writer has to juggle a lot of "marker tags" in order
to avoid adding the Superl to irrelevant readings or getting loops.
I know it's not trivial to make heuristics for tag placement here, but
it seems like it could be possible to have a heuristic similar to the
SUBSTITUTE heuristic one, like
place <extra tags> after the last removed tag from <extra tags>,
otherwise at the end of the reading
so that you could write
COPY (A Superl &SUGGEST) EXCEPT (A) TARGET (A &syn-super-part2) IF (NOT 0 (&SUGGEST));
(On the other hand, if people depend on the current behaviour, perhaps new
rule options BEFORE/AFTER <tag> might be a better solution …)
--
Kevin Brubeck Unhammer