Hi Aarne,
yes, we agree on 1.-3.,
> 1. data/fun distinction should be kept, since it makes a difference in how
> pattern matching in def definitions is interpreted
> 2. RGL should switch into data everywhere, since in this way one could define
> transfer by pattern matching on the RGL functions
> 3. data should be allowed to be split to different modules (as it is now for
> the judgement form 'data f : T')
and on 3. not only because of Extend, as I thought, but also because
of Lexicon, as you said. (But on 2. see the Remark below).
Besides that, I only now realised the following distinction of functions in the
abstract syntax:
a) transfer functions f : tree -> tree, which do not appear in parse trees:
fun f : A -> C with def f pat1 = t1 | ... f patn = tn
but *without* linearization in concrete grammars
b) computable f : tree -> tree that may appear in parse trees:
fun f : A -> C with def f pat1 = t1 | ... | f patn = tn
but *with* linearization (lin f) in concrete grammars
Those under a) (like active2passive, digits2numeral, aggregate), remind me of
Chomsky's transformational grammar, in so far as he claims that relations between
constructions are an important part of natural language grammars --similar to
equivalences between formulas in predicate logic; maybe Chomsky made a stronger
claim and wanted to use active2passive to save writing passive constructions.
Those under b) give us a (restricted?) way to linearize by case
distinction over the abstract syntax of type A in the sense of
lin (f a) = case a of { pat1 => lin t1 | ... |
patn => lin tn | x => (lin f)(lin x) }
*if* we use
pt -compute (f a) | linearize
rather than
linearize (f a) = (lin f)(lin a),
the default for linearizing (f a). Maybe your idea with UseComp can be treated this way.
What I had in mind was related to pronouns: these can *often* be treated as if
they were full noun phrases, but *not always*.
1. If we consider personal and possessive pronouns als forms of a single Pron,
then UsePron : Pron -> NP only works for the substantive usage of NPs. The
(det/adjectival) possessive usage of NPs is expressed by the Saxon genitive:
PossNP : NP -> CN -> NP -- Bill's car | a young girl's sweet dreams |
-- the greengrocers' apples |
-- the greengrocer's apostrophe ("trouser's reduced")
As there is no genitive of pronouns in Eng, the (def/adjectival) possessive
usage of Prons has to be expressed by special forms:
PossPron : Pron -> CN -> NP -- his car | her sweet dreams | their apples
So I had in mind that the (linearization of) UsePron : Pron -> NP could omit
the possessive forms of (lincat Pron), while the rare(?) possessive usage would
need to have access to them. So, the general rule should be
fun Poss : NP -> CN -> NP ;
def Poss (UsePron p) cn = PossPron p cn | Poss np cn = PossNP np cn ;
Note 1: this is a variation of "def F (G1 a1) = f1 a1 | F (G2 a2) = f2 a2" of
my previous message, with a final else-case: F a = f a. But in contrast to
what I said there, f1, f2 and f are not auxiliary constructors, but (probably)
data construcors, while F is auxiliary. Done this way, F is just a transfer
function, a case of a) above, and will not appear in parse trees (bad).
Note 2: a slightly different implementation is made in PronNPGer below.
Identifing Poss with PossNP, we can give PossNP a linearization to handle the
default (using np.s!Gen) and give it a computation rule (without else-case for
the default) to handle exceptions:
fun PossNP : NP -> CN -> NP ;
def PossNP (UsePron p) cn = PossPron p cn ; -- exception
This makes PossNP an instance of b) above: it appears in parse trees of a
non-pronoun np (good) and incorrectly uses the genitive form of a personal
pronoun (bad), but it can no longer be a data constructor (bad).
Remark: If all constructions (fun F) in the RGL are turned into (data F), we
cannot add computation rules (def F pat) to add corrections for special cases.
This may be an argument against turning all fun to data. A way out would be to
improve an existing (data F) of the RGL by adding a new (fun F') with
(def F' pat) for exception-cases, and use F in the else-case of F'.
2. As in the RGL one can use a flag isPron:Bool in (lincat NP), to *modify* the
NP-linearization for a Pron, but one cannot *exclude* a Pron from being
used in a rule intended for non-pron NPs. [The correct basic distinction is
between Pron and NonPron, and NP is just a simplifying category for (Pron |
NonPron) that saves duplicating common rules, with a price paid by admitting
any np in non-common rules.] Using abstract functions of type b) above reduces
this price, I think.
We sometimes have to implement special cases, which do not necessarily
have to do with access to rarely used information in linearizations.
For example, even if Pron is just the personal pronoun, ditransitive
verbs want to do a "pronoun switch":
besides the acc < (to-)dat ordering
I gave (the book:acc) (to the girl:dat) [ComplAccDatV3]
I gave (it:acc) ((to) her:dat)
one can say
I gave (the girl:dat) (the book:acc) [ComplDatAccV3]
I gave (her:dat) (the book:acc)
but it is wrong to say (with unstressed "it")
* I gave (the girl:dat) (it: acc)
* I gave (her:dat) (it:acc)
I'm not absolutely sure about the details, but at least the (it:acc) must be put
before a dative object without "to". If we had a ternary VP-construction
fun ComplDatAccV3 : V3 -> NP -> NP -> VP,
for putting the indirect object (without "to") before the direct one, one
could implement the switching of the acc-pronoun in front by a computation
rule
def ComplDatAccV3 v np (UsePron q) = ComplAccDatV3 v (UsePron q) np
and use the default-behaviour of ComplDatAccV3 only for non-pronoun direct objects.
To implement this pronoun switch in GF, using the isPron flags in the
NP-linearizations, is a bit cumbersome. GF embeds first one of the objects, by
Slash2V3 : V3 -> NP -> VPSlash ; -- give it (to her)
Slash3V3 : V3 -> NP -> VPSlash ; -- give (it) to her
and later the other one, by
ComplSlash : VPSlash -> NP -> VP ; -- love it
Even though the lincat of the vps:VPSlash constructed in the first step
remembers whether the direct or the indirect object is still missing, it has
glued the string np1.s of the np1 it got into the string in vps.s (as far as I
remember), and hence, np1.s cannot be replaced by np2.s in (ComplSlash vps np2).
[Sorry, this may be wrong since you use an nn-field in vps, but something like this
happend with objects of embedded infinitives I needed to move around,
or it had to do with remembering in vps whether np1 was a pronoun.]
Two years ago, I had implemented a similar pronoun-switch in VerbGer, using a field
nn : Agr => Str * ... * Str
that kept inserted objects/complements in separate fields
<relfl or pron, other np, pp, compls, obj.of infinitives, embedded infinitives>
It can be done, but adding one or two boolean flags in (lincat NP) (isPron,
isHeavy, isDefinite, ..) gave a considerable increase of the compiled grammar
(too many Slash(2|3)V3 rules, I think), and finally made compilation
impossible on my notebook.
I wonder whether constructions with computation rules would lead to an easier
implementation, as sketched under 1. above. In the RGL case, one would need
patterns that look deeply into the argument vp of
PredVP : NP -> VP -> Cl
to see whether this vp was constructed using (UsePron p) as np-argument of
embedded (Compl np vp)s and (Slash*V3 v np)s.
Other examples of such exceptions might be to exclude misuses of (PredVP np
vp) in which the vp contains a reflexive that does not agree with the np --
but no, this cannot be blocked on the abstract level.
Summary: writing grammar rules by case distinction on the abstract syntax is
to some extent possible in GF, and maybe useful for exceptional cases. (I
always missed this.) [All this needs "pt -compute", as explained in b) and
shown in the PronNPGer example below.]
The price, you may say, is non-compositionality: the linearization of a tree t
is then not computed from the linearization of t's *direct* subtrees, but from
the linearizations of *indirect* subtrees (or trees built from those) instead.
Sorry, this got a bit long, to be as precise as is needed.
Hans
P.S. The command "pt -transfer=f tree" internally used -compute to go down to
subtrees that match one of the patterns of "def f pat1 = t1 | ... | f patn =
tn". The tree t resulting from a parse typically does not match these patterns:
t may be an Utt, but active2passive : Cl -> Cl applies to clauses. Even if we
are given a tree t, we don't want to insert f in front of suitable subtrees by
hand and then use -compute.
------------------------- Example ------------------------------- 10.3.2021 HL
--------------------------------PronNP.gf ------------------------------------
abstract PronNP = {
flags
startcat = NP ;
cat
Det ; CN ; Pron ; NP ;
data
DetCN : Det -> CN -> NP ;
UsePron : Pron -> NP ;
PossPron : Pron -> CN -> NP ;
fun
PossNP : NP -> CN -> NP ; -- default: des Sohnes Hund |
def -- *meiner Hund
PossNP (UsePron p) cn = PossPron p cn ; -- exception: mein Hund
-- Lexicon
data
Der : Det ;
Hund, Sohn : CN ;
Ich : Pron ;
}
--------------------------------PronNPGer.gf ---------------------------------
concrete PronNPGer of PronNP = {
lincat
NP = { s : Case => Str } ;
Pron = { s : Case => Str ; poss : AForm => Str } ;
CN, Det = { s : Case => Str } ; -- simplified
lin
DetCN det cn = { s = \\c => det.s!c ++ cn.s!c } ;
UsePron p = { s = p.s } ; -- p as personal pronoun
PossPron p cn = { s = \\c => p.poss ! (AF c) ++ cn.s ! c } ;
PossNP np cn -- default : possessive genitive, except for np=pron
= { s = \\c => np.s ! Gen ++ cn.s ! c } ;
Der = { s = table { Nom => "der" ; Gen => "des" ;
Dat => "dem" ; Acc => "den" } };
Ich = { s = table { Nom => "ich" ; Gen => "meiner" ;
Dat => "mir" ; Acc => "mich" };
poss = table AForm { AF Nom => "mein" ; AF Gen => "meines" ;
AF Dat => "meinem" ; AF Acc => "meinen" }
} ;
Hund = { s = table { Nom|Acc => "Hund" ; Gen => "Hundes" ;
Dat => "Hunde" } } ;
Sohn = { s = table { Nom|Acc => "Sohn" ; Gen => "Sohnes" ;
Dat => "Sohne" } } ;
param
Case = Nom | Gen | Dat | Acc ;
AForm = AF Case ; -- short for: AF Gender Number Case
}
--------------- Parsing, linearization, and tree normalization ----------------------
PronNP> p -tr "der Sohn" | l -table
DetCN Der Sohn
s Nom : der Sohn
s Gen : des Sohnes
s Dat : dem Sohne
s Acc : den Sohn
PronNP> p -tr "ich" | l -table
UsePron Ich
s Nom : ich
s Gen : meiner
s Dat : mir
s Acc : mich
PronNP> p -tr "des Sohnes Hund" | l -table
PossNP (DetCN Der Sohn) Hund
s Nom : des Sohnes Hund
s Gen : des Sohnes Hundes
s Dat : des Sohnes Hunde
s Acc : des Sohnes Hund
-- incorrect input, accepted by default construction:
PronNP> p -tr "meiner Hund" | l -table
PossNP (UsePron Ich) Hund
s Nom : meiner Hund
s Gen : meiner Hundes
s Dat : meiner Hunde
s Acc : meiner Hund
-- correct input, accepted by special construction:
PronNP> p -tr "mein Hund" | l -table
PossPron Ich Hund
s Nom : mein Hund
s Gen : meines Hundes
s Dat : meinem Hunde
s Acc : meinen Hund
-- incorrect input, translated to correct output by tree normalization (pt -compute):
PronNP> p -tr "meiner Hund" | pt -compute -tr | l -table
PossNP (UsePron Ich) Hund
PossPron Ich Hund
s Nom : mein Hund
s Gen : meines Hundes
s Dat : meinem Hunde
s Acc : meinen Hund
-- normalization changes indirect subtrees (only):
PronNP> p -tr "meiner Sohnes Hundes Sohn" | pt -compute -tr | l -table
PossNP (PossNP (PossNP (UsePron Ich) Sohn) Hund) Sohn
PossNP (PossNP (PossPron Ich Sohn) Hund) Sohn
s Nom : meines Sohnes Hundes Sohn
s Gen : meines Sohnes Hundes Sohnes
s Dat : meines Sohnes Hundes Sohne
s Acc : meines Sohnes Hundes Sohn
-- normalization leaves non-possessive use of pronouns intact:
PronNP> p -tr "ich" | pt -compute -tr | l -table
UsePron Ich
UsePron Ich
s Nom : ich
s Gen : meiner
s Dat : mir
s Acc : mich
----------------------------- Example. ----------------------------------------