Dear Organizers,
Our team is working on systems for the Shared Task and we have encountered some issues with the data.
- The first issue is related to Russian data. It seems that there are a number of samples (examples are from the dev set) that are ungrammatical or meaningless. For example:
(ungrammatical)
блокировать; IND;PRS;NOM(3,SG,FEM);ACC(3,SG,MASC);DAT(3,SG,NEUT);INS(3,SG,MASC);
она блокирует его ему ему.
or
(meaningless)
блокировать; IND;PRS;NOM(3,SG,MASC);NEG;Q;ACC(2,PL);AT+ABL(2,SG);INS(3,SG,MASC);
не блокирует ли он вас от тебя ему?
- The second issue is about the evaluation of model outputs for languages with relatively free word order like Turkish or Russian.
For example in Russian:
блокировать; IND;PST;NOM(1,SG,NEUT);Q;ACC(3,SG,MASC);AT+ABL(3,PL);INS(RFLX)
could be both:
блокировало ли я его от них собой?
and
блокировало ли я его собой от них?
or for Turkish:
Türkçeleştirmek;
INFR;PST;PRSP;NOM(2,PL);NEG;Q;ACC(1,PL)
could be both:
bizi Türkçeleştirmiş olmayacak mıydınız?
and
Türkçeleştirmiş olmayacak mıydınız bizi?
How would these results be evaluated? We would appreciate your clarifications.
Thank you,