It would be unfortunate if someone's system performed poorly merely
because they didn't lemmatize their translation output correctly, but
they otherwise got the correct meaning. Can you provide a lemmatizer
for the source and target language? If not, how can we assume that our
systems have the appropriate lemmas?
in tasks like this while it is evident that certain preprocessing
tools might put a system to a certain advantage, such tools are never
provided. Instead, you are given the choice to use any resource or
tool of your choice. We have not made any sort of comparison amongst
different lemmatizers available, but if you make sure that you strip
the words down to the basic lemma [no inflection], all lemmatizers
should lead to the same output [the same basic lemmas with no
inflections].
Ravi
My concern is that the system will get a good translation of the word,
but choose the wrong lemma. In this case, the system would be unfairly
penalized.
What semi-automatic technique did the annotators use for
lemmatization?
We don't have any native Spanish-language speakers, so we are not sure
how to make sure we get the appropriate lemmas.
Other participants, do you mind sharing your lemmatization approach?
Thanks,
Joseph
--
You received this message because you are subscribed to the Google Groups "SemEval2010.CrossLingualLexicalSubstitution" group.
To post to this group, send email to clls...@googlegroups.com.
To unsubscribe from this group, send email to clls2010+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/clls2010?hl=en.
Good luck,
Pierpaolo
> > clls2010+u...@googlegroups.com<clls2010%2Bunsu...@googlegroups.com>
As to the semi-supervised method we used to lemmatize the
translations, we used TreeTagger, which was followed by a manual
inspection of lemmatization errors and/ or spelling errors.
Best,
Ravi