SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context (MCL-WiC) Disambiguation
Final Call for Participation
--------------------------------------
Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC) is the first SemEval task for Word-in-Context disambiguation which tackles the challenge of capturing the polysemous nature of words without relying on a fixed sense inventory in a multilingual and cross-lingual setting.
MCL-WiC is framed as binary classification task in which participants indicate whether the target word, occurring in two sentences (sentence1 and sentence2), is used in the same meaning (tagged as T for true) or in a different meaning (F for false) in the same language (multilingual dataset) or across different languages (cross-lingual dataset).
We carried out a huge annotation endeavour with: 5 languages (Arabic, Chinese, English, French and Russian), 9250 unique lemmas and 22,000 sentence pairs. Training (multilingual setting, English only), dev (multilingual setting for the 5 languages) and test data (multilingual 5 languages, cross-lingual EN to 4 languages) are available.
Multilingual subtask: sentence1 and sentence2 are in the same language, either Arabic, Chinese, English, French or Russian. Given the sentence pair:
1) la souris mange le fromage
2) le chat court après la souris
participating systems are asked to determine whether the target word souris is used in the same meaning (T) in the two sentences or not (F).
Cross-lingual subtask: sentence1 is in English and sentence2 is either in Arabic, Chinese, French, or Russian. Given the sentence pair:
1) click the right mouse button
2) le chat court après la souris
participants are asked to determine whether the target word mouse and its translation souris are used in the same meaning (T) in the two sentences or not (F).
We believe that this task will provide the NLP community with a novel, very useful multilingual and cross-lingual evaluation setup which will challenge systems and achieve new results in Natural Language Understanding without the need of a predefined sense inventory.
All participants will then be encouraged to submit a paper to the SemEval-2021 workshop!
Important dates
Evaluation starts: Now!
Evaluation ends: January 31, 2021
Paper submission due: February 23, 2021
Important links
CodaLab competition: https://competitions.codalab.org/competitions/27054
GitHub page: https://github.com/SapienzaNLP/mcl-wic
Task Organizers
Federico Martelli, Najla Kalach, Gabriele Tola, Roberto Navigli
Sapienza NLP Group, Sapienza University of Rome
--