SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context (MCL-WiC) Disambiguation

87 views
Skip to first unread message

Roberto Navigli

unread,
Jan 11, 2021, 6:02:29 AM1/11/21
to semeval3

SemEval-2021 Task 2: Multilingual and Cross-lingual Word-in-Context (MCL-WiC) Disambiguation


Final Call for Participation

--------------------------------------

 

Multilingual and Cross-lingual Word-in-Context Disambiguation (MCL-WiC) is the first SemEval task for Word-in-Context disambiguation which tackles the challenge of capturing the polysemous nature of words without relying on a fixed sense inventory in a multilingual and cross-lingual setting.

 

MCL-WiC is framed as binary classification task in which participants indicate whether the target word, occurring in two sentences (sentence1 and sentence2), is used in the same meaning (tagged as T for true) or in a different meaning (F for false) in the same language (multilingual dataset) or across different languages (cross-lingual dataset).


We carried out a huge annotation endeavour with: 5 languages (Arabic, Chinese, English, French and Russian), 9250 unique lemmas and 22,000 sentence pairs. Training (multilingual setting, English only), dev (multilingual setting for the 5 languages) and test data (multilingual 5 languages, cross-lingual EN to 4 languages) are available.


Multilingual subtask: sentence1 and sentence2 are in the same language, either Arabic, Chinese, English, French or Russian. Given the sentence pair:

 

1)     la souris mange le fromage

2)     le chat court après la souris

 

participating systems are asked to determine whether the target word souris is used in the same meaning (T) in the two sentences or not (F).

 

Cross-lingual subtask: sentence1 is in English and sentence2 is either in Arabic, Chinese, French, or Russian. Given the sentence pair:

 

1)     click the right mouse button

2)     le chat court après la souris

 

participants are asked to determine whether the target word mouse and its translation souris are used in the same meaning (T) in the two sentences or not (F).

 

We believe that this task will provide the NLP community with a novel, very useful multilingual and cross-lingual evaluation setup which will challenge systems and achieve new results in Natural Language Understanding without the need of a predefined sense inventory.

 

All participants will then be encouraged to submit a paper to the SemEval-2021 workshop!

 

Important dates

 

Evaluation starts: Now!

Evaluation ends: January 31, 2021

Paper submission due: February 23, 2021

 

Important links

 

CodaLab competition: https://competitions.codalab.org/competitions/27054

GitHub page: https://github.com/SapienzaNLP/mcl-wic

 

Task Organizers

 

Federico Martelli, Najla Kalach, Gabriele Tola, Roberto Navigli

Sapienza NLP Group, Sapienza University of Rome

http://nlp.uniroma1.it

 

--
=====================================
Roberto Navigli
Full Professor
Department of Computer Science
Sapienza University of Rome
Viale Regina Elena, 295b
00161 Roma Italy
Phone: +39 0649255161 - Fax: +39 06 49918301
Home Page: http://wwwusers.di.uniroma1.it/~navigli
Sapienza NLP Group: http://nlp.uniroma1.it
Co-founder of Babelscape srl
=====================================
Reply all
Reply to author
Forward
0 new messages