3rd call, training data ready: UniDive shared tasks on idiomaticity and multiword expressions

0 views
Skip to first unread message

Carlos Ramisch

unread,
Sep 30, 2025, 4:19:31 PMSep 30
to naacl-lat...@googlegroups.com


image.png

**3rd  CALL FOR PARTICIPATION**

Two peas in a pod: PARSEME 2.0 and AdMiRe 2.0 multilingual UniDive shared tasks

on idiomaticity and multiword expressions

https://unidive.lisn.upsaclay.fr/doku.php?id=other-events:parseme-admire-st-call#call_for_participation

Expression of interest: https://forms.gle/rwSfUmNR1sTsHDfx6 

====================================================================

The UniDive COST Action is happy to announce ADMIRE 2 and the PARSEME 2.0 shared tasks dedicated to detecting and interpreting idiomaticity and multiword expressions (MWEs). MWEs are groups of words that have non-compositional semantics, i.e. their meanings cannot be straightforwardly deduced from the meanings of their components. For instance, a bad apple is a person who has a bad influence on others.

Both shared tasks will take place together and we hope to co-organise the workshop with SIGLEX-MWE section and co-locate it with EACL 2026 in Morocco (24-29 March 2026) but this is still to be confirmed. 

The participating teams are to submit the results of their systems on CodaBench. The submission links will be published at the same time as the test data.

We are delighted to confirm that UniDive will provide funding for selected system presenters. 


Important dates

-----------------


  • [1 OCTOBER] Training data and baseline systems released

  • [3 DECEMBER] Publication of test blind data

  • [8 DECEMBER] Submission of system predictions

  • [19 DECEMBER] Systems evaluated

  • [5 JANUARY] Submission deadline for system description papers

  • [9-23 JANUARY] Reviewing period (system teams will participate as reviewers)

  • [3 FEBRUARY] Submission deadline for camera-ready papers

  • [24-29 MARCH 2026] EACL, including the MWE workshop (to confirm)

PARSEME 2.0 is a shared task whose main objective is to identify and paraphrase multiword expressions (MWEs) in written text. We propose two subtasks: the first corresponds to the classical identification task in running text. The second consists in paraphrasing a sentence containing a MWE, so as to remove idiomaticity. Data annotation is finished and 17 languages are covered: Dutch, Egyptian (ca. 2700-2000 BC), French, Georgian, Greek (Ancient), Greek (Modern), Hebrew, Japanese, Latvian, Persian, Polish, Portuguese (Brazilian), Romanian, Serbian, Slovene, Swedish, and Ukrainian. Subtask 1 is on MWEs identification and Subtask 2 on paraphrasing MWEs.

AdMIRe 2.0 (Advancing Multimodal Idiomaticity Representation) addresses the challenge of multilingual and multimodal idiomatic language understanding by evaluating how well models interpret potentially idiomatic expressions (PIEs) across languages and across modalities using both text and images. This new edition extends the AdMIRe 1 task adding more languages from the UNIDIVE network and beyond. Given a context sentence containing a PIE and a set of five images, the task is to rank the images based on how accurately they depict the meaning of the PIE used in that sentence. The task will be zero-shot for newly introduced languages. While the task is designed to encourage participation from teams working on multilingual and multimodal technologies, it also accommodates approaches focused only on a subset of the languages and on a single modality (text) with automatically generated descriptive captions for each image, allowing models to rely exclusively on text input if desired.


Data

-----------------


Organizing team

-----------------


PARSEME 2.0 :

  • Manon Scholivet, Université Paris Saclay, LISN, FR

  • Takuya Nakamura, Université Paris Saclay, LISN, FR

  • Agata Savary, Université Paris Saclay, LISN, FR

  • Éric Bilinski, Université Paris Saclay, LISN, FR

  • Carlos Ramisch, Aix-Marseille Université, LIS, FR


ADMIRE 2




Reply all
Reply to author
Forward
0 new messages