2nd call and trial data release: UniDive shared tasks on idiomaticity and multiword expressions

11 views
Skip to first unread message

Aline Villavicencio

unread,
Sep 17, 2025, 3:40:54 PM (4 days ago) Sep 17
to Women in Machine Learning
admire-1.png

**2nd  CALL FOR PARTICIPATION**

Two peas in a pod: PARSEME 2.0 and AdMiRe 2.0 multilingual UniDive shared tasks

on idiomaticity and multiword expressions


https://unidive.lisn.upsaclay.fr/doku.php?id=other-events:parseme-admire-st-call#call_for_participation

Expression of interest: https://forms.gle/rwSfUmNR1sTsHDfx6 

====================================================================

The UniDive COST Action is happy to announce ADMIRE 2 and the PARSEME 2.0 shared tasks dedicated to detecting and interpreting idiomaticity and multiword expressions (MWEs). MWEs are groups of words that have non-compositional semantics, i.e. their meanings cannot be straightforwardly deduced from the meanings of their components. For instance, a bad apple is a person who has a bad influence on others.

Both shared tasks will take place together and we hope to co-organise the workshop with SIGLEX-MWE section and co-locate it with EACL 2026 in Morocco (24-29 March 2026) but this is still to be confirmed. 

We are delighted to confirm that UniDive will provide funding for selected system presenters. 


Important dates

-----------------

  • [1 OCTOBER] Training data and baseline systems released

  • [3 DECEMBER] Publication of test blind data

  • [8 DECEMBER] Submission of system predictions

  • [19 DECEMBER] Systems evaluated

  • [5 JANUARY] Submission deadline for system description papers

  • [9-23 JANUARY] Reviewing period (system teams will participate as reviewers)

  • [3 FEBRUARY] Submission deadline for camera-ready papers

  • [24-29 MARCH 2026] EACL, including the MWE workshop (to confirm)

PARSEME 2.0 is a shared task whose main objective is to identify and paraphrase multiword expressions (MWEs) in written text. We propose two subtasks: the first corresponds to the classical identification task in running text. The second consists in paraphrasing a sentence containing a MWE, so as to remove idiomaticity. Data annotation is ongoing and at least 17 languages are expected to be covered: Dutch, Egyptian (ca. 2700-2000 BC), French, Georgian, Greek (Ancient), Greek (Modern), Hebrew, Japanese, Latvian, Persian, Polish, Portuguese (Brazilian), Romanian, Serbian, Slovene, Swedish, and Ukrainian. Other languages that might also participate are Albanian, Italian, and Lithuanian. Subtask 1 is on MWEs identification and Subtask 2 on paraphrasing MWEs.

AdMIRe 2.0 (Advancing Multimodal Idiomaticity Representation) addresses the challenge of multilingual and multimodal idiomatic language understanding by evaluating how well models interpret potentially idiomatic expressions (PIEs) across languages and across modalities using both text and images. This new edition extends the AdMIRe 1 task adding more languages from the UNIDIVE network and beyond. Given a context sentence containing a PIE and a set of five images, the task is to rank the images based on how accurately they depict the meaning of the PIE used in that sentence. The task will be zero-shot for newly introduced languages. While the task is designed to encourage participation from teams working on multilingual and multimodal technologies, it also accommodates approaches focused only on a subset of the languages and on a single modality (text) with automatically generated descriptive captions for each image, allowing models to rely exclusively on text input if desired.


Data

-----------------

Organizing team

---------------


PARSEME 2.0 :

  • Manon Scholivet, Université Paris Saclay, LISN, FR

  • Takuya Nakamura, Université Paris Saclay, LISN, FR

  • Agata Savary, Université Paris Saclay, LISN, FR

  • Éric Bilinski, Université Paris Saclay, LISN, FR

  • Carlos Ramisch, Aix-Marseille Université, LIS, FR


ADMIRE 2


Reply all
Reply to author
Forward
0 new messages