Description and Objectives
Morphology has been widely studied as a word-level task, although in many languages it has complex hierarchical relationships with different layers of language, such as phonetic, syntactic or semantic representations of phrase or sentence-level utterances. The extent of this relationship as well as its complexity, however, still remain unknown. The new shared task on multilingual clause-level morphology aims to investigate methods for morphological analysis or generation of different forms in languages with varying typology, where the modeling and alignment of morphosyntactic structure is accomplished at the level of clauses.
The shared task aims to provide a new benchmark that can help bring novel understandings in:
The relationship between morphology and syntax in different languages
How morphosyntactic structure aligns across languages with varying typology
The performance of conventional statistical methods for language modeling or representation learning in learning abstract linguistic features that can generalize across forms and languages
The limitations of conventional methods for morphological or syntactic modeling as well as the specifications required for developing more comprehensive and theoretically complete models of language
Languages
The shared task will initially include six languages from different language families and with varying morphological characteristics: English, French, German, Hebrew, Russian and Turkish. We anticipate the extension of the benchmark to include more languages as time and resources become available.
Tasks
The shared task can be studied in terms of three parts.
Task 1: Inflection
In this task the input is verbal lemma (the form given as a lexicon entry) and a specific set of inflectional features. The task requires generating the desired output clause manifesting the features.
Examples
Task 2: Reinflection
In this task the input is an inflected clause, accompanied by its features, and a new set of features representing the desired form. The task is to generate the desired output that will represent the desired features.
Examples
Task 3: Analysis
This task is the opposite of task 1, where a system is required to analyze given clauses and generate the lemma and features underlying them.
Examples
Participation
Interested parties are invited to join the mailing list at participants-mc...@googlegroups.com to be involved in the competition.
All participating systems will be evaluated together with our baselines against the same held-out test set, to be released shortly before evaluation. Submitted systems can compete in some or all sub-tasks.
Participating teams will be invited to submit a short paper describing their work to the MRL workshop and to present it in a special session in the workshop.
Important Dates
May 16, 2022: Release of training and development data
July 20, 2022: Release of testing data
July 30, 2022: Deadline for submission of systems
August 15, 2022: Release of rankings and results
September 7, 2022: Deadline for submitting system description papers
Evaluation
System outputs will be evaluated using standard evaluation metrics used in morphological analysis and inflection, including the exact match accuracy ratings (precision, recall and F-1) as well as metrics for generated text, such as the edit distance.
Organizers
Omer Goldman, Bar Ilan University
Reut Tsarfaty, Bar Ilan University
Djame Seddah, University Paris-Sorbonne
Benjamin Muller, University Paris-Sorbonne
Hila Gonen, University of Washington and Meta AI
Jamshidbek Mirzakhalov, Salesforce
Kelechi Ogueji, University of Waterloo
Francesco Tinner, University of Zurich
Duygu Ataman, New York University
Contact
---------------
On behalf of the MRL 2022 Organizers,
Asst. Prof. Gözde Gül Şahin,
Koç University, KUIS AI Fellow
Istanbul/Turkey