First Call for Participation: LLMs with Limited Resources WMT2025 Shared Task
LLMs with Limited Resources for Slavic Languages @ WMT2025 @ EMNLP2025
Website: https://www2.statmt.org/wmt25/limited-resources-slavic-llm.html
Join our Google Group! https://groups.google.com/g/slavic-llms-mt2025
HuggingFace Collection: https://huggingface.co/collections/tum-nlp/llms-for-slavic-languages-67f3993bf057be6a8d6665ab
This shared task explores how LLMs perform on MT and QA jointly, aiming to investigate task synergy under limited data and compute resources. Ukrainian (uk) is a mid-resource language (~40M L1 speakers), while Upper Sorbian (hsb) and Lower Sorbian (dsb) are minority West Slavic languages (30k and 7k L1 speakers, respectively) spoken in Germany.
Data Overview
Ukrainian
MT directions: en→uk, cs→uk
QA: Derived from high-school graduation exams (ZNO)
Training sets examples:
QA: UNLP2024, ZNO-EVAL, Cohere INCLUDE
Upper Sorbian & Lower Sorbian (two separate tracks)
MT directions: de→hsb, de→dsb
QA: Multiple-choice questions based on actual CEFR-based language certification exams (A1–C1 levels)
We will prepare the following resources:
Parallel & monolingual corpora via Witaj-Sprachzentrum and Leipzig Corpora Collection;
Previous WMT low-resource tracks (2020–2022);
QA task adapted from language certifications of different levels.
Submission Guidelines
Models must produce both MT & QA outputs for the chosen language(s);
Submissions are language-specific; submit to one or multiple language tracks;
Participants can only use one of the following base models that are restricted to 3B parameters maximum:
Quantized or Unsloth variants from HuggingFace collections
Key Dates (AoE)
Registration opens now!: Join our Google group https://groups.google.com/g/slavic-llms-mt2025
Training/dev data release: Late April
Test data release: Late June
Submission deadline: Early July
System description deadline: Late July
Final workshop: 5-9th November @ EMNLP 2025 in Suzhou, China!
Organisers
TUM Heilbronn:
Daryna Dementieva
Marion di Marco
Lukas Edman
Alexander Fraser
Kathy Hämmerl
Shu Okabe
Witaj-Sprachzentrum:
Beate Brězan,
Anita Hendrichowa
Marko Měškank
Tomaš Šołta
Acknowledgements
We thank the UNLP 2024 Shared Task team (Roman Kyslyi, Mariana Romanyshyn, Oleksiy Syvokon) for kindly sharing Ukrainian QA resources.