Test Sets are Out

10 views
Skip to first unread message

LLMs with Limited Resources for Slavic Languages 2025

unread,
Jul 2, 2025, 2:01:57 PMJul 2
to LLMs with Limited Resources for Slavic Languages 2025
Dear participants,

We are happy to announce that we release the test data for our shared task! Please, check out GitHub repo:

The updates:
Test sets
Upper and Lower Sorbian

The test sets for both Sorbian languages are released in the same fashion:

  • for MT, the test file has a test prefix (test.de-{hsb|dsb}.de) and contains the German source sentences (in their respective MT folder)
  • for QA, the five test files (available in CSV and JSON formats) are in the test folder, with a test_ prefix (e.g., test_{HSB|DSB}_A1.{csv|json}). There are the same A1-B2 level questions and additionally a C1 level file (in their respective QA folder).
Ukrainian

The test sets for Ukrainian are in their respective folders:

  • for MT, test.{cs|en}_uk.jsonl
  • for QA, test.json

We note here that the MT datasets come from the General MT Shared Task (WMT). Hence, they are in a format different from the development sets.


Shortly, we will release about the details of submissions. Meanwhile, we wish you all good luck with the development of your systems!


Best wishes,

Shared Task Organizers

Reply all
Reply to author
Forward
0 new messages