We are happy to announce that we release the test data for our shared task! Please, check out GitHub repo:
The test sets for both Sorbian languages are released in the same fashion:
- for MT, the test file has a test prefix (test.de-{hsb|dsb}.de) and contains the German source sentences (in their respective MT folder)
- for QA, the five test files (available in CSV and JSON formats) are in the test folder, with a test_ prefix (e.g., test_{HSB|DSB}_A1.{csv|json}). There are the same A1-B2 level questions and additionally a C1 level file (in their respective QA folder).
The test sets for Ukrainian are in their respective folders:
- for MT, test.{cs|en}_uk.jsonl
- for QA, test.json
We note here that the MT datasets come from the General MT Shared Task (WMT). Hence, they are in a format different from the development sets.
Shortly, we will release about the details of submissions. Meanwhile, we wish you all good luck with the development of your systems!
Best wishes,
Shared Task Organizers