WMT GenMT - alignment script fix

5 views
Skip to first unread message

Kocmi T.

unread,
4:00 PM (4 hours ago) 4:00 PM
to wmt-...@googlegroups.com
Hi All,

we had previously released an old version of the alignment script which relied on a wrong alignment. Here is the updated script to check the alignment of your translations:
https://github.com/wmt-conference/wmt-collect-translations/blob/main/genmt_check_alignment.py

The alignment this year is trivial, HTML documents are segmented via paragraphs (<p>) while JSON are segmented on individual key-value pairs (last year we used double new lines which are not used this year).

While the alignment check is optional, we recommend checking that your translations are passing, the alignment is needed for human evaluation. 
However, we will run automatic re-alignment on translations which will be failing the alignment to minimize penalization during human evaluation.

We apologize for the inconvenience with the previous broken script.

--
Tom Kocmi (he/him)
Staff Researcher, Europe
ko...@cohere.com

Reply all
Reply to author
Forward
0 new messages