Updated Hungarian test set, some additional info

1 view
Skip to first unread message

Philipp Koehn

unread,
Dec 11, 2008, 5:52:36 AM12/11/08
to Fourth Workshop on Statistical Machine Translation (WMT09)
Hi,

we just updated the Hungarian test set with a slightly corrected version,
so please get this if you are working on Hungarian-English. All the other
test sets are unchanged.

The following questions came up frequently:

* Is it possible to submit multiple systems?

Yes and no. We are able to score multiple systems with automatic metrics
but we will not be able to submit multiple systems from one site to
the manual evaluation. If you submit multiple systems, please identify
one as the primary submission for the manual evaluation.

* What format should translations be provided?

Please submit a file that could be used for standard scoring tools
such as the mteval-11b.perl BLEU tool from NIST. If you are not
able to produce this, please submit a file with one segment per
line.

* Which format should the n-best lists for system combination be?

We do not prescribe a fixed format, but it should be clear from the
files how they are structured. A good format would be for instance
an xml file with the following structure:
<tstset ...>
<doc docid...>
<seg id="1">
<hyp id="1" score="...">
<words> This is a test translation </words>
</hyp>
<hyp id="2" score="...">
<words> This is the test translation </words>
</hyp>
[...]

Happy translating,

Chris Callison-Burch
Danilo Giampiccolo
Philipp Koehn
Christof Monz
Josh Schroeder
WMT 08 Workshop and Shared Task organizers

Reply all
Reply to author
Forward
0 new messages