System Combination Tuning/Test Data Release

0 Aufrufe
Direkt zur ersten ungelesenen Nachricht

Josh Schroeder

ungelesen,
22.12.2008, 15:48:3922.12.08
an WM...@googlegroups.com
Hi all,

The data necessary to participate in the System Combination Task for
WMT09 is available at:

http://www.statmt.org/wmt09/system-combination-task.html

We received 30 n-best outputs from translation task participants --
thanks to those of you who helped out!

The newstest2009 data supplied to the translation task has been split
into newssyscomb2009 (502 segments, 25 documents) and newstest2009 is
now officially made up of the remaining 2525 lines, 111 documents. We
have released src data for newstest2009 and src/ref data for
newssyscomb2009 for tuning system weights. 1-best data is provided for
all entrants, and n-best outputs are provided as supplemental data
where available. More information is available in a README in the data
download.

System combination output should be emailed to jsch...@inf.ed.ac.uk
by JANUARY 5, 2009.

Output should match the format used in the translation task: recased,
detokenized XML format - just as in most other translation campaigns
(NIST, TC-Star). Output should contain only the current newstest2009
portion of the data: 2525 segments in 111 documents.

You are allowed to use any external data you like for language
modeling or other features, but please indicate if you are using only
data provided for the translation task (constrained) or are using
other resources (unconstrained).

We have already had one participant express interest in doing multi-
source-to-English translation using the supplied data. Please feel
free to provide "xx-en" entries, and indicate this when submitting
your output. We will evaluate these entries as part of the All-English
human evaluation task. (Shameless plug: if you are interested in multi-
source, you might want to check out my paper "Word Lattices for Multi-
Source Translation" at the EACL main conference!)

Please let me know if you have any questions about submitting results.

Happy Holidays,
Josh

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Allen antworten
Antwort an Autor
Weiterleiten
0 neue Nachrichten