Evaluation task data

0 views
Skip to first unread message

Chris Callison-Burch

unread,
Jan 12, 2009, 12:20:30 AM1/12/09
to WM...@googlegroups.com
The data for the evaluation shared task is here:
http://statmt.org/wmt09/evaluation-task.tgz
It includes all primary systems to be scored along with source and
reference transaltions.

--Chris

Chris Callison-Burch

unread,
Jan 12, 2009, 10:24:14 AM1/12/09
to WM...@googlegroups.com
I have also released the contrastive systems. These will not be part
of the manual evaluation, but if your metric is quick and easy to run,
you can report scores for them too:
http://statmt.org/wmt09/evaluation-task-contrastive.tgz

--Chris

Chris Callison-Burch

unread,
Jan 12, 2009, 1:46:50 PM1/12/09
to WM...@googlegroups.com
By popular request the evaluation task data is now available in plan
text format:
http://www.statmt.org/wmt09/evaluation-task-txt.tgz
http://www.statmt.org/wmt09/evaluation-task-contrastive-txt.tgz

--Chris

On Jan 12, 2009, at 1:10 PM, Sebastian Pado wrote:

> Hi Chris,
>
> is the data available in plain text format, too? According to the
> web page, that was
> supposed to be the input format.
>
> Sebastian

Chris Dyer

unread,
Jan 12, 2009, 2:19:35 PM1/12/09
to WM...@googlegroups.com
Hey CCB-
Would you mind having one more metric for the workshop metric
evaluation? Since IBM has been suggesting that (TER-BLEU)/2 is useful
(at least in terms of correlating with HTER), I'm curious to see how
well it will perform with respect to the workshop's evaluation task.
Are you in a position of being able to accept another metric easily?
This doesn't even need to be in the report- I'm just curious about how
well the metric does (not least because I used it as the optimization
criterion), but if you guys are overwhelmed with metrics submissions,
I certainly don't want to make more work...

-Chris
Reply all
Reply to author
Forward
0 new messages