Evaluation task data

Chris Callison-Burch

unread,

Jan 12, 2009, 12:20:30 AM1/12/09

to WM...@googlegroups.com

The data for the evaluation shared task is here:
http://statmt.org/wmt09/evaluation-task.tgz
It includes all primary systems to be scored along with source and
reference transaltions.

--Chris

Chris Callison-Burch

unread,

Jan 12, 2009, 10:24:14 AM1/12/09

to WM...@googlegroups.com

I have also released the contrastive systems. These will not be part
of the manual evaluation, but if your metric is quick and easy to run,
you can report scores for them too:
http://statmt.org/wmt09/evaluation-task-contrastive.tgz

--Chris

Chris Callison-Burch

unread,

Jan 12, 2009, 1:46:50 PM1/12/09

to WM...@googlegroups.com

By popular request the evaluation task data is now available in plan
text format:
http://www.statmt.org/wmt09/evaluation-task-txt.tgz
http://www.statmt.org/wmt09/evaluation-task-contrastive-txt.tgz

--Chris

On Jan 12, 2009, at 1:10 PM, Sebastian Pado wrote:

> Hi Chris,
>
> is the data available in plain text format, too? According to the
> web page, that was
> supposed to be the input format.
>
> Sebastian

Chris Dyer

unread,

Jan 12, 2009, 2:19:35 PM1/12/09

to WM...@googlegroups.com

Hey CCB-
Would you mind having one more metric for the workshop metric
evaluation? Since IBM has been suggesting that (TER-BLEU)/2 is useful
(at least in terms of correlating with HTER), I'm curious to see how
well it will perform with respect to the workshop's evaluation task.
Are you in a position of being able to accept another metric easily?
This doesn't even need to be in the report- I'm just curious about how
well the metric does (not least because I used it as the optimization
criterion), but if you guys are overwhelmed with metrics submissions,
I certainly don't want to make more work...

-Chris

Reply all

Reply to author

Forward