BLEU/NIST Scores for Submissions

1 view
Skip to first unread message

Philipp Koehn

unread,
Dec 25, 2008, 1:55:01 AM12/25/08
to Fourth Workshop on Statistical Machine Translation (WMT09)
Dear Participants,

below please find a preliminary release of the automatic scores
for your submissions (both primary and contrastive).

Please take note that

* the official metric of the WMT'09 shared task is human sentence ranking

* runs by Google, RBMT systems and maybe others used additional
  resources and are included only for comparison sake

* the test set for which these scores are computed exclude about 500
  sentences that are used for system combination tuning.

If you notice any problems, please let us know.

Regards,
Philipp Koehn
Chris Callison-Burch
Josh Schroeder

cz-en ===
    bleu  cased  nist  cased
    21.18 20.22  6.823 6.608 [ratio=0.94] google (primary)
    19.51 18.51  6.197 5.998 [ratio=1.03] uedin (primary)
    16.37 15.02  5.842 5.536 [ratio=1.02] cu-bojar (primary)
de-en ===
    bleu  cased  nist  cased
    21.30 20.23  6.854 6.650 [ratio=0.98] google (primary)
    21.03 19.80  6.664 6.431 [ratio=0.98] uka (primary)
    20.83 19.21  6.740 6.419 [ratio=0.93] umd (primary)
    20.20 19.01  6.473 6.243 [ratio=1.00] uedin (primary)
    19.82 18.43  6.393 6.111 [ratio=0.99] stuttgart (primary)
    19.31 17.59  6.350 6.022 [ratio=0.99] liu (primary)
    19.02 18.01  6.443 6.237 [ratio=0.93] rwth (primary)
    18.86 17.17  6.402 6.078 [ratio=0.97] systran (primary)
    18.20 17.27  6.256 6.062 [ratio=0.90] rwth (contrastive2)
    18.14 17.19  6.289 6.087 [ratio=0.90] rwth (contrastive3)
    18.02 16.87  6.162 5.913 [ratio=1.00] rwth (contrastive1)
    17.79 16.79  6.269 6.061 [ratio=0.89] rwth (contrastive4)
    17.47 16.51  6.216 6.016 [ratio=0.90] rwth (contrastive5)
    17.18 15.87  5.981 5.714 [ratio=1.04] rbmt3 (primary)
    17.04 15.63  6.059 5.749 [ratio=1.03] rbmt2 (primary)
    16.64 15.44  5.894 5.636 [ratio=1.05] usaar (primary)
    15.53 14.20  5.646 5.363 [ratio=1.05] rbmt4 (primary)
    14.22 13.09  5.301 5.068 [ratio=1.07] rbmt1 (primary)
    10.00 09.07  4.879 4.646 [ratio=0.99] geneva (primary)
    06.81 05.64  4.903 4.253 [ratio=0.95] jhu-tromble (primary)
en-cz ===
    bleu  cased  nist  cased
    14.24 13.20  5.175 4.961 [ratio=1.01] cu-bojar (primary)
    13.86 12.85  5.110 4.900 [ratio=1.01] cu-bojar (contrastive-b)
    13.59 13.13  4.964 4.840 [ratio=1.08] google (primary)
    13.55 13.02  5.039 4.895 [ratio=1.00] uedin (primary)
    10.01 09.30  4.360 4.201 [ratio=1.01] cu-bojar (contrastive-c)
    09.51 09.10  4.381 4.258 [ratio=1.00] eurotranxp (primary)
    09.42 08.89  4.335 4.191 [ratio=1.04] pctrans (primary)
    07.29 06.95  4.173 4.032 [ratio=0.96] cu-tectomt (primary)
en-de ===
    bleu  cased  nist  cased
    15.23 14.76  5.533 5.420 [ratio=1.00] uedin (primary)
    15.09 14.56  5.595 5.480 [ratio=1.00] uka (primary)
    14.68 14.20  5.357 5.246 [ratio=1.03] google (primary)
    13.71 13.04  5.350 5.180 [ratio=1.01] liu (primary)
    13.63 13.24  5.511 5.408 [ratio=0.95] rwth (primary)
    13.49 13.12  5.448 5.348 [ratio=0.94] rwth (contrastive1)
    13.43 13.06  5.475 5.372 [ratio=0.93] rwth (contrastive2)
    13.43 13.05  5.502 5.400 [ratio=0.94] rwth (contrastive3)
    13.35 12.97  5.081 4.986 [ratio=1.05] rbmt2 (primary)
    13.16 12.80  5.477 5.378 [ratio=0.94] rwth (contrastive4)
    12.49 11.58  5.062 4.821 [ratio=1.04] stuttgart (primary)
    11.92 11.55  4.796 4.710 [ratio=1.03] rbmt3 (primary)
    11.64 11.24  4.832 4.711 [ratio=1.08] usaar (primary)
    11.08 10.73  4.691 4.590 [ratio=1.05] rbmt1 (primary)
    10.56 10.22  4.658 4.567 [ratio=1.04] rbmt4 (primary)
en-es ===
    bleu  cased  nist  cased
    27.94 26.80  7.270 7.067 [ratio=1.05] google (primary)
    24.99 23.76  6.944 6.731 [ratio=1.01] uedin (primary)
    24.91 23.21  6.963 6.673 [ratio=1.01] nus (primary)
    24.85 23.37  6.963 6.689 [ratio=1.01] talp-upc (primary)
    22.25 21.24  6.830 6.634 [ratio=0.96] rwth (primary)
    20.77 19.79  6.465 6.279 [ratio=1.01] rbmt4 (primary)
    20.20 19.25  6.359 6.155 [ratio=1.02] usaar (primary)
    18.28 17.37  5.793 5.633 [ratio=1.02] rbmt3 (primary)
    14.91 14.16  5.315 5.170 [ratio=1.03] rbmt1 (primary)
en-fr ===
    bleu  cased  nist  cased
    25.55 24.40  7.016 6.826 [ratio=0.98] lium-systran (primary)
    25.32 24.15  6.903 6.709 [ratio=1.02] google (primary)
    24.90 23.90  6.955 6.790 [ratio=0.99] lium-systran (contrastive1)
    24.81 23.82  6.942 6.773 [ratio=0.98] limsi (primary)
    24.78 23.81  6.880 6.714 [ratio=0.99] limsi (contrastive1)
    24.21 23.06  6.753 6.565 [ratio=1.00] uedin (primary)
    24.01 22.99  6.819 6.652 [ratio=0.99] uka (primary)
    23.78 22.14  6.692 6.393 [ratio=1.01] dcu (primary)
    23.37 22.36  6.685 6.515 [ratio=1.00] limsi (contrastive2)
    22.97 21.77  6.700 6.471 [ratio=0.98] systran (primary)
    22.17 21.33  6.668 6.509 [ratio=0.96] rwth (primary)
    21.67 20.65  6.565 6.370 [ratio=0.98] systran (contrastive)
    18.68 17.76  6.156 5.979 [ratio=0.99] usaar (primary)
    18.29 17.46  6.119 5.959 [ratio=0.99] rbmt1 (primary)
    18.15 17.25  6.022 5.856 [ratio=0.99] rbmt4 (primary)
    14.81 13.97  5.587 5.391 [ratio=0.94] geneva (primary)
en-hu ===
    bleu  cased  nist  cased
    09.91 09.31  4.484 4.316 [ratio=0.98] uedin (primary)
    08.16 07.79  4.042 3.921 [ratio=1.06] morpho (primary)
es-en ===
    bleu  cased  nist  cased
    28.69 27.76  7.683 7.501 [ratio=0.98] google (primary)
    26.30 25.14  7.245 7.042 [ratio=1.00] uedin (primary)
    25.93 24.54  7.275 7.017 [ratio=0.99] talp-upc (primary)
    24.14 23.32  7.115 6.946 [ratio=0.97] rwth (primary)
    22.60 21.80  6.920 6.747 [ratio=0.98] nict (contrastive18)
    22.49 21.67  6.911 6.739 [ratio=0.98] nict (primary)
    22.49 21.67  6.910 6.739 [ratio=0.98] nict (contrastive16)
    22.25 21.45  6.902 6.734 [ratio=0.98] nict (contrastive6)
    22.13 21.32  6.896 6.724 [ratio=0.98] nict (contrastive14)
    22.09 21.27  6.898 6.728 [ratio=0.98] nict (contrastive8)
    21.86 21.07  6.913 6.746 [ratio=0.98] nict (contrastive4)
    21.79 20.95  6.898 6.727 [ratio=0.98] nict (contrastive2)
    21.59 20.81  6.867 6.703 [ratio=0.98] nict (contrastive12)
    21.38 20.59  6.846 6.682 [ratio=0.97] nict (contrastive10)
    20.79 20.04  6.709 6.545 [ratio=0.95] nict (contrastive19)
    20.68 19.91  6.697 6.535 [ratio=0.95] nict (contrastive17)
    20.68 19.91  6.697 6.535 [ratio=0.95] nict (contrastive1)
    20.40 19.66  6.677 6.520 [ratio=0.94] nict (contrastive7)
    20.38 19.60  6.236 6.078 [ratio=1.09] rbmt3 (primary)
    20.24 19.48  6.669 6.507 [ratio=0.94] nict (contrastive15)
    20.20 19.43  6.669 6.508 [ratio=0.94] nict (contrastive9)
    20.16 19.38  6.678 6.517 [ratio=0.94] nict (contrastive3)
    20.11 19.38  6.681 6.524 [ratio=0.94] nict (contrastive5)
    19.88 19.16  6.636 6.481 [ratio=0.94] nict (contrastive13)
    19.78 18.93  6.203 6.033 [ratio=1.08] rbmt4 (primary)
    19.66 18.92  6.610 6.456 [ratio=0.94] nict (contrastive11)
    19.55 18.76  6.305 6.139 [ratio=1.07] usaar (primary)
    19.03 18.33  6.073 5.927 [ratio=1.09] rbmt1 (primary)
fr-en ===
    bleu  cased  nist  cased
    31.14 30.27  7.999 7.843 [ratio=0.96] google (primary)
    26.89 26.01  7.257 7.104 [ratio=1.00] lium-systran (primary)
    26.86 24.93  7.286 6.939 [ratio=1.01] dcu (primary)
    26.52 23.28  7.231 6.680 [ratio=1.00] jhu (primary)
    26.51 25.69  7.297 7.152 [ratio=0.99] lium-systran (contrastive1)
    25.98 22.68  7.225 6.668 [ratio=0.99] jhu (contrastive)
    25.96 25.01  7.165 7.002 [ratio=1.00] uka (primary)
    25.51 24.63  7.024 6.866 [ratio=1.04] limsi (primary)
    25.44 24.37  7.039 6.852 [ratio=1.01] uedin (primary)
    24.89 24.05  7.093 6.938 [ratio=0.99] rwth (primary)
    24.84 23.79  6.827 6.643 [ratio=1.09] limsi (contrastive1)
    24.64 23.61  6.985 6.795 [ratio=1.02] limsi (contrastive2)
    24.61 23.73  7.080 6.918 [ratio=0.98] rwth (contrastive2)
    23.98 23.20  6.913 6.764 [ratio=1.00] cmu-statxfer (contrastive)
    23.84 22.89  6.875 6.687 [ratio=1.01] rwth (contrastive1)
    23.76 22.94  7.053 6.898 [ratio=0.96] rwth (contrastive3)
    23.65 22.91  6.894 6.749 [ratio=0.99] cmu-statxfer (primary)
    23.36 22.43  6.848 6.679 [ratio=1.01] hkust (primary)
    19.63 18.71  6.119 5.959 [ratio=1.07] rbmt3 (primary)
    18.82 17.98  6.083 5.920 [ratio=1.11] usaar (primary)
    18.76 17.99  5.972 5.826 [ratio=1.10] rbmt4 (primary)
    18.47 17.55  5.893 5.733 [ratio=1.13] rbmt1 (primary)
    14.49 13.70  5.323 5.149 [ratio=1.09] geneva (primary)
hu-en ===
    bleu  cased  nist  cased
    12.75 11.58  5.409 5.119 [ratio=0.86] umd (primary)
    12.05 11.16  4.954 4.742 [ratio=1.02] uedin (primary)
    09.89 09.23  4.746 4.549 [ratio=1.07] morpho (primary)

Reply all
Reply to author
Forward
0 new messages