Hi
Thanks for spotting this! I have added a fix to the extraction code, and I am re-running it. If all looks I well, I will update the release,
best
Barry
This email was sent to you by someone outside the University.You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.--
You received this message because you are subscribed to the Google Groups "Workshop on Statistical Machine Translation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wmt-tasks+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wmt-tasks/85491f7f-07de-481e-ab53-b21c61079cean%40googlegroups.com.
Hi
I have regenerated the news-commentary corpus to fix the extraction. This fixes the error noted below, and also improves document alignment a bit, resulting in more aligned documents.
I created a new release here (https://data.statmt.org/news-commentary/v18.1/) , and will point the WMT task page at that release,
best
Barry
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336. --
You received this message because you are subscribed to the Google Groups "Workshop on Statistical Machine Translation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wmt-tasks+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wmt-tasks/3b5fc7da-f9f1-be9c-bdc5-1e8667a37f67%40ed.ac.uk.