--
You received this message because you are subscribed to the Google Groups "Workshop on Statistical Machine Translation" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wmt-tasks+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wmt-tasks/a0982e4d-9b59-4184-9003-d7738b303763n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wmt-tasks/b3e1561f-49f5-4025-acc7-2aca04920319n%40googlegroups.com.
[apologies for cross-posting]
Hi everyone,
Thank you for your patience with the evaluation process. Submitting models is a new form of evaluation, and there have been a few aspects that needed our attention to make the process smoother. We've doing our best to answer your questions as fast as we can.
As we prepare to close the FLORES evaluation tomorrow (August 13th, anywhere on Earth), we wanted to clarify a few points:
1) How do you signal what model IS your submission model?
Using the test set (hidden) to choose the best performing model is an anti-pattern. We won't be doing this. You need to "publish" your final model to signal a submission to the task. At the moment, we're showing unpublished submissions in the dashboard as "Anonymous" models. But coming Monday (Aug 16th), we won't do that anymore. So please ensure to publish your model beforehand.
2) You submitted your model before the deadline, but the result is not visible?
We're seeing a lot of last minute submissions to the full task. This has increased the load on the evaluation server significantly. As a result, computation of final scores is taking longer than expected.
We have implemented several changes to make the evaluation faster. However, we encourage you to submit only **ONCE**.
If you submit by Friday (Aug 13th, anywhere on Earth), it's possible that you won't be able to publish your model while it's being evaluated. The evaluation will still be running during the weekend. However, if this is your case, please fill in this form to let us know the model ID as your final submission (and a backup model ID, just in case the first one fails). We’ll take care of the rest.
3) What if your model fails evaluation between Aug 13 and Aug 16?
In a scenario in which your model failed evaluation, we'll allow you to choose another model (already evaluated) as the final submission. Unfortunately, if you don't have an alternative model, we won't be able to help you.
3) Only one submission per team.
There is a reason to have a maximum number of submissions per day: To avoid fine tuning on the test set. We have seen that there are teams with several accounts open. Just remember that only ONE submission per team is allowed.
We understand that there were some parameters that needed tuning to make the dynabench evaluation work. But please be mindful of the number of evaluations you submit (too many evaluations clog the queues and produces delays for everyone).
NOTE: Any models that are still ongoing evaluation and that haven't been claimed as primary or backup submissions will be deprioritized to make space for other primary/backup models’ evaluations.
4) What if there are models still waiting for evaluation on Monday?
Primary submissions are still waiting for evaluation on Monday (Aug 16th), we’ll hold publishing the results for the task. We don’t anticipate this being an issue for the small tasks.
Please feel free to reach out to flo...@fb.com in case of any questions.