Can be low score caused by bad format?

19 views
Skip to first unread message

Anton Voronov

unread,
Apr 1, 2020, 9:23:06 AM4/1/20
to duolingo-sharedtask-2020
Hello.
Recently (03/31/2020 10:22:22) I have submitted my first predictions on dev set and I got incredibly low score. I was wondering what can be the reason because on the first glance predictions were not that bad. Can it be somehow related to the format of submission? Can you maybe check somehow my submission if it is really a problem with a format?

Stephen Mayhew

unread,
Apr 1, 2020, 10:01:18 AM4/1/20
to Anton Voronov, duolingo-sharedtask-2020
Hi Anton, 

It looks like the text is still tokenized, while the evaluation files are detokenized. For an example of how to detokenize, look at our run_pretrained.sh script (line 26). Alternatively, check out the sacremoses library.

Hope this helps!

Stephen

On Wed, Apr 1, 2020 at 9:23 AM Anton Voronov <voron...@phystech.edu> wrote:
Hello.
Recently (03/31/2020 10:22:22) I have submitted my first predictions on dev set and I got incredibly low score. I was wondering what can be the reason because on the first glance predictions were not that bad. Can it be somehow related to the format of submission? Can you maybe check somehow my submission if it is really a problem with a format?

--
You received this message because you are subscribed to the Google Groups "duolingo-sharedtask-2020" group.
To unsubscribe from this group and stop receiving emails from it, send an email to duolingo-sharedtas...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/duolingo-sharedtask-2020/1a5bda8b-8889-4471-a718-f20078839385%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages