Hi there,
Certainly, dev and test sets are to be used for exactly what they say: You are more than welcome to look at, train with, and otherwise use the dev set for whatever helps build a better model. That being said, it can become easy to overfit to the dev set, and
subsequently perform poorly on the test set. In a perfect world, you should never look at the contents of the test set, nor should you really try to “game the system” by using your score on the test set to infer the contents of the test set.
Avoiding overfitting on the dev set is a problem many of us face over time, and things like cross-validation, good interpretation of the dataset as a whole, etc. do help avoid these problems. You are also welcome to use outside datasets/resources to help broaden the domain and applicability of your model, though as we have said before, hand-labeling of new datapoints is not acceptable.
That being said, if there is something in the dev set that would cause your model to perform significantly better on the test set, I encourage you to write a paper submission about it! We know that this type of competition isn’t perfect, and we want to see papers that help us push this system and the research forward.
Best,
Sasha
--
You received this message because you are subscribed to the Google Groups "DeftEval 2020" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
semeval-2020-task...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/semeval-2020-task-6-all/9ac46ddc-4e85-4e18-aa9d-1ac56dff94e9%40googlegroups.com.