Using test dataset

120 views
Skip to first unread message

tete...@mail.ru

unread,
Mar 18, 2020, 5:19:46 PM3/18/20
to agriculture-vision
One of techniques that came from Kaggle is pseudo-labeling test dataset, then using it in training process. While it gives a significant boost to your algorithm result, it doesn't make sense in real-world situation.
Since the challenge now has an active leaderboard, I think, organizers should explicitly mention in the rules, that testset is off-limits in training, to avoid any confusion.

agriculture-vision

unread,
Mar 19, 2020, 2:40:22 PM3/19/20
to agriculture-vision
Hi,

Thank you for your suggestion.

We expect a standard model training pipeline, where the test set is off-limits in training. We will add a statement in our rules.

The Agriculture-Vision Team

tete...@mail.ru

unread,
Mar 28, 2020, 5:17:22 PM3/28/20
to agriculture-vision
Another point that is not made clear is the use of validation set in training. Cityscapes is a prime example where most researches are combining train and validation datasets for final training after using them separately in search of optimal hyperparameters  to prop up their results on test leaderboard. Will this be allowed?

agriculture-vision

unread,
Mar 30, 2020, 4:15:05 PM3/30/20
to agriculture-vision
We do allow using the val set as part of the training data.

The Agriculture-Vision Team
Reply all
Reply to author
Forward
0 new messages