Using test dataset

tete...@mail.ru

unread,

Mar 18, 2020, 5:19:46 PM3/18/20

to agriculture-vision

One of techniques that came from Kaggle is pseudo-labeling test dataset, then using it in training process. While it gives a significant boost to your algorithm result, it doesn't make sense in real-world situation.

Since the challenge now has an active leaderboard, I think, organizers should explicitly mention in the rules, that testset is off-limits in training, to avoid any confusion.

agriculture-vision

unread,

Mar 19, 2020, 2:40:22 PM3/19/20

to agriculture-vision

Hi,

Thank you for your suggestion.

We expect a standard model training pipeline, where the test set is off-limits in training. We will add a statement in our rules.

The Agriculture-Vision Team

tete...@mail.ru

unread,

Mar 28, 2020, 5:17:22 PM3/28/20

to agriculture-vision

Another point that is not made clear is the use of validation set in training. Cityscapes is a prime example where most researches are combining train and validation datasets for final training after using them separately in search of optimal hyperparameters to prop up their results on test leaderboard. Will this be allowed?

agriculture-vision

unread,

Mar 30, 2020, 4:15:05 PM3/30/20

to agriculture-vision

We do allow using the val set as part of the training data.

The Agriculture-Vision Team

Reply all

Reply to author

Forward