Forcing classifier training from Coral Net Team

91 views
Skip to first unread message

Gabriel Calistro

unread,
Jul 5, 2023, 12:06:31 PM7/5/23
to CoralNet Users
Dear Coral Net Team,

I am the admin for the Fragments of Hope Restoration Sites source and have a request attempting to improve the classifier. We had just triggered the classifier to train itself yesterday and unfortunately it did not produce a better classifier. The existing images that have been trained makeup the bulk of the diversity on the images were are trying to classify. My understanding is if we train more less diverse images, i.e. images with frequent labels like Sand and Gravel,  the training set randomly chosen will increasingly have more and more Sand and Gravel and less of the rarer labels represented. I understand that 1/8th of the trained images are examined randomly, but is there anyway for the classifier to be triggered by your team to look at another 1/8th? As you see from our source we have quite a lot more images trained since the last saved classifier, and I believe that it just needs the right 1/8th examined to produce a better classifier. If I keep doing more images with the same common labels to trigger a training  I will just get further from an update that will save.


I greatly appreciate any possible assistance with this issue. 

Thank you.

All the best,

Gabe

Stephen Chan

unread,
Jul 15, 2023, 4:35:48 AM7/15/23
to CoralNet Users
Hi Gabe, apologies for the late response!


> My understanding is if we train more less diverse images, i.e. images with frequent labels like Sand and Gravel,  the training set randomly chosen will increasingly have more and more Sand and Gravel and less of the rarer labels represented.

Correct, more and less as a proportion of the whole training set. However, in theory at least, adding more (accurately labeled) Sand and Gravel training points should not diminish the machine's ability to distinguish those labels from the rarer labels. But yes, it can skew the accuracy measurement in a way you might not want.


> I understand that 1/8th of the trained images are examined randomly, but is there anyway for the classifier to be triggered by your team to look at another 1/8th?

Unfortunately, we don't have a straightforward way to change how the 1/8th is picked, even from the admin side.

We're generally working towards enabling more flexibility with using classifiers, so in the near future we'll likely provide some method to relax the criteria for saving a new classifier - such as ignoring the accuracy check. For now, that method doesn't exist yet. However, there is a workaround that some folks have been using to save a new classifier whenever they want:

When you add a label to / remove a label from your source's labelset, that will start a process where the source's existing classifier history gets deleted, and a new classifier gets trained. This new classifier will use all of the currently available training data in the source. So what you'd do is add any label to your labelset, then remove that label right after. CoralNet should say it's resetting the classifiers twice, but if you do both actions within a few minutes, you should only have to wait for a single re-train.
Reply all
Reply to author
Forward
0 new messages