Adjusting CoralNet Training for Species with Uneven Annotations

9 views
Skip to first unread message

Gabe Machado

unread,
Sep 25, 2025, 4:41:52 PMSep 25
to CoralNet Users

Hi all,

I’m currently working on a project using CoralNet to train a machine learning model that annotates species within Monterey’s rocky intertidal community. I have a question about how the training process handles differences in annotation counts across species.

As I understand it, CoralNet randomly subsets the data (about 7/8 of annotations are used for training and 1/8 for testing). The issue I’m running into is that some species (e.g., seagrass) have far more annotations than others (e.g., barnacles). I’m wondering:

  • Is there a way to adjust the training to account for this imbalance in annotations?

  • Does CoralNet provide source code or customization options that would let me modify how the training subset is selected?

Thanks in advance for any guidance!

Gabe

Stephen Chan

unread,
Sep 30, 2025, 5:47:38 PM (11 days ago) Sep 30
to CoralNet Users
Hi Gabe,

That's correct on how CoralNet subsets the data. However, there currently isn't a way to customize how CoralNet selects the subset. I'd be interested to hear what customization options you have in mind, as we might want to add such customization in the future.

Regarding imbalance of labels/species in the training data, CoralNet doesn't currently provide adjustments to address this either. If you suspect that imbalance is introducing bias into your classifier, and you're considering techniques such as oversampling or data augmentation to try to address that bias, you would have to set up such techniques yourself.
Reply all
Reply to author
Forward
0 new messages