YAMNet has not been calibrated so you can't interpret the raw scores from YAMNet as probabilities. Comparing the scores across classes is also not guaranteed to make sense because each class has an independent logistic classifier and we train in a multi-class multi-label setting, and so different classes could easily use different score ranges. Furthermore, YAMNet was trained entirely on YouTube so there might be a domain mismatch if you run it on non-YouTube data.
If you want to use YAMNet for a particular application and you want the outputs to be interpretable, it would be best to run some kind of calibration or fine tuning or even transfer learning:
- calibration: run the model on a few representative clips with known ground truth labels, and use the scores to determine thresholds and ranges you can use for making predictions
Manoj