Training MobileNetV1 on AudioSet

79 views

Skip to first unread message

Hannes Stoll

unread,

Jul 14, 2021, 4:32:42 AM7/14/21

to audioset-users

Hi there,

Im currently using the AudioSet to train teh MobileNet, but the AUC does not exceed 0.8. Compared to resuslts from the YAMNet (which is based on the MobileNet V1 structure) a higher score should be possible. I used the alpha parameter to lower the complexity since at first there seemed to be some overfitting. I additionally included dropout with a rate of 0.5. The inputs are Mel-Spectrograms (96x64) according to the parameters in the YAMNet.

The training uses random sclices of the Training set. Since the labels are global (for the whole clip) there is no guarantee, that the sliced clip actually includes all the sounds for the given labels. In another post is it mentioned that some tricks where used. Can anyone help me with what these techniques are? I would assume label smoothing would be one of them.

Also it would be interesting to know the differences between MobileNet V1 and YAMNet, if there are any?

Thank you in advance