use vggish to get class names like in yamnet

71 views

Skip to first unread message

Maros Galik

unread,

Jul 28, 2023, 11:23:56 AM7/28/23

to audiose...@googlegroups.com

Hello,

I am interested in obtaining Audioset scores and class names from YT8M VGGish embeddings.

In your demonstration, you employed a WAV file as input for a pre-trained YAMNet model, which provided classes from the Audioset Ontology. Is there a potential method to utilize a VGGish input for the YAMNet model?

Thank you kindly in advance!

Maros

Dan Ellis

unread,

Jul 28, 2023, 2:21:43 PM7/28/23

to audioset-users

The VGGish classifier has no knowledge of AudioSet. To "convert" VGGish embeddings into AudioSet categories, you would need another classifier layer.

To the extent that VGGish has successfully extracted the "semantic essence" of the audio, this layer might be relatively shallow (one fully-connected layer?), so might not need a lot of training data. However, it's an unknown mapping so will need to be learned.

The AudioSet embeddings and labels are the data you would use to train this transformation classifier.