Hello AudioSet users!
Given the interest on this list in audio embeddings trained on AudioSet (e.g. VGGish), we thought you might be interested in
OpenL3, an open-source deep audio embedding model trained on AudioSet.
OpenL3 is an improved version
of the self-supervised L3-Net,
and outperforms VGGish and SoundNet (and the original L3-Net) on several sound recognition tasks.
We're excited to announce the release of version 0.3.1 of OpenL3: In
this latest version, we have added functionality for extracting image embeddings, processing video files, and batch processing. OpenL3
is open source and readily available for everyone to use: if you have TensorFlow installed just run pip
install openl3 and
you're good to go.
Full details are provided in the following paper:
We're excited to see what the community does with OpenL3, and of course if you have any feedback please don't hesitate to reach out.
Cheers!
Justin, on behalf of the OpenL3 team: Jason Cramer, Ho-Hsiang Wu, Justin Salamon and Juan Pablo Bello.