Fwd: New tools & data for soundscape synthesis and online audio annotation

8 views
Skip to first unread message

Kevin Austin

unread,
Oct 11, 2017, 12:54:09 AM10/11/17
to cec-con...@googlegroups.com, ACMA, Justin Salamon, electroa...@lists.concordia.ca


From: Justin Salamon <justin....@NYU.EDU>
Subject: New tools & data for soundscape synthesis and online audio annotation


We're glad to announce the release of two open-source tools and a new dataset developed as part of the SONYC project we hope will be of use to the community: 


Scaper: a library for soundscape synthesis and augmentation
- Automatically synthesize soundscapes with corresponding ground truth annotations 
- Useful for running controlled ML experiments (ASR, sound event detection, bioacoustic species recognition, etc.)
- Useful for running controlled experiments to assess human annotation performance
- Potentially useful for generating data for source separation experiments (might require some extra code)
- Potentially useful for generating ambisonic soundscapes (definitely requires some extra code)


AudioAnnotator: a javascript web interface for annotating audio data
- Developed in collaboration with Edith Law and her students at the University of Waterloo's HCI Lab
- A web interface that allows users to annotate audio recordings
- Supports 3 types of visualization (waveform, spectrogram, invisible)
- Useful for crowdsourcing audio labels
- Useful for running controlled experiments on crowdsourcing audio labels
- Supports feedback mechanisms for providing real-time feedback to the user based on their annotations


URBAN-SED dataseta new dataset for sound event detection
- Includes 10,000 soundscapes with strongly labeled sound events generated using scaper
- Totals almost 30 hours and includes close to 50,000 annotated sound events
- Baseline convnet results on URBAN-SED are included in the scaper-paper.


Further information about scaper, the AudioAnnotator and the URBAN-SED dataset, including controlled experiments on the quality of crowdsourced human annotations as a function of visualization and soundscape complexity, are provided in the following papers:

M. Cartwright, A. Seals, J. Salamon, A. Williams, S. Mikloska, D. MacConnell, E. Law, J. Bello, and O. Nov. Proceedings of the ACM on Human-Computer Interaction, 1(2), 2017.

J. Salamon, D. MacConnell, M. Cartwright, P. Li, and J. P. Bello.
In IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, Oct. 2017.

We hope you find these tools and data useful and look forward to receiving your feedback (and pull requests!).
Cheers, on behalf of the entire team,
Justin Salamon & Mark Cartwright.


--
Justin Salamon, PhD
Senior Research Scientist
Music and Audio Research Laboratory (MARL)
& Center for Urban Science and Progress (CUSP)
New York University, New York, NY

Reply all
Reply to author
Forward
0 new messages