Is MP3 support an eventual goal?

Davyd Betchkal

unread,

Dec 2, 2020, 6:52:56 PM12/2/20

to birdvox

Hello there, I've been watching some of the developments of BirdVox Detect and Classify from afar. I work with a lot of different scholars and managers in Alaska.

Many (including my own NPS colleagues) collect data as long-duration MP3 recordings. I was scanning the repository and this Groups discussion but I haven't seen whether your team has intentions of adding support for detection within MP3 audio? I appreciate that this is a non-trivial alteration.

Thank you,

Davyd

Justin Salamon

unread,

Dec 2, 2020, 8:19:17 PM12/2/20

to Davyd Betchkal, birdvox

Hi Davyd,

Thanks for getting in touch. Currently BVD and BVC rely on the soundfile python library for audio I/O, which in turn relies on the libsndfile library, which doesn't support mp3 (yet). While there are other python audio libraries that do support mp3 loading, we rely on the functionality of soundfile, in particular seeking to load a segment of an audiofile without having to load all the data into memory, which is important for robust processing of long recordings. My colleagues Vincent and Jason can probably comment on this in greater detail.

That said, we realize that this is an important feature and we would love to support it if we can. We'll try to do some digging around this to see what we can do - perhaps there are alternative builds of libsndfile that do support mp3 that we could leverage, potentially at the cost of a custom installation process for users who require this feature.

Sincerely,

Justin

--
You received this message because you are subscribed to the Google Groups "birdvox" group.
To unsubscribe from this group and stop receiving emails from it, send an email to birdvox+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/birdvox/2db71250-426c-43bc-ad08-7920457bc18bn%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Josh Adams

unread,

Dec 2, 2020, 11:03:48 PM12/2/20

to Justin Salamon, Davyd Betchkal, birdvox

Davyd,

Not the answer you're looking for, but it would be fairly straightforward to write a script to use a program like SoX (http://sox.sourceforge.net/) to convert an MP3 to an Soundfile compatible format like WAV or FLAC. They could then run BVD on that temporary file and delete it afterwards. Downsides to this are that you'd need a decent chunk of free disk space depending on how long long-duration is and making something with compatibility that works across platforms and is easy to document. Python would seem like the obvious choice given you need it already for BVD.

Josh

To view this discussion on the web visit https://groups.google.com/d/msgid/birdvox/CAKS_OiR3RLYS8GNtOfmdohH5Jn4BpU21qMMJgkWTGeMpnj87WQ%40mail.gmail.com.

Vincent Lostanlen

unread,

Dec 3, 2020, 3:03:48 AM12/3/20

to bir...@googlegroups.com

Hello Davyd, Josh, Justin, and all,

Thank you for the feature request. I have opened a ticket on the GitHub issue tracker: https://github.com/BirdVox/birdvoxdetect/issues/75

Davyd is correct that using an off-the-shelf converter like pysox should get you going, as long as you're OK with loading the entire MP3 file first.

For context, we've designed BirdVoxDetect in such a way that it can process audio files of arbitrary length, even if they don't fit into RAM. This is important for full night recordings: they are ~1GB in size and their spectrograms are even heavier, which tends to overflow the memory. Thus, internally, we process audio files by "chunks" of 450 seconds. (i.e. 8 chunks per hour)

I think that a realistic path to address this would be to replace calls to pysoundfile by calls to librosa.load with a movable offset parameter and a fixed duration of 450 seconds.

Documentation: https://librosa.org/doc/latest/generated/librosa.load.html

Source code: https://librosa.org/doc/latest/_modules/librosa/core/audio.html#load

For WAV and FLAC files, should be backwards compatible, both in terms of usage and package dependencies. (we already require librosa for computing spectrograms)

As a bonus, it will become possible to process MP3 files directly in BirdVoxDetect by installing ffmpeg or gstreamer to the conda environment. See this paragraph on the README section of librosa:

https://github.com/librosa/librosa#audioread-and-mp3-support

How does this sound? If you agree with the plan i can experiment with librosa.load and prepare a pull request for the next version of BirdVoxDetect.

Sincerely,

Vincent.

To view this discussion on the web visit https://groups.google.com/d/msgid/birdvox/CAFOenNG1NjVWLjB5_%2B1xJWin0O7f7t8tpeA%3D_Vk%2B0qqVOwF6%3DA%40mail.gmail.com.

Davyd Betchkal

unread,

Dec 7, 2020, 12:24:12 PM12/7/20

to birdvox

Hello Vincent - `librosa` seems like an elegant solution. I haven't encountered that library before. In the past I worked on a project where we used `subprocess` to send commands directly to ffmpeg, which allows some of the same custom-seeking functionality, too. We're also frequently working with long audio files (1 or 2 GB split points) so I can appreciate the need for not having to read and hold the whole file in memory!

Thank you for considering MP3 support. I appreciate that.

Davyd

Dan Stowell

unread,

Dec 8, 2020, 11:47:47 AM12/8/20

to birdvox

Hello all,

I have a geeky method for MP3 support for long files. It works in
Linux/Mac only - and it's geeky. But here it is, in case any of you
benefit from it:

In order to read a long-duration MP3 file into BirdVoxDetect (or any
other WAV-streaming analysis), you can create a "unix pipe" and use it
together with an MP3 decoder command such as lame or sox. This will feed
the MP3 into your software without occupying any disk space.

In a new terminal window, run these 3 lines (replacing MYMP3PATH):

fifo=/tmp/mp3decoding01.fifo
mkfifo $fifo
lame --decode --silent MYMP3PATH $fifo

Then in Python, treat /tmp/mp3decoding01.fifo as if it was a wave file
to be read into BirdVoxDetect. No extra disk space is needed, it's all
on-the-fly.

This can't be done in windows, I think.

Best
Dan

On 07/12/2020 17:24, Davyd Betchkal wrote:
> Hello Vincent - `/librosa` /seems like an elegant solution. I haven't

> encountered that library before. In the past I worked on a project where

> we used /`subprocess`/ to send commands directly to ffmpeg, which allows

>> <http://sox.sourceforge.net/>)

>> to convert an MP3 to an Soundfile compatible format like WAV or
>> FLAC. They could then run BVD on that temporary file and delete it
>> afterwards. Downsides to this are that you'd need a decent chunk
>> of free disk space depending on how long long-duration is and
>> making something with compatibility that works across platforms
>> and is easy to document. Python would seem like the obvious choice
>> given you need it already for BVD.
>>
>> Josh
>>
>> On Wed, Dec 2, 2020 at 5:19 PM Justin Salamon
>> <justin....@gmail.com> wrote:
>>
>> Hi Davyd,
>>
>> Thanks for getting in touch. Currently BVD and BVC rely on the
>> soundfile python library for audio I/O, which in turn relies
>> on the libsndfile library, which doesn't support mp3 (yet

>> <https://github.com/libsndfile/libsndfile/issues/258>).

>> <https://groups.google.com/d/msgid/birdvox/2db71250-426c-43bc-ad08-7920457bc18bn%40googlegroups.com?utm_medium=email&utm_source=footer>.

>> For more options, visit https://groups.google.com/d/optout

>> <https://groups.google.com/d/optout>.

>>
>> --
>> You received this message because you are subscribed to the
>> Google Groups "birdvox" group.
>> To unsubscribe from this group and stop receiving emails from
>> it, send an email to birdvox+u...@googlegroups.com.
>> To view this discussion on the web visit

>> https://groups.google.com/d/msgid/birdvox/CAKS_OiR3RLYS8GNtOfmdohH5Jn4BpU21qMMJgkWTGeMpnj87WQ%40mail.gmail.com
>> <https://groups.google.com/d/msgid/birdvox/CAKS_OiR3RLYS8GNtOfmdohH5Jn4BpU21qMMJgkWTGeMpnj87WQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.

>> For more options, visit https://groups.google.com/d/optout

>> <https://groups.google.com/d/optout>.

>>
>> --
>> You received this message because you are subscribed to the Google
>> Groups "birdvox" group.
>> To unsubscribe from this group and stop receiving emails from it,
>> send an email to birdvox+u...@googlegroups.com.
>> To view this discussion on the web visit

>> https://groups.google.com/d/msgid/birdvox/CAFOenNG1NjVWLjB5_%2B1xJWin0O7f7t8tpeA%3D_Vk%2B0qqVOwF6%3DA%40mail.gmail.com
>> <https://groups.google.com/d/msgid/birdvox/CAFOenNG1NjVWLjB5_%2B1xJWin0O7f7t8tpeA%3D_Vk%2B0qqVOwF6%3DA%40mail.gmail.com?utm_medium=email&utm_source=footer>.

>>
>> For more options, visit https://groups.google.com/d/optout

>> <https://groups.google.com/d/optout>.

>
> --
> You received this message because you are subscribed to the Google
> Groups "birdvox" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to birdvox+u...@googlegroups.com

> <mailto:birdvox+u...@googlegroups.com>.

> To view this discussion on the web visit

> https://groups.google.com/d/msgid/birdvox/43c8d2b7-9cd7-4446-a6d5-e4b559d693fcn%40googlegroups.com
> <https://groups.google.com/d/msgid/birdvox/43c8d2b7-9cd7-4446-a6d5-e4b559d693fcn%40googlegroups.com?utm_medium=email&utm_source=footer>.

> For more options, visit https://groups.google.com/d/optout

> <https://groups.google.com/d/optout>.

--
Dan Stowell
Lecturer
Machine Listening Lab
Centre for Digital Music
Queen Mary, University of London
Mile End Road, London E1 4NS
http://www.mcld.co.uk/research/
http://machine-listening.eecs.qmul.ac.uk/

Reply all

Reply to author

Forward