How to download easily the audio files instead of the provided features?

Jun Deng

unread,

Mar 8, 2017, 2:05:11 AM3/8/17

to audioset-users

First of all, thanks a lot for sharing such big audio event database to the community. Currently, only features dataset or csv files are provided. I know, it's possible to download the video based on the provided csv files. But it would be problematic to download such larger scale of video data from youtube. Hence, I would like to know if there is a way to download the audio files.

Jort Gemmeke

unread,

Mar 8, 2017, 3:01:50 AM3/8/17

to Jun Deng, audioset-users

Hi Jun,

Thanks for your interest in the dataset! I am afraid we don't support downloading the audio as it's against the YouTube terms of service.

We hope to release the model used to extract the features, which will allow you to use the dataset to build classifiers for your own data.

Jort

On Wed, Mar 8, 2017, 01:05 Jun Deng <jund...@gmail.com> wrote:

First of all, thanks a lot for sharing such big audio event database to the community. Currently, only features dataset or csv files are provided. I know, it's possible to download the video based on the provided csv files. But it would be problematic to download such larger scale of video data from youtube. Hence, I would like to know if there is a way to download the audio files.

--
You received this message because you are subscribed to the Google Groups "audioset-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to audioset-users+unsubscribe@googlegroups.com.
To post to this group, send email to audiose...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/audioset-users/ca8094ca-1475-4593-8a90-975e7bd053b8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jun Deng

unread,

Mar 8, 2017, 6:26:11 AM3/8/17

to audioset-users

Hi Jort,

Thanks for your immediate response. It would be also very cool if you release the trained CNN models.

Best,

Jun

On Wednesday, March 8, 2017 at 9:01:50 AM UTC+1, Jort Gemmeke wrote:

Hi Jun,

Thanks for your interest in the dataset! I am afraid we don't support downloading the audio as it's against the YouTube terms of service.
We hope to release the model used to extract the features, which will allow you to use the dataset to build classifiers for your own data.
Jort

On Wed, Mar 8, 2017, 01:05 Jun Deng <jund...@gmail.com> wrote:

First of all, thanks a lot for sharing such big audio event database to the community. Currently, only features dataset or csv files are provided. I know, it's possible to download the video based on the provided csv files. But it would be problematic to download such larger scale of video data from youtube. Hence, I would like to know if there is a way to download the audio files.

--
You received this message because you are subscribed to the Google Groups "audioset-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to audioset-user...@googlegroups.com.

paulo.ch...@gmail.com

unread,

Mar 10, 2017, 11:20:35 AM3/10/17

to audioset-users

Hi guys.

That's great.

Actually, I am trying to figure out the VGG-based acoustic model to obtain the 128 features from the patches of 96 x 64 bins.

Also, which sampling frequency did you use on the experiments?

Regards,

Paulo

On Wednesday, 8 March 2017 03:01:50 UTC-5, Jort Gemmeke wrote:

Hi Jun,

Thanks for your interest in the dataset! I am afraid we don't support downloading the audio as it's against the YouTube terms of service.
We hope to release the model used to extract the features, which will allow you to use the dataset to build classifiers for your own data.
Jort

On Wed, Mar 8, 2017, 01:05 Jun Deng <jund...@gmail.com> wrote:

First of all, thanks a lot for sharing such big audio event database to the community. Currently, only features dataset or csv files are provided. I know, it's possible to download the video based on the provided csv files. But it would be problematic to download such larger scale of video data from youtube. Hence, I would like to know if there is a way to download the audio files.

--
You received this message because you are subscribed to the Google Groups "audioset-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to audioset-user...@googlegroups.com.

Message has been deleted

Göran Sandström

unread,

Mar 16, 2017, 4:43:47 AM3/16/17

to audioset-users

This would be awesome, I guess it would be easy to retrain it on other labels by replacing the sigmoid layer?

On Wednesday, March 8, 2017 at 9:01:50 AM UTC+1, Jort Gemmeke wrote:

Hi Jun,

Thanks for your interest in the dataset! I am afraid we don't support downloading the audio as it's against the YouTube terms of service.
We hope to release the model used to extract the features, which will allow you to use the dataset to build classifiers for your own data.
Jort

On Wed, Mar 8, 2017, 01:05 Jun Deng <jund...@gmail.com> wrote:

First of all, thanks a lot for sharing such big audio event database to the community. Currently, only features dataset or csv files are provided. I know, it's possible to download the video based on the provided csv files. But it would be problematic to download such larger scale of video data from youtube. Hence, I would like to know if there is a way to download the audio files.

--
You received this message because you are subscribed to the Google Groups "audioset-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to audioset-user...@googlegroups.com.

Jort Gemmeke

unread,

Mar 21, 2017, 1:14:33 PM3/21/17

to audioset-users

The model will generate the audio features we released. You can train your own (shallow) layer on top of those features, and combine the two models for inference on your own data.

Alex Nichol

unread,

Apr 3, 2017, 7:45:17 PM4/3/17

to audioset-users

While I suppose nobody should encourage it, it is theoretically possible to do the download straight from YouTube.

Downloading the balanced training/evaluation sets only takes a day or two, since YouTube has a private API for fetching the raw audio of a video. See this repo: https://github.com/unixpickle/audioset/tree/bddd5c7e5d6e8b6fb565943ec5c608c3a8c7f8e7.

Blunt3k

unread,

Apr 12, 2017, 3:16:10 AM4/12/17

to audioset-users

Firstly thanks for releasing such great resources to the research world.

Do you still have plans on releasing the models to generate the features so we can play around with adding a shallow layer ?

Manoj Plakal

unread,

Apr 12, 2017, 12:46:26 PM4/12/17

to Blunt3k, audioset-users

Yes, we are currently working on disentangling the model and supporting code from our internal infrastructure so that it is usable outside Google, while remaining comparable to the original model we used to make the released features.

We'll update the mailing list when the model is available.

To unsubscribe from this group and stop receiving emails from it, send an email to audioset-users+unsubscribe@googlegroups.com.
To post to this group, send email to audioset-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/audioset-users/c0c8ef00-df66-42ef-a0fe-78112a2457ac%40googlegroups.com.

Edouard Oyallon

unread,

Jun 20, 2017, 2:40:52 PM6/20/17

to audioset-users

Hi, first of all, thanks a lot for these precious informations! I was wondering if by chance, there was somewhere a list of the 485 categories used for the classification task, either a script to get them. That would be wonderful.

Best regards,

EO

Dan Ellis

unread,

Jun 21, 2017, 2:02:16 PM6/21/17

to Edouard Oyallon, audioset-users

The subset of 485 AudioSet classes reflected those for which we had sufficient examples to evaluate at the time the paper was finalized.

By the time we made the actual AudioSet public data release, our additional annotation effort meant that we had 527 classes with sufficient classes. These are the 527 unique MIDs that appear in our release lists, e.g. http://storage.googleapis.com/us_audioset/youtube_corpus/v1/csv/eval_segments.csv. You can extract the list for instance by:

grep -v "#" eval_segments.csv | awk '{print $4}' | tr -d '"' | tr ',' '\n' | sort | uniq

They're also the first field of http://storage.googleapis.com/us_audioset/youtube_corpus/v1/qa/qa_true_counts.csv.

You convert the MIDs to display names by looking up in the main ontology file, https://github.com/audioset/ontology/blob/master/ontology.json

We haven't reported results on the (527 class) released eval set, and of course there are multiple approaches. We're approaching release of the model used to generate the embedding features we released; we'll try to include a baseline AudioSet classifier (with results) as well.

DAn.

We have not reported performance on

To unsubscribe from this group and stop receiving emails from it, send an email to audioset-users+unsubscribe@googlegroups.com.

To post to this group, send email to audiose...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/audioset-users/4294ca83-57b6-4bd1-9feb-83a70bf83030%40googlegroups.com.

Edouard Oyallon

unread,

Jun 27, 2017, 1:38:03 PM6/27/17

to audioset-users

Hi,

Thanks a lot for your answer. We're getting some troubles to obtain the data, like the line 53 of "eval_segments.csv" is not available.(it corresponds to that: https://www.youtube.com/watch?v=5CGQGSFGyg )

Apparently, 5 to 10% of the videos are not available. Any suggestions?

Thanks a lot again

Edouard

Ryan Monroe

unread,

Jun 27, 2017, 6:35:05 PM6/27/17

to audioset-users

Guessing that many videos were either deleted or made private. We will not be able to access them. It's not a big chunk of the data though!

Message has been deleted

Frank Aadam

unread,

Jun 28, 2017, 3:57:44 AM6/28/17

to audioset-users

Thank you for your great job. Is the model used to extract the features released? If not, Is there any trouble when releasing it?

Aadam

On Wednesday, March 8, 2017 at 3:01:50 AM UTC-5, Jort Gemmeke wrote:

Hi Jun,

Thanks for your interest in the dataset! I am afraid we don't support downloading the audio as it's against the YouTube terms of service.
We hope to release the model used to extract the features, which will allow you to use the dataset to build classifiers for your own data.
Jort

On Wed, Mar 8, 2017, 01:05 Jun Deng <jund...@gmail.com> wrote:

First of all, thanks a lot for sharing such big audio event database to the community. Currently, only features dataset or csv files are provided. I know, it's possible to download the video based on the provided csv files. But it would be problematic to download such larger scale of video data from youtube. Hence, I would like to know if there is a way to download the audio files.

--
You received this message because you are subscribed to the Google Groups "audioset-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to audioset-user...@googlegroups.com.

Dan Ellis

unread,

Jun 28, 2017, 9:27:46 AM6/28/17

to Ryan Monroe, audioset-users

On Tue, Jun 27, 2017 at 6:35 PM, Ryan Monroe <ryan.m...@gmail.com> wrote:

Guessing that many videos were either deleted or made private. We will not be able to access them. It's not a big chunk of the data though!

It's true. Using public YouTube links makes us entirely dependent on the original uploaders: they can delete or hide the video at any time.

We tried to minimize this by using only videos that had at least 1000 views and that had been up for at least a few months before we made the list, but we've seen anything up to 1% per month "erosion" of these kinds of YouTube sets.

However, the example you cite is actually a mistransliteration of the YTID. Note that the YTIDs in eval_segments.csv are sorted. Because the base64 alphabet used in YTIDs includes "-" and "_", and because "-" sorts lower than the letters, the first few hundred eval YTIDs begin with "-". Edouard dropped the leading "-" when constructing the URL. The tip-off is that the page says that the video doesn't exist, rather than "This video has been deleted" or something like that.

So the actual video referred on line 53 of eval_segments.csv is https://www.youtube.com//watch?v=-5CGQGSFGyg (note the minus), which is still there.

DAn.

To unsubscribe from this group and stop receiving emails from it, send an email to audioset-users+unsubscribe@googlegroups.com.
To post to this group, send email to audioset-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/audioset-users/209a558d-3c82-4bc3-90bb-b8b976c0f597%40googlegroups.com.

Manoj Plakal

unread,

Aug 8, 2017, 5:29:19 PM8/8/17

to Blunt3k, audioset-users

An update: the model is now available, please see https://groups.google.com/forum/#!topic/audioset-users/u69WCaBMeQg

Reply all

Reply to author

Forward