Temporally Strong Labels - ICASSP 2021 paper

92 views

Skip to first unread message

Arjun Pankajakshan

unread,

May 13, 2022, 9:34:11 AM5/13/22

to audioset-users

Hi everyone,

Many thanks to Audioset team for releasing temporally strong labels for a large amount of data.

I have a doubt about how to make full use of this strong labels for the audio classification task- the strong labels are released with a temporal resolution of approximately 0.1 sec, however the audioset embeddings are computed on 960ms/1s grid.

In my understanding, the audio classification task using the current audioset embeddings (N, 10, 128) offer the following training setups -

weak labelled training with a label resolution of 10sec
Strong labelled training with a label resolution of 1 sec

Which means though the temporally strong labels are having a temporal resolution of 0.1 sec we cannot make use of this in the training, otherwise the embeddings should be computed in 100 ms resolution (N, 100, 128).

Can someone please comment on this? Please correct me if I get it wrong or missed any contents in the paper or any follow-up updates.

Arjun

PhD scholar

C4DM, QMUL

Robert Mcanany

unread,

May 16, 2022, 10:27:00 AM5/16/22

to Arjun Pankajakshan, audioset-users

I'm no expert, but that is my understanding as well.

> --
> You received this message because you are subscribed to the Google Groups "audioset-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to audioset-user...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/audioset-users/bdaddddf-d37e-4e3c-b6c4-6d102550c284n%40googlegroups.com.

Reply all

Reply to author

Forward

0 new messages