Temporally Strong Labels - ICASSP 2021 paper

66 views
Skip to first unread message

Arjun Pankajakshan

unread,
May 13, 2022, 9:34:11 AM5/13/22
to audioset-users
Hi everyone,

Many thanks to Audioset team for releasing temporally strong labels for a large amount of data.

I have a doubt about how to make full use of this strong labels for the audio classification task- the strong labels are released with a temporal resolution of approximately 0.1 sec, however the audioset embeddings are computed on 960ms/1s grid. 

In my understanding, the audio classification task using the current audioset embeddings (N, 10, 128) offer the following training setups -
  • weak labelled training with a label resolution of 10sec
  • Strong labelled training with a label resolution of 1 sec
Which means though the temporally strong labels are having a temporal resolution of 0.1 sec we cannot make use of this in the training, otherwise the embeddings should be computed in 100 ms resolution (N, 100, 128).

Can someone please comment on this? Please correct me if I get it wrong or missed any contents in the paper or any follow-up updates.


Arjun
PhD scholar 
C4DM, QMUL


Robert Mcanany

unread,
May 16, 2022, 10:27:00 AM5/16/22
to Arjun Pankajakshan, audioset-users
I'm no expert, but that is my understanding as well.
> --
> You received this message because you are subscribed to the Google Groups "audioset-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to audioset-user...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/audioset-users/bdaddddf-d37e-4e3c-b6c4-6d102550c284n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages