Clotho v2 is published.

522 views
Skip to first unread message

Konstantinos Drossos

unread,
May 10, 2021, 8:41:40 AM5/10/21
to DCASE Discussions
Dear all, 

We are happy to announce that Clotho v2 is now published in Zenodo. You can find Clotho v2 in the same Zenodo entry for Clotho v1 (https://zenodo.org/record/4743815). 

Clotho v2 contains around 2000 more audio files (~40% increase compared to Clotho v1) introducing more training data and a new validation split. Evaluation and testing splits are not altered, i.e. evaluation and testing splits in v2 are kept the same as in v1. Each audio file is of 15-30 seconds long, having five captions of eight to 20 words. 

Changes in version 2: 

In version 2 of Clotho, there are audio files added in the development split and a new validation split is added. There are no changes in the evaluation split. 

Specifically: 
  • Now there are 3840 audio files in the development split. In Clotho version 1, there were 2893 audio files. Now, 947 new audio files are added. 
  • There are 1046 new audio files as the validation split. 
  • There is no overlap of audio files between splits
All new captions are treated as in version 1 of Clotho, i.e. having word consistency, no named entities, no speech transcription, and no hapax legomena between splits (i.e. words appearing only in one of the splits). 

Best, 

# Kostas
Reply all
Reply to author
Forward
0 new messages