Extracting audio metadata into a csv file for making wav.scp and other files

Sage Khan

unread,

Jun 25, 2022, 9:32:04 AM6/25/22

to kaldi-help

I am trying to process my audio data for training an ASR on Kaldi. What I want is to run a python or bash script that can collect data from all audio files in a folder and save it in csv files. The data I need is

Audio file name Audio file extension Sample rate or bit rate Audio File path (e.g. ~/ProgramFiles/kaldi/data) Duration Transcript Speaker-ID Speaker-Name Gender

So I have saved the folders as {DATA-ROOT}/{LANGUAGE-NAME}/{GENDER}/{SPEAKER-NAME}/xxx.wav xxx.wav is saved as Story123, Sentence1,2,3 or word1,2,3 etc

What I want the script (python or bash) to do is to scrape out these details from a folder I give as input, it searches and dumps all data in columns mentioned above. Each directory containing the wav files has the associated transcript saved there as txt file as well. So that should be in line with the audio detail row in the csv file. so for sentence1.wav there is a sentence1.txt which has the associated transcript.

Quick help in this regard will be appreciated

Regards

Sage Khan

unread,

Jun 28, 2022, 12:33:53 PM6/28/22

to kaldi-help

This is a good resource to answer this

https://www.youtube.com/watch?v=IEMVk7r8_-M

Sage Khan

unread,

Jul 5, 2022, 4:24:22 AM7/5/22

to kaldi-help

These should also help

txt-transcript-combiner.sh

wavscp-maker.sh

audioscanner.py

metadat.py

Reply all

Reply to author

Forward