Extracting audio metadata into a csv file for making wav.scp and other files

995 views
Skip to first unread message

Sage Khan

unread,
Jun 25, 2022, 9:32:04 AM6/25/22
to kaldi-help
I am trying to process my audio data for training an ASR on Kaldi. What I want is to run a python or bash script that can collect data from all audio files in a folder and save it in csv files. The data I need is
Audio file name Audio file extension Sample rate or bit rate Audio File path (e.g. ~/ProgramFiles/kaldi/data) Duration Transcript Speaker-ID Speaker-Name Gender
So I have saved the folders as {DATA-ROOT}/{LANGUAGE-NAME}/{GENDER}/{SPEAKER-NAME}/xxx.wav xxx.wav is saved as Story123, Sentence1,2,3 or word1,2,3 etc
What I want the script (python or bash) to do is to scrape out these details from a folder I give as input, it searches and dumps all data in columns mentioned above. Each directory containing the wav files has the associated transcript saved there as txt file as well. So that should be in line with the audio detail row in the csv file. so for sentence1.wav there is a sentence1.txt which has the associated transcript.
Quick help in this regard will be appreciated
Regards

Sage Khan

unread,
Jun 28, 2022, 12:33:53 PM6/28/22
to kaldi-help
This is a good resource to answer this

https://www.youtube.com/watch?v=IEMVk7r8_-M

Sage Khan

unread,
Jul 5, 2022, 4:24:22 AM7/5/22
to kaldi-help

These should also help
txt-transcript-combiner.sh
wavscp-maker.sh
audioscanner.py
metadat.py
Reply all
Reply to author
Forward
0 new messages