Make_mfcc.sh failed - Augmented voxceleb data

Bar Madar

unread,

Sep 3, 2019, 6:16:13 AM9/3/19

to kaldi...@googlegroups.com

Hey,

I am trying to extract MFCC to augmented voxceleb data.

I am using those lines for the augmentation proccess:

local/make_musan.sh $data_root/musan data

utils/data/get_utt2dur.sh data/musan_${name}
mv data/musan_${name}/utt2dur data/musan_${name}/reco2dur

python3 steps/data/augment_data_dir.py --utt-suffix "babble" --bg-snrs "20:17:15:13" --num-bg-noises "3:4:5:6:7" --bg-noise-dir "data/musan_speech" data/voxceleb data/voxceleb_babble

after the augmentation, I take random subset:

utils/subset_data_dir.sh data/voxceleb_babble 1000 data/voxceleb_babble_1k
utils/fix_data_dir.sh data/voxceleb_babble_1k
and then run the make_mfcc script:
steps/make_mfcc.sh --mfcc-config conf/mfcc.conf --nj 40 --cmd "$train_cmd" \
data/voxceleb_babble_1k exp/make_mfcc $mfccdir
The script fails, and that the error I have got in the LOG file:
# compute-mfcc-feats --write-utt2dur=ark,t:exp/make_mfcc/utt2dur.1 --verbose=2 --config=conf/mfcc.conf scp,p:exp/make_mfcc/wav_voxceleb_babble_1k.1.scp ark:- | copy-feats --write-num-frames=ark,t:exp/make_mfcc/utt2num_frames.1 --compress=true ark:- ark,scp:/common_space_docker/kaldi/egs/sre16/v2_SRE18data_full_train/mfcc/raw_mfcc_voxceleb_babble_1k.1.ark,/common_space_docker/kaldi/egs/sre16/v2_SRE18data_full_train/mfcc/raw_mfcc_voxceleb_babble_1k.1.scp 
# Started at Tue Sep  3 09:36:57 UTC 2019
#
copy-feats --write-num-frames=ark,t:exp/make_mfcc/utt2num_frames.1 --compress=true ark:- ark,scp:/common_space_docker/kaldi/egs/sre16/v2_SRE18data_full_train/mfcc/raw_mfcc_voxceleb_babble_1k.1.ark,/common_space_docker/kaldi/egs/sre16/v2_SRE18data_full_train/mfcc/raw_mfcc_voxceleb_babble_1k.1.scp 
compute-mfcc-feats --write-utt2dur=ark,t:exp/make_mfcc/utt2dur.1 --verbose=2 --config=conf/mfcc.conf scp,p:exp/make_mfcc/wav_voxceleb_babble_1k.1.scp ark:- 
wav-reverberate --shift-output=true '--additive-signals=wav-reverberate --duration=5.504 "sox -t wav /common_space_docker/TDNN_train/musan/speech/us-gov/speech-us-gov-0026.wav -r 8k -t wav - |" - |,wav-reverberate --duration=5.504 "sox -t wav /common_space_docker/TDNN_train/musan/speech/librivox/speech-librivox-0113.wav -r 8k -t wav - |" - |,wav-reverberate --duration=5.504 "sox -t wav /common_space_docker/TDNN_train/musan/speech/us-gov/speech-us-gov-0178.wav -r 8k -t wav - |" - |,wav-reverberate --duration=5.504 "sox -t wav /common_space_docker/TDNN_train/musan/speech/librivox/speech-librivox-0078.wav -r 8k -t wav - |" - |,wav-reverberate --duration=5.504 "sox -t wav /common_space_docker/TDNN_train/musan/speech/us-gov/speech-us-gov-0134.wav -r 8k -t wav - |" - |' --start-times=0,0,0,0,0 --snrs=20,17,17,17,13 - - 
wav-reverberate --duration=5.504 'sox -t wav /common_space_docker/TDNN_train/musan/speech/us-gov/speech-us-gov-0026.wav -r 8k -t wav - |' - 
sox WARN rate: rate clipped 270 samples; decrease volume?
sox WARN dither: dither clipped 230 samples; decrease volume?
ASSERTION_FAILED (wav-reverberate[5.5.0~1-c449]:main():wav-reverberate.cc:286) Assertion failed: (samp_freq == samp_freq_input)

[ Stack-Trace: ]
/opt/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x82c) [0x7f85943dd2aa]
/opt/kaldi/src/lib/libkaldi-base.so(kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)+0x6c) [0x7f85943ddd18]
wav-reverberate(main+0xfab) [0x405623]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f859378c830]
wav-reverberate(_start+0x29) [0x403c79]

Aborted (core dumped)
ERROR (compute-mfcc-feats[5.5.0~1-c449]:Read4ByteTag():wave-reader.cc:56) WaveData: expected 4-byte chunk-name, got read error
I have the line "--allow-downsample=true" in the mfcc.conf file
I succeeded to extract mfcc for reverb augmentation using the "reverberate_data_dir.py" with "--source sampling-rate 16000".
Do you have any idea how to solve the problem?

Thanks!
Bar

Daniel Povey

unread,

Sep 3, 2019, 7:27:46 AM9/3/19

to kaldi-help, David Snyder, Vimal Manohar

Hm.

The data you are trying to augment, is that 8kHz data or 16kHz data? If it's 16kHz data and you want your system to be an 8Khz system you might have to explicitly downsample it using a sox command by modifying the wav.scp (it would be to something ending a pipe symbol, |). Because probably those reverberation commands don't support automatic downsampling.

Dan

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAFHC87EyrrvNgA4yAbnNDnO8ZtnaXB-De35Ox1awC%2Bp80i59NQ%40mail.gmail.com.

David Snyder

unread,

Sep 3, 2019, 9:56:56 AM9/3/19

to kaldi-help

Take a look at your wav.scp in your augmentation directories, e.g., data/musan_speech. My guess is that you're downsampling it to 8k prematurely. As a result, the wav-reverberate binary takes as input the original voxceleb data, sampled at 16k and the augmentation data, which you downsampled to 8k. This results in the error you're seeing here.

Take a look at what your local/make_musan.sh is doing. There's probably an option in there to specify the sampling rate. It should be sampling at 16k here, not 8k. Don't worry, the flag --allow-downsample in the MFCC config will still give you 8k MFCCs.

Note that your repo is out of date. You should be calling make_musan.sh as follows if it were up to date:

steps/data/make_musan.sh --sampling-rate 16000 $musan_root data

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

Bar Madar

unread,

Sep 4, 2019, 5:02:14 AM9/4/19

to kaldi...@googlegroups.com

Ok, thanks!

I updated the script and as you say, there is an option to define sample rate.

But now my question is that I want to mix data from SRE (8Khz) and also from VoxCeleb (16Khz). Do I need to use the Make_musan twice? once with 8Khz for the SRE data and the second with 16Khz for the VoxCeleb? (Before I augment the data).

At the end, I plan to merge all the augmentation data and extract MFCC for train my nnet3.

Thanks,

Bar

‫בתאריך יום ג׳, 3 בספט׳ 2019 ב-16:56 מאת ‪David Snyder‬‏ <‪david.ry...@gmail.com‬‏>:‬

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/348235e2-bd9e-4726-b724-b2344f9f8b8e%40googlegroups.com.

Daniel Povey

unread,

Sep 4, 2019, 5:31:01 AM9/4/19

to kaldi-help

I would say probably yes.

Either that or downsample the 16kHz data explicitly in the wav.scp via a sox command, instead of relying on allow-downsample=true.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAFHC87FPDmvjvFdifK1Lo2%2BAtynJhWNNHwTrVpCSyHEeBCLhEQ%40mail.gmail.com.

David Snyder

unread,

Sep 4, 2019, 11:19:48 AM9/4/19

to kaldi-help

Yes, I would create two copies of MUSAN, with 16k and 8k sampling rates. Then use the 8k version of MUSAN to augment the SRE data, and the 16k copy of MUSAN to augment the VoxCeleb data.

On Wednesday, September 4, 2019 at 5:31:01 AM UTC-4, Dan Povey wrote:

I would say probably yes.
Either that or downsample the 16kHz data explicitly in the wav.scp via a sox command, instead of relying on allow-downsample=true.

On Wed, Sep 4, 2019 at 5:02 PM Bar Madar <mad...@post.bgu.ac.il> wrote:

Ok, thanks!
I updated the script and as you say, there is an option to define sample rate.
But now my question is that I want to mix data from SRE (8Khz) and also from VoxCeleb (16Khz). Do I need to use the Make_musan twice? once with 8Khz for the SRE data and the second with 16Khz for the VoxCeleb? (Before I augment the data).
At the end, I plan to merge all the augmentation data and extract MFCC for train my nnet3.

Thanks,

Bar

‫בתאריך יום ג׳, 3 בספט׳ 2019 ב-16:56 מאת ‪David Snyder‬‏ <‪david.r...@gmail.com‬‏>:‬

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/348235e2-bd9e-4726-b724-b2344f9f8b8e%40googlegroups.com.

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/CAFHC87FPDmvjvFdifK1Lo2%2BAtynJhWNNHwTrVpCSyHEeBCLhEQ%40mail.gmail.com.

Bar Madar

unread,

Sep 5, 2019, 2:40:39 AM9/5/19

to kaldi...@googlegroups.com

Thanks,

That is exactly what I did, and it works.

Bar

To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/8983fb65-b1ed-453d-9265-a825710845a5%40googlegroups.com.

Reply all

Reply to author

Forward