I am trying to extract MFCC to augmented voxceleb data.
python3 steps/data/augment_data_dir.py --utt-suffix "babble" --bg-snrs "20:17:15:13" --num-bg-noises "3:4:5:6:7" --bg-noise-dir "data/musan_speech" data/voxceleb data/voxceleb_babble
after the augmentation, I take random subset:
utils/subset_data_dir.sh data/voxceleb_babble 1000 data/voxceleb_babble_1k
utils/fix_data_dir.sh data/voxceleb_babble_1k
and then run the make_mfcc script:
steps/make_mfcc.sh --mfcc-config conf/mfcc.conf --nj 40 --cmd "$train_cmd" \
data/voxceleb_babble_1k exp/make_mfcc $mfccdir
The script fails, and that the error I have got in the LOG file:
# compute-mfcc-feats --write-utt2dur=ark,t:exp/make_mfcc/utt2dur.1 --verbose=2 --config=conf/mfcc.conf scp,p:exp/make_mfcc/wav_voxceleb_babble_1k.1.scp ark:- | copy-feats --write-num-frames=ark,t:exp/make_mfcc/utt2num_frames.1 --compress=true ark:- ark,scp:/common_space_docker/kaldi/egs/sre16/v2_SRE18data_full_train/mfcc/raw_mfcc_voxceleb_babble_1k.1.ark,/common_space_docker/kaldi/egs/sre16/v2_SRE18data_full_train/mfcc/raw_mfcc_voxceleb_babble_1k.1.scp
# Started at Tue Sep 3 09:36:57 UTC 2019
#
copy-feats --write-num-frames=ark,t:exp/make_mfcc/utt2num_frames.1 --compress=true ark:- ark,scp:/common_space_docker/kaldi/egs/sre16/v2_SRE18data_full_train/mfcc/raw_mfcc_voxceleb_babble_1k.1.ark,/common_space_docker/kaldi/egs/sre16/v2_SRE18data_full_train/mfcc/raw_mfcc_voxceleb_babble_1k.1.scp
compute-mfcc-feats --write-utt2dur=ark,t:exp/make_mfcc/utt2dur.1 --verbose=2 --config=conf/mfcc.conf scp,p:exp/make_mfcc/wav_voxceleb_babble_1k.1.scp ark:-
wav-reverberate --shift-output=true '--additive-signals=wav-reverberate --duration=5.504 "sox -t wav /common_space_docker/TDNN_train/musan/speech/us-gov/speech-us-gov-0026.wav -r 8k -t wav - |" - |,wav-reverberate --duration=5.504 "sox -t wav /common_space_docker/TDNN_train/musan/speech/librivox/speech-librivox-0113.wav -r 8k -t wav - |" - |,wav-reverberate --duration=5.504 "sox -t wav /common_space_docker/TDNN_train/musan/speech/us-gov/speech-us-gov-0178.wav -r 8k -t wav - |" - |,wav-reverberate --duration=5.504 "sox -t wav /common_space_docker/TDNN_train/musan/speech/librivox/speech-librivox-0078.wav -r 8k -t wav - |" - |,wav-reverberate --duration=5.504 "sox -t wav /common_space_docker/TDNN_train/musan/speech/us-gov/speech-us-gov-0134.wav -r 8k -t wav - |" - |' --start-times=0,0,0,0,0 --snrs=20,17,17,17,13 - -
wav-reverberate --duration=5.504 'sox -t wav /common_space_docker/TDNN_train/musan/speech/us-gov/speech-us-gov-0026.wav -r 8k -t wav - |' -
sox WARN rate: rate clipped 270 samples; decrease volume?
sox WARN dither: dither clipped 230 samples; decrease volume?
ASSERTION_FAILED (wav-reverberate[5.5.0~1-c449]:main():wav-reverberate.cc:286) Assertion failed: (samp_freq == samp_freq_input)
[ Stack-Trace: ]
/opt/kaldi/src/lib/libkaldi-base.so(kaldi::MessageLogger::LogMessage() const+0x82c) [0x7f85943dd2aa]
/opt/kaldi/src/lib/libkaldi-base.so(kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)+0x6c) [0x7f85943ddd18]
wav-reverberate(main+0xfab) [0x405623]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0) [0x7f859378c830]
wav-reverberate(_start+0x29) [0x403c79]
Aborted (core dumped)
ERROR (compute-mfcc-feats[5.5.0~1-c449]:Read4ByteTag():wave-reader.cc:56) WaveData: expected 4-byte chunk-name, got read error
I have the line "--allow-downsample=true" in the mfcc.conf file
I succeeded to extract mfcc for reverb augmentation using the "reverberate_data_dir.py" with "--source sampling-rate 16000".
Do you have any idea how to solve the problem?
Thanks!
Bar