Problems with MFCC extraction using online2-wav-dump-features

196 views
Skip to first unread message

desf...@gmail.com

unread,
Apr 1, 2016, 7:23:18 AM4/1/16
to kaldi-help
Hi all,

I was following the recipe in egs/rm/s5/local/online/run_nnet2_wsj_joint.sh and I encountered a problem when extracting egs. I have isolated the error and it is produced by the call:

extract-segments scp:wav.scp segments ark:- | \
         online2-wav-dump-features --verbose=1 --config=feature.conf ark:spk2utt ark,s,cs:- ark:-

wav.scp, segment and spk2utt correspond only to one utterance and feature.conf is the normal one for nnet2. Here is the contents of the mfcc.conf used in feature.conf, as I will get back to it later:
--use-energy=false
--num-mel-bins=40
--num-ceps=40
--low-freq=20
--high-freq=-400
--sample-frequency=16000
--snip-edges=false

So when this command is executed I get the following error:
LOG (online2-wav-dump-features:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (online2-wav-dump-features:ComputeDerivedVars():ivector-extractor.cc:204) Done.
KALDI_ASSERT: at online2-wav-dump-features:Init:kaldi-vector.cc:167, failed: dim >= 0
Stack trace is:
kaldi::KaldiGetStackTrace()
kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)
kaldi::Vector<float>::Init(int)
kaldi::Vector<float>::Resize(int, kaldi::MatrixResizeType)
kaldi::ExtractWaveformRemainder(kaldi::VectorBase<float> const&, kaldi::FrameExtractionOptions const&, kaldi::Vector<float>*)
kaldi::Mfcc::ComputeInternal(kaldi::VectorBase<float> const&, kaldi::MelBanks const&, kaldi::Matrix<float>*, kaldi::Vector<float>*) const
kaldi::Mfcc::Compute(kaldi::VectorBase<float> const&, float, kaldi::Matrix<float>*, kaldi::Vector<float>*)
kaldi::OnlineGenericBaseFeature<kaldi::Mfcc>::AcceptWaveform(float, kaldi::VectorBase<float> const&)
kaldi::OnlineNnet2FeaturePipeline::AcceptWaveform(float, kaldi::VectorBase<float> const&)
online2-wav-dump-features(main+0x5ff) [0x600abc]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f2d59d11ec5]
online2-wav-dump-features() [0x6003f9]
KALDI_ASSERT: at online2-wav-dump-features:Init:kaldi-vector.cc:167, failed: dim >= 0
Stack trace is:
kaldi::KaldiGetStackTrace()
kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)
kaldi::Vector<float>::Init(int)
kaldi::Vector<float>::Resize(int, kaldi::MatrixResizeType)
kaldi::ExtractWaveformRemainder(kaldi::VectorBase<float> const&, kaldi::FrameExtractionOptions const&, kaldi::Vector<float>*)
kaldi::Mfcc::ComputeInternal(kaldi::VectorBase<float> const&, kaldi::MelBanks const&, kaldi::Matrix<float>*, kaldi::Vector<float>*) const
kaldi::Mfcc::Compute(kaldi::VectorBase<float> const&, float, kaldi::Matrix<float>*, kaldi::Vector<float>*)
kaldi::OnlineGenericBaseFeature<kaldi::Mfcc>::AcceptWaveform(float, kaldi::VectorBase<float> const&)
kaldi::OnlineNnet2FeaturePipeline::AcceptWaveform(float, kaldi::VectorBase<float> const&)
online2-wav-dump-features(main+0x5ff) [0x600abc]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f2d59d11ec5]
online2-wav-dump-features() [0x6003f9]

If I get rid of the "--snip-edges=false" option the command will work normally, but I cannot use this solution as my nnet2 setup uses this option. As the problem seems related to the mfcc extraction in online2-wav-dump-features  I tried to manually extract them using the same feature.conf and wav.scp, etc and it worked correctly.

extract-segments scp,p:wav.scp segments ark:- | compute-mfcc-feats --config=mfcc.conf ark:- ark:-

Are these two mfcc extraction the same, if so how do they differ? Is snip-edges=false causing online2-wav-dump-features to expect more audio than it is supposed?

Bests,

Nicolás


Daniel Povey

unread,
Apr 1, 2016, 2:55:29 PM4/1/16
to kaldi-help
I looked into this, and it's not a question of a a simple bug-- it's more a conceptual mismatch between how the features are extracted with --snip-edges=false, and the mechanism of 'waveform remainder' that's used in online recognition.  Because with --snip-edges=false, the frame starts slightly before the real signal starts, we would need to do some extra bookkeeping, which would change the structure of the waveform-related bits of online recognition code.

I think this might be a case of 'the cure is worse than the disease'.  I am thinking of reversing my previous recommendation that new recipes be created with 'snip-edges=false', and recommending to just use the default for now (snip-edges=true).

Dan

--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

desf...@gmail.com

unread,
Apr 4, 2016, 6:19:46 AM4/4/16
to kaldi-help, dpo...@gmail.com
Thanks for the reply Dan, I will use the default option from now on.

Daniel Povey

unread,
Apr 10, 2016, 2:29:09 PM4/10/16
to desf...@gmail.com, kaldi-help

Hi,
If you have a chance to test what you were doing again (i.e. running online stuff using --snip-edges=false) using the candidate new code, it would be great.
I have created a pull request in
with code changes that should enable what you were doing to work.  Since there were a lot of changes, we should test this thoroughly before merging.
Dan

Reply all
Reply to author
Forward
0 new messages