Problems with MFCC extraction using online2-wav-dump-features

desf...@gmail.com

unread,

Apr 1, 2016, 7:23:18 AM4/1/16

to kaldi-help

Hi all,

I was following the recipe in egs/rm/s5/local/online/run_nnet2_wsj_joint.sh and I encountered a problem when extracting egs. I have isolated the error and it is produced by the call:

extract-segments scp:wav.scp segments ark:- | \

online2-wav-dump-features --verbose=1 --config=feature.conf ark:spk2utt ark,s,cs:- ark:-

wav.scp, segment and spk2utt correspond only to one utterance and feature.conf is the normal one for nnet2. Here is the contents of the mfcc.conf used in feature.conf, as I will get back to it later:

--use-energy=false

--num-mel-bins=40

--num-ceps=40

--low-freq=20

--high-freq=-400

--sample-frequency=16000

--snip-edges=false

So when this command is executed I get the following error:

LOG (online2-wav-dump-features:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor

LOG (online2-wav-dump-features:ComputeDerivedVars():ivector-extractor.cc:204) Done.

KALDI_ASSERT: at online2-wav-dump-features:Init:kaldi-vector.cc:167, failed: dim >= 0

Stack trace is:

kaldi::KaldiGetStackTrace()

kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)

kaldi::Vector<float>::Init(int)

kaldi::Vector<float>::Resize(int, kaldi::MatrixResizeType)

kaldi::ExtractWaveformRemainder(kaldi::VectorBase<float> const&, kaldi::FrameExtractionOptions const&, kaldi::Vector<float>*)

kaldi::Mfcc::ComputeInternal(kaldi::VectorBase<float> const&, kaldi::MelBanks const&, kaldi::Matrix<float>*, kaldi::Vector<float>*) const

kaldi::Mfcc::Compute(kaldi::VectorBase<float> const&, float, kaldi::Matrix<float>*, kaldi::Vector<float>*)

kaldi::OnlineGenericBaseFeature<kaldi::Mfcc>::AcceptWaveform(float, kaldi::VectorBase<float> const&)

kaldi::OnlineNnet2FeaturePipeline::AcceptWaveform(float, kaldi::VectorBase<float> const&)

online2-wav-dump-features(main+0x5ff) [0x600abc]

/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f2d59d11ec5]

online2-wav-dump-features() [0x6003f9]

KALDI_ASSERT: at online2-wav-dump-features:Init:kaldi-vector.cc:167, failed: dim >= 0

Stack trace is:

kaldi::KaldiGetStackTrace()

kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)

kaldi::Vector<float>::Init(int)

kaldi::Vector<float>::Resize(int, kaldi::MatrixResizeType)

kaldi::ExtractWaveformRemainder(kaldi::VectorBase<float> const&, kaldi::FrameExtractionOptions const&, kaldi::Vector<float>*)

kaldi::Mfcc::ComputeInternal(kaldi::VectorBase<float> const&, kaldi::MelBanks const&, kaldi::Matrix<float>*, kaldi::Vector<float>*) const

kaldi::Mfcc::Compute(kaldi::VectorBase<float> const&, float, kaldi::Matrix<float>*, kaldi::Vector<float>*)

kaldi::OnlineGenericBaseFeature<kaldi::Mfcc>::AcceptWaveform(float, kaldi::VectorBase<float> const&)

kaldi::OnlineNnet2FeaturePipeline::AcceptWaveform(float, kaldi::VectorBase<float> const&)

online2-wav-dump-features(main+0x5ff) [0x600abc]

/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f2d59d11ec5]

online2-wav-dump-features() [0x6003f9]

If I get rid of the "--snip-edges=false" option the command will work normally, but I cannot use this solution as my nnet2 setup uses this option. As the problem seems related to the mfcc extraction in online2-wav-dump-features I tried to manually extract them using the same feature.conf and wav.scp, etc and it worked correctly.

extract-segments scp,p:wav.scp segments ark:- | compute-mfcc-feats --config=mfcc.conf ark:- ark:-

Are these two mfcc extraction the same, if so how do they differ? Is snip-edges=false causing online2-wav-dump-features to expect more audio than it is supposed?

Bests,

Nicolás

Daniel Povey

unread,

Apr 1, 2016, 2:55:29 PM4/1/16

to kaldi-help

I looked into this, and it's not a question of a a simple bug-- it's more a conceptual mismatch between how the features are extracted with --snip-edges=false, and the mechanism of 'waveform remainder' that's used in online recognition. Because with --snip-edges=false, the frame starts slightly before the real signal starts, we would need to do some extra bookkeeping, which would change the structure of the waveform-related bits of online recognition code.

I think this might be a case of 'the cure is worse than the disease'. I am thinking of reversing my previous recommendation that new recipes be created with 'snip-edges=false', and recommending to just use the default for now (snip-edges=true).

Dan

--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

desf...@gmail.com

unread,

Apr 4, 2016, 6:19:46 AM4/4/16

to kaldi-help, dpo...@gmail.com

Thanks for the reply Dan, I will use the default option from now on.

Daniel Povey

unread,

Apr 10, 2016, 2:29:09 PM4/10/16

to desf...@gmail.com, kaldi-help

Hi,

If you have a chance to test what you were doing again (i.e. running online stuff using --snip-edges=false) using the candidate new code, it would be great.

I have created a pull request in

https://github.com/kaldi-asr/kaldi/pull/679

with code changes that should enable what you were doing to work. Since there were a lot of changes, we should test this thoroughly before merging.

Dan

Reply all

Reply to author

Forward