steps/online/nnet2/train_ivector_extractor.sh and error and warnings on gaussian-min-count

394 views
Skip to first unread message

razbor...@gmail.com

unread,
Jan 10, 2016, 7:08:33 AM1/10/16
to kaldi-help

I am trying to train ivectors. 

steps/online/nnet2/train_ivector_extractor.sh outputs the following error log.


# ivector-extractor-est --num-threads=1 exp/nnet2_online/extractor/0.ie exp/nnet2_online/extractor/acc.0 exp/nnet2_online/extractor/1.ie
# Started at Sun Jan 10 16:47:29 JST 2016
#
ivector-extractor-est --num-threads=1 exp/nnet2_online/extractor/0.ie exp/nnet2_online/extractor/acc.0 exp/nnet2_online/extractor/1.ie
LOG (ivector-extractor-est:main():ivector-extractor-est.cc:55) Reading model
LOG (ivector-extractor-est:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (ivector-extractor-est:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (ivector-extractor-est:main():ivector-extractor-est.cc:59) Reading statistics
LOG (ivector-extractor-est:Update():ivector-extractor.cc:1176) Overall auxf/frame on training data was -270.422 per frame over 14093.2 frames.

WARNING (ivector-extractor-est:UpdateProjection():ivector-extractor.cc:1229) Skipping Gaussian index 0 because count 33.4303 is below min-count.()

File Edit Options Buffers Tools Help                                                                                                                                             
LOG (ivector-extractor-est:UpdateProjections():ivector-extractor.cc:1330) Overall objective function improvement for M (mean projections) was 0 per frame over 14093.2 frames.
KALDI_ASSERT: at ivector-extractor-est:UpdateVariances:ivector-extractor.cc:1385, failed: var_floor_count > 0.0
Stack trace is:
kaldi::KaldiGetStackTrace()
kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)
kaldi::IvectorExtractorStats::UpdateVariances(kaldi::IvectorExtractorEstimationOptions const&, kaldi::IvectorExtractor*) const
kaldi::IvectorExtractorStats::Update(kaldi::IvectorExtractorEstimationOptions const&, kaldi::IvectorExtractor*) const
ivector-extractor-est(main+0x397) [0x4742a4]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fa69105eec5]
ivector-extractor-est() [0x473e49]
KALDI_ASSERT: at ivector-extractor-est:UpdateVariances:ivector-extractor.cc:1385, failed: var_floor_count > 0.0
Stack trace is:
kaldi::KaldiGetStackTrace()
kaldi::KaldiAssertFailure_(char const*, char const*, int, char const*)
kaldi::IvectorExtractorStats::UpdateVariances(kaldi::IvectorExtractorEstimationOptions const&, kaldi::IvectorExtractor*) const
kaldi::IvectorExtractorStats::Update(kaldi::IvectorExtractorEstimationOptions const&, kaldi::IvectorExtractor*) const
ivector-extractor-est(main+0x397) [0x4742a4]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7fa69105eec5]
ivector-extractor-est() [0x473e49]

# Accounting: time=0 threads=1
# Ended (code 255) at Sun Jan 10 16:47:29 JST 2016, elapsed time 0 seconds


I have already checked the ivector-extractor.h and gaussian_min_count is set to 100.0, and this value causes WARNINGS.


struct IvectorExtractorEstimationOptions {
  double variance_floor_factor;
  double gaussian_min_count;
  int32 num_threads;
  bool diagonalize;
  IvectorExtractorEstimationOptions(): variance_floor_factor(0.1),
                                       gaussian_min_count(100.0),
                                       diagonalize(true) { }
  void Register(OptionsItf *opts) {
    opts->Register("variance-floor-factor", &variance_floor_factor,
                   "Factor that determines variance flooring (we floor each covar "
                   "to this times global average covariance");
    opts->Register("gaussian-min-count", &gaussian_min_count,
                   "Minimum total count per Gaussian, below which we refuse to "
                   "update any associated parameters.");
    opts->Register("diagonalize", &diagonalize,
                   "If true, diagonalize the quadratic term in the "
                   "objective function. This reorders the ivector dimensions"
                   "from most to least important.");
  }
};





















Can I avoid this error?



Daniel Povey

unread,
Jan 10, 2016, 3:30:06 PM1/10/16
to kaldi-help
It looks like this can happen if *all* of your Gaussians had counts below 'gaussian-min-count' (default: 100).
This means you have configured the system very badly.  Normally you'd train the iVector extractor on a thousand hours of data or more, and you seem to be training on just over two minutes of data [14093 frames].  This isn't going to give good results.
Dan


--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Daniel Povey

unread,
Jan 10, 2016, 11:52:08 PM1/10/16
to atsush...@netsmile.jp, kaldi-help
No, .ie stands for ivector-extractor, it contains the information needed to extract iVectors given the data.  In the conventional notation it contains the UBM together with the T matrix (== total-variability matrix), although we formulated the math a little differently (but equivalently) so there is not a single matrix.
Dan

On Sun, Jan 10, 2016 at 8:47 PM, <atsush...@netsmile.jp> wrote:
Hi, Dan

I executed the same shell script for another  massive data, as you pointed out.

A binary file "0.ie" has been created now. The training hasn't finished yet, but I understand that 0.ie is an  ivector for the first epoch.

Is this situation in a correct process?



2016年1月11日月曜日 5時30分06秒 UTC+9 Dan Povey:

razbor...@gmail.com

unread,
Jan 12, 2016, 3:33:51 AM1/12/16
to kaldi-help, razbor...@gmail.com
Thank you, Dan

By the way, an nnet traing shell script outputs the following echos.

Training neural net (pass 233)
Training neural net (pass 234)
Warning: the mix up opertion is disabled!
    Ignore mix up leaves number specified
Training neural net (pass 235)
Training neural net (pass 236)

This error is happened in "steps/nnet2/train_pnorm_simple2.sh"

  if [ "$mix_up" -gt 0 ] && [ $x -eq $mix_up_iter ]; then    
   echo "Warning: the mix up opertion is disabled!"    
   echo " Ignore mix up leaves number specified"    
  fi

Does this message mean that the shell failed an improvement of nnet learning models?
I am wondering if this warning has some harmful effects...

















2016年1月10日日曜日 21時08分33秒 UTC+9 razbor...@gmail.com:

Daniel Povey

unread,
Jan 12, 2016, 2:34:13 PM1/12/16
to kaldi-help, razbor...@gmail.com
It's a harmless warning, due to an option that is now ignored but still present in some calling scripts.

Reply all
Reply to author
Forward
0 new messages