--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Thank you, Dan.Actually I also tried using LDA+MLLT as input features on 3hr data and got similar results. Now I am waiting for results of the same training on 40hr data.
I notice that ivector-extractor-est does not accept the spk2utt option and looks like accumulating statistics per utterance.
In run_ivector_common.sh, i-vectors of training data are extracted per sub-speaker with at most 2 utterances, for better generalization, while those of testing data are extracted per speaker. I wonder if this mismatch would affect the performance.
btw, are there any methods to check the correctness of extracted i-vectors? There might be bugs in my code but I can't find them by reviewing the codes.
Thank you very much for your explanation. Now I believe it's really difficult to train an i-vector extractor with only 3 hr data. I tried "extracting i-vectors" by randomly sampling from a standard Gaussian distribution and got almost identical curve of log probabilities in DNN training.
In my experiment on 40 hr data using LDA+MLLT as input features, i-vectors also gave a slightly worse WER. The i-vector extractor should be almost the same as run_ivector_common.sh, except I use PLP+pitch features rather than high-resolution MFCC for LDA training, and extract one i-vector per utterance rather than sub-speaker with 2 utterances. But I don't think these should be the problem, right?
I just checked log files of ivector-extractor-est (update.*.log), but couldn't find an objective function improvement between 5 and 10. Did you mean Update():ivector-extractor.cc:1191 ?
My experiments on 3 hours data always set the i-vector dimension to 40 (100 for 40 hours data and swbd) but it's still difficult to train. I am planning to do some research on DNN+i-vector and trying to build a strong baseline. I was expecting an improvement on fMLLR, but now I hope at least it works on 40 hours data and LDA features.Actually I am mostly using ivector-extract, not ivector-extract-online2. Sorry for not mentioning that. I am not sure whether the normal range (5, 10) is also applicable to ivector-extract. The objective function improvements for most jobs range from 3 to 10 although some are higher than 10. Just now I extracted i-vectors for 40 hours data with ivector-extract-online2. All objective function improvements range from 2.3 to 10.
Thank you for your suggestion. I will try building a system without pitch.
egs/swbd/s5c/local/nnet3/run_tdnn.sh.