In these recent days, I use kaldi to implement a LSTM+DNN system which has been proposed in the Tara's paper“Convolutional, long short-term memory, fully connected deep neural networks”, however, the results I got were not as I expected.
Here are some details of my experiments:
1. A LSTM baseline system (2 layers) is trained on 300hrs speech data using kaldi's rm recipe of nnet run_lstm.sh, it got around 15% relative WER improvement than the DNN system.
2. 2 DNN layers (fully connected layers) are added after the output of the LSTM to train a LSTM+DNN system, results are listed as below:
LSTM: WER=19.4%
LSTM+DNN: WER=29.4%
All the LSTM+DNN system training parameter configurations are the same as the LSTM, except the extra added 2 DNN layers in the nnet proto file. such as,learning rate =0.0001, splice = 0, momentum = 0.9, BPTT=20, etc. Also including the nnet initialization method.
I am not sure whether I can do this implementation directly using nnet, and in the recent updated kaldi, I find there is a lstm recipe- wsj/s5/steps/nnet3/lstm/train.sh, from this script, the "--hidden-dim" is already taken as the lstm input, does it mean that I can use this recipe to reach my lstm+dnn goal ?
I really appreciate if someone can tell me some points.
-Yanhua
--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.