the meaning of LSTM parameters in nnet3

507 views
Skip to first unread message

shunfeich...@gmail.com

unread,
Feb 13, 2017, 10:42:13 PM2/13/17
to kaldi-help
Hi,all:
    I have trained the LSTM model in nnet3, and the result also is good. However, I still can't understand the meaning of LSTM parameters. And these how to affect the result of train? Someone can give me some paper to study ? 
Here are the parameters that I don't know: 
    clipping_threshold 
    chunk_width
    chunk_left_context/right_context
    label_delay
    lstm_delay
    and the relationship between splice_indexes and num_lstm_layers


Thank you very much!

chenshunfei

Daniel Povey

unread,
Feb 13, 2017, 10:50:48 PM2/13/17
to kaldi-help

Please don't post duplicate questions; I answered some of this in a separate thread but will repeat it here.

clipping_threshold relates to derivative truncation.

chunk_width is because we train on fixed-size chunks of data, it's the number of frames per chunk.

chunk_left_context/right_context is the number of frames of input we give per chunk, to the left/right of the parts with labels (it's additional context to the LSTM).

label-delay has the same effect as putting all the labels 5 frames later than where they appear in the original system's alignments, so the LSTM gets to see further in the future than it normally would (this makes it less asymmetric, since it would normally see infinite left context and zero right context).

lstm-delay (or the delay=x parameter in lstm layers in xconfigs), controls the recurrence period of the LSTM.  Negative, IIRC, means a forwards-in-time LSTM, and positive means a backwards-in-time LSTM (it's badly named, it's really the opposite of a delay).  Values of +1 and -1 mean standard LSTMs; with (e.g.) -3, it's like having several separate 3 chains of LSTMs, each with a recurrence period of 3 frames.

Regarding splice_indexes and num_lstm_layers: splice_indexes relates to splicing frames together in TDNNs, and num_lstm_layers is the number of LSTM layers, probably in a setup where some layers are LSTM and some are TDNN.  But both of these are only used in setups with outdated scripts.  The latest setup will use the 'xconfig' mechanism which is much clearer.


I don't recommend to tune any of these things for now.  The only thing you should possibly tune is the cell-dim (and for projected LSTMs, there are two other dims which should generally be one quarter of the cell-dim).


Dan


--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

shunfeich...@gmail.com

unread,
Feb 14, 2017, 12:32:55 AM2/14/17
to kaldi-help, dpo...@gmail.com
Thank you, very much! This is my first time to ask question.

在 2017年2月14日星期二 UTC+8上午11:50:48,Dan Povey写道:
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
Message has been deleted

Mehadi Hasan

unread,
Sep 8, 2018, 1:36:55 AM9/8/18
to kaldi-help
Hello, shunfei chen. I need some help from you in training a LSTM model in kaldi nnet3. Can you give me your gmail id, please?
Reply all
Reply to author
Forward
0 new messages