DNN+BLSTM Training

1,104 views
Skip to first unread message

Shubham Khandelwal

unread,
Mar 8, 2016, 12:02:06 PM3/8/16
to kaldi-help
Hello all,

I wish to build a DNN+BLSTM architecture. So, firstly I am using the DNN (using nnet2) and then the output of DNN should be consumed by BLSTM (nnet3).

In the file (kaldi-trunk/egs/wsj/s5/local/nnet3/run_lstm.sh) when we run the following script for BLSTM:

  steps/nnet3/lstm/train.sh --stage $train_stage \
    --label-delay $label_delay \
    --lstm-delay "$lstm_delay" \
    --num-epochs $num_epochs --num-jobs-initial $num_jobs_initial --num-jobs-final $num_jobs_final \
    --num-chunk-per-minibatch $num_chunk_per_minibatch \
    --samples-per-iter $samples_per_iter \
    --splice-indexes "$splice_indexes" \
    --feat-type raw \
    --online-ivector-dir exp/nnet3/ivectors_train_si284 \
    --cmvn-opts "--norm-means=false --norm-vars=false" \
    --initial-effective-lrate $initial_effective_lrate --final-effective-lrate $final_effective_lrate \
    --momentum $momentum \
    --cmd "$decode_cmd" \
    --num-lstm-layers $num_lstm_layers \
    --cell-dim $cell_dim \
    --hidden-dim $hidden_dim \
    --recurrent-projection-dim $recurrent_projection_dim \
    --non-recurrent-projection-dim $non_recurrent_projection_dim \
    --chunk-width $chunk_width \
    --chunk-left-context $chunk_left_context \
    --chunk-right-context $chunk_right_context \
    --egs-dir "$common_egs_dir" \
    --remove-egs $remove_egs \
    data/train_si284_hires data/lang exp/tri4b_ali_si284 $dir

Here, Instead of providing the training data ("data/train_si284_hires"), I will be using the output of DNN to build DNN-BLSTM.

I saw carefully these two files: steps/nnet2/train_tanh.sh  &  steps/nnet2/decode.sh. Still, I am not able to figure out that In Which file, I am extracting DNN output or Which function can provide me the output of DNN so that I can use it in the input for BLSTM (by replacing "data/train_si284_hires") or Is there any function which can give me the extraction features of DNN in the format for the input of BLSTM ?
 
Thank you very much for time and specially for Kaldi.

Yours Sincerely,
Shubham Khandelwal

Vimal Manohar

unread,
Mar 8, 2016, 12:18:12 PM3/8/16
to kaldi-help

--
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Vimal Manohar
PhD Student
Electrical & Computer Engineering
Johns Hopkins University

Vijayaditya Peddinti

unread,
Mar 8, 2016, 12:19:16 PM3/8/16
to kaldi-help
You can nnet-compute to get the posteriors from the nnet2 network. This IIRC can be directly used as input to nnet3. But you should be very careful about the pre-processing done in nnet3. I can immediately think of  the issue with compression used for storing nnet3 egs. You could remove the LDA extraction stage in local/nnet3/run_ivector_common.sh, assuming the outputs of the DNN are approximately orthogonal.

But a better way to accomplish this, with reduced IO load, is to use the DNN as a preprocessor inside the nnet3 network. You could make use of the FixedAffineComponents to do this. This would make your life very simple as you would just have to change the nnet3 make_configs.py file and nothing else.

--Vijay

--

Daniel Povey

unread,
Mar 8, 2016, 2:13:51 PM3/8/16
to kaldi-help
Vijay, it is the case that if the recurrence indexes in the BLSTM config generator are set to 0 for some layers, those layers just become DNN?

Vijayaditya Peddinti

unread,
Mar 8, 2016, 5:40:26 PM3/8/16
to kaldi-help
No that is not the case. If the lstm/make_configs.py script is called it assumes there are a specified number of lstm (num_lstm_layers) layers in the beginning of the network, and rest of the layer (determined from --splice-indexes are DNN layers).

--Vijay

Shubham Khandelwal

unread,
Mar 10, 2016, 10:09:41 AM3/10/16
to kaldi-help
Hello,

Thank you all for your reply.

Vijay, you suggested to change the nnet3 make_configs.py file for DNN as a pre-processor inside the nnet3 network.

But Can you please tell me more in detail that how should I modify make_configs.py file for this ?

Thank you.

Yours Sincerely,
Shubham Khandelwal

Vijayaditya Peddinti

unread,
Mar 10, 2016, 12:46:56 PM3/10/16
to kaldi-help
You can look at the current tdnn/make_configs.py file and see how an lda.mat is added. You would use the same method to add your dnn layers. You would have to extract the DNN parameters and dump them into separate files in Kaldi format. You can look at lda.mat to see what the Kaldi format is.

--Vijay

Shubham Khandelwal

unread,
Mar 16, 2016, 6:03:24 AM3/16/16
to kaldi-help
Hello,

After extracting the DNN parameters in separate files, I will have to also change the lstm/train.sh file for DNN-BLSTM. Right ?
Also, Can you please check the attached make_configs.py for DNN-BLSTM just to be sure whether its correct or not. If not the please let me know your suggestions.
In this I have added the part of tdnn/make_config.py into lstm/make_config.py (which is in between ### Begin DNN  &  ##### End DNN )

Looking forward for your response.

Thanking you.

Yours Sincerely,
Shubham
make_configs.py

Vijayaditya Peddinti

unread,
Mar 16, 2016, 11:56:13 AM3/16/16
to kaldi-help
On Wed, Mar 16, 2016 at 6:03 AM, Shubham Khandelwal <skhl...@gmail.com> wrote:
Hello,

After extracting the DNN parameters in separate files, I will have to also change the lstm/train.sh file for DNN-BLSTM. Right ?

You can avoid doing this by calling make_configs.py from the top level script. See here. You can use the steps/nnet3/train_rnn.py even for your DNN+BLSTM.

 
Also, Can you please check the attached make_configs.py for DNN-BLSTM just to be sure whether its correct or not. If not the please let me know your suggestions.
In this I have added the part of tdnn/make_config.py into lstm/make_config.py (which is in between ### Begin DNN  &  ##### End DNN )

You are adding num_hidden_layers fully connected feed-forward layers and num_lstm_layers lstm layers and then again num_hidden_layers-num_lstm_layers DNN layers. You probably forgot to remove the last part.
I would recommend carefully understanding what the script does. Most of the options in the script are not necessary in your case (e.g. --pooling* parameters). These can be eliminated to simplify the script.

Shubham Khandelwal

unread,
Mar 18, 2016, 2:49:52 PM3/18/16
to kaldi-help

Hello,


Thank you for your reply.
I used “nnet-to-raw-nnet”  to extract raw file from final.mdl of dnn. In raw file of almost every layer, there are following 3  matrices:


Affinecomponent […..]

Biasparam [.…]

TanhComponent [....]


Now I wish to use these matrices in the input .mat file instead of lda.mat file in lstm/make_config.py to make DNN-BLSTM.


lda.mat has only one matrix. But I have 3 matrices in the raw file.

So Can you please tell me how to use these 3 matrices for .mat file in this case ?


Looking forward for your response.


Thank you very much.

Vijayaditya Peddinti

unread,
Mar 19, 2016, 9:09:50 PM3/19/16
to kaldi-help
You would have to convert all the AffineComponents in your previous DNN to FixedAffineComponents (see AddFixedAffineLayer method).

--Vijay

Shubham Khandelwal

unread,
Mar 22, 2016, 12:34:47 PM3/22/16
to kaldi-help
Hello,

Thank you for your reply.
I knew about the conversion of AffineComonents to FixedAffineComponents.
But I have following matrices also in the raw files:

<BiasParams>
<AffineComponentPreconditioned> <LearningRate> .... <LinearParams>
<TanhComponent> <Dim> .... <ValueSum>
<DerivSum>

So How should I use these matrices in the input .mat file instead of lda.mat file for every output layer of DNN in lstm/make_config.py (to make DNN-BLSTM) ?


Looking forward for your response.

Thank you very much.


Daniel Povey

unread,
Mar 22, 2016, 2:10:23 PM3/22/16
to kaldi-help
I don't really understand what you are trying to do, but I think it is based on a misunderstanding.  The easiest way to do DNN+BLSTM experiments would be to modify one of the python scripts that generates config files, to have regular DNN-type layers before BLSTM layers, and train them jointly.
Dan

Shubham Khandelwal

unread,
Mar 23, 2016, 6:02:54 AM3/23/16
to kaldi-help, dpo...@gmail.com
Dear Dan,

Sorry for misunderstanding.

I am also trying to do what you and Vijay said before.

I was trying to modify the following make_config.py file:
https://github.com/kaldi-asr/kaldi/blob/master/egs/wsj/s5/steps/nnet3/lstm/make_configs.py#L227
In this line, I will be using the output of DNN layers instead of lda.mat.

Now my question was that, In lda.mat we have only 1 matix. But In the output layer of DNN(I have attached one of the layer output of DNN for your reference in .txt file), I have following matrices in all layer's output:

<FixedAffineComponent>
<BiasParams>
<AffineComponentPreconditioned> <LearningRate> .... <LinearParams> [... ]
<TanhComponent> <Dim> .... <ValueSum> [...] <DerivSum> [....]

I knew how to use FixedAffineComponent But I don't know that how should I use the rest matrices for every layer in make_configs.py#L227 ?

Sorry for inconvenience.

Thank you very much.
3rdlayer.txt

Daniel Povey

unread,
Mar 23, 2016, 6:36:46 PM3/23/16
to Shubham Khandelwal, kaldi-help
You are going about it the wrong way.  Forget about that lda.mat, that is not the way you would implement a DNN+BLSTM.  Better to just train everything jointly.  I don't have time to show you step by step though.
Dan

Reply all
Reply to author
Forward
0 new messages