how to edit final.mdl programmatically

881 views
Skip to first unread message

mirfan.ms...@seecs.edu.pk

unread,
Aug 8, 2017, 2:03:43 AM8/8/17
to kaldi-help
I've tried to edit the file in MATLAB but failed due to invalid format. Is there any way to change the values say linear params in final.mdl using python or c++?

Daniel Povey

unread,
Aug 8, 2017, 3:37:55 PM8/8/17
to kaldi-help
That file isn't really intended to be edited in that way, but if you
write the model in text form (nnet3-am-copy --binary=false or
nnet3-copy --binary=false) and you are OK at regex type things, it
should be possible to write a python script that would change the
parameters you want to change.
> --
> Go to http://kaldi-asr.org/forums.html find out how to join
> ---
> You received this message because you are subscribed to the Google Groups
> "kaldi-help" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kaldi-help+...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

mirfan.ms...@seecs.edu.pk

unread,
Aug 10, 2017, 3:04:01 AM8/10/17
to kaldi-help, dpo...@gmail.com
Ok @dan. I have edited the final.mdl file. But here is an issue. I'm saving the Linear params for input, output and forget layer as matrix using scipy savemat method. I want to use these matrices for next lstm model and will add matrices in config. 

The issue is savemat specify dictionary object to be stored and I don't know what are the variables or dictionary object in lda.mat. The .mat files I have created should be compatible with kaldi .mat file so that they can be used.

Daniel Povey

unread,
Aug 10, 2017, 12:53:36 PM8/10/17
to mirfan.ms...@seecs.edu.pk, kaldi-help
That's your problem. The text form of kaldi matrices is very easy and
transparent, like
[ 1 2
3 4 ]
and the exact use of whitespace at the beginning and end is flexible.
Like matlab. It's not that hard to duplicate.

mirfan.ms...@seecs.edu.pk

unread,
Aug 15, 2017, 8:32:27 AM8/15/17
to kaldi-help, mirfan.ms...@seecs.edu.pk, dpo...@gmail.com
Thanks @dan.

I've created the matrix files for linear params of each layer from final.mdl file. It seems there is a dimension mismatch issue of matrix.
Here is what I'm getting error:

# nnet3-init --srand=-3 exp/nnet3/lstm_ld7/init.raw exp/nnet3/lstm_ld7/configs/layer1.config exp/nnet3/lstm_ld7/0.raw 
# Started at Tue Aug 15 16:59:41 PKT 2017
#
nnet3-init --srand=-3 exp/nnet3/lstm_ld7/init.raw exp/nnet3/lstm_ld7/configs/layer1.config exp/nnet3/lstm_ld7/0.raw 
LOG (nnet3-init[5.2.9~2-cdb25d]:main():nnet3-init.cc:68) Read raw neural net from exp/nnet3/lstm_ld7/init.raw
ERROR (nnet3-init[5.2.9~2-cdb25d]:Check():nnet-nnet.cc:722) Dimension mismatch for network-node Lstm1_i1: input-dim 193 versus component-input-dim 292

[ Stack-Trace: ]

kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::MessageLogger::~MessageLogger()
kaldi::nnet3::Nnet::Check(bool) const
kaldi::nnet3::Nnet::ReadConfig(std::istream&)
main
__libc_start_main
_start


# Accounting: time=0 threads=1
# Ended (code 255) at Tue Aug 15 16:59:41 PKT 2017, elapsed time 0 seconds

It seem init.raw has 192 columns while linear params from final.mdl has 292 as dimensions. Are these linear params multiplied with features and are save in final.mdl file or  am I missing something here that is creating the issue? 

Daniel Povey

unread,
Aug 15, 2017, 3:02:36 PM8/15/17
to mirfan.ms...@seecs.edu.pk, kaldi-help
Firstly it looks like you are using older scripts--the currently
recommended scripts from the master (train_dnn.py, train_rnn.py, etc.)
don't have commands like that.

Anyway, the networks have different layers with different dimensions.
The inputs to the LSTM layers may not all have the same dimension, so
the matrices may not all be the same size. You might be mixing those
things up.

mirfan.ms...@seecs.edu.pk

unread,
Aug 16, 2017, 9:19:51 AM8/16/17
to kaldi-help, mirfan.ms...@seecs.edu.pk, dpo...@gmail.com
I've latest scripts but I'm stick to shell script files for the time being. Well it seems to be an issue with the config files. config files are attached. 

here are the nnet3-info script results for init.raw and 0.raw
./nnet3-info /home/irfan/kaldi/egs/voxforge/s5/exp/nnet3/lstm_ld7/init.raw
./nnet3-info /home/irfan/kaldi/egs/voxforge/s5/exp/nnet3/lstm_ld7/init.raw 
left-context: 2
right-context: 2
num-parameters: 0
modulus: 1
input-node name=input dim=13
output-node name=output input=Append(Offset(input, -2), Offset(input, -1), input, Offset(input, 1), Offset(input, 2)) dim=65 objective=linear


and
./nnet3-info /home/irfan/kaldi/egs/voxforge/s5/exp/nnet3/lstm_ld7/0.raw 
./nnet3-info /home/irfan/kaldi/egs/voxforge/s5/exp/nnet3/lstm_ld7/0.raw 
LOG (nnet3-info[5.2.9~2-cdb25d]:Read():nnet-simple-component.cc:2536) reading linear params values from file
LOG (nnet3-info[5.2.9~2-cdb25d]:Read():nnet-simple-component.cc:2536) reading linear params values from file
LOG (nnet3-info[5.2.9~2-cdb25d]:Read():nnet-simple-component.cc:2536) reading linear params values from file
LOG (nnet3-info[5.2.9~2-cdb25d]:Read():nnet-simple-component.cc:2536) reading linear params values from file
LOG (nnet3-info[5.2.9~2-cdb25d]:Read():nnet-simple-component.cc:2536) reading linear params values from file
LOG (nnet3-info[5.2.9~2-cdb25d]:Read():nnet-simple-component.cc:2536) reading linear params values from file
left-context: 0
right-context: 9
num-parameters: 954226
modulus: 1
input-node name=input dim=13
component-node name=L0_fixaffine component=L0_fixaffine input=Append(Offset(input, -2), Offset(input, -1), input, Offset(input, 1), Offset(input, 2)) input-dim=65 output-dim=65
component-node name=Lstm1_c_t component=Lstm1_c input=Sum(Lstm1_c1_t, Lstm1_c2_t) input-dim=512 output-dim=512
component-node name=Lstm1_i1 component=Lstm1_W_i-xr input=Append(L0_fixaffine, IfDefined(Offset(Lstm1_r_t, -1))) input-dim=193 output-dim=512
component-node name=Lstm1_i2 component=Lstm1_w_ic input=IfDefined(Offset(Lstm1_c_t, -1)) input-dim=512 output-dim=512
component-node name=Lstm1_i_t component=Lstm1_i input=Sum(Lstm1_i1, Lstm1_i2) input-dim=512 output-dim=512
component-node name=Lstm1_f1 component=Lstm1_W_f-xr input=Append(L0_fixaffine, IfDefined(Offset(Lstm1_r_t, -1))) input-dim=193 output-dim=512
component-node name=Lstm1_f2 component=Lstm1_w_fc input=IfDefined(Offset(Lstm1_c_t, -1)) input-dim=512 output-dim=512
component-node name=Lstm1_f_t component=Lstm1_f input=Sum(Lstm1_f1, Lstm1_f2) input-dim=512 output-dim=512
component-node name=Lstm1_o1 component=Lstm1_W_o-xr input=Append(L0_fixaffine, IfDefined(Offset(Lstm1_r_t, -1))) input-dim=193 output-dim=512
component-node name=Lstm1_o2 component=Lstm1_w_oc input=Lstm1_c_t input-dim=512 output-dim=512
component-node name=Lstm1_o_t component=Lstm1_o input=Sum(Lstm1_o1, Lstm1_o2) input-dim=512 output-dim=512
component-node name=Lstm1_h_t component=Lstm1_h input=Lstm1_c_t input-dim=512 output-dim=512
component-node name=Lstm1_g1 component=Lstm1_W_c-xr input=Append(L0_fixaffine, IfDefined(Offset(Lstm1_r_t, -1))) input-dim=193 output-dim=512
component-node name=Lstm1_g_t component=Lstm1_g input=Lstm1_g1 input-dim=512 output-dim=512
component-node name=Lstm1_c1_t component=Lstm1_c1 input=Append(Lstm1_f_t, IfDefined(Offset(Lstm1_c_t, -1))) input-dim=1024 output-dim=512
component-node name=Lstm1_c2_t component=Lstm1_c2 input=Append(Lstm1_i_t, Lstm1_g_t) input-dim=1024 output-dim=512
component-node name=Lstm1_m_t component=Lstm1_m input=Append(Lstm1_o_t, Lstm1_h_t) input-dim=1024 output-dim=512
component-node name=Lstm1_rp_t component=Lstm1_W-m input=Lstm1_m_t input-dim=512 output-dim=256
dim-range-node name=Lstm1_r_t_preclip input-node=Lstm1_rp_t dim-offset=0 dim=128
component-node name=Lstm1_r_t component=Lstm1_r input=Lstm1_r_t_preclip input-dim=128 output-dim=128
component-node name=Final_affine component=Final_affine input=Lstm1_rp_t input-dim=256 output-dim=1650
component-node name=Final_log_softmax component=Final_log_softmax input=Final_affine input-dim=1650 output-dim=1650
output-node name=output input=Offset(Final_log_softmax, 7) dim=1650 objective=linear
component name=L0_fixaffine type=FixedAffineComponent, input-dim=65, output-dim=65, linear-params-rms=0.006388, bias-{mean,stddev}=0.03215,0.5729
component name=Lstm1_W_i-xr type=NaturalGradientAffineComponent, input-dim=193, output-dim=512, learning-rate=0.001, max-change=0.75, linear-params-rms=0.07167, bias-{mean,stddev}=0.01383,1.04, rank-in=20, rank-out=80, num_samples_history=2000, update_period=4, alpha=4
component name=Lstm1_w_ic type=NaturalGradientPerElementScaleComponent, input-dim=512, output-dim=512, learning-rate=0.001, max-change=0.75, scales-min=-3.40722, scales-max=2.77667, scales-{mean,stddev}=0.003034,1.034, rank=8, update-period=10, num-samples-history=2000, alpha=4
component name=Lstm1_W_f-xr type=NaturalGradientAffineComponent, input-dim=193, output-dim=512, learning-rate=0.001, max-change=0.75, linear-params-rms=0.07219, bias-{mean,stddev}=-0.02353,1.019, rank-in=20, rank-out=80, num_samples_history=2000, update_period=4, alpha=4
component name=Lstm1_w_fc type=NaturalGradientPerElementScaleComponent, input-dim=512, output-dim=512, learning-rate=0.001, max-change=0.75, scales-min=-3.65117, scales-max=3.1187, scales-{mean,stddev}=0.03599,0.9966, rank=8, update-period=10, num-samples-history=2000, alpha=4
component name=Lstm1_W_o-xr type=NaturalGradientAffineComponent, input-dim=193, output-dim=512, learning-rate=0.001, max-change=0.75, linear-params-rms=0.07217, bias-{mean,stddev}=-0.02933,0.9943, rank-in=20, rank-out=80, num_samples_history=2000, update_period=4, alpha=4
component name=Lstm1_w_oc type=NaturalGradientPerElementScaleComponent, input-dim=512, output-dim=512, learning-rate=0.001, max-change=0.75, scales-min=-3.6183, scales-max=2.5519, scales-{mean,stddev}=0.02835,1.036, rank=8, update-period=10, num-samples-history=2000, alpha=4
component name=Lstm1_W_c-xr type=NaturalGradientAffineComponent, input-dim=193, output-dim=512, learning-rate=0.001, max-change=0.75, linear-params-rms=0.07189, bias-{mean,stddev}=-0.002364,1.028, rank-in=20, rank-out=80, num_samples_history=2000, update_period=4, alpha=4
component name=Lstm1_i type=SigmoidComponent, dim=512, self-repair-scale=1e-05
component name=Lstm1_f type=SigmoidComponent, dim=512, self-repair-scale=1e-05
component name=Lstm1_o type=SigmoidComponent, dim=512, self-repair-scale=1e-05
component name=Lstm1_g type=TanhComponent, dim=512, self-repair-scale=1e-05
component name=Lstm1_h type=TanhComponent, dim=512, self-repair-scale=1e-05
component name=Lstm1_c1 type=ElementwiseProductComponent, input-dim=1024, output-dim=512
component name=Lstm1_c2 type=ElementwiseProductComponent, input-dim=1024, output-dim=512
component name=Lstm1_m type=ElementwiseProductComponent, input-dim=1024, output-dim=512
component name=Lstm1_c type=BackpropTruncationComponent, dim=512, scale=1, count=0, recurrence-interval=1, clipping-threshold=30, clipped-proportion=0, zeroing-threshold=15, zeroing-interval=20, zeroed-proportion=0, count-zeroing-boundaries=0
component name=Lstm1_W-m type=NaturalGradientAffineComponent, input-dim=512, output-dim=256, learning-rate=0.001, max-change=0.75, linear-params-rms=0.0443, bias-{mean,stddev}=0.0008823,0.957, rank-in=20, rank-out=80, num_samples_history=2000, update_period=4, alpha=4
component name=Lstm1_r type=BackpropTruncationComponent, dim=128, scale=1, count=0, recurrence-interval=1, clipping-threshold=30, clipped-proportion=0, zeroing-threshold=15, zeroing-interval=20, zeroed-proportion=0, count-zeroing-boundaries=0
component name=Final_affine type=NaturalGradientAffineComponent, input-dim=256, output-dim=1650, learning-rate=0.001, max-change=1.5, linear-params-rms=0.06257, bias-{mean,stddev}=0.01987,0.9908, rank-in=20, rank-out=80, num_samples_history=2000, update_period=4, alpha=4
component name=Final_log_softmax type=LogSoftmaxComponent, dim=1650


Its seems dimension issue here for config and matrices taken out from final.mdl
configs.tar.gz

mirfan.ms...@seecs.edu.pk

unread,
Aug 24, 2017, 1:58:39 PM8/24/17
to kaldi-help, mirfan.ms...@seecs.edu.pk, dpo...@gmail.com
Actually, I'm trying to duplicate the matrices for input, output and forget gates for each layer. You are right the dimensions are different but the issue is I have not found any solution on how to change the default dimensions of these components.

Kindly guide me about this issue.

Daniel Povey

unread,
Aug 24, 2017, 1:59:49 PM8/24/17
to mirfan.ms...@seecs.edu.pk, kaldi-help
If the dimensions are different then they are different for structural reasons that cannot easily be changed, e.g. the input dimensions of those layers are different.
You could just try your method on the subset of those layers that have the same input dimension.

mirfan.ms...@seecs.edu.pk

unread,
Aug 28, 2017, 5:14:52 PM8/28/17
to kaldi-help, mirfan.ms...@seecs.edu.pk, dpo...@gmail.com
by subset, you mean component nodes?

Daniel Povey

unread,
Aug 28, 2017, 5:34:17 PM8/28/17
to mirfan.ms...@seecs.edu.pk, kaldi-help
Yes, I mean only some layers.

mirfan.ms...@seecs.edu.pk

unread,
Sep 5, 2017, 4:12:14 AM9/5/17
to kaldi-help, mirfan.ms...@seecs.edu.pk, dpo...@gmail.com
I've tried this method but it does not work. I've gone through the 0.raw model and converted it to 0.mdl in text format and tried to read and changed the structure of model. I've changed the paramters dimesnion of each gate and it does not show any error accept on the component Lstm1_rp_t  which shows dimension mismatch issue and I tried to find out where is that component in mdl file and but but did not find any params or info about this component except here
component-node name=Lstm1_c2_t component=Lstm1_c2 input=Append(Lstm1_i_t, Lstm1_g_t)

component
-node name=Lstm1_m_t component=Lstm1_m input=Append(Lstm1_o_t, Lstm1_h_t)

component
-node name=Lstm1_rp_t component=Lstm1_W-m input=Lstm1_m_t

dim
-range-node name=Lstm1_r_t_preclip input-node=Lstm1_rp_t dim-offset=0 dim=128
component
-node name=Lstm1_r_t component=Lstm1_r input=Lstm1_r_t_preclip

component
-node name=Final_affine component=Final_affine input=Lstm1_rp_t
component
-node name=Final_log_softmax component=Final_log_softmax input=Final_affine
output
-node name=output input=Offset(Final_log_softmax, 7) objective=linear

and the error is:
 Dimension mismatch for network-node Lstm1_rp_t: input-dim 511 versus component-input-dim 512

[ Stack-Trace: ]

kaldi::MessageLogger::HandleMessage(kaldi::LogMessageEnvelope const&, char const*)
kaldi::MessageLogger::~MessageLogger()
kaldi::nnet3::Nnet::Check(bool) const
kaldi::nnet3::Nnet::ReadConfig(std::istream&)
kaldi::nnet3::Nnet::Read(std::istream&, bool)
void kaldi::ReadKaldiObject<kaldi::nnet3::Nnet>(std::string const&, kaldi::nnet3::Nnet*)
main
__libc_start_main
_start


I think it's the last point where if the this error is resolved, then I will get what I want. Kindly look into this issue and guide where I'm missing the point of change the dimension for this component.

Daniel Povey

unread,
Sep 5, 2017, 2:43:59 PM9/5/17
to mirfan.ms...@seecs.edu.pk, kaldi-help
By 'only some layers' I meant you should try your idea on just the
LSTM layers that had the same dimension. I don't know what you mean
by 'does not work'.

I don't have time to figure out what your problem was when you changed
the dimensions of some layers.

mirfan.ms...@seecs.edu.pk

unread,
Sep 6, 2017, 1:56:04 AM9/6/17
to kaldi-help, mirfan.ms...@seecs.edu.pk, dpo...@gmail.com
I tried to change the dimension of 2nd layer of nnet3 lstm model
# Input gate control : W_i* matrices
component name=Lstm2_W_i-xr type=NaturalGradientAffineComponent input-dim=350 output-dim=512  max-change=0.75
# note : the cell outputs pass through a diagonal matrix
component name=Lstm2_w_ic type=NaturalGradientPerElementScaleComponent  dim=512  param-mean=0.0 param-stddev=1.0  max-change=0.75
# Forget gate control : W_f* matrices
component name=Lstm2_W_f-xr type=NaturalGradientAffineComponent input-dim=350 output-dim=512  max-change=0.75
# note : the cell outputs pass through a diagonal matrix
component name=Lstm2_w_fc type=NaturalGradientPerElementScaleComponent  dim=512  param-mean=0.0 param-stddev=1.0  max-change=0.75
#  Output gate control : W_o* matrices
component name=Lstm2_W_o-xr type=NaturalGradientAffineComponent input-dim=350 output-dim=512  max-change=0.75

I've changed the input-dim=384 to input-dim=350 but it shows the same error as above comments, dimension mismatch.
Then I tried to use matrices for extracted from final.mdl and got the same issue after reducing rows it to 350.


What I'm doing now is to read the initial .raw model, convert it to .mdl and modified the model by changing the layers dimension of different components. Convert it back to .raw model and use it in nnet3 recipe. 

Let say component is Lstm2_W_i-xr it has input-dim=193 output-dim=512. I have extracted the matrix of this(and Lstm1_W_o-xr, Lstm1_W_i-xr ,Lstm1_W_f-xr, Lstm1_W_c-xr) component and reduced the matrix size to 193, 511. When I convert back the .mdl file back to .raw it throws error on 

 Dimension mismatch for network-node Lstm1_rp_t: input-dim 511 versus component-input-dim 512

Same error was being shown on Lstm1_i2 component, and was disappeared when I changed the matrix size of all the components which has 193, 512 dimension.

Lstm1_rp_t does not have any matrix in .raw file, except it is being used for the nnet3 configuration only.

input-node name=input dim=13
component-node name=L0_fixaffine component=L0_fixaffine input=Append(Offset(input, -2), Offset(input, -1), input, Offset(input, 1), Offset(input, 2))
component-node name=Lstm1_c_t component=Lstm1_c input=Sum(Lstm1_c1_t, Lstm1_c2_t)
component-node name=Lstm1_i1 component=Lstm1_W_i-xr input=Append(L0_fixaffine, IfDefined(Offset(Lstm1_r_t, -1)))
component-node name=Lstm1_i2 component=Lstm1_w_ic input=IfDefined(Offset(Lstm1_c_t, -1))
component-node name=Lstm1_i_t component=Lstm1_i input=Sum(Lstm1_i1, Lstm1_i2)
component-node name=Lstm1_f1 component=Lstm1_W_f-xr input=Append(L0_fixaffine, IfDefined(Offset(Lstm1_r_t, -1)))
component-node name=Lstm1_f2 component=Lstm1_w_fc input=IfDefined(Offset(Lstm1_c_t, -1))
component-node name=Lstm1_f_t component=Lstm1_f input=Sum(Lstm1_f1, Lstm1_f2)
component-node name=Lstm1_o1 component=Lstm1_W_o-xr input=Append(L0_fixaffine, IfDefined(Offset(Lstm1_r_t, -1)))
component-node name=Lstm1_o2 component=Lstm1_w_oc input=Lstm1_c_t

component-node name=Lstm1_o_t component=Lstm1_o input=Sum(Lstm1_o1, Lstm1_o2)
component-node name=Lstm1_h_t component=Lstm1_h input=Lstm1_c_t

component-node name=Lstm1_g1 component=Lstm1_W_c-xr input=Append(L0_fixaffine, IfDefined(Offset(Lstm1_r_t, -1)))
component-node name=Lstm1_g_t component=Lstm1_g input=Lstm1_g1

component-node name=Lstm1_c1_t component=Lstm1_c1 input=Append(Lstm1_f_t, IfDefined(Offset(Lstm1_c_t, -1)))
component-node name=Lstm1_c2_t component=Lstm1_c2 input=Append(Lstm1_i_t, Lstm1_g_t)
component-node name=Lstm1_m_t component=Lstm1_m input=Append(Lstm1_o_t, Lstm1_h_t)
component-node name=Lstm1_rp_t component=Lstm1_W-m input=Lstm1_m_t
dim-range-node name=Lstm1_r_t_preclip input-node=Lstm1_rp_t dim-offset=0 dim=128
component-node name=Lstm1_r_t component=Lstm1_r input=Lstm1_r_t_preclip
component-node name=Final_affine component=Final_affine input=Lstm1_rp_t
component-node name=Final_log_softmax component=Final_log_softmax input=Final_affine
output-node name=output input=Offset(Final_log_softmax, 7) objective=linear

Daniel Povey

unread,
Sep 6, 2017, 2:08:09 AM9/6/17
to mirfan.ms...@seecs.edu.pk, kaldi-help
If the matrix dimensions differ it's because the input dimensions
differ, because previous layers had a different size. There will be
no way to make them all the same.
Reply all
Reply to author
Forward
0 new messages