I have realized that most nn training recipes use 4 for num-epochs; a few questions:- shouldn't num-epochs be dependent of train set size and neural net size?
- how do you decide on the optimal number of epochs/learning rate?
- in general how do you track accuracy/loss on train and validation sets during training? are there any options to export these metrics to viz tools like visdom or tensorboard?
--Thanks!
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/718e5e4d-7a5f-431a-ae2c-a12755302478%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I have realized that most nn training recipes use 4 for num-epochs; a few questions:- shouldn't num-epochs be dependent of train set size and neural net size?Yes, sometimes when there is less data it makes sense to use more epochs, up to 10 or so.- how do you decide on the optimal number of epochs/learning rate?You'd normally have to tune them.- in general how do you track accuracy/loss on train and validation sets during training? are there any options to export these metrics to viz tools like visdom or tensorboard?Personally I rely on grepping in the logs (e.g.: `grep Overall exp/chain/tdnn1b_sp/log/compute_prob_*.100.log`)but you can also usesteps/nnet3/report/generate_plots.py(see its usage message; it generates a pdf plot)Dan
Thanks!
--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/000ae4ce-0a2b-4eb8-a97d-99c2ed40a513%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/0ced7833-512e-482b-afb7-b7f85cc41f28%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/eda514f6-3dbc-4ef6-980d-5697ca77f7fe%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/146d9ac6-2219-45f6-85d6-df1709d5a728%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/be67e7f8-d810-451c-8849-b21d54bce097%40googlegroups.com.
if [ $stage -le 12 ]; then echo "$0: creating neural net configs using the xconfig parser";
num_targets=$(tree-info $treedir/tree |grep num-pdfs|awk '{print $2}') learning_rate_factor=$(echo "print 0.5/$xent_regularize" | python) opts="l2-regularize=0.001" linear_opts="orthonormal-constraint=1.0" output_opts="l2-regularize=0.0005 bottleneck-dim=256"
mkdir -p $dir/configs
cat <<EOF > $dir/configs/network.xconfig input dim=100 name=ivector input dim=40 name=input # please note that it is important to have input layer with the name=input # as the layer immediately preceding the fixed-affine-layer to enable # the use of short notation for the descriptor
....
output-layer name=output-xent dim=$num_targets learning-rate-factor=$learning_rate_factor $output_optsEOF steps/nnet3/xconfig_to_configs.py --xconfig-file $dir/configs/network.xconfig --config-dir $dir/configs/fi
$train_cmd $dir/log/generate_input_mdl_transfer.log \
nnet3-am-copy --raw=true --nnet-config=$dir/configs/final.config \ $dir/6380.mdl $dir/6380_lower_l2.raw || exit 1;
echo "Copied pretrained neural net weights"
if [ $stage -le 13 ]; then
steps/nnet3/chain/train.py --stage $train_stage \ --cmd "$train_cmd" \ --trainer.input-model $dir/6380_lower_l2.raw \ --feat.online-ivector-dir ${exp}/nnet3/ivectors_${train_set} \ --feat.cmvn-opts "--norm-means=false --norm-vars=false" \ --chain.xent-regularize $xent_regularize \
...
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/d622d574-b130-4c6f-a073-c10261b8de93%40googlegroups.com.
exp/full_cgn_flnl_swbd/chain/tdnn7n_sp/log/compute_prob_train.6328.log:LOG (nnet3-chain-compute-prob[5.4.16~1-8b500]:PrintTotalStats():nnet-chain-diagnostics.cc:193) Overall log-probability for 'output-xent' is -3.53733 per frame, over 17602 frames.
exp/full_cgn_flnl_swbd/chain/tdnn7n_sp/log/compute_prob_train.6328.log:LOG (nnet3-chain-compute-prob[5.4.16~1-8b500]:PrintTotalStats():nnet-chain-diagnostics.cc:193) Overall log-probability for 'output' is -0.349972 per frame, over 17602 frames.
exp/full_cgn_flnl_swbd/chain/tdnn7n_sp/log/compute_prob_valid.6328.log:LOG (nnet3-chain-compute-prob[5.4.16~1-8b500]:PrintTotalStats():nnet-chain-diagnostics.cc:193) Overall log-probability for 'output-xent' is -3.4315 per frame, over 17294 frames.
exp/full_cgn_flnl_swbd/chain/tdnn7n_sp/log/compute_prob_valid.6328.log:LOG (nnet3-chain-compute-prob[5.4.16~1-8b500]:PrintTotalStats():nnet-chain-diagnostics.cc:193) Overall log-probability for 'output' is -0.323175 per frame, over 17294 frames.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/dbb70bda-7980-41c3-8cb2-12c509ba188e%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/87cd9c8a-c970-48bf-baf8-0548e2fd23cf%40googlegroups.com.
opts="l2-regularize=0.001 dropout-proportion=0.0 dropout-per-dim=true dropout-per-dim-continuous=true"
linear_opts="orthonormal-constraint=-1.0 l2-regularize=0.001"
output_opts="l2-regularize=0.001"
Thanks for tips and results sharing, i think i let my remaining lower-lr epoch finish anyway and see if results match yours.
A few questions on the new setting:- Is it right to say that the new setting with orthonormal constraint and your following suggestions will be mostly effective on single-gpu?
- why would you train more conservatively (smaller lr, l2) with larger datasets? i thought larger data has a regularization effect by itself which could counter e.g. possible side effect of too many params in the model?
- by ~1000 hours you mean hours before data augmentation (speed/volume), right?
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/b5c52048-5eb6-49f1-960d-5f474e6a988f%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/c5e8e7de-a375-468d-bdc1-2fa96c5dd717%40googlegroups.com.
For 300 hours of data before augmentation (a swbd-type setup) I'd say about 6-8 epochs if using >1 GPU and maybe 4-6 epochs if using 1 GPU.
It could be that for the newer setup that trains more slowly, a smaller model is needed.
How much data were you using for the experiment where you trained an already-trained model for one more epochs and got a WER degradation?
For 300 hours of data before augmentation (a swbd-type setup) I'd say about 6-8 epochs if using >1 GPU and maybe 4-6 epochs if using 1 GPU.Will learning rates stay in the same range of 1e-3 > 1e-4?
Would it make sense to run with the upper bounds (8 on multigpu, 6 on single gpu) and evaluate wer with mdl 's corresponding to end of lower epochs?
How would you change num epochs when doubling hours (600h)?
It could be that for the newer setup that trains more slowly, a smaller model is needed.Which of the other chain tdnn versions do you recommend for this?
--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/1f1d8391-3b48-4161-9891-b5972cb67366%40googlegroups.com.
--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+unsubscribe@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/97bfd8fa-3230-4836-92e6-e52349d73401%40googlegroups.com.