Why get_learning_rate return num_jobs * learning_rate ?

82 views

Skip to first unread message

付嘉懿

unread,

Nov 8, 2018, 3:21:30 AM11/8/18

to kaldi-help

Hi all,

I have a simple question. I found there is a function named get_learning_rate to compute the actual learning rate in the training process and its return value is num_jobs*learning_rate.

Why not just return learning rate but return num_jobs*learning rate ? Does this mean I should set different initial-effective-lrate values when I set different num-jobs-initial values?

Thanks !

Daniel Povey

unread,

Nov 8, 2018, 12:48:44 PM11/8/18

to kaldi...@googlegroups.com

The idea is that if you are training with, say, 4 jobs, then when you average over the 4 jobs the contribution of any given minibatch is diluted by a factor of 4, so for the training to make the same rate of progress (per minibatch seen, not per time), you need to multiply the learning rate by 4. Setting it that way makes the training a bit less sensitive to the num-jobs (i.e. it's intended that you won't have to re-tune the learning rate when you change the num-jobs). If the learning rate is high enough that instability might be an issue, you will tend to hit the "max-change" constraints. That, of course, will make you train slower than you otherwise would, which is why when you use more jobs, you tend to have to increase the number of epochs a bit.

Dan

--
Go to http://kaldi-asr.org/forums.html find out how to join
---
You received this message because you are subscribed to the Google Groups "kaldi-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kaldi-help+...@googlegroups.com.
To post to this group, send email to kaldi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kaldi-help/559b3333-36da-4dfe-a737-4d388f9cc67c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages