I have had a look at the respective PRs, but could not find anything related to these questions.
On a side note: The docs (and also the caffe.proto) could reflect the independence between (learning rate policy and associated parameters) and (solver type and associated parameters) a bit better. These parameters are a bit mixed up in the caffe.proto and looking at the code only helps marginally. On the solver page the explanation of the solver types is quite nice, but the possible lr policies are treated quite poorly. Don't misunderstand me, caffe is a great tool, probably the greatest there is today for deep learning. Sadly I don't have too much time to help improve it myself.
Jan
1. Why
is this multiplication implemented in the caffe code, although there is no such multiplication in the original paper's pseudocode?
2. If
I want to achieve the behavior that is described by the pseudocode in the paper, should I set the lr policy to "fixed" and base_lr to 1.0?
3.
Could it actually make sense to use other lr policies or would that interfere with AdaDelta's adaption mechanism?
The docs (and also the caffe.proto) could reflect the independence between (learning rate policy and associated parameters) and (solver type and associated parameters) a bit better
--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/0120d011-f283-48d6-a26c-efdeb2f6802d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.