I'm finding that in standard backprop, the overall magnitude of the
weights drops as a power-law with the number of training cycles. I.e.
w ~ NCYCLE ^ -alpha
where alpha is some real number (depending on data, etc.). Is there
anything in updating of the weights that causes them to shrink with
NCYCLE?
Caren