I train a net via pycaffe using sdgsolver. At some point I completely
stop the training and dump net parameters (not solver state)
to a .caffemodel file:
solver.net.save("some.caffemodel")
Later I re-run training script to continue training, with exactly the same solver parameters and data, but loading
pre-trained net weights from the previous stage:
solver = caffe.get_solver("solver.tmp")
solver.net.copy_from("some.caffemodel"
)
If at this point I test the net using
I can make sure that this is exactly the same net that I got before saving.
However, when I try to continue training using the solver.step
the loss value is many times more than it was at the end of the first training stage,
and after some iterations the net appears to be damaged or even completely destroyed.
It seems that after loading net to solver some solver data is left uninitialized or filled
with garbage. How should I initialize the solver so that it could work with a pre-trained net?