Hello,
At the moment, I have the following problem: I train a CNN for a number of epochs. Using the callback
ModelCheckpoint, I save the weights each epoch. Training finishes and the loss and accuracy on training set is about 0.15 and 0.98, respectively. After that, I want to continue training this model. So I load exactly the same model and restore the weights (from the most recent checkpoint file). After running the first epoch, loss and accuracy are about 1.3 and 0.1 suddenly.
How is this possible? I restored everything to the state of the end of the first training - model, weights, learning rate etc.
I suspect the optimizer to be the problem. In particular, I'm using
RMSprop. As far as I know, RMSprop uses the previous calculated gradients. Of course, when I continue training the old gradients aren't available.
Do you think this is the reason of the bad training performance in the second run? Also, do you have any suggestions on how to deal with this kind of problem?