Continue Training from Checkpoint - Performance is surprisingly bad

mhubri...@gmail.com

unread,

Jul 3, 2016, 11:42:33 AM7/3/16

to Keras-users

Hello,

At the moment, I have the following problem: I train a CNN for a number of epochs. Using the callback ModelCheckpoint, I save the weights each epoch. Training finishes and the loss and accuracy on training set is about 0.15 and 0.98, respectively. After that, I want to continue training this model. So I load exactly the same model and restore the weights (from the most recent checkpoint file). After running the first epoch, loss and accuracy are about 1.3 and 0.1 suddenly.

How is this possible? I restored everything to the state of the end of the first training - model, weights, learning rate etc.

I suspect the optimizer to be the problem. In particular, I'm using RMSprop. As far as I know, RMSprop uses the previous calculated gradients. Of course, when I continue training the old gradients aren't available.

Do you think this is the reason of the bad training performance in the second run? Also, do you have any suggestions on how to deal with this kind of problem?

François Chollet

unread,

Jul 3, 2016, 12:42:02 PM7/3/16

to mhubri...@gmail.com, Keras-users

Just use a lower learning rate. The optimizer state is not saved, so you're starting with a reinitialized optimizer state.

Alternatively, you can save and load the optimizer.

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/ec4797b0-6b2d-4752-b45e-0fb9f730ad3f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

mhubri...@gmail.com

unread,

Jul 3, 2016, 12:53:37 PM7/3/16

to Keras-users, mhubri...@gmail.com

Thanks for your advice, Francois!

Are there any build-in functions to save and load the state of an optimizer? I couldn't find any yet.

François Chollet

unread,

Jul 3, 2016, 4:25:22 PM7/3/16

to Markus Hubrich, Keras-users

Yes: optimizer.get_weights() / set_weights().

--

You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.

To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/73bed0fd-321c-44da-af7f-a51fcd23bca2%40googlegroups.com.

mhubri...@gmail.com

unread,

Jul 4, 2016, 4:11:08 AM7/4/16

to Keras-users, mhubri...@gmail.com

But won't this line cause problems?

params = self.weights
if len(params) != len(weights):
    raise Exception('Provided weight array does not match  weights.')

When you create a fresh optimizer, it has no self.weights attribute (or to be correct, self.weights = [] for new optimizers). So, set_weights() won't work. Am I missing something here?

mhubri...@gmail.com

unread,

Jul 8, 2016, 12:57:19 PM7/8/16

to Keras-users, mhubri...@gmail.com

To whom it could help:
I fixed the problem by extending the RMSprop class (you could do the same with other optimizers, or maybe better with the super class Optimizer itself):

import numpy as np

class MyRMSprop(RMSprop):
    def __init__(self, path_weights=None, path_updates=None, lr=0.001, rho=0.9, epsilon=1e-8, **kwargs):
        self.path_weights = path_weights
        self.path_updates = path_updates
        super(MyRMSprop, self).__init__(lr, rho, epsilon, **kwargs)
    
    def get_updates(self, params, constraints, loss):
        tmp = super(MyRMSprop, self).get_updates(params, constraints, loss)
        if self.path_weights is not None:
            weights = np.load(self.path_weights)
            self.set_weights(weights)
        if self.path_updates is not None:
            updates = np.load(self.path_updates)
            self.set_state(updates)
        return tmp
    
    def save_weights(self, path):
        weights = self.get_weights()
        np.save(path, weights)

    def save_updates(self, path):
        updates = self.get_state()
        np.save(path, updates)

So all in all, after training I save the weights and updates (= state of the optimizer) as numpy arrays in a file. When I continue training, I load these files and restore the state of the optimizer.

Reply all

Reply to author

Forward