Keras - SGD learning rate Decay is not clear to me

André L

unread,

Feb 9, 2017, 8:39:00 AM2/9/17

to Keras-users

According to the code at : https://github.com/fchollet/keras/blob/master/keras/optimizers.py#L113

lr = self.lr

if self.initial_decay > 0:
lr *= (1. / (1. + self.decay * self.iterations))
self.updates .append(K.update_add(self.iterations, 1))

I have a framework that contains both keras and lasagne/theano options for the user for the sake of reproducibility.

So i wanted to add the option to use the same learning_rate function that keras does.

I Tried the code below but i think its incorrect. Would someone clarify?

if decay > 0:
    lr = shared_learning_rate.get_value() * (1. / (1. + decay * (epoch)))
    shared_learning_rate.set_value(np.float32(lr))

Klemen Grm

unread,

Feb 10, 2017, 3:59:37 AM2/10/17

to Keras-users

The optimizer's "self.iterations" variable is incremented once per training minibatch, whereas your "decay" variable presumably refers to the index of the training epoch.

Message has been deleted

André L

unread,

Feb 10, 2017, 6:09:58 AM2/10/17

to Keras-users

Thanks for the reply. I made some changes... Would you please check it out ? :)

By the way, Im using decay as the parameter in keras , so im passing 1e-6 to the decay.

Heres a bigger snippet of the code :

print("Starting training...")

training_start_time = time.clock()

total_train_batches = 0

for epoch in range(0, max_epochs):

    # In each epoch, do a full pass over the training data:
    train_err = 0
    train_batches = 0
    start_time = time.time()
    for batch in self.minibatch_iterator(X, Y, self.batch_size):
        inputs, targets = batch
        train_err += self.train_fn(inputs, targets)
        train_batches += 1
    total_train_batches = total_train_batches + train_batches


    # And a full pass over the validation data:
    val_err = 0
    val_acc = 0
    val_batches = 0
    for batch in self.minibatch_iterator(X_val, Y_val, self.batch_size):
        inputs, targets = batch
        err, acc = self.val_fn(inputs, targets)
        val_err += err
        val_acc += acc
        val_batches += 1

    # Calculate Results
    trainingLoss = (train_err / train_batches)
    validationLoss = (val_err / val_batches)
    validationAccuracy = (val_acc / val_batches * 100)

    # Then print the results for this epoch:
    print("Epoch {} of {} took {:.3f}s".format(epoch + 1, max_epochs, time.time() - start_time))
    print("  Training Loss:\t\t{:.6f}".format(trainingLoss))
    print("  Validation Loss:\t\t{:.6f}".format(validationLoss))
    print("  Validation Accuracy:\t\t{:.4f} %".format(validationAccuracy))

    if decay > 0:
        lr = shared_learning_rate.get_value() * (1. / (1. + decay * total_train_batches))
        shared_learning_rate.set_value(np.float32(lr))

    print "...Current Learning Rate : ", np.float32(shared_learning_rate.get_value())

Klemen Grm

unread,

Feb 10, 2017, 6:14:58 AM2/10/17

to Keras-users

You're calculating the lr decay correctly now, but you only update it once per epoch, whereas in keras the decay is accounted for on every minibatch.

André L

unread,

Feb 10, 2017, 6:19:59 AM2/10/17

to Keras-users

Like this ?

# Finally, launch the training loop.

print("Starting training...")

training_start_time = time.clock()

total_train_batches = 0

for epoch in range(0, max_epochs):

    # In each epoch, do a full pass over the training data:
    train_err = 0
    train_batches = 0
    start_time = time.time()
    for batch in self.minibatch_iterator(X, Y, self.batch_size):
        inputs, targets = batch
        train_err += self.train_fn(inputs, targets)
        train_batches += 1

        total_train_batches = total_train_batches + 1

Klemen Grm

unread,

Feb 10, 2017, 6:26:33 AM2/10/17

to Keras-users

That looks right.

André L

unread,

Feb 10, 2017, 6:28:56 AM2/10/17

to Keras-users

Thanks for the help :)

Do you know from where they based this learning rate decay from ?

André L

unread,

Feb 10, 2017, 9:00:30 AM2/10/17

to Keras-users

Are you sure the decay is applied at each minibatch?

Im seeing that the learning rate is decaying too fast.. Even with a decay of 1e-6 ....

muni...@gmail.com

unread,

Jul 24, 2017, 9:48:22 PM7/24/17

to Keras-users

Hi, @André L

Did you find the answer? I have the same question. In my program, one epoch has 800 mini-batches. So if learning rate decays per mini-batches, the learning rate decays very fast with decay = 1e-6.

在 2017年2月10日星期五 UTC-5上午9:00:30，André L写道：

François Chollet

unread,

Jul 25, 2017, 12:20:53 AM7/25/17

to muni...@gmail.com, Keras-users

You can use a lower decay rate, or if you want a specific per-epoch schedule, use the LearningRateScheduler callback.

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users+unsubscribe@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/98027e53-292a-44d4-afc8-1c6f01f4c759%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Chong Wang

unread,

Jul 26, 2017, 6:28:05 PM7/26/17

to François Chollet, Keras-users

I am using train_on_batch(), instead of fit(). So I assume I cannot use callback. How can I do step decay and change the learning rate between epochs? Should I do "model.optimizer.lr.assign(0.01)"?

Reply all

Reply to author

Forward