Is Keras evaluation of Training loss only an approximation? If so how to compute the true training loss?

1,436 views
Skip to first unread message

Mahesh Chandra

unread,
Feb 10, 2017, 8:02:07 AM2/10/17
to Keras-users
Hi,

The following is the implementation where the loss are averaged at the end of each epoch. 
Typically loss after every epoch should be computed on the entire training dataset. However 
 in  Keras, it seems like it considers the performance on the mini-batches
then average at the end. Is there any way to check after each epoch the full training loss, for instance
can we write a callback do that? 

Note: This sort of logging the information after each epoch can also be useful for computing the 
validation accuracy, validation loss and training accuracy or any other metric depending on the
requirement.

 
class BaseLogger(Callback):
"""Callback that accumulates epoch averages of metrics.
This callback is automatically applied to every Keras model.
"""
def on_epoch_begin(self, epoch, logs=None):
self.seen = 0
self.totals = {}
def on_batch_end(self, batch, logs=None):
logs = logs or {}
batch_size = logs.get('size', 0)
self.seen += batch_size
for k, v in logs.items():
if k in self.totals:
self.totals[k] += v * batch_size
else:
self.totals[k] = v * batch_size
def on_epoch_end(self, epoch, logs=None):
if logs is not None:
for k in self.params['metrics']:
if k in self.totals:
# Make value available to next callbacks.
logs[k] = self.totals[k] / self.seen

Thanks
Mahesh
 
 

jpeg729

unread,
Feb 10, 2017, 8:54:45 AM2/10/17
to Keras-users
The loss for a mini-batch is the average loss per sample. You seem to be trying to calculate the average loss per sample over all the samples.
Keras on the other hand calculates the average loss per mini-batch, which is the average loss per sample over the mini-batch.

Mathematically, assuming that the batch size is constant, the two calculations are equivalent. The batch size is constant because it is a parameter of model.fit().

It is however possible that neither calculation gives you what you want. If you have 105 samples and a batch size of 20, then you can either make 5 batches of 20 samples discarding the other 5 samples, or you could make 6 batches of 20 samples thus repeating 15 samples. I assume that keras does the first of these. Either way, collecting the losses reported at the end of each minibatch as you do will not give you exactly one loss per sample in your dataset. In practice the loss given by keras is usually pretty close to the true value, and I assume the division of the dataset into batches is done the same way at each epoch, which means that you can meaningfully compare the loss calculated by keras from one epoch to the next.

You could run a prediction across the whole dataset and then calculate the loss by hand, but in practice, unless your batch size is too large for your training set, the epoch loss reported by keras will be pretty close to the true figure.

There are probably more important problems to worry about.
Reply all
Reply to author
Forward
0 new messages