Difference in tf.keras.losses.categorical_crossentropy vs tf.keras.metrics.CategoricalCrossentropy() when sample weights are present

237 views

Skip to first unread message

Isaac Gerg

unread,

Apr 23, 2021, 12:20:50 PM4/23/21

to Discuss

Hi,

I am building an image segmentation model. However, some of the pixels are not labeled and I wish for these to be ignored during training/validation process. To do this, I add a matrix w which is the same size as the image to my generator (i.e. generator returns x, y, w).

When I compare the output of the backprop loss (or running model.eval), I get different numbers for tf.keras.losses.categorical_crossentropy than when I use weighted_metrics=[tf.keras.metrics.CategoricalCrossentropy()]

It appears that the metrics version is correct from my by-hand computations. What I noticed with respect to tf.keras.losses.categorical_crossentropy is that the loss normalizes the mean by (w.sum() / number_of_pixels). Why would it do this and how do I remove it?

For example, suppose my image is 262144 pixels (i.e. 512x512) with a mask that is 221185 pixels and I compute the cross entropy via [tf.keras.metrics.CategoricalCrossentropy(), I get 1.6678107976913452 which is exactly what I get by hand if i compute the loss USING ONLY THE RELEVANT PIXELS (i.e. w =1).

However, when I examine the loss from model.eval(), i get 1.4072216 which is ( 221185 / 262144 ) * 1.6678107976913452. I don't understand why this normalization factor exists.

Thank you,

Isaac

Isaac Gerg

unread,

Apr 23, 2021, 12:34:26 PM4/23/21

to Discuss, Isaac Gerg

Quick update in case someone else has this problem.

The loss in the training loop of .fit() does mean(tf.keras.losses.categorical_crossentropy(y, y_pred) * w) so the denominator of the mean is the size of the image. Somehow in the metrics version of the function, it knows to factor the "correct" denominator which is w.sum().

Reply all

Reply to author

Forward

0 new messages