How is Tensorflow SparseCategoricalCrossentropy is Impelemented?

216 views

Skip to first unread message

Depo Depo

unread,

Mar 6, 2021, 7:39:33 AM3/6/21

to Keras-users

I am working on a weighted version of SparseCategoricalCrossentropy. right now my implementation is converting y_true to one hot form and calculates the cross entropy then multiplies it with a weight matrix. I get the same output between my implementation and SparseCategoricalCrossentropy when weights are all 1 however my problem is with one hot encoding. I have a lot of classes (32+bg) and when using one hot encoding I run out of memory for large images/batch sizes which does not happen with SparseCategoricalCrossentropy. I am trying to figure out how is the built in one implemented (is there a way to avoid one hot encoding etc.). How is the built in one implemented or where is it implemented looking at [1] it is probably implemented on the native side but I can not find it?

[1] https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/losses.py#L692

Lance Norskog

unread,

Mar 6, 2021, 10:56:40 PM3/6/21

to Depo Depo, Keras-users

The code at the URL you give refers to a function later on in that file:

https://github.com/tensorflow/tensorflow/blob/99b1fa6a15a6c82eb5ac2642d93fee810a2c766d/tensorflow/python/keras/losses.py#L1674

Which calls this function in Keras backend:

https://github.com/tensorflow/tensorflow/blob/a2cbfe54d2d695124b96d929247ee9a6b3534395/tensorflow/python/keras/backend.py#L4974

Sparse categorical_crossentropy is essentially a convenience wrapper for categorical_crossentropy, and requires one-hot encoding.

On Sat, Mar 6, 2021 at 4:39 AM Depo Depo <de...@robotics.neu.edu.tr> wrote:

I am working on a weighted version of SparseCategoricalCrossentropy. right now my implementation is converting y_true to one hot form and calculates the cross entropy then multiplies it with a weight matrix. I get the same output between my implementation and SparseCategoricalCrossentropy when weights are all 1 however my problem is with one hot encoding. I have a lot of classes (32+bg) and when using one hot encoding I run out of memory for large images/batch sizes which does not happen with SparseCategoricalCrossentropy. I am trying to figure out how is the built in one implemented (is there a way to avoid one hot encoding etc.). How is the built in one implemented or where is it implemented looking at [1] it is probably implemented on the native side but I can not find it?

[1] https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/losses.py#L692

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/d9c24bde-166f-4c7d-a118-3b864b9d8b4fn%40googlegroups.com.

Lance Norskog
lance....@gmail.com
Redwood City, CA

Reply all

Reply to author

Forward

0 new messages