How is Tensorflow SparseCategoricalCrossentropy is Impelemented?

216 views
Skip to first unread message

Depo Depo

unread,
Mar 6, 2021, 7:39:33 AM3/6/21
to Keras-users

I am working on a weighted version of SparseCategoricalCrossentropy. right now my implementation is converting y_true to one hot form and calculates the cross entropy then multiplies it with a weight matrix. I get the same output between my implementation and SparseCategoricalCrossentropy when weights are all 1 however my problem is with one hot encoding. I have a lot of classes (32+bg) and when using one hot encoding I run out of memory for large images/batch sizes which does not happen with SparseCategoricalCrossentropy. I am trying to figure out how is the built in one implemented (is there a way to avoid one hot encoding etc.). How is the built in one implemented or where is it implemented looking at [1] it is probably implemented on the native side but I can not find it?

[1] https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/losses.py#L692

Lance Norskog

unread,
Mar 6, 2021, 10:56:40 PM3/6/21
to Depo Depo, Keras-users
The code at the URL you give refers to a function later on in that file:


Which calls this function in Keras backend:


Sparse categorical_crossentropy is essentially a convenience wrapper for categorical_crossentropy, and requires one-hot encoding. 

On Sat, Mar 6, 2021 at 4:39 AM Depo Depo <de...@robotics.neu.edu.tr> wrote:

I am working on a weighted version of SparseCategoricalCrossentropy. right now my implementation is converting y_true to one hot form and calculates the cross entropy then multiplies it with a weight matrix. I get the same output between my implementation and SparseCategoricalCrossentropy when weights are all 1 however my problem is with one hot encoding. I have a lot of classes (32+bg) and when using one hot encoding I run out of memory for large images/batch sizes which does not happen with SparseCategoricalCrossentropy. I am trying to figure out how is the built in one implemented (is there a way to avoid one hot encoding etc.). How is the built in one implemented or where is it implemented looking at [1] it is probably implemented on the native side but I can not find it?

[1] https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/keras/losses.py#L692

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/d9c24bde-166f-4c7d-a118-3b864b9d8b4fn%40googlegroups.com.


--
Lance Norskog
lance....@gmail.com
Redwood City, CA
Reply all
Reply to author
Forward
0 new messages