Cross entropy loss for a full probability distribution?

779 views
Skip to first unread message

Dylan Rhodes

unread,
Jul 28, 2015, 2:49:03 PM7/28/15
to Caffe Users
Hi, I'd like to train a network with a softmax/cross entropy loss function.  My labels are full probability distributions, not just one-hot vectors.  Does Caffe have the capability to calculate the full cross entropy loss function?  Neither the multinomial logistic loss layer nor the softmax loss layer accept a probability distribution as a label, just one-hot vectors specified in terms of the index of the true label.  I can implement it myself, but first I'd like to confirm that Caffe doesn't already include it.

For clarity, I don't want to use the sigmoid cross entropy loss layer, because the sigmoid function doesn't produce a probability distribution.

Dylan Rhodes

unread,
Jul 28, 2015, 5:50:12 PM7/28/15
to Caffe Users, dylanr%st...@gtempaccount.com
I went ahead and just did it myself.

Noa Arbel

unread,
Feb 29, 2016, 11:17:06 AM2/29/16
to Caffe Users, dylanr%st...@gtempaccount.com
Hi,

Did you implemented cross entropy loss function (without sigmoid) in C++ and added it to your copy of Caffe? Did it worked?

Noa

Dylan Rhodes

unread,
Mar 2, 2016, 7:48:26 PM3/2/16
to Noa Arbel, Caffe Users, dylanr%st...@gtempaccount.com
Hey Noa,

I did wind up implementing the full softmax/cross entropy loss function.  It did work - you'll just need to recompile Caffe after adding the file in order to use it in network architectures.  I don't have the code with me now, but I could put it on github if you want.  If you're interested in implementing it yourself, there are just a few modifications to make to the existing softmax loss layer and this covers the differentiation http://stats.stackexchange.com/questions/79454/softmax-layer-in-a-neural-network

-Dylan

Jan C Peters

unread,
Mar 3, 2016, 3:44:25 AM3/3/16
to Caffe Users, noa....@gmail.com, dylanr%st...@gtempaccount.com
Hi Dylan,

I think that would be a great extension to caffe itself. So maybe you want to go the extra mile and add a PR for that in the official caffe repo? I am sure lots of users will thank you for it.

Jan

Noa Arbel

unread,
Mar 3, 2016, 10:49:40 AM3/3/16
to Caffe Users, noa....@gmail.com, dylanr%st...@gtempaccount.com
Thanks Dylan.

When you write "softmax/cross entropy loss function", do you mean a layer that contain both softmax and cross entropy? because softmax layer already exist in caffe, and I though to implement just the cross-entropy loss.

I agree with Jan - this layer may be useful for other caffe users, so I think it would be best if you can put it on github.
Reply all
Reply to author
Forward
0 new messages