difference between SoftmaxWithLoss and SigmoidCrossEntropy Loss

505 views
Skip to first unread message

anand

unread,
Aug 8, 2017, 5:38:34 AM8/8/17
to Caffe Users
Hell all,

I am bit confused between classification loss layer provided from caffe. Can any one help me to know how exactly SigmoidCrossEntropyLoss is different from SoftmaxWithLoss in terms of caffe implementation and how they compute loss between logits and labels.

Any help would be really appreciated.

Thank you.

Regards,
Anand 

Jonathan R. Williford

unread,
Aug 8, 2017, 6:36:55 AM8/8/17
to anand, Caffe Users
Hi Anand,

SoftmaxWithLoss is a more numerically stable version of a SoftmaxLayer followed by a  MultinomialLogisticLossLayer. This is suitable for one-of-many classification tasks - ie. there is one true class (such as with the ImageNet where one object is considered as "the object" of an image). Softmax makes the probability of the classes sum to 1. When you are using Softmax, you are asking for the probability of a given class, given that the data contains exactly one class.

SigmoidCrossEntropyLoss is a more numerical stable version of a SigmoidLayer + a cross entropy layer. This is suitable for classification of multiple classes (e.g. an image may contain multiple object classes). Sigmoid allows the sum of all of the probabilities to be more than 1 (or less).

See the Doxygen pages for the equations.


For SigmoidCrossEntropyLoss see:
http://caffe.berkeleyvision.org/doxygen/classcaffe_1_1SigmoidCrossEntropyLossLayer.html#details

Best,
Jonathan

--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users+unsubscribe@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/3fcb22b3-3225-4d31-8cd6-aadb9f508c43%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

anand dubey

unread,
Aug 8, 2017, 10:23:08 AM8/8/17
to Jonathan R. Williford, Caffe Users
Hi Jonathan,
Thank you for your time and detailed reply.
What I understand from the link and google is:
SoftmaxWithLoss = -( L*log [(softmax(p)] + (1-L) log [(softmax(p)] )
SigmoidCrossEntropy  = ( L*log [(Sign(p)] + (1-L) log [(Sign(p)] )
where, L is label and P logits from network

So basically they compute loss by same means except SoftmaxWithLoss applies softmax function on logits and SigmoidCrossEntropy applies signum function on logits before loss calculation.

Please correct me if my understanding is wrong.

Thank you.

Regards,
Anand
--
Anand Dubey
TU Chemnitz
+49 1766 00 17028

Reply all
Reply to author
Forward
0 new messages