MultinomialLogisticLoss with softmax activations

26 views
Skip to first unread message

Raúl Gombru

unread,
May 16, 2018, 7:35:13 AM5/16/18
to Caffe Users
Hi,

I want to use MultinomialLogisticLoss (cross-entropy) with softmax activations (which is usually called softmax loss) for multilabel classification. 
I know that intuitively it doesn't make sense to use softmax activations for a multi-label problem, but Facebook stated is his paper "Exploring the Limits of Weakly Supervised Pretraining" that they work better than Sigmoid activations.

But I have found that either the MultinomialLogisticLoss and the SoftMaxWithLoss layers require integers (class labels indices) as targets, while I need to use real numbers. The only cross-entropy loss layer using real numbers targets is the SigmoidCrossEntropyLoss, but I don't want Sigmoid activations, but Softmax.

¿Is there any solutions to do that with the default Caffe?

In the stackoverflow question someone proposes to modify the MultiModalLogisticLoss layer, but I'd like to avoid that https://stackoverflow.com/questions/38070104/how-do-i-go-about-having-a-cross-entropy-layer-in-caffe?utm_medium=organic&utm_source=google_rich_qa&utm_campaign=google_rich_qa

Thanks

Przemek D

unread,
Aug 20, 2018, 10:15:54 AM8/20/18
to Caffe Users
Since this seems to be an experiment-level work, you might try creating a custom Python layer for that. Make sure to use numpy whenever possible and it shouldn't be terribly slow. But this gives you the most flexibility you may need.

Raúl Gombru

unread,
Aug 20, 2018, 10:29:17 AM8/20/18
to Caffe Users
I implemented it, and link it here in case anyone is interested: https://gist.github.com/gombru/53f02ae717cb1dd2525be090f2d41055
In this blogpost I explain how it works and why I wanted to test it, as Facebook did: https://gombru.github.io/2018/05/23/cross_entropy_loss/

Thanks,
Closing

Tung Hoang

unread,
Aug 20, 2018, 3:07:15 PM8/20/18
to Caffe Users
Should we use SoftmaxWithLossLayer ?

/T

Raúl Gombru

unread,
Aug 21, 2018, 4:08:32 AM8/21/18
to Caffe Users
 SoftMaxWithLoss layers require integers (class labels indices) as targets, while I need to use real numbers. SoftMaxWithLoss is only valid for Multi-Class clasification, and not for Multi-Label ( see https://gombru.github.io/2018/05/23/cross_entropy_loss/).
Reply all
Reply to author
Forward
0 new messages