gradient descent of softmax loss

85 views

Skip to first unread message

unread,

Feb 6, 2015, 5:40:25 AM2/6/15

to caffe...@googlegroups.com

Dear my colleagues,

I have a question why the gradient of entropy is different from the calcuation and the implementation in softmax_loss_layer.cpp.

From Calculation:

Entropy = sum(-logPi(x=li));

deltaEntropy = -1 + Pi(x=li);

where 'i' represents pixel index and li is the i pixel ground truth label.

From Implementation (in "softmax_loss_layer"):

Dtype* bottom_diff =(*bottom)[0]->muitable_cpu_diff();

for (int i = 0; i < num; ++i) {

for (int j =0; j < spatial_dim; ++j) {

bottom_diff[ i*dim+static_cast<int>(label[ i * spatial_dim + j]) * spatial_dim + j] -= 1;

}

But I don't know where is the Pi(x=li) term in the implementation.

Give me any hint please!

Reply all

Reply to author

Forward

0 new messages