Infogain Loss Layer for semantic segmentation

279 views

Skip to first unread message

Alex Ter-Sarkisov

unread,

Aug 12, 2017, 8:37:31 AM8/12/17

to Caffe Users

There are 2 classes for a semantic segmentation problem, hence for an image size 250x250 the dims are 2x250x250. Ideally I'd like to have the following H matrix:

[1, 0,

10, 1]

since there are 2 classes per pixel, but the documentation is telling me it has to be much larger (KxK, i,e, 2*250*250x2*250*250). So should I have such a matrix twice the size of an image (i.e. a separate matrix for every image, which is quite hard)? I didn't find the solution anywhere.

Shai Bagon

unread,

Aug 14, 2017, 5:37:45 AM8/14/17

to Caffe Users

(1) Your H matrix makes no sense. You are going to pay maximal loss (10) when correctly predicting the foreground. See this stackoverflow thread for more information.

(2) If you have two labels, your H matrix should be 2x2.

(3) Make sure you have the recent version of `"InfogainLoss"` layer (including PR #3855): this update adjusted this layer to handle "per-pixel" prediction (rather than a single prediction for the entire input).

See this stackoverflow answer for more details.

Alex Ter-Sarkisov

unread,

Aug 17, 2017, 4:06:35 AM8/17/17

to Caffe Users

Thanks Shai, I of course meant [1 1
1 10]

to penalize classifying class 1 as 0. So if I do it this way, it will apply pixelwise with the new PR? Also, in order to upload to matrix, is your answer from 25/12/2014 still correct?