Training data of Fully Convolutional Neural Network image segmentation

Filip K

unread,

Jan 29, 2018, 9:22:52 AM1/29/18

to Caffe Users

I am currently working on implementation of semantic segmentation of images neural network, and try to implement one of the already existing solution such as Fully Convolutional Neural Network [1].

Data that I am using is based on Pascal-Context dataset [2], which has additional labeling to original 20-class PASCAL VOC dataset. This results in a dataset with over 450-classes.

***Problem***

Initial 20-classes do not not match classes that I would like to achieve for indoor scenes. Therefore, I have created a short list of 12-classes that I would like to capture, which are in Pascal Context 450-classes dataset.

I managed to convert the data and now trying to start training. I am following this tutorial [3] on Matlab, which provides an example of an Image with classes overlay and all pixels colored.

However, in my scenario, I only want to be able to distinguish elements such as tvmonitor, sofa, wall and ignoring all other elements, which might be there. Matlab tutorial states: ***"Areas with no color overlay do not have pixel labels and are not used during training.

As you can see above, I have two classes present in the picture, but I am not sure whether I should also include class Background, which would put overlay on everything that is not within my list of classes, and include that as an additional class in my training or not.

In summary, I am wondering whether Background needs to be provided as an additional class to my list of the classes that I would like to classify or not, even if this background class usually takes majority of each of images. Would that result in everything being classified as background?

**References:**

[1] https://github.com/shelhamer/fcn.berkeleyvision.org

[2] https://www.cs.stanford.edu/~roozbeh/pascal-context/

[3] https://uk.mathworks.com/help/vision/examples/semantic-segmentation-using-deep-learning.html#d119e321

Przemek D

unread,

Jan 31, 2018, 4:28:36 AM1/31/18

to Caffe Users

Providing a background class can help, as long as it doesn't introduce a huge disproportion to your data (e.g. 97% background vs 3% objects). Usually networks can deal with some disproportion, but if there's way more background than other classes, and the objects are difficult to recognize, then it shouldn't be surprising that the easiest answer a network can make (and closest to truth!) is that everything is background.

Alternative approach is to provide an ignored class - you can make the loss function ignore some areas entirely (by assigning these areas a special ignore_label), preventing it from propagating anything to such pixels. This is how to do it.

In some of my segmentation problems I did both: marked some of the image as background and some as ignore (to counter a rather large imbalance), and it worked pretty well.

Filip K

unread,

Jan 31, 2018, 6:04:49 AM1/31/18

to Caffe Users

Oh ok, that is great. I decided to use Background for now as well as it doesn't seem to be really as frequent as you mentioned above, so I will give it a try.

Reply all

Reply to author

Forward