Hello there!
I'm trying to reconstruct a network for text localization and recognition based on the fully-convolutional regression network used in this paper (
http://www.robots.ox.ac.uk/~ankush/textloc.pdf ) by Ankush Gupta et al , but i run into problem.
According to article this approach require that "If a cell does not contain a ground-truth word, the loss ignores all parameters but c (text/no-text)."
So I do not understand how it can be implemented in a caffe. Is there any standard settings for the loss layers with multi-labels support (such as Hinge or Euclidian or maybe others) to implement this technique or i can't do this without modifying the source code of the layer( that will be rather slow because of only CPU computation isn't it?).
Maybe this problem may be solved other way?
Great thanks for your attention!