2d argmax after softmax in fcn

536 views
Skip to first unread message

Etienne Perot

unread,
Sep 12, 2015, 12:30:35 PM9/12/15
to Caffe Users
Hello, 

i'm testing the fcn with a c++ code with a small model i finetuned from caffeNet.


the model takes 50 ms on my Quaddro K2000M to run. I had to re-write a 2D argmax like this :

cv::Mat Argmax(const boost::shared_ptr<caffe::Blob<float> >& probLayer){
    unsigned int w_max = probLayer->width();
    unsigned int h_max = probLayer->height();
    unsigned int c_max = probLayer->channels();
    float* data = (float*)probLayer->cpu_data();
    cv::Mat classes(cv::Size(w_max, h_max),CV_8U);
    classes.setTo(0);
    int plan_size = w_max*h_max;
    for (int y = 0; y < h_max; y++) {
        uchar* optr = classes.ptr<uchar>(y);
        int yoffset = y*w_max;
        for (int x = 0; x < w_max; x++) {
            int xoffset = yoffset+x;
            int max_label=0;
            float max_proba=0.f;
            for(int c = 0; c < c_max; c++){        
                int offset = c*plan_size + xoffset;
                float proba = data[offset];
                if( max_proba < proba ){
                    max_label = c;
                    max_proba = proba;
                }
            }
            *optr++ = max_label;
        }
    }
    return classes;
}

this ugly hack takes around 10 ms for 7 classes on a pic of 640 x 480... (which is like 1 fifth of the forward !!)

does anybody know if this feature is already coded in Caffe? the Argmax Layer seems to take max in all trailing dimensions... (and i want a 2d output)



Evan Shelhamer

unread,
Sep 12, 2015, 4:41:42 PM9/12/15
to Etienne Perot, Caffe Users
Max pooling layers output both the max + argmax if defined with 2 tops: the 1st is the max pooling output while the 2nd is the argmax mask: https://github.com/BVLC/caffe/blob/master/src/caffe/layers/pooling_layer.cu#L164-L180

The layer documentation is due for an update to make features like this more clear...​

Evan Shelhamer



--
You received this message because you are subscribed to the Google Groups "Caffe Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to caffe-users...@googlegroups.com.
To post to this group, send email to caffe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/caffe-users/e0b4a4ec-e7e0-4b28-b9ed-b429482b0e6a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Etienne Perot

unread,
Sep 13, 2015, 6:31:48 AM9/13/15
to Caffe Users, et.p...@gmail.com
Ah ok! I had no idea you could use max pooling in cross channel mode ?   If i use 1x1 kernel size ? you see it's to deduce the classes from the prob layer... 

( https://github.com/BVLC/caffe/issues/1299 suggests that i could use Eltwise with max, but i'm not sure it could output the indices..., but surely it is implemented at least for backprop)

Evan Shelhamer

unread,
Sep 13, 2015, 1:47:35 PM9/13/15
to Etienne Perot, Caffe Users
Oh, sorry -- I misunderstood the question. The 2nd top of max pooling holds the spatial argmax, not the channel argmax as you'd need for the class output. I'll read more closely next time! The ArgMax layer could be adapted to take the argmax over different spans of axes.
Reply all
Reply to author
Forward
0 new messages