can't find object by fusing lower layer feature with high layer feature in faster RCNN

helxsz

unread,

Jul 3, 2017, 5:02:19 AM7/3/17

to Caffe Users

I am currently working on faster rcnn, moreover I am using the skip-connection to fuse conv3-3, conv4-3 and conv5-3 together on ROI layer (I don't change RPN layer), the steps are shown below

1. Extract the feature maps of the face region (at multiple scales conv3-3, conv4-3, conv5-3)

2. and apply RoI-Pooling to it (i.e. convert to a fixed height and width).

3. L2-normalize each feature map. Concatenate the (RoI-pooled and normalized) feature maps of the face (at multiple scales) with each other (creates one tensor).

4. Apply a 1x1 convolution to the face tensor. Apply two fully connected layers to the face tensor, creating a vector.

I used the caffe and made a prototxt based on faster-RCNN VGG16 , the following parts are added into the original prototxt

# roi pooling the conv3-3 layer and L2 normalize it 

layer {
  name: "roi_pool3"
  type: "ROIPooling"
  bottom: "conv3_3"
  bottom: "rois"
  top: "pool3_roi"
  roi_pooling_param {
    pooled_w: 7
    pooled_h: 7
   spatial_scale: 0.25 # 1/4
  }
}

layer {
  name:"roi_pool3_l2norm"
  type:"L2Norm"
  bottom: "pool3_roi"
  top:"pool3_roi"
}

-------------

# roi pooling the conv4-3 layer and L2 normalize it 


layer {
  name: "roi_pool4"
  type: "ROIPooling"
  bottom: "conv4_3"
  bottom: "rois"
  top: "pool4_roi"
  roi_pooling_param {
    pooled_w: 7
    pooled_h: 7
    spatial_scale: 0.125 # 1/8
  }
}

layer {
  name:"roi_pool4_l2norm"
  type:"L2Norm"
  bottom: "pool4_roi"
 top:"pool4_roi"
}

 --------------------------

# roi pooling the conv5-3 layer and L2 normalize it 

layer {
  name: "roi_pool5"
  type: "ROIPooling"
  bottom: "conv5_3"
  bottom: "rois"
  top: "pool5"
  roi_pooling_param {
    pooled_w: 7
    pooled_h: 7
    spatial_scale: 0.0625 # 1/16
  }
}


layer {
  name:"roi_pool5_l2norm"
  type:"L2Norm"
  bottom: "pool5"
  top:"pool5"
}


# concat roi_pool3, roi_pool4, roi_pool5 and apply 1*1 conv


layer {
  name:"roi_concat"
  type: "Concat"
  concat_param {
    axis: 1
  }
  bottom: "pool5"
  bottom: "pool4_roi"
  bottom: "pool3_roi"      
  top:"roi_concat"
}

layer {
  name:"roi_concat_1*1_conv"
  type:"Convolution"
  top:"roi_concat_1*1_conv"
  bottom:"roi_concat"
  param {
   lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 128
    pad: 1
    kernel_size: 1
    weight_filler{
                type:"xavier"
    }
        bias_filler{
                type:"constant"        
        }
  }
}

layer {
  name: "fc61"
  type: "InnerProduct"
  bottom: "roi_concat_1*1_conv"
  top: "fc61"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 4096
  }
}

However when the model is trained and tested, it can not detect any objects at all. Ultimately I found the detection windows generated from RPN are all over the image with low confidence as shown below

Since I only modified the fast_rcnn_train stage not the rpn_train stage, as they shard the VGG16 layer, so the fast_rcnn_train stage has caused the rpn not well trained to predict the right bounding box

Can some body spot the problem and how to fix it? the attachement has my own modified prototxt model, I only changed the stage1_fast_rcnn_train and stage2_fast_rcnn_train.

faster_rcnn_test.pt

stage1_fast_rcnn_train.pt

stage1_rpn_train.pt

stage2_fast_rcnn_train.pt

stage2_rpn_train.pt

samshi

unread,

Aug 15, 2017, 10:07:23 AM8/15/17

to Caffe Users

Hello, have you found any solutions or problem?

I'm also trying yolo-v2 for segmentation but with fusing conv2-2 conv3-3, conv4-3 and conv5-3 together. The results from the net showed nothing improved, even worse.

Mehmet Kerim Yucel

unread,

Jan 31, 2018, 9:40:20 AM1/31/18

to Caffe Users

would like to find out if you've solved this problem, as I am in a similar situtation right now.

Reply all

Reply to author

Forward