How to implement a new Caffe layer that can exploit all Proposals generated by Faster RCNN together？

90 views

Skip to first unread message

jiaji...@gmail.com

unread,

May 20, 2016, 8:28:41 AM5/20/16

to Caffe Users

We all know that Faster-RCNN consists of two module, the RPN module which will produce some Proposals(object locations, e.g a set of rectangulars) and the Fast-RCNN module that classifies each proposal into one class(e.g airplane) and refine the proposal into a more precise location.

My problem is that when I got all the Proposals in the image in question from the output of the RPN module, how can I use these Proposals simultaneously to achieve another task like Image Caption?More precisely, how can I implement a new Caffe layer such that I can exploit all these Proposals simultaneously? In the Fast-RCNN module of Faster-RCNN, it just classifies each proposal into one class and refine the proposal into a more precise location one by one. And this is what Caffe do in other CNN architecture, e.g, DeepID2, AlexNet, VGG_16. But I want to exploit these proposals simultaneously!

My motivation is that some vision tasks are related to many parts(e.g, all objects in a image, or all facial parts of a facial image) in the image. So when I got all these proposals, how can I implement a new Caffe layer that can process these proposals simultaneously?