Share weights in every group of convolutions

Emmanuel

unread,

Jan 27, 2016, 1:16:36 PM1/27/16

to Caffe Users

Hi,

Let's say I have a feature map as input to a convolutional layer. I would like to apply a set of convolution filters to only the k first feature maps, the same convolution filters to the next k filter maps, and so on. There is a "group" parameter that let's me restrict the connections (which subset of maps in the previous layer are connected to which maps in the next one), but this does not impose that I want the same convolution filters to be applied in each group.

The solution I find is to split my feature map into many layers, and add independent convolutional layers for each group of feature maps, while sharing the weights across them. This is probably gonna work but it's pretty painful since I need to create many layers. I don't know how Caffe handles weight sharing, but if the weights are repeated in memory I might also not have enough memory in the GPU to deal with this configuration.

Do you guys know any simple solution to get the behavior I want?

Thanks!

Jan C Peters

unread,

Jan 28, 2016, 4:13:17 AM1/28/16

to Caffe Users

Hi Emmanuel,

your solution would have been my first approach, too. And it is also probably the only practicable one (without adjusting the code of the convolutional layer). As for the weight sharing: I am pretty sure that caffe actually uses the same memory and the shared data is not taking additional space containing duplicates of it.

But maybe, if you are willing to get your hands dirty, you could look into the code and make the necessary adjustments to the convolutional layer itself. Maybe it is possible to support your scenario by adding a layer param option to the proto and making corresponding changes in the code. Since all the bits and pieces are there, just not exactly in the form you want them to, it should not be too hard to do. You look in other places in the code to see how weight sharing works and how grouping works.

Jan

Emmanuel

unread,

Jan 29, 2016, 5:26:43 AM1/29/16

to Caffe Users

Hi,

Thanks for your answer.

Actually, I've just checked and Caffe convolutional layers support inputting multiple blobs and applying the operation to all of them independently. I still need to "slice" the input and then concatenate the output but I don't need to rewrite the convolutional layer multiple times and use weight sharing at all. I tried both approaches and the memory consumption is the same.

Best

Jan C Peters

unread,

Jan 29, 2016, 6:21:33 AM1/29/16

to Caffe Users

Very interesting, did not know that you could feed multiple blobs to a conv layer. Thanks for sharing that piece of info.

Jan

chen...@umn.edu

unread,

Mar 16, 2016, 10:22:54 AM3/16/16

to Caffe Users

Hi Emmanuel,

You mentioned that "Caffe convolutional layers support inputting multiple blobs and applying the operation to all of them independently."