cuda-convnet2 local convolution kernel weights

75 views
Skip to first unread message

Chen-Ping Yu

unread,
Dec 1, 2016, 3:12:08 PM12/1/16
to torch7
Hi,

I'm working with local convolutions, and I found that cuda-convnet2 has very fast local convolution, so that's what I'm using right now. However, the kernel weights seem to be mixed into one big matrix instead of separate dimensions like in nn or cunn or cudnn, so for example:

require 'nn'
require 'cunn'
require 'cutorch'
require 'ccn2'

net = nn.Sequential()
net:add(nn.Transpose({1,4}, {1,3}, {1,2}))  -- need to transpose the input, according to torch's ccn2 webpage
net:add(ccn2.SpatialConvolutionLocal(3,32,10,5))


then when I tried to look at the kernel weights at the local convolution layer, I get a 2d matrix:

net.modules[2].weight:size()
 2700
     32
[torch.LongStorage of size 2]

but if you create the same network using nn.SpatialConvolutionLocal, the weights have the following format:
  6
  6
 32
  3
  5
  5
[torch.LongStorage of size 6]

which is much clearer, and allows me to know which weight and which dimension is which. but cuda-convnet2 mixes (6x6x3x5x5) into a single dimension of 2700. Does anyone know how I can reshape this dimension into the correct dimensions, that is height x width x #-kernels x #-channels x #-kh x #-kw, like in the nn veersion of the local convolution? I know reshape will probably work, but in what order should I reshape the 2700 values? Thanks!











Yossi Biton

unread,
Dec 2, 2016, 2:55:32 AM12/2/16
to torch7
I was facing this issue too long time ago. 
This is the great solution i got from Soumith :

for a local convolution layer in ccn2, weight is a 2D tensor of shape:
outputSIze*nInputPlane*filterSize in the 1st dimension
nOutputPlane in the 2nd dimension.
I pulled this information directly from the code.

Now for yo to get each kernel separately, you have to do this:

weight = weight:transpose(1,2):clone()
weight = weight:view(nOutputPlane, outputSize, nInputPlane, filterSize)
weight = weight:transpose(1,2):clone()
weight = weight:view(outputHeight, outputWidth, nOutputPlane*nInputPlane, filterHeight, filterWidth)

Now, to get the set of filters at location 3,4 for example, you do:
filters = weight[3][4]
This will be a tensor of the form {nFilters, height, width} and can be directly seen as a series of images. You can visualize this with image.display or gfx.image

Chen-Ping Yu

unread,
Dec 2, 2016, 9:46:49 AM12/2/16
to torch7
Hi Yossi,

Thanks for the reply!

it seems that your solution does give weights in term of {nFilters, height, width}, but nFilters is actually a combination of nOutputPlane*nInputPlane.

How should I do if I want to separate nFilters into 2 dimensions of nOutputPlane and nInputPlane separately, so I can get like a 1x3xhxw  for image.display to visualize (assuming this is the first layer, so the nInputPlane is 3 colors of rgb)? Thanks!

Chen-Ping Yu

unread,
Dec 2, 2016, 8:43:15 PM12/2/16
to torch7
Hi,

Just got a great solution from @fmassa

weight6d = weight2d:view(oH,oW,nInputPlane,kH,kW,nOutputPlane):permute(1,2,6,3,4,5)

This is a simple 1 liner that takes care of the appropriate decomposition with the correct ordering. Thanks Yossi for your suggestion though.
Reply all
Reply to author
Forward
0 new messages