I was trying to extract weights from bvlc_alexnet model, specifically second convolution weights.
nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> output]
(1): cudnn.SpatialConvolution(3 -> 96, 11x11, 4,4)
(2): cudnn.ReLU
(3): cudnn.SpatialCrossMapLRN
(4): cudnn.SpatialMaxPooling(3x3, 2,2)
(5): cudnn.SpatialConvolution(96 -> 256, 5x5, 1,1, 2,2)
(6): cudnn.ReLU
(7): cudnn.SpatialCrossMapLRN
(8): cudnn.SpatialMaxPooling(3x3, 2,2)
(9): cudnn.SpatialConvolution(256 -> 384, 3x3, 1,1, 1,1)
(10): cudnn.ReLU
(11): cudnn.SpatialConvolution(384 -> 384, 3x3, 1,1, 1,1)
(12): cudnn.ReLU
(13): cudnn.SpatialConvolution(384 -> 256, 3x3, 1,1, 1,1)
(14): cudnn.ReLU
(15): cudnn.SpatialMaxPooling(3x3, 2,2)
(16): nn.View(-1)
(17): nn.Linear(9216 -> 4096)
(18): cudnn.ReLU
(19): nn.Dropout(0.500000)
(20): nn.Linear(4096 -> 4096)
(21): cudnn.ReLU
(22): nn.Dropout(0.500000)
(23): nn.Linear(4096 -> 1000)
(24): cudnn.SoftMax
}
This model has groups as 2 enabled.
.
.
.
4 :
{
dH : 2
dW : 2
kW : 3
gradInput : CudaTensor - empty
iSize : LongStorage - size: 4
kH : 3
_type : "torch.CudaTensor"
padW : 0
ceil_mode : true
output : CudaTensor - empty
padH : 0
name : "pool1"
}
5 :
{
padW : 2
nInputPlane : 96
output : CudaTensor - empty
gradInput : CudaTensor - empty
_type : "torch.CudaTensor"
groups : 2
gradBias : CudaTensor - size: 256
dW : 1
nOutputPlane : 256
padH : 2
kH : 5
weight : CudaTensor - size: 256x48x5x5
gradWeight : CudaTensor - size: 256x48x5x5
bias : CudaTensor - size: 256
dH : 1
kW : 5
name : "conv2"
}
.
.
.
My question isn't the weight of conv2 should be 256 x 96 x 5 x 5? It is 256 x 48 x 5 x 5 instead. Another doubt is since the input was 96 channels does that mean only on half of our input we are calculating weights? What happens to rest of the half?
Will it not create confusion while doing weights re-engineering as to where rest of the weight disappeared?
I see groups is implemented only in cudnn package(torch wrapper) whereas cunn as well as nn package doesn't have same. If I want to convert this same model to run only on CPU how m I suppose to go about doing this.
I'm novice to torch any suggestion regarding this would be of immense help.