Unable to extract weights from model

24 views

Skip to first unread message

Venu Sharma

unread,

Dec 4, 2016, 3:42:17 AM12/4/16

to torch7

hello,

I was trying to extract weights from bvlc_alexnet model, specifically second convolution weights.

nn.Sequential {

[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> output]

(1): cudnn.SpatialConvolution(3 -> 96, 11x11, 4,4)

(2): cudnn.ReLU

(3): cudnn.SpatialCrossMapLRN

(4): cudnn.SpatialMaxPooling(3x3, 2,2)

(5): cudnn.SpatialConvolution(96 -> 256, 5x5, 1,1, 2,2)

(6): cudnn.ReLU

(7): cudnn.SpatialCrossMapLRN

(8): cudnn.SpatialMaxPooling(3x3, 2,2)

(9): cudnn.SpatialConvolution(256 -> 384, 3x3, 1,1, 1,1)

(10): cudnn.ReLU

(11): cudnn.SpatialConvolution(384 -> 384, 3x3, 1,1, 1,1)

(12): cudnn.ReLU

(13): cudnn.SpatialConvolution(384 -> 256, 3x3, 1,1, 1,1)

(14): cudnn.ReLU

(15): cudnn.SpatialMaxPooling(3x3, 2,2)

(16): nn.View(-1)

(17): nn.Linear(9216 -> 4096)

(18): cudnn.ReLU

(19): nn.Dropout(0.500000)

(20): nn.Linear(4096 -> 4096)

(21): cudnn.ReLU

(22): nn.Dropout(0.500000)

(23): nn.Linear(4096 -> 1000)

(24): cudnn.SoftMax

}

This model has groups as 2 enabled.

4 :

{

dH : 2

dW : 2

kW : 3

gradInput : CudaTensor - empty

iSize : LongStorage - size: 4

kH : 3

_type : "torch.CudaTensor"

padW : 0

ceil_mode : true

output : CudaTensor - empty

padH : 0

name : "pool1"

}

5 :

{

padW : 2

nInputPlane : 96

output : CudaTensor - empty

gradInput : CudaTensor - empty

_type : "torch.CudaTensor"

groups : 2

gradBias : CudaTensor - size: 256

dW : 1

nOutputPlane : 256

padH : 2

kH : 5

weight : CudaTensor - size: 256x48x5x5

gradWeight : CudaTensor - size: 256x48x5x5

bias : CudaTensor - size: 256

dH : 1

kW : 5

name : "conv2"

}

My question isn't the weight of conv2 should be 256 x 96 x 5 x 5? It is 256 x 48 x 5 x 5 instead. Another doubt is since the input was 96 channels does that mean only on half of our input we are calculating weights? What happens to rest of the half?

Will it not create confusion while doing weights re-engineering as to where rest of the weight disappeared?

I see groups is implemented only in cudnn package(torch wrapper) whereas cunn as well as nn package doesn't have same. If I want to convert this same model to run only on CPU how m I suppose to go about doing this.

I'm novice to torch any suggestion regarding this would be of immense help.

Regards

Venu

Reply all

Reply to author

Forward

0 new messages