Unable to extract weights from model

24 views
Skip to first unread message

Venu Sharma

unread,
Dec 4, 2016, 3:42:17 AM12/4/16
to torch7
hello,
   I was trying to extract weights from bvlc_alexnet model, specifically second convolution weights. 

nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> output]
  (1): cudnn.SpatialConvolution(3 -> 96, 11x11, 4,4)
  (2): cudnn.ReLU
  (3): cudnn.SpatialCrossMapLRN
  (4): cudnn.SpatialMaxPooling(3x3, 2,2)
  (5): cudnn.SpatialConvolution(96 -> 256, 5x5, 1,1, 2,2)
  (6): cudnn.ReLU
  (7): cudnn.SpatialCrossMapLRN
  (8): cudnn.SpatialMaxPooling(3x3, 2,2)
  (9): cudnn.SpatialConvolution(256 -> 384, 3x3, 1,1, 1,1)
  (10): cudnn.ReLU
  (11): cudnn.SpatialConvolution(384 -> 384, 3x3, 1,1, 1,1)
  (12): cudnn.ReLU
  (13): cudnn.SpatialConvolution(384 -> 256, 3x3, 1,1, 1,1)
  (14): cudnn.ReLU
  (15): cudnn.SpatialMaxPooling(3x3, 2,2)
  (16): nn.View(-1)
  (17): nn.Linear(9216 -> 4096)
  (18): cudnn.ReLU
  (19): nn.Dropout(0.500000)
  (20): nn.Linear(4096 -> 4096)
  (21): cudnn.ReLU
  (22): nn.Dropout(0.500000)
  (23): nn.Linear(4096 -> 1000)
  (24): cudnn.SoftMax
}

This model has groups as 2 enabled.

   .
   .
   .
  4 : 
    {
      dH : 2
      dW : 2
      kW : 3
      gradInput : CudaTensor - empty
      iSize : LongStorage - size: 4
      kH : 3
      _type : "torch.CudaTensor"
      padW : 0
      ceil_mode : true
      output : CudaTensor - empty
      padH : 0
      name : "pool1"
    }
  5 : 
    {
      padW : 2
      nInputPlane : 96
      output : CudaTensor - empty
      gradInput : CudaTensor - empty
      _type : "torch.CudaTensor"
      groups : 2
      gradBias : CudaTensor - size: 256
      dW : 1
      nOutputPlane : 256
      padH : 2
      kH : 5
      weight : CudaTensor - size: 256x48x5x5
      gradWeight : CudaTensor - size: 256x48x5x5
      bias : CudaTensor - size: 256
      dH : 1
      kW : 5
      name : "conv2"
    }
   .
   .
   .

My question isn't the weight of conv2 should be 256 x 96 x 5 x 5? It is 256 x 48 x 5 x 5 instead. Another doubt is since the input was 96 channels does that mean only on half of our input we are calculating weights? What happens to rest of the half?
Will it not create confusion while doing weights re-engineering as to where rest of the weight disappeared?

I see groups is implemented only in cudnn package(torch wrapper) whereas cunn as well as nn package doesn't have same. If I want to   convert this same model to run only on CPU how m I suppose to go about doing this.

I'm novice to torch any suggestion regarding this would be of immense help.

Regards
Venu
Reply all
Reply to author
Forward
0 new messages