Inconsistent output by loaded caffemodel

165 views
Skip to first unread message

Krishna Kumar singh

unread,
Oct 5, 2015, 4:17:28 PM10/5/15
to torch7
Hi,

I am trying to load standard Caffenet model and do forward pass. But if I do the forward pass twice for the same input, I am getting different outputs. Also parameters are not changing between the two forward pass and I have removed the dropout layers. I have attached the code and output.

Code:-

require 'nn'
require 'cutorch'
require 'inn'
require 'cudnn'
require 'loadcaffe'

-- load caffenet model
model
=loadcaffe.load('/home/vision3/Downloads/caffe-master/models/bvlc_reference_caffenet/deploy.prototxt','/home/vision3/Downloads/caffe-master/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel','cudnn');

-- remove dropout layer to avoid randomness
model
:remove(22)
model
:remove(19)
print(model);


-- declare input
inp
=torch.Tensor(3,224,224);


-- transfer data to GPU
model
=model:cuda();
inp
=inp:cuda();



-- store current network parameters
p_old
=model:getParameters():clone();


-- first forward pass
local o1=model:forward(inp):clone():float();

-- store parameters after first pass
p_new
=model:getParameters():clone();

-- second forward pass
local o2=model:forward(inp):clone():float();

-- print number of elements not similar in output
print('Number of non-equal output elements: '..torch.sum( torch.ne(o1,o2)));


-- print number of elements not similar in paramaeters
print('Number of non-equal parameter elements: '..torch.sum(torch.ne(p_old,p_new)));






Output:-

Successfully loaded /home/vision3/Downloads/caffe-master/models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel
MODULE data UNDEFINED
warning
: module 'data [type 5]' not found
conv1
: 96 3 11 11
conv2
: 256 48 5 5
conv3
: 384 256 3 3
conv4
: 384 192 3 3
conv5
: 256 192 3 3
fc6
: 1 1 9216 4096
fc7
: 1 1 4096 4096
fc8
: 1 1 4096 1000
nn
.Sequential {
 
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> output]
 
(1): cudnn.SpatialConvolution(3 -> 96, 11x11, 4,4)
 
(2): cudnn.ReLU
 
(3): cudnn.SpatialMaxPooling
 
(4): inn.SpatialCrossResponseNormalization
 
(5): cudnn.SpatialConvolution(96 -> 256, 5x5, 1,1, 2,2)
 
(6): cudnn.ReLU
 
(7): cudnn.SpatialMaxPooling
 
(8): inn.SpatialCrossResponseNormalization
 
(9): cudnn.SpatialConvolution(256 -> 384, 3x3, 1,1, 1,1)
 
(10): cudnn.ReLU
 
(11): cudnn.SpatialConvolution(384 -> 384, 3x3, 1,1, 1,1)
 
(12): cudnn.ReLU
 
(13): cudnn.SpatialConvolution(384 -> 256, 3x3, 1,1, 1,1)
 
(14): cudnn.ReLU
 
(15): cudnn.SpatialMaxPooling
 
(16): nn.View
 
(17): nn.Linear(9216 -> 4096)
 
(18): cudnn.ReLU
 
(19): nn.Linear(4096 -> 4096)
 
(20): cudnn.ReLU
 
(21): nn.Linear(4096 -> 1000)
 
(22): nn.SoftMax
}
Number of non-equal output elements: 1000    
Number of non-equal parameter elements: 0    




I also found that output is consistent till layer 4 and changes from layer 5 (2nd convolution layer). Can someone please explain me this behavior.

Francisco Vitor Suzano Massa

unread,
Oct 5, 2015, 4:56:39 PM10/5/15
to torch7
Could you try initializing your input data, by doing something like inp = torch.Tensor(3,224,224):uniform(), or something like that ?
Maybe SpatialCrossResponseNormalization is having problems with this non initialized data ?
...

Francisco Vitor Suzano Massa

unread,
Oct 5, 2015, 5:08:12 PM10/5/15
to torch7
maybe because of the non-initialized data, you are having nan's after the normalization layer  (inf/inf or something like that) ?

Krishna Kumar singh

unread,
Oct 5, 2015, 6:06:43 PM10/5/15
to torch7
Hi Francisco,

Thanks for your suggestion. I tried to initialize the data but it didn't help. I am still getting different outputs.

Sergey Zagoruyko

unread,
Oct 5, 2015, 6:38:14 PM10/5/15
to torch7 on behalf of Krishna Kumar singh
Hi, you should update cudnn bindings, there was a bug in R3 branch in how it handled groups.

--
You received this message because you are subscribed to the Google Groups "torch7" group.
To unsubscribe from this group and stop receiving emails from it, send an email to torch7+un...@googlegroups.com.
To post to this group, send email to tor...@googlegroups.com.
Visit this group at http://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.

Sadegh Aliakbarian

unread,
Oct 5, 2015, 8:01:34 PM10/5/15
to torch7
Hi,

I think it is because of the drop-out in the network that randomly chooses values for the next layer. Drop out should be used just in the training. if you want to test your model, try model:evaluate().
I hope it works for you.
...

Sadegh Aliakbarian

unread,
Oct 5, 2015, 8:06:16 PM10/5/15
to torch7
I forgot to say that put model:evaluate() after you loading your model (or before model:forward()). The rest of the code should remain unchanged.

Krishna Kumar singh

unread,
Oct 5, 2015, 11:06:34 PM10/5/15
to torch7
Hi,

I have already used model:evaluate(), and it still gives different outputs. Also, output changes from fifth layer (2nd convolution layer) much before the drop out layers. I will try to change cudnn bindings.

Krishna Kumar singh

unread,
Oct 6, 2015, 12:51:56 AM10/6/15
to torch7
Hi,

Updating the cudnn worked, and now I am getting consistent output. Thanks for the help.
Reply all
Reply to author
Forward
0 new messages