I have a network with three parallel branches, and I want to share all their parameters so that they are identical at the end of the training.
ptb=nn.ParallelTable()
ptb:add(some_model)
ptb:add(some_model:clone('weight','bias', 'gradWeight','gradBias'))
ptb:add(some_model:clone('weight','bias', 'gradWeight','gradBias'))
triplet=nn.Sequential()
triplet:add(ptb)
I don't think the loss function is relevant, but just in case, I use nn.DistanceRatioCriterion. As for the variable some_model
, it is just a standard module made of cudnn.SpatialConvolution
, nn.PReLU
, nn.SpatialBatchNormalization
. Additionally, there is a nn.SpatialDropout
, but its probability is set to 0
, so it has no effect. Note that when I call
W,dval_dw=triplet:getParameters()
the tensors W
and dval_dw
have the correct size (same size as the parameters of some_model
), so this hints that up until now, the sharing is working.
Now, if I pass a batch {example,example,example}
where all example
variables are the same, I expect all three branches to output the same thing, that is
res=triplet:forward({example,example,example})
-- res[1], rest[2] and res[3] should be the same.
This holds before I start training. But once I go trough a parameters update, res[1]
, res[2]
and res[3]
take different values for the same inputs. So, it seems that updating the parameters somehow cancels the sharing. What is happening? Thanks a lot for your time.