By following this web-site http://caffe.berkeleyvision.org/gathered/examples/siamese.html
we can use a Siamese network in Caffe, which shares the weights for each layer.
But, I was wondering about how the Siamese network in Caffe updates their shared weights from the backpropagation.
To be specific, if we have
input1 -> conv1(shared) -> output1
input2 -> conv1(shared) -> output2 ===> contrastive loss (for output1 and output2),
then, Caffe just sums up the two gradients for conv1 from the first and second networks?
Thanks for your response in advance.