Background : I want to chain multiple networks. Suppose that I have 5 neural networks. One of them feeds into the rest 4. The purpose is to feedforward into the chosen subnetwork (and backpropagate) depending on the input example.
For example, if I have an input A which is mapped to the 1st subnetwork, then it is feedforwarded through the common network, the output of which is fed into the 1st subnetwork (note that I have 4 subnetworks in total and 1 shared network). At the end of the 1st subnetwork, I compute the loss w.r.t. the ground truth for input A. Then, I backpropagate the error, updating the parameters as I go back through the layers of the 1st subnetwork and subsequently the shared network.
Problem :
So far, I've tried to do this using two simple networks N and M. The output of N feeds into M. During backpropagation, I assign thediff
of the first layer of M to the diff
of the last layer of N
. The last layer in N is identical to the first layer of M. After the above assignment, when I execute N.backward
it does not seem to result in the correct gradients for the layers in N. I don't know how to get around this.
N.backward(output_layer=diff);
diff = M.blobs['first_layer'].diff;