Main part that this training rule is based, is updating no the whole graph with loss at the top, but part of this graph. For example if we have such graph conv1 -> conv2 -> loss, and conv1 -> loss2, how can I update only conv2 for loss? Making lr = 0 doesn works, because conv1 should be updated by loss2.