Hello,
I am trying to use a custom
chainer.Function class (written by somebody else) in my Pytorch code. To this end, I intend to write a Pytorch module on top of the chainer module, so that the chainer module is actually called for all computations. For example, if
chainerModule is a custom
chainer.Function class with defined
forward and
backward functions, I create a Pytorch
torch.autograd.Function class, and a Pytorch
torch.nn.Module class. The
autograd.Function class's
forward function calls the
chainerModule's
forward function, and the same for the
backward. However, I observe absurd gradients returned by the
backward of the
chainerModule. Is calling the backward of the chainerModule in such a manner inconsistent with chainer's rules? Am I missing something by forcing the call to the backward function? If so, is there another way to use the chainer module in Pytorch code?
I include below a bird's-eye view of my code.class chainerModule(chainer.Function): def __init__(): ... def forward(self, inputs): ... def backward(self, inputs, grad_outputs): ...class PytorchFunction(torch.autograd.Function): @staticmethod def forward(ctx, inputs): ctx.save_for_backward(inputs) ctx.CM = chainerModule()
# Convert inputs from Pytorch tensor to CuPy matrix here ...
outputs = ctx.CM(inputs) @staticmethod def backward(ctx, grad_outputs): inputs, = ctx.saved_variables
# Convert inputs and grad_outputs from Pytorch tensors to CuPy matrices here ...
grad = ctx.CM.backward(inputs, grad_outputs)
# Convert grad to Pytorch Variable here ...
return gradclass PytorchModule(torch.nn.Module): def __init__(): ... def forward(self, inputs): return PytorchFunction.apply(inputs)
Thanks very much, and have a nice day,
Mihir