Hello,
I am trying to understand optim.checkgrad. Consider this simple function
function f(x)
vect = x.new(x:size()):fill(1)
fx = torch.dot(x:view(-1), vect:view(-1))
return fx, vect
end
fx is the dot product between flattened x and flattened vect, and it seems to me that the jacobian should simply be vect (ones of the size of x)
Yet when checking with optim.checkgrad
a = torch.rand(2, 5)
diff, dC, dC_est = optim.checkgrad(f, a)
dC_est evaluates to 5.0 * (ones of the size of x). Looking at the source of optim.checkgrad, optim.checkgrad operate along rows of x. What am I missing here?
PyTorch's autograd returns the result I expect, a tensor of ones.
import torch
from torch.autograd import Variable
a = torch.rand(2, 5)
av = Variable(a, requires_grad=True)
b = Variable(torch.ones(10))
c = torch.dot(av.view(-1), b)
c.backward()
print(av.grad)