Checking gradients and Hessians (even if you don't need all of Manopt)

164 views
Skip to first unread message

Nicolas Boumal

unread,
May 6, 2019, 2:23:44 PM5/6/19
to manopt...@googlegroups.com
Hello everyone,

This post is just to attract attention to a specific tool in Manopt that may be helpful also for coders who do not need the special features of Manopt, but need to write code for a gradient or a Hessian  of a function anyway:

If you have code in a function f = mycost(x) that computes the value of a function f(x), and you have code in a function g = mygradient(x) which computes the gradient of f at x, and you would like to check if your code for the gradient is correct, then you can do this (with Manopt available in your Matlab environment):

problem.M = euclideanfactory(n, 1); % if your variable x is a column vector of length n
problem.cost = @mycost;
problem.egrad = @mygradient;
checkgradient(problem);

This generates a few outputs at the command line, and a plot showing two lines that should be parallel over some interval. If any of the tests fails, this indicates the gradient is likely incorrect.

If your variable x is not a column vector of length n, but more generally a matrix of size m x n, or even a higher-dimensional array, write

problem.M = euclideanfactory(m, n);

problem.M = euclideanfactory([n1, n2, n3]); % any dimension works

instead. In any case, remember that the gradient must have the same size and shape as x itself.

If you have code in a function [f, g] = mycostandgradient(x) which returns both the value of f(x) and the gradient of f at x, write this:

problem.costgrad = @mycostandgradient;

instead of defining problem.cost and problem.egrad. (Note: This format expect that you return the Riemannian gradient, but if your variable is unconstrained in a linear space, as in the examples above, that this distinction is irrelevant and you can just ignore it.)


[If your cost function and gradient work out, since you're already there, you can also call trustregions(problem) to optimize from a random guess, or trustregions(problem, x) to optimize starting at x.]


If you also have code to compute the Hessian, in a function h = myhessian(x, u) which computes the Hessian of f at x applied to the direction u, (so that x, u, h all have the same size and shape), you can also check this code by adding:

problem.ehess = @myhessian;
checkhessian(problem);

This also produces some outputs at the command line, and a plot with two lines that, here too, should have the same slope over some interval. If any of the tests fails, this is indicative that the Hessian and/or the gradient code is incorrect.

If you variable x is complex, you can also handle this with

problem.M = euclideancomplexfactory(n, 1); % or any other array size as above

With x complex, grad f(x) is a complex vector / matrix / array of the same size as x, and the gradient is defined with respect to the real inner product <u, v> = real(u(:)'*v(:)).

If your function is a function of more than one variable, see productmanifold.m.

Let us know if you face any difficulties using these tools.

Best,
Nicolas

Nicolas Boumal

unread,
May 21, 2020, 10:49:32 AM5/21/20
to Manopt
The mathematical underpinnings of these tools are explained in my book, announced here:

Reply all
Reply to author
Forward
0 new messages