Hessian of function depending on ComplexCircle vector

93 views
Skip to first unread message

stum...@gmail.com

unread,
Mar 25, 2018, 11:01:27 PM3/25/18
to Manopt
Does one have to do something special to set the Euclidean Hessian of a function that depends on a vector of ComplexCircles of length n? I used http://www.matrixcalculus.org/matrixCalculus to compute the gradient and Hessian of the cost, reinterpreting sign as phase in the output. The egrad set below passes the gradient test, but the ehess set below does not pass the Hessian test or the symmetry test. Do you have any suggestions on how I might properly set the ehess below for this cost function? Thanks, Stuart

CODE:
n = 1000;
A = randn(n)+1i*randn(n);
b = randn(n,1);
manifold = complexcirclefactory(n);
problem.M = manifold;
problem.cost = @(x) 1/2*norm(abs(A*x).^2-b)^2;
problem.egrad = @(x) 2*A'*((abs(A*x).^2-b).*(A*x));
problem.ehess = @(x, xdot) 2*A'*((2*(A*x).*(A*x)+abs(A*x).^2-b).*(A*xdot));

checkgradient(problem); figure;
checkhessian(problem); figure;


OUTPUT:
The slope should be 2. It appears to be: 1.9999.
If it is far from 2, then directional derivatives might be erroneous.
The residual should be 0, or very close. Residual: 9.30823e-08.
If it is far from 0, then the gradient is not in the tangent space.
The slope should be 3. It appears to be: 2.
If it is far from 3, then directional derivatives or the Hessian might be erroneous.
Note: if the exponential map is only approximate, and it is not a second-order approximation,
then it is normal for the slope test to reach 2 instead of 3. Check the factory for this.
If tested at a critical point, then even for a first-order retraction the slope test should yield 3.
The residual should be zero, or very close. Residual: 2.90723e-09.
If it is far from 0, then the Hessian is not in the tangent plane.
<d1, H[d2]> - <H[d1], d2> should be zero, or very close.
Value: 761099 - 200577 = 560521.
If it is far from 0, then the Hessian is not symmetric.

Nicolas Boumal

unread,
Mar 26, 2018, 10:34:16 AM3/26/18
to Manopt
Hello,

For complex numbers, it's important to keep track of complex-conjugates. The key is that the gradient is defined with respect to some inner product. In Manopt, complex manifolds are usually endowed with an inner product which treats there real and imaginary part as if they were just components of a longer real vector: <u, v> = Re(u*v), where Re() extracts real part and u* is the Hermitian-conjugate of vector u.

Here is correct (albeit inefficient) code for the Hessian:

problem.ehess = @(x, xdot) 2*A'*(  ( 2*abs(A*x).^2 - b ).*(A*xdot)  +  (A*x).^2 .* conj(A*xdot) );

Here is a picture of the derivations:

In your case, the costly operation is the computation of the matrix-vector product A*x. This product appears a number of times in the cost, gradient and Hessian. To avoid repeating this computation, it's good to use the "store" mechanism (for caching).

See: http://manopt.org/tutorial.html#costdescription -- "Caching: how to use the store structure".

I hope this helps,
Best,
Nicolas
Message has been deleted

stum...@gmail.com

unread,
Mar 29, 2018, 12:10:57 AM3/29/18
to Manopt
Is there a correct way to use store when computing the Hessian so that the Hessian is saved off? Is it useful to save off the Hessian? The code below causes the Hessian symmetry test to fail. In myhess, is there some issue with store and different values of u for the same value of x? If I uncomment the code at the bottom of myhess, the code passes the tests.

CODE:
% Generate the problem data.


n = 1000;
A = randn(n)+1i*randn(n);
b = randn(n,1);

% Create the problem structure.


manifold = complexcirclefactory(n);
problem.M = manifold;

problem.cost = @mycost;
function [f,store] = mycost(x,store)

if ~isfield(store,'Ax')
store.Ax = A*x; % The store memory is associated to a specific x
store.aAx2 = abs(store.Ax).^2;
store.R = store.aAx2-b;
end

if ~isfield(store,'f')
R = store.R;
store.f = (R'*R)/2;
end
f = store.f;

end

problem.grad = @mygrad;
function [g,store] = mygrad(x,store)

if ~isfield(store,'Ax')
store.Ax = A*x; % The store memory is associated to a specific x
store.aAx2 = abs(store.Ax).^2;
store.R = store.aAx2-b;
end

if ~isfield(store,'eg')
Ax = store.Ax;
R = store.R;
store.eg = A'*((2*R).*Ax);
end

if ~isfield(store,'g')
eg = store.eg;
store.g = manifold.egrad2rgrad(x,eg);
end
g = store.g;

end

problem.hess = @myhess;
function [h,store] = myhess(x,u,store)

if ~isfield(store,'Ax')
store.Ax = A*x; % The store memory is associated to a specific x
store.aAx2 = abs(store.Ax).^2;
store.R = store.aAx2-b;
end

if ~isfield(store,'eg')
Ax = store.Ax;
R = store.R;
store.eg = A'*((2*R).*Ax);
end


if ~isfield(store,'h')
eg = store.eg;
Ax = store.Ax;
R = store.R;
aAx2 = store.aAx2;
Au = A*u;
store.h = manifold.ehess2rhess(x,eg,2*A'*((aAx2+R).*Au+(Ax.^2).*conj(Au)),u);
end
h = store.h;


%{
eg = store.eg;
Ax = store.Ax;
R = store.R;
aAx2 = store.aAx2;
Au = A*u;
h = manifold.ehess2rhess(x,eg,2*A'*((aAx2+R).*Au+(Ax.^2).*conj(Au)),u);
%}

end

% Numerically check gradient and Hessian consistency.
figure;
checkgradient(problem);
figure;
checkhessian(problem);

OUTPUT:
The slope should be 2. It appears to be: 2.00004.


If it is far from 2, then directional derivatives might be erroneous.

The residual should be 0, or very close. Residual: 1.04939e-07.


If it is far from 0, then the gradient is not in the tangent space.

The slope should be 3. It appears to be: 3.00028.


If it is far from 3, then directional derivatives or the Hessian might be erroneous.
Note: if the exponential map is only approximate, and it is not a second-order approximation,
then it is normal for the slope test to reach 2 instead of 3. Check the factory for this.
If tested at a critical point, then even for a first-order retraction the slope test should yield 3.

The residual should be zero, or very close. Residual: 3.03604e-09.


If it is far from 0, then the Hessian is not in the tangent plane.
<d1, H[d2]> - <H[d1], d2> should be zero, or very close.

Value: -363453 - -583440 = 219987.

Nicolas Boumal

unread,
Mar 29, 2018, 9:28:33 AM3/29/18
to Manopt
Hello,

Is there a correct way to use store when computing the Hessian so that the Hessian is saved off? Is it useful to save off the Hessian?

No, there is no way to do it, and the reason is it is not useful.

This is crucial: the store is associated to a particular x (a point on the manifold). If the Hessian is called for multiple directions u at the same x, all these calls will share the same store.

The reason is: it's pretty easy for solvers to keep track of which Hessian calls they made, so they won't ask you for the Hessian at the same x and u twice. This also makes the store mechanism simpler (hence more efficient), and it uses less memory.

Your commented code is indeed the correct version. (The part ' if ~isfield(store,'h') ... ' should be removed.)

Best,
Nicolas 
Reply all
Reply to author
Forward
0 new messages