Inverse Riemannian Hessian

102 views
Skip to first unread message

Carlos Feres

unread,
Oct 7, 2019, 9:46:22 PM10/7/19
to Manopt

Hi all! First, just want to thank all the people involved in Manopt, it's an amazing tool.

I'm currently working on a minimization problem with Riemannian Manifolds. I'm rather new to the topic, so I started by using Euclidean spaces (although I'm looking to reformulate an expanded version of the problem using Grassmanian manifolds). My question is related to proving the theoretical convergence properties of Trust-Regions, as Manopt results are very nice (using complexeuclideanfactory and trustregions) but I want to prove that my problem satisfies the global and local convergence requirements of TR.

My cost function is defined over complex vectors,  f:C^n->R and
 is phase-invariant, i.e. f(alpha*w)=f(w) whith |alpha|=1. So the problem is actually defined on a quotient manifold  (which is one of the motivations to use manifold optimization), although this is not a major issue in my actual question.


Being a complex domain, I use Wirtinger calculus to derive the Euclidean gradients and Hessians. 


I've done most of the proofs already (using the theorems of the Riemannian Trust-Regions paper): I've shown global convergence (using Corollary 4.6), superlinear local convergence (Theorem 4.13, as the Hessian is exact), and most of the conditions for local convergence (Theorem 4.12). The only part missing of the latter is to show that the inverse Riemannian Hessian operator is bounded in a neighborhood of a local minimizer.


My problem: I haven't been able to find an expression for the inverse Riemannian Hessian.



The Wirtinger Hessian can be described as a block matrix where each submatrix depends only on the argument w in C^n


H_f (w) = [A_w , B_w ; conj(B_w) , conj(A_w) ]


Due to Wirtinger calculus, this Hessian operates on the argument [ u ; conj(u) ]


I computed the Riemannian Hessian with Frechet derivatives and obtained 

h = Hess f [u] = A(w) u + B(w) conj(u)

I tried to solve for h in the equation above, to no avail. I also tried to rewrite the expression with the usual C->R^2 formulation, but had no success isolating the real and imaginary parts of u. 

Any ideas to find the inverse Riemannian Hessian operator? Am I missing something obvious? 

Nicolas Boumal

unread,
Oct 8, 2019, 8:10:24 AM10/8/19
to Manopt
Hello Carlos,

Thank you for your interest in the toolbox.

I think you do not need to get the inverse of the Hessian for your stated purpose: the Hessian at the global optimizer (or any other point) is a symmetric operator, hence it has real eigenvalues. If these eigenvalues are nonzero, then the eigenvalues of the inverse of the Hessian are simply the reciprocals of the eigenvalues of the Hessian:

If lambda_1, ...., lambda_n are the eigenvalues of Hess f(x),
then 1/lambda_1, ..., 1/lambda_n are the eigenvalues of (Hess f(x))^{-1}.

If you can show that lambda_1 is strictly positive, say, lambda_1 > a for some positive real number a, then it follows that all eigenvalues of the inverse of the Hessian at x are upperbounded by 1/a.

Then, simply using that Hess f(x) is continuous (I didn't check for your particular problem, but it's often the case), it follows that the inverse of the Hessian is continuous, so that there is a neighborhood around x where the inverse of the Hessian is bounded. This doesn't tell you how big the neighborhood is, but it's enough to claim existence.

As a side note: an alternative to Wirtinger calculus is described here: https://groups.google.com/forum/#!topic/manopttoolbox/CLP5J8x0qmA: see the comment at the end of the post which explains how to apply the general concept to complex variables.

I hope this helps.

Best,
Nicolas

Carlos Feres

unread,
Oct 8, 2019, 12:36:27 PM10/8/19
to Manopt
Thanks Nicolas for your prompt answer. 

When using Manopt for this problem, I actually used the approach described in the link with euclideancomplexfactory and the real trace inner product. I used Wirtinger calculus to describe where the terms of the Riemannian Hessian come from.

I'm assuming that you are using eigenvalues in ascending order. I did that for the Wirtinger Hessian. But as far as I understand, I need to derive the eigenvalues of the Riemannian Hessian operator to show that its inverse is bounded. If that's the case, I'm also having trouble with describing the operator in "matrix" form because of the conj(u) term in Hess f [u] in order to derive its eigenvalues. And yes, the Riemannian Hessian is continuous (both matrices A_w and B_w are polynomials of the real and imaginary parts of w). 

Any further ideas? Thanks!

Nicolas Boumal

unread,
Oct 8, 2019, 12:42:02 PM10/8/19
to Manopt
Hello Carlos,

That's correct: to apply the local convergence results from Absil's book, you need to understand the eigenvalues of the Riemannian Hessian at the limit point (the point where you converge).

To understand the largest or smallest eigenvalue of a symmetric linear operator A, you can simply consider the maximum or minimum of the quantity <u, A(u)>, where x is a unit norm vector. For example, A(u) = Hess f(x)[u] is the Hessian at x, and u is a tangent vector at x, with norm 1.

To get an understanding of <u, A(u)>, the nice thing is that you do not necessarily need to express A as a matrix: you just need to understand what A(u) is, and you probably already have a formula for this. Then, take the (real) inner product of u with A(u): this gives a real number, and you "just" need to upper or lower bound it.

Best,
Nicolas
Message has been deleted
Message has been deleted

Carlos Feres

unread,
Oct 8, 2019, 1:50:26 PM10/8/19
to Manopt
Thanks Nicolas, this helps a lot. Although my Riemannian Hessian operator is R-linear but not C-linear, as it has a conj(u) in the expression, it does not seem like something that breaks that argument.
Reply all
Reply to author
Forward
0 new messages