Gradient over Unitray manifold

65 views
Skip to first unread message

LY

unread,
Mar 18, 2024, 11:39:36 AMMar 18
to Manopt
Hi, I have a simple difficult:  I want to optimize over the unitary manifold, but I don't know how to do the gradient over the complex unitary manifold. For example, my cost function is 1.png, where U is a unitary matrix. How could I do the gradient of f(U) over U, for U is a complex value matrix.

Thanks for everyone!

LY

unread,
Mar 18, 2024, 11:45:27 AMMar 18
to Manopt
Or for other cost function 2.png

Nicolas Boumal

unread,
Mar 18, 2024, 12:14:56 PMMar 18
to Manopt
Hello,

When working with complex matrices, a common (real) inner product is this one:  <A, B> = real(trace(A* B))   where A* is the Hermitian conjugate-transpose of A.

Notice how we take the real part of the trace, so that this inner product is indeed always a real number. (There is a bit more information around equation (3.17) in my book.)

For your function f(U) = ||UA - B||^2,  I assume you mean the Frobenius norm. Then, you can check that f(U) = <UA - B, UA - B> (with the inner product above).

What happens to f(U) if we push U a little bit in the direction V? That is, what the the value of Df(U)[V], the "directional derivative" of f at U along the direction V?

If we apply the chain rule and the product rule (since <., .> is indeed a product), then we find:

Df(U)[V] = <VA, UA - B> + <UA - B, VA>

Notice that the inner product is symmetric, and also that we can move a matrix from one side to the other in the inner product simply by taking its conjugate-transpose (see my book if you're not familiar with that fact); so:

Df(U)[V] = 2 <VA, UA - B> = <V, 2(UA-B)A*>

This show that the Euclidean gradient of f at U with respect to the inner product chosen above is:

nabla f(U) = 2(UA-B)A*

You can check that this is correct with the following Manopt code in Matlab:

problem.M = euclideancomplexfactory(5, 5);
A = randn(5)+1i*randn(5);
B = randn(5)+1i*randn(5);
problem.cost = @(U) norm(U*A-B, 'fro')^2;
problem.egrad = @(U) 2*(U*A-B)*A';
checkgradient(problem);

To get the Riemannian gradient on the unitary group (with the Riemannian submanifold metric, which is typical), it remains to compute the orthogonal projection of the Euclidean gradient to the tangent space.

In Manopt, you could just do this:

problem.M = unitaryfactory(5);
problem.cost = @(U) norm(U*A-B, 'fro')^2;
problem.egrad = @(U) 2*(U*A-B)*A';
checkgradient(problem);

Notice that we changed the manifold (problem.M), but that's it: all the rest is the same. That's because problem.egrad specifies the Euclidean gradient ("egrad", not "grad"), and Manopt takes care of the conversion automatically.

You can also specify the Riemannian gradient (problem.grad) by doing the projection yourself. The projector at U maps a matrix P to the matrix   U skew(U* P)   where skew(M) = (M-M*)/2 is the skew-Hermitian part of a matrix.

Hope this helps,
Nicolas

Nicolas Boumal

unread,
Mar 18, 2024, 12:16:45 PMMar 18
to Manopt
(Make sure to type "help unitaryfactory" in Matlab though, and to read the documentation there. There is a message under "This is important" that is, well, important to read when working with that manifold.)

LY

unread,
Mar 18, 2024, 10:30:49 PMMar 18
to Manopt
Thanks for your reply!

Actually, I'm considering a more complex cost function, just like this.3.png When I try to calculate the egrad of G(T), I find I have to do the complex gradient over a complex value matrix T, but the value of the cost function G(T) is real.Just from the consideration of complex analysis, this gradient seems impossible.

Nicolas Boumal

unread,
Mar 19, 2024, 8:39:46 AMMar 19
to Manopt
Hello,

Try the following:

>> problem.M = unitaryfactory(5);
>> D = diag(randn(5,1)); % Lambda matrix; real diagonal
>> problem.cost = @(T) sum(exp(diag(T*D*T')));
>> problem.egrad = @(T) 2*diag(exp(diag(T*D*T')))*T*D;
>> checkgradient(problem)

Here are some details (I didn't write everything out, but this is the essence of the calculation you need):

Image from iOS (1).jpg

Nicolas Boumal

unread,
Mar 19, 2024, 8:51:02 AMMar 19
to Manopt
Ah, one more thing:

While it's true that mathematically your cost function is real-valued, it may very well be that numerically it will pick up small complex parts.

Actually, the texte produced by "checkgradient(problem)" tells you this.

The procedure then is (a) to check that indeed the imaginary parts are machine-precision small (if not, there is a bigger problem); then (b) to forcefully take the real part; like this:

>> problem.cost = @(T) real(sum(exp(diag(T*D*T'))));

LY

unread,
Mar 19, 2024, 10:12:58 AMMar 19
to Manopt
Thanks very much!!! I will check it carefully!
Reply all
Reply to author
Forward
0 new messages