Optimization with two different manifolds

433 views
Skip to first unread message

Xianghao YU

unread,
Jan 6, 2015, 1:38:37 AM1/6/15
to manopt...@googlegroups.com
Hi,

I am using Manopt and find it really useful and powerful for me. Now I can solve the optimization problem on manifolds with the help of Manopt.

However, I wonder whether Manopt can solve the problem with two different manifolds simultaneously.

Thank you very much.

Best,
Xianghao

BM

unread,
Jan 6, 2015, 3:01:41 AM1/6/15
to manopt...@googlegroups.com
Hello Xianghao,

Yes. You can use the tool productmanifold

Here is a forum post by Nicolas 

Also, here is the tutorial in the tools section, http://manopt.org/tutorial.html#tools.

Let us know if you have any other query.

Regards,
BM


seu...@gmail.com

unread,
Jan 12, 2015, 2:53:35 AM1/12/15
to manopt...@googlegroups.com
Hello BM,

Thank you very much for your help.

In addition, could you please provide me some materials about the optimization problem on the complex circle manifold? I want to understand some details about it.

Thank you very much.

Yours sincerely,
Xianghao

BM

unread,
Jan 12, 2015, 4:43:10 PM1/12/15
to manopt...@googlegroups.com
Hello Xianghao, 

Unfortunately, I am not that familiar with the complex circle manifold myself, Nicolas might have some useful references on this.

Regards,
BM



seu...@gmail.com

unread,
Jan 13, 2015, 12:58:26 AM1/13/15
to manopt...@googlegroups.com
Thank you very much for your help.

Yours sincerely,
Xianghao

Nicolas Boumal

unread,
Jan 13, 2015, 8:33:33 AM1/13/15
to manopt...@googlegroups.com
Hello Xianghao,

The geometry obtained with complexcirclefactory(n) allows one to optimize cost functions defined over complex vectors of length n such that each entry of the vector has modulus 1. That is to say: it is optimization over a product of n circles in the complex plane.

Here is one paper that addresses optimization over this manifold, and gives formulas for the tangent space, orthogonal projector and the Riemannian gradient. it does not give formulas for the Riemannian Hessian, but those can be rad from the matlab factory file:


Tightness of the maximum likelihood semidefinite relaxation for angular synchronization
Afonso S. Bandeira, Nicolas Boumal, Amit Singer
See section 4.3 "Construction of the dual certificate S".
(there is very little about the geometry, though, I'm afraid.)

I hope this helps!

Nicolas

seu...@gmail.com

unread,
Jan 13, 2015, 8:58:08 AM1/13/15
to manopt...@googlegroups.com
Hello Nicolas,

Thank you for your help. I will read it and hope it helps.

And I have another question may be a little bit trivial. I wonder the result given by Manopt worked over complex circle manifold is local optimal or global optimal?

Thank you
Xianghao

Nicolas Boumal

unread,
Jan 13, 2015, 9:03:18 AM1/13/15
to
If the cost function is appropriately smooth (say, twice continuously differentiable, to be safe, but you might get away with less than that), then methods such as the steepest descent method and the trust-regions method converge (from any initial guess) to critical points, i.e., the norm of the gradient goes to zero (in the limit).

That is a very weak statement. In particular, it doesn't say if you are converging to a single point (although in practice you do, "almost always"), and it doesn't even say whether you are going to a local optimizer (although in practice you do, "almost always").

There is certainly no guarantee of finding a global optimum in general. But we do observe that, on many applications, convergence is to global optima "most of the time" ... Again: little or no theory there, and certainly not for a general cost.

I hope this answers your question!
Best,
Nicolas

seu...@gmail.com

unread,
Jan 13, 2015, 9:15:44 AM1/13/15
to manopt...@googlegroups.com
It really helps. Thank you for your answer.

Sincerely,
Xianghao

seu...@gmail.com

unread,
Mar 3, 2015, 9:30:34 AM3/3/15
to manopt...@googlegroups.com
Hello Nicolas,

Could you please provide me some information how you determine the step size alpha in the steepest algorithm in Manopt? I saw that you wrote it is based on a simple backtracking method, is it the same as the Armijo step?

Thank you very much!
Yours sincerely,
Xianghao

Nicolas Boumal

unread,
Mar 4, 2015, 7:03:04 AM3/4/15
to manopt...@googlegroups.com
Hello Xianghao,

There are multiple line-search algorithms implemented in Manopt. You can see a list here:

You can choose which one to use in the options structure passed to the solver, and you can also pass options to the line-search algorithm in this fashion.

The algorithm used by default in steepestdescent (in the version of the code at the time of writing this)
is

It is indeed a backtracking line-search, which stops based on an Armijo "sufficient decrease" condition.
The way we select the initial step size to try (this is critical to making the algorithm efficient) is derived from page 59 in the book "Numerical Optimization", by Nocedal and Wright (2nd edition).

If you have a better idea of which step size should be tried first, you can use linesearch_hint as an alternate line-search algorithm. See the documentation here:
and an example here:

If you are interested in the conditions that the line-search should fulfil in order to guarantee convergence of the steepest descent method, I refer you to the book Optimization Algorithms on Matrix Manifolds by Absil, Mahony and Sepulchre (available for free online).

Hope this helps.

Cheers!
Nicolas

seu...@gmail.com

unread,
Mar 29, 2015, 2:21:05 AM3/29/15
to manopt...@googlegroups.com
Hello Nicolas,

I have another question about the checkgradient function. Can this function check the gradient of a real value function with respect to a complex variable?

For example, I have an optimization problem on complex sphere, and I have calculated the gradient by myself, can I use checkgradient to check it?

Thank you!


Yours sincerely,
Xianghao

BM

unread,
Mar 29, 2015, 6:40:06 AM3/29/15
to manopt...@googlegroups.com
I think it is possible. 

Regards,
BM


seu...@gmail.com

unread,
Mar 30, 2015, 1:42:25 AM3/30/15
to manopt...@googlegroups.com
I am a little bit confused about your answer. Does it mean the checkgradient function can check the gradient with respect to the complex variable all the time?

Thank you very much.

Xianghao

Nicolas Boumal

unread,
Mar 30, 2015, 4:48:45 AM3/30/15
to manopt...@googlegroups.com
Hello Xianghao,

There's a few things to distinguish in your question:

 * Manopt has no problem working with complex numbers as a rule, just as it wouldn't have a problem working with quaternions, or cells, or whatever. The toolbox can work with any representation of a Riemannian manifold, as long as one can expose projections, retractions, inner products, etc. for it.

 * This being said, most manifolds that come out of the box with Manopt, currently, are real manifolds. Those that work with complex numbers are explicitly tagged as such. The complex sphere you mention is one of them. We hope to see complex Stiefel and Grassmann at some point (people showed interest.)

 * When considering complex functions, there are multiple notions of differentiability. You may want to look into the Cauchy-Riemann conditions and Wirtinger calculus:

 * For Manopt (that is to say, for what optimization algorithms will expect to get as a gradient, and for what checkgradient etc. will check against), what counts is that the gradient should provide a first-order approximation of the cost function as explained here:
http://www.manopt.org/tutorial.html#tools  (combine the explanation for checkdiff and checkgradient).

 * For the complex sphere, the inner product can be read in M.inner, here:
   It corresponds to seeing complex matrices of size mxn as elements of R^2mn, with the canonical metric.
 * For the complex circle, the inner product corresponds to seeing the complex plane as R^2, really:

So yes, you can use checkgradient to check derivatives of real functions of complex variables on complex manifolds in Manopt; you just need to be sure what it is that you are checking, and trying to obtain. The notion implemented in Manopt is the one that was useful for all our test cases so far, but if you find that you need a different notion, that'd be interesting to know.

I hope this helps ; the different notions of derivatives on complex variables can sometimes be confusing (they shouldn't be though ; it'd be nice to give a clearer picture of this at some point.)

Cheers,
Nicolas

seu...@gmail.com

unread,
Apr 5, 2015, 11:45:43 PM4/5/15
to manopt...@googlegroups.com
Hello Nicolas,

Thank you for your reply and it really helps. Actually, I found some minor problems. The gradient given by checkgradient function doubles the gradient, when solving the gradient of some quadratic form functions with respect to a complex variable. In reality, the gradient for real variables is often two times of that for complex domain, especially when the function with quadratic forms. However, I don't think it's a big problem because the gradient gives a direction and the step size can be used to adjust this issue.

Furthermore, I have a question that is it possible to define a complex Euclidean space as a manifold, like the Euclidean space which has been already well-developed in Manopt?

Nicolas Boumal

unread,
Apr 7, 2015, 5:45:03 AM4/7/15
to manopt...@googlegroups.com
Hello Xianghao,

Could you give an explicit example of this doubling phenomenon you mention? A specific function f(X) = ... where this happens?
While for steepestdescent (with a specific line-search) it is true that only the direction of the gradient matters, for other algorithms the magnitude matters too, so we better get this right.

As for a complex Euclidean space, it should be easy enough. I gave it a try, and I attached the corresponding geometry to this post. Again: this really comes down to seeing C^(mxn) as just (R^(mxn))^2; with a fancy reprensentation. It is not necessarily what you need; let us know at any rate. If it works for you, we will add it to the next release of Manopt (should be coming this month, or early May).

Cheers,
Nicolas
euclideancomplexfactory.m

seu...@gmail.com

unread,
Apr 8, 2015, 5:16:26 AM4/8/15
to manopt...@googlegroups.com
Hello Nicolas,

First of all, thanks a lot for your code for the complex Euclidean manifold. Actually I have tried to do some coding on it and I found a mistake in my code when I see yours. Thank you again!

For the complex gradient issue, I am not sure whether I am right and I just give a simple example for it. For instance, f(x)=tr(A*X) + tr(A*X').

In the complex domain, the derivatives should be solved w.r.t. the conjugate of the variable. The first term of f(x), whose derivative should be zero in complex domain. And the second term, the derivative is A.'. If you use this result into the Manopt, the checkgradient function will give a negative judgement. Could you please explain this?

If I have any misunderstanding, please tell me and thank you very much!

Best,
Alex

BM

unread,
Apr 8, 2015, 6:38:19 AM4/8/15
to manopt...@googlegroups.com
Alex, I think that the derivative may not be completely correct. Should not there be a symmetric operation on A'?

Cheers,
BM


Nicolas Boumal

unread,
Apr 8, 2015, 6:44:51 AM4/8/15
to manopt...@googlegroups.com
I'm with Bamdev here: the gradient should be 2*HermitianPart(A) = A + A' (conjugate-transpose).
Can you check that ?

Thanks,
Nicolas

seu...@gmail.com

unread,
Apr 8, 2015, 9:32:58 PM4/8/15
to manopt...@googlegroups.com
I think for the function f(x)=tr(A*X) + tr(A*X'), when A is complex and X is real matrix, it should be A+A.'(transpose, not conjugate-transpose), I check this by Manopt already.

But for the situation that A and X are both complex, I don't think the gradient is similiar. Some references can be found on the complex derivatives. Even if you use A+A'(conjugate-transpose) in the code, checkgradient function will give the negative judgement. For example, I wrote a code as follows:

A = randn(5,5)+1i*randn(5,5);
manifold = spherecomplexfactory(5,5);
problem.M = manifold;
problem.cost = @(X) trace(A*X) + trace(A*X');
problem.egrad = @(X) A+A';
checkgradient(problem);

Hope that this is not my misunderstanding, thank you very much.

Yours sincerely,
Alex

Nicolas Boumal

unread,
Apr 9, 2015, 4:41:27 AM4/9/15
to manopt...@googlegroups.com
Hello Alex,

Good point. But the issue here is that your cost function is not real. We need the cost to be real for optimization to make sense (at least, in the sense that Manopt and most (all?) optimization packages I know can understand).

If you replace the cost by real( ... ) and use A+A' as gradient, you'll be fine:

manifold = spherecomplexfactory(5,5);
problem.M = manifold;
problem.cost  = @(X) real(trace(A*X) + trace(A*X'));
problem.egrad = @(X) A+A';
problem.ehess = @(X, Xdot) zeros(size(Xdot));
checkgradient(problem);

Does that make sense?

Cheers,
Nicolas

seu...@gmail.com

unread,
Apr 9, 2015, 5:23:26 AM4/9/15
to manopt...@googlegroups.com
Hello Nicolas,

It makes sense totally, thanks.
Furthermore, I found that the relationship between the derivative and gradient is a little bit confusing. In Matrix Cookbook (Page 24), the gradient is two times of the conjugate complex derivatives. However, in the book "Complex-Valued Matrix Derivatives: With Applications in Signal Processing and Communications"(Page 77), these two are equal. I think this maybe the reason for the doubling phenomenon I mentioned.

Thank you very much!

Best,
Alex

Nicolas Boumal

unread,
Apr 9, 2015, 5:31:03 AM4/9/15
to manopt...@googlegroups.com
As a rule, one clean way to get the gradient is to obtain the directional derivatives first, and then identify the formula with the definition.

The gradient of f at X, noted grad f(X), is the (unique, in finite dimension at least) vector satisfying the following, for all Xdot (Xdot's are vectors too):

<grad f(X), Xdot> = Df(X)[Xdot]

where <., .> is the chosen inner product. For real matrices, the typical inner product is <A, B> = trace(A.' * B). For complex matrices, a classical real inner product is <A, B> = real(trace(A' * B)); They both correspond to the Frobenius norm: norm(A, 'fro') = sqrt(<A, A>).

Getting the directional derivative is usually simple. See for example this:


(Notice that directional derivatives do not depend on the choice of inner product, whereas the gradient does depend on the inner product.)

seu...@gmail.com

unread,
May 6, 2015, 2:07:07 AM5/6/15
to manopt...@googlegroups.com
Hello Nicolas,

Is it possible to disable the display of the solver in Manopt?

Thank you very much!
Alex

Nicolas Boumal

unread,
May 6, 2015, 3:05:58 AM5/6/15
to manopt...@googlegroups.com
Yes: set options.verbosity = 0;

Let me know if this doesn't work for the solver you use.

Best,
Nicolas

ose...@gmail.com

unread,
Jun 9, 2016, 10:55:10 AM6/9/16
to Manopt
Hey,
I am really iterated in what you were discussing in this post. I have a question which may be trivial, +Nicolas , you have mentioned that about complex optimization you want to see stiefel mainfold people showed more interest, is there any different between use sphere complex or stiefel complex ? if i want to solve an optimization problem via manifold ?

BR
SAM

BM

unread,
Jun 10, 2016, 1:10:32 AM6/10/16
to Manopt
Hello Sam, 

Yes, the complex sphere and stiefel manifolds are different. The difference is orthogonality of the vectors. The descriptions are in http://www.manopt.org/reference/manopt/manifolds/stiefel/stiefelcomplexfactory.html and http://www.manopt.org/reference/manopt/manifolds/complexcircle/complexcirclefactory.html. Nicolas would be able to shed more light on the applications use cases. 

Regards,
BM



BM

unread,
Jun 10, 2016, 12:24:23 PM6/10/16
to Manopt
Hello Sam,

Thanks for the interest in Manopt. 

1- do you recommend any paper show the difference between them, like more theory and analysis ?

There are papers for each of them, but comparing directly them would be comparing apples and oranges and would not make sense. The sphere case deals with vectors of unit norm. The Stiefel manifold, on the other hand, consists of matrices with vectors of unit norm that are also orthogonal with other. 

The specific use of these depends on the applications. Do you have anything specific in mind?



2- may be this bit difference but in the Manopt matlab file there is a file called examples inside it there is (truncated_svd.m) , my question is there any paper discuss this example, i wanna understand more ? because i just called this function to do an optimization and it gave me a very good solution curve but i have no much information ? could you please pass me some data ?

The file is pretty standard. I am not aware if there exists any specific paper that discusses this particular example.

Nicolas might have more specific suggestions. 

Regards,
BM





On Fri, Jun 10, 2016 at 4:49 PM, SAM wrote:
Dear BM
Thank you for replying, it's really helpful. i have some questions :
1- do you recommend any paper show the difference between them, like more theory and analysis ?
2- may be this bit difference but in the Manopt matlab file there is a file called examples inside it there is (truncated_svd.m) , my question is there any paper discuss this example, i wanna understand more ? because i just called this function to do an optimization and it gave me a very good solution curve but i have no much information ? could you please pass me some data ?

BR
SAM


Reply all
Reply to author
Forward
0 new messages