A nonconvex manifold optimization with complex circle constraint

jueni...@gmail.com

unread,

Dec 16, 2016, 2:40:23 PM12/16/16

to Manopt

I'm working on the following nonconvex manifold optimization problem:
\min_{A\in M} f(A)=||P-A(A^*A)^{-1}A^*P||_{F}^2
where P is a given complex matrix, M is the complex circle manifold M=\{A| |A_{ij}|=1, \; \forall (i,j)\}, and A^* represents the Hermitian conjugate transpose of A. Moreover, the invariance holds f(A)=f(AD), where D is a diagonal square matrix with modulus one diagonal elements. I'll be very appreciated if you can give me some suggestions of the following two questions:

1. I derived the closed form wirtinger gradient and hessian matrix for f(A), but it seems that Manopt does not support wirtinger gradient and hessian.

2. Another way to tackle this manifold problem is to let A_{ij}=e^{i\theta_{ij}} and then solve the unconstrained problem over \Theta=[\theta_{ij}]. I wonder if solving the manifold optimization directly will have some benefits compared with this parameterized method.

Nicolas Boumal

unread,

Dec 16, 2016, 3:19:23 PM12/16/16

to Manopt

Hello Juening,

For your questions:

1. complexcirclefactory only handles vectors with unit modulus entries, not matrices, so you have to do a reshape manually (or modify complexcirclefactory). The gradient is defined with respect to the real inner product <u, v> = real(u'*v). I am unsure how you derived your gradient, but it should be compatible with this inner product. Did you try checkgradient? Did you try to fix frequent errors such as a missing factor 2, a missing sign (+/-), a missing conjugate or conjugate transpose?

2. I would certainly go with the complex-circle approach rather than with the parameterization e^i*theta.

As a side note: I don't see where this kind of problem might come up. Apparently, you look for a basis of a subspace orthogonal to P, such that the vectors of the basis have unit modulus entries. That's rather specific! Would you like to tell us more?

Best,

Nicolas

jueni...@gmail.com

unread,

Dec 16, 2016, 3:39:15 PM12/16/16

to Manopt

Hello Nicolas,

Thanks for your reply. The background of my problem is to decompose a given matrix P into two matrices A and D, where each element of A should satisfy the unit modulus constraint |A_{ij}|=1. Then the problem can be formulated as

min_{A\in M,D} ||P-AD||_{F}^2

Given a full column rank A, this problem is a least square problem and the optimal solution D_{opt}=(A^*A)^{-1}A^*P. Inserting D_{opt} into the original problem we obtain

min_{A\in M} ||P-A(A^*A)^{-1}A^*P||_{F}^2

which is the problem I'm working on. This reformulation can reduce the optimization variables from (A,D) to A, so I expect that the number of saddle points is also reduced.

Could you give me some hints why you prefer to go with the complex circle approach than the parameterization method? Does the parameterized problem have more stationary points than the un-parameterized version?

Nicolas Boumal

unread,

Dec 16, 2016, 4:15:09 PM12/16/16

to Manopt

Hello Juening,

I expect that your problem would be a lot easier if you optimized ||P-AD||_{F}^2 simultaneously for A and D rather than substituting one as an expression of the other. I hear your concern about saddle points, but this simpler form is definitely worth a try.

Did you already try that?

In Manopt, you would use productmanifold to work on a product of both complexcirclefactory(n^2) (which represents nxn matrices as vectors of size n^2; use reshape to convert back and forth) and euclideancomplexfactory(n, n) to represent D, a complex matrix of size nxn.

elements.A = complexcirclefactory(n^2);

elements.D = euclideancomplexfactory(n, n);

manifold = productmanifold(elements);

problem.M = manifold;

problem.cost = @(X) ... ; where X is a structure which contains X.A and X.D on appropriate manifolds.

problem.egrad = @(X) ... ; should return a structure G with fields G.A and G.D corresponding to classical gradients wrt A and D.

>> Could you give me some hints why you prefer to go with the complex circle approach than the parameterization method? Does the parameterized problem have more stationary points than the un-parameterized version?

I don't know about the number of stationary points. I do not have a strong argument to offer, aside from my personal experience and gut feeling. That's not much :).

Best,

Nicolas

BM

unread,

Dec 16, 2016, 4:19:00 PM12/16/16

to Manopt

Hello Juening and Nicolas,

Sorry for jumping in.

Keeping the motivation to use the complex-circle to Nicolas. (I agree with him on the use of complex-circle). I would like to share my thoughts on the optimization problem itself.

min_{A\in M,D} ||P-AD||_{F}^2

= min_{A\in M} min_D ||P-AD||_{F}^2

= min_{A \in M} || P - A D_A||_{F}^2:= f(A, D_A), where D_A is the least-squares solution of the inner problem when given A.

I understand that you inserted the solution D_A directly into the optimization problem so that final optimization problem is only on A. However, this might complicate the computation of the gradient and the Hessian, which in fact have simple formulas

egrad = 2(A D_A - P) D_A*, where D_A as (A^*A)^{-1}A^*P.

Similarly, ehess = 2( A D_A - P) D_A_dot* + 2( A_dot D_A + A D_A_dot) D_A, where

D_A is (A^*A)^{-1}A^*P and

D_A_dot = (A^*A)^{-1} ( A_dot*P - A_dot*A D_A - A* A_dot D_A).

A_dot is the search direction that is supplied in the argument.

Are you using these formulas?

Care: Check the gradient is compatible with the inner product that is used.

Regards,

BM

jueni...@gmail.com

unread,

Dec 16, 2016, 4:29:09 PM12/16/16

to Manopt

Hello Nicolas,

Thanks. I tried an alternating minimization method to optimize A and D alternatively. The performance of this method is not bad, but I think there still has some space to improve the algorithm because sometimes this method will converge to a saddle point. I will try the method you provided later.

Best regards
juening

jueni...@gmail.com

unread,

Dec 16, 2016, 4:38:46 PM12/16/16

to Manopt

Hello BM,

Thanks for sharing your thought. Indeed, I derived the gradient and Hessian matrix using the method you posted. The main problem is that I'm trying to seek an algorithm that can guarantee to converge to a local minimum. I read some papers on saddle-free algorithms, e.g, "Identifying and attacking the saddle point
problem in high-dimensional non-convex optimization". But it seems that these methods do not work on my problem. Since both Nicolas and you prefer to deal with the complex circle directly, I will try the trust region method on manifolds and see what happens.

Reply all

Reply to author

Forward