Optimization over sphere segment

Niklas Koep

unread,

Aug 18, 2017, 5:40:20 AM8/18/17

to Manopt

Hey everyone,

let's say you want to optimize over a patch of the unit sphere given in terms of the intersection of a polyhedral cone C = {x | Ax < 0} (for some matrix A of size m-by-n) and S^{n-1}. Would something like this be possible with manopt?

I had a look at how the retraction for the sphere is derived in Prop. 4.1.2 of "Optimization algorithms on matrix manifolds". At least at a cursory glance it seems that M = S^{n-1} \cap {x | Ax < 0} = S^{n-1} \cap C and N = {x | x > 0} with phi(x,r) = xr would still apply. In that case the image of M x N under phi would be the open cone C itself, i.e., phi(M x N) = C, making it an open subset of R^n, and the inverse of phi given by phi^{-1}(u) = (u/||u||, ||u||). However, the retraction constructed in this way clearly doesn't work since x + xi easily yields points outside of C for x in M and xi in R^n. Hence, simply projecting x + xi onto S^{n-1} does not produce elements in M.

Which (probably very obvious) issue am I overlooking?

Thank you in advance for your help!

Best regards,

Niklas

Nicolas Boumal

unread,

Aug 18, 2017, 7:45:23 AM8/18/17

to manopt...@googlegroups.com

Hello Niklas,

Thank you for your question.

The short answer is that, currently, Manopt does not offer ready-to-use tools to solve optimization problems on a manifold with additional constraints (as is your case) -- but we are working on it this summer! If you would like to share some data for your problem (cost function, matrix A), we would certainly be interested in taking a look.

Until then, here are a couple ideas for how one could approach your problem, in the order that I would try them based on ease of implementation more than based on expected results:

- splitting methods: if it is easy enough to project to the cone C, you could artificially introduce a new variable y, under the seemingly pointless constraint x = y, and optimize: min f(x) + I_C(y) s.t. x on the sphere and x = y, where I_C(y) is 0 if y is in the cone C, and infinite otherwise. Then, you can optimize by alternating between x and y, and also an extra variable which will correspond to Lagrange multipliers for the constraint x = y. This rough sketch is explained much more precisely in the MADMM paper: https://link.springer.com/chapter/10.1007/978-3-319-46454-1_41 (also on arxiv) -- I didn't check this in detail but I expect this may work.

- penalty methods: you could remove the constraint that x \in C, and instead have a penalty term in the cost function, where you want to simultaneously minimize your cost and you want to minimize max([Ax ; 0]) (the largest entry of that vector which contains the entries of Ax and also 0): in practice, you minimize a weighted sum of both, and there is a need to iterate over the weights to find "the right one". Notice that max(Ax) is a nonsmooth function, and currently manopt doesn't do nonsmooth: we're also working on that. In the meantime, you can replace max([Ax ; 0]) by a smooth surrogate, using the log-sum-exp trick: https://hips.seas.harvard.edu/blog/2013/01/09/computing-log-sum-exp/. (On that last webpage's notation, z is an approximation for max(x), and one would typically scale down x by epsilon to increase differences, and scale the result back by epsilon.).

- If it is easy to initialize inside the feasible set, you could also try barrier methods, that is: you would have a penalty which goes to infinity as you approach the boundary of the cone. Something like sum(log(-Ax)) (but I didn't think about this much). Here also there would be a weight parameter to update progressively.

Note: there is an additional issue I have overlooked here, namely, that your cone is open. That's an issue for optimization, since your problem may then have no solution at all. I wrote my answer assuming you would be happy with a solution in {x : Ax <= 0} (the closure).

I hope this helps,

Nicolas

Niklas Koep

unread,

Aug 25, 2017, 8:16:21 AM8/25/17

to Manopt

Dear Nicolas,

thanks for the quick reply (and sorry for my late response). I should have clarified that I was mainly looking for a way to modify the sphere manifold to deal with the situation (if possible). Based on your response, I assume it's not as simple as I had hoped for ;) As for the openness of the cone: that was a mistake on my part; it was indeed supposed to be closed.

I've actually played around with your first and third suggestion before, in particular the log-barrier approach, but so far I wasn't very successful (which might boil down to a bad parameter choice). I guess the situation also isn't helped by the fact that the objective function in the original problem is ||x||_1 which I had to approximate using the pseudo-Huber function.

I don't think there's anything particular in the structure of the cone which would enable particularly fast projections (think A defined in terms of random hyperplanes). I've actually found a paper which presents a somewhat fast algorithm for projection on polyhedral cones but I think a proper projection will still remain the biggest bottleneck.

In any case, I'll revisit these strategies and try to post some example code next week.

Thanks for your help!

Niklas

Nicolas Boumal

unread,

Aug 25, 2017, 8:31:49 AM8/25/17

to Manopt

Thanks for the feedback and clarifications Niklas -- looking forward to your results!