Multiple "solve" needed

1,037 views
Skip to first unread message

Emmanuel Iarussi

unread,
Jun 30, 2017, 11:56:26 AM6/30/17
to Ceres Solver
Hello, 
I'm using Ceres to solve a quadratic non-linear Least-Squares problem. Giving an starting polygonal mesh, I basically optimize its shape using some size and orientation constraints (smooth normals, tile size, etc). My solver options are like this:

options.minimizer_type = ceres::LINE_SEARCH;
options.line_search_interpolation_type = ceres::CUBIC;
options.line_search_direction_type =  ceres::LBFGS;  
options.use_approximate_eigenvalue_bfgs_scaling = true;
options.max_lbfgs_rank = 10000;
options.max_num_iterations = 50000;
options.function_tolerance = 1e-5; 
options.parameter_tolerance = 1e-10; 
options.gradient_tolerance = 1e-7;

Here is my problem: for some examples I've noticed that the solver stops with state "CONVERGENCE", but if hit "solve" again (using this new starting point), Ceres just goes
 on again for a while till it converges (now for real). Some times this re-starting is needed more than one time. I'm not an expert on non-linear solving, so it would really help 
me if someone gives me an intuition of what's happening. Ideally I would like to solve the problem with just one "solve" loop. Why is this re-starting needed? 
Thanks in advance.

Sameer Agarwal

unread,
Jun 30, 2017, 11:59:17 AM6/30/17
to Ceres Solver
Emmanuel, first question, why are you using the line search solver ? doesn't the trust region solver work for you in this case?

That said, what you are describing sounds like a bug to me, my guess is that this is because the LBFGS direction is poor and re-starting the solver resets it to an identity matrix. 

Sameer


--
You received this message because you are subscribed to the Google Groups "Ceres Solver" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ceres-solver...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ceres-solver/4a5ecb46-9695-41b4-8028-8554f0f0c626%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Emmanuel Iarussi

unread,
Jun 30, 2017, 12:19:35 PM6/30/17
to Ceres Solver


El viernes, 30 de junio de 2017, 12:59:17 (UTC-3), Sameer Agarwal escribió:
Emmanuel, first question, why are you using the line search solver ? doesn't the trust region solver work for you in this case?
 
Humm.. I don't really have an argument here. I was used to work with line search, so I kept that. I'm not really sure how to set up all the solver options for trust region. A quick test I just did doesn't work: the problem converges in a couple of iterations but residual is insanely large. Why should I use trust region instead of line search?
 
That said, what you are describing sounds like a bug to me, my guess is that this is because the LBFGS direction is poor and re-starting the solver resets it to an identity matrix. 

Is this related to my cost function? I'm using auto-diff. I've checked cost values during the iterations and they look ok. 

Emmanuel Iarussi

unread,
Jun 30, 2017, 3:56:10 PM6/30/17
to Ceres Solver
To expand the info, this is what my last iteration before convergence looks like:

18: f: 2.085903e+01 d: 3.82e-02 g: 2.43e+01 h: 7.04e-04 s: 5.80e-03 e: 9 it: 8.97e-02 tt: 1.23e+00

TOTAL RESIDUAL: 20.859

Iteration: 18

Cost change: 0.0381875

Gradient norm: 164.299

Step norm: 0.000704222


Then the summary:


Solver Summary (v 1.12.0-eigen-(3.3.0)-lapack-suitesparse-(4.5.3)-cxsparse-(3.1.9)-no_openmp)


Original Reduced

Parameter blocks 720 720

Parameters 2160 2160

Residual blocks 100 100

Residual 100 100

Minimizer LINE_SEARCH

Line search direction LBFGS (10000)

Line search type CUBIC WOLFE

Given Used

Threads 8 1


Cost:

Initial 1.988935e+02

Final 2.085903e+01

Change 1.780345e+02


Minimizer iterations 19

Line search steps 87

Time (in seconds):

Preprocessor 0.0014

Residual evaluation 0.0000

Line search cost evaluation 0.0000

Jacobian evaluation 1.2085

Line search gradient evaluation 0.9040

Line search polynomial minimization 0.0003

Minimizer 1.3456

Postprocessor 0.0000


Total 1.3470


Termination: CONVERGENCE (Parameter tolerance reached. Relative step_norm: 0.000000e+00 <= 0.000000e+00.)

SUCCESS: true


And finally what the first 2 iterations output when re-starting solver:


0: f: 2.085903e+01 d: 0.00e+00 g: 2.43e+01 h: 0.00e+00 s: 0.00e+00 e: 0 it: 1.57e-02 tt: 1.60e-02

TOTAL RESIDUAL: 20.8590327

Iteration: 0

Cost change: 0

Gradient norm: 164.299095

Step norm: 0


1: f: 2.033823e+01 d: 5.21e-01 g: 5.26e+01 h: 5.41e-03 s: 3.30e-05 e: 12 it: 1.24e-01 tt: 1.46e-01

TOTAL RESIDUAL: 20.3382269

Iteration: 1

Cost change: 0.520805814

Gradient norm: 190.316416

Step norm: 0.00541373248


My question is why Ceres is able to continue minimizing if the step norm = 0 and the solver converged?

Sameer Agarwal

unread,
Jul 2, 2017, 3:47:28 PM7/2/17
to Ceres Solver
Emmanuel,

In Iteration 0, no step is computed, it is just the setup for the solver, so I would not worry about the step norm being zero. Iteration 1 is the first real iteration.

The bigger issue here is that the LBFGS approximation seemed to have reached a state where it is producing poor search directions.

Is there a way for you to share your code/optimization problem so that I can run it myself and see what is going on?

Sameer


Sameer Agarwal

unread,
Jul 3, 2017, 10:16:06 AM7/3/17
to Ceres Solver
Another thing, I suspect your problem is poorly conditioned. Because of the way the trust region solved behaved. It maybe worth looking into this.


Alex Stewart

unread,
Jul 4, 2017, 10:38:34 AM7/4/17
to ceres-...@googlegroups.com
So there are several things here:

1) This:

Line search direction LBFGS (10000)

means you are using max_lbfgs_rank = 10000.  That’s a big number, and actually means that:

a) You’re actually (albeit indirectly) running BFGS (which I assume is what you wanted).
b) You’re actually allocating more memory than you would if you ran with BFGS, in fact considerably more, as we pre-allocate max_lbfgs_rank x num_parameters in LowRankInverseHessian, whereas the BFGS line search direction only allocates num_parameters x num_parameters.

2) Whether you’re running with LBFGS or BFGS, the Hessian (approximation) at termination != the initial Hessian (approximation) if the solver is restarted.  By definition, at termination the Hessian is built from the last max_lbfgs_rank iterations (LBFGS) or all previous iterations (BFGS), whereas when the solver is (re)started the inverse Hessian is initialised with the identity (optionally scaled if approximate eigenvalue scaling is used).  No internal state is maintained within Ceres between solves (irrespective of the solver).  As a result, the Hessian approximations are obviously not the same, typically they would not even be close to being the same and so the solver would usually continue to take steps when restarted.

From this:

Termination:  CONVERGENCE (Parameter tolerance reached. Relative step_norm: 0.000000e+00 <= 0.000000e+00.)

I would assume that your problem isn’t well conditioned, in that it looks like your accumulated LBFGS Hessian approximation isn’t yielding a descent direction.  Note that depending upon your problem, it can be the case that using a smaller max_lbfgs_rank may help here.

Intuitively, BFGS (or LBFGS if max_lbfgs_rank > num_iterations) uses information from places which can be ‘far away’ from the current operating point (i.e. from many iterations ago).  This information may not be helpful, particularly if your problem is not well conditioned, and so may actually be resulting in a poor Hessian approximation.  This is almost certainly what’s happening given that when you re-run the solver it takes quite a few steps.

-Alex

Emmanuel Iarussi

unread,
Jul 26, 2017, 3:00:19 PM7/26/17
to ceres-...@googlegroups.com
Hello Alex, 
Thanks a lot for your detailed response, and sorry for the late reply. I've been trying to figure out what the problem was. I'll answer you in line:

2017-07-04 11:38 GMT-03:00 Alex Stewart <alex...@gmail.com>:
So there are several things here:

1) This:

Line search direction LBFGS (10000)

means you are using max_lbfgs_rank = 10000.  That’s a big number, and actually means that:

a) You’re actually (albeit indirectly) running BFGS (which I assume is what you wanted).
b) You’re actually allocating more memory than you would if you ran with BFGS, in fact considerably more, as we pre-allocate max_lbfgs_rank x num_parameters in LowRankInverseHessian, whereas the BFGS line search direction only allocates num_parameters x num_parameters.

I see, I should directly use BFGS then.

2) Whether you’re running with LBFGS or BFGS, the Hessian (approximation) at termination != the initial Hessian (approximation) if the solver is restarted.  By definition, at termination the Hessian is built from the last max_lbfgs_rank iterations (LBFGS) or all previous iterations (BFGS), whereas when the solver is (re)started the inverse Hessian is initialised with the identity (optionally scaled if approximate eigenvalue scaling is used).  No internal state is maintained within Ceres between solves (irrespective of the solver).  As a result, the Hessian approximations are obviously not the same, typically they would not even be close to being the same and so the solver would usually continue to take steps when restarted.

Makes total sense now. Is there a way to flush this internal state on each iteration?  
 

From this:

Termination:  CONVERGENCE (Parameter tolerance reached. Relative step_norm: 0.000000e+00 <= 0.000000e+00.)

I would assume that your problem isn’t well conditioned, in that it looks like your accumulated LBFGS Hessian approximation isn’t yielding a descent direction.  Note that depending upon your problem, it can be the case that using a smaller max_lbfgs_rank may help here.

Intuitively, BFGS (or LBFGS if max_lbfgs_rank > num_iterations) uses information from places which can be ‘far away’ from the current operating point (i.e. from many iterations ago).  This information may not be helpful, particularly if your problem is not well conditioned, and so may actually be resulting in a poor Hessian approximation.  This is almost certainly what’s happening given that when you re-run the solver it takes quite a few steps.


In my problem I'm computing normals and comparing them. I've discovered ceres "crashes" because during the optimization, some of the geometry triangles are distorted too much (due to some energy terms), and normals get flipped or vanish. I've tried to prevent ceres to do these "wrong steps" returning false in the cost function if I detect something like this happened. An alternative I've also tried is to return a very high cost whenever normals/geometry are flipped. In both cases I get the same result from ceres: it stops in local minima with very high energy. 

I'm not sure what's the best option to tackle this problem. Is there any example with normals and geometry I can use to compare with mine? Thanks again!

Emmanuel

-Alex

--
You received this message because you are subscribed to a topic in the Google Groups "Ceres Solver" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ceres-solver/SEtXIMQwq88/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ceres-solver+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ceres-solver/E857C925-81BE-43B8-A57A-733B1F63760D%40gmail.com.

Alex Stewart

unread,
Jul 30, 2017, 8:00:57 AM7/30/17
to ceres-...@googlegroups.com
Emmanuel,

Makes total sense now. Is there a way to flush this internal state on each iteration?  

Why would you want to do this?  It’s possible, in that you could set max_num_iterations to 1, but doing so would be exactly equivalent to gradient descent as resetting the internal state on each iteration would result in the inverse Hessian (approximation) always being the Identity.  For a well posed problem this should be significantly slower than BFGS as the descent direction selected will not capture well the shape of the cost function.

In my problem I'm computing normals and comparing them. I've discovered ceres "crashes" because during the optimization, some of the geometry triangles are distorted too much (due to some energy terms), and normals get flipped or vanish. I've tried to prevent ceres to do these "wrong steps" returning false in the cost function if I detect something like this happened. An alternative I've also tried is to return a very high cost whenever normals/geometry are flipped. In both cases I get the same result from ceres: it stops in local minima with very high energy. 

Firstly, it’s important to understand that Ceres will only ever give you a local optima.  If your problem has a single minima, then in that special case it will return the global minima, but in general you will only get a local minima - which one you get (if there are multiple) will be a function of your initial conditions.

I don’t know the specifics of your problem, but I think Sameer is right that you currently formulation is poorly conditioned.  Andrew Fitzgibbon and Richard Stebbing did some work on fitting/deforming meshes which used Ceres [1][2] - how they structured their problem might be a good place to start for ideas about how to reformulate your problem.

-Alex



You received this message because you are subscribed to the Google Groups "Ceres Solver" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ceres-solver...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages