Inner iterations best practices

721 views
Skip to first unread message

Austin Conner

unread,
Jun 29, 2017, 6:07:31 PM6/29/17
to Ceres Solver
Hello,

I am using ceres solver and trying to make the most use of the inner iteration feature. FIrst of all, some statements I believe are true (correct me if I'm wrong)

1. The solver runs faster the better its information about jacobian sparsity
2. Information about jacobian sparsity is only passed to ceres by dividing the parameters into blocks. (For parameters in a block, ceres must assume dense jacobians)
3. Inner iterations are another way of exploiting the structure of the problem (In particular, if parameters can be divided into groups over which the residuals are linear, or more generally where the hessian entries are zero pairwise within a group).
4. Inner iteration groups are more effective the larger they are (subject to the property above).
5. Inner iteration groups are passed to ceres using 'parameter block orderings,' which is an ordered partition of the parameter blocks.
6. A restriction on a legal inner iteration group is that the parameter blocks of a group cannot co-occur in a partcular residual.

My question then is on the tension between the last point and the previous points. Suppose there is a set of parameters for some problem for which all the residuals are linear. However a particular residual only depends on a small subset of these parameters, so theres a lot of jacobian sparsity also. (I am trying to solve such a problem) Trying to exploit the jacobian sparsity requires splitting this parameter set into different blocks. The last point then prevents using an inner iteration to optimize these parameters simultaneously, because then the small individual blocks co-occur in some of the residuals. Alternatively, I could have made this parameter set into a large parameter block to use inner iterations, but then I'm not exploiting the jacobian sparsity.

My question is basically why the restriction on parameter block orderings? If its because ceres has no other way to know the hessian has the claimed property, is there a way to assure ceres of this fact? Is it possible to change a line of code in the ceres source to remove this restriction forever?

Thanks,
Austin

Sameer Agarwal

unread,
Jun 30, 2017, 10:13:29 AM6/30/17
to ceres-...@googlegroups.com
Austin,
My answers to your questions are inline. But a high level question first.

What kind of problems are you trying to solve, in terms of number of variables, size of parameter blocks, number of residuals/residual blocks. 

On Thu, Jun 29, 2017 at 3:07 PM Austin Conner <aconn...@gmail.com> wrote:
Hello,

I am using ceres solver and trying to make the most use of the inner iteration feature. FIrst of all, some statements I believe are true (correct me if I'm wrong)

1. The solver runs faster the better its information about jacobian sparsity

yes, generally speaking this should be true. 
 
2. Information about jacobian sparsity is only passed to ceres by dividing the parameters into blocks. (For parameters in a block, ceres must assume dense jacobians)

yes.
 
3. Inner iterations are another way of exploiting the structure of the problem (In particular, if parameters can be divided into groups over which the residuals are linear, or more generally where the hessian entries are zero pairwise within a group).

Sort of. Inner iterations are meant to generalize the notion of variable projection (which requires partial linearity) to fully non-linear problems. The idea is quite simple, we intersperse iterations of the trust region algorithm which works on the full problem with with inner iterations performing coordinate descent on individual parameter blocks. 
 
4. Inner iteration groups are more effective the larger they are (subject to the property above).

not necessarily true,  because in the limit you are just re-creating the trust region algorithm's work. So for example if you have a single parameter block in your problem, there is no point to doing inner iterations.
 
5. Inner iteration groups are passed to ceres using 'parameter block orderings,' which is an ordered partition of the parameter blocks.

yes.
 
6. A restriction on a legal inner iteration group is that the parameter blocks of a group cannot co-occur in a partcular residual.

Yes. But they can occur across inner iteration groups.
 

My question then is on the tension between the last point and the previous points.

as you can see there isn't a tension between these two points.
 
Suppose there is a set of parameters for some problem for which all the residuals are linear. However a particular residual only depends on a small subset of these parameters, so theres a lot of jacobian sparsity also. (I am trying to solve such a problem) Trying to exploit the jacobian sparsity requires splitting this parameter set into different blocks. The last point then prevents using an inner iteration to optimize these parameters simultaneously, because then the small individual blocks co-occur in some of the residuals. Alternatively, I could have made this parameter set into a large parameter block to use inner iterations, but then I'm not exploiting the jacobian sparsity.

My question is basically why the restriction on parameter block orderings?

Because we do coordinate descent on each parameter block in parallel, and for that it is important that at any given point of time the set of parameter blocks that we are doing descent on do not involve a residual block, because the coordinate descent assumes that all other parameters are held constant (or it will violate the descent property).

Sameer

If its because ceres has no other way to know the hessian has the claimed property, is there a way to assure ceres of this fact? Is it possible to change a line of code in the ceres source to remove this restriction forever?

Thanks,
Austin

--
You received this message because you are subscribed to the Google Groups "Ceres Solver" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ceres-solver...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ceres-solver/f2b6b1a3-a367-4e2d-8609-8cdebe9a3711%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Austin Conner

unread,
Jun 30, 2017, 12:59:55 PM6/30/17
to Ceres Solver
Thanks for the reply Sameer. I'm trying to use Ceres to solve systems of polynomial equations, the most recent of which in my study has 812 equations to 130 complex variables. My strategy so far has been to make a fine grained blocking of the variables, only grouping the real and imaginary parts of a particular variable into a parameter block. Same with residuals, a residual block is simply the real and imaginary part of an equation. This arrangement is useful not only for the typical sparsity of the jacobian, but for later steps when I try to 'sparsify' the solution by fixing some variables and optimizing over the remaining (great ceres feature).

For my particular problem, some of my colleagues try the so-called "alternating least squares" method, which apply's to what I believe you call a separable non linear least squares problem. The method is simply to optimize using exact linear methods alternately the groups of linear variables keeping the others constant. I had mistaken the inner iteration feature as a generalization of this method (in the documentation reference is made there to separable problems). Hence my belief in point 4.

Let me ask this then. Suppose I do want to do something similar to ALS where, after each trust region iteration, I wanted to further descend on 1 or more subgroups of variables (and I don't insist on doing this in parallel). 

1. Can this be achieved by a Solve call in an IterationCallback?
2. To what extent will the trust region or line search solvers solve linear least squares problem efficiently if the problem has this structure? Should I instead detect this case and apply a specialized solver?

Austin

Sameer Agarwal

unread,
Jul 2, 2017, 3:40:01 PM7/2/17
to ceres-...@googlegroups.com

Thanks for the details Austin, my comments are inline.

On Fri, Jun 30, 2017 at 9:59 AM Austin Conner <aconn...@gmail.com> wrote:
Thanks for the reply Sameer. I'm trying to use Ceres to solve systems of polynomial equations, the most recent of which in my study has 812 equations to 130 complex variables. My strategy so far has been to make a fine grained blocking of the variables, only grouping the real and imaginary parts of a particular variable into a parameter block. Same with residuals, a residual block is simply the real and imaginary part of an equation. This arrangement is useful not only for the typical sparsity of the jacobian, but for later steps when I try to 'sparsify' the solution by fixing some variables and optimizing over the remaining (great ceres feature).

This sounds very reasonable.

 
For my particular problem, some of my colleagues try the so-called "alternating least squares" method, which apply's to what I believe you call a separable non linear least squares problem. The method is simply to optimize using exact linear methods alternately the groups of linear variables keeping the others constant. I had mistaken the inner iteration feature as a generalization of this method (in the documentation reference is made there to separable problems). Hence my belief in point 4.

I think you are mixing two things. Alternating least squares and VARPRO (variable projection) are sort of related but not quite the same thing. And yes the inner iterations feature is trying to approximate the varpro algorithm but albeit poorly for the separable use case. 

my suggestion, if you actually have linear variables is to just eliminate them, and have a dense polynomial system purely in terms of the non-linear variables and feed it to ceres. Your system is small enough that sparsity is not an issue. DENSE_QR as the linear solver will solve it very competently.
 
Let me ask this then. Suppose I do want to do something similar to ALS where, after each trust region iteration, I wanted to further descend on 1 or more subgroups of variables (and I don't insist on doing this in parallel). 

1. Can this be achieved by a Solve call in an IterationCallback?

No, you are not allowed to modify the problem during the solve.
 
2. To what extent will the trust region or line search solvers solve linear least squares problem efficiently if the problem has this structure? Should I instead detect this case and apply a specialized solver?

See my suggestion above.  You can implement a cost function which only exposes the nonlinear variables and their jacobian to the solver and everytime the solver asks you to evaluate you can do a solve for the linear variables and eliminate them.
More precisely.

lets consider F(x,y) = 0 where x is a set of linear variables and y are nonlinear.

In this case F(x,y) = A(y) x + B(y) = 0

so create a cost function which only exposes y as the variables,  and then every time you get some y to evaluate 

you first evaluate A(y) and B(y) and you get 

x = -pinv(A(y))B(y)

as the solution

and then evaluate 

[I -A(y)pinv(A(y))] B(y) 

as the residual

and its derivative w.r.t y as the jacobian, which can be done by going back to the original VARPRO algorithm

The Differentiation of Pseudo-Inverses and Nonlinear Least Squares Problems Whose Variables Separate. Author(s): G. H. Golub and V. Pereyra. Source: SIAM Journal on Numerical Analysis, Vol. 10, No. 2 (Apr., 1973), pp. 413-432

or a more modern reference:

http://www.cs.umd.edu/~oleary/software/varpro.pdf

Sameer
 

Ajay Karthik

unread,
Jul 13, 2022, 5:36:00 AM7/13/22
to Ceres Solver
Hi Sameer,

I have a small doubt about the point 6 of the best practices of inner iteration groups. Just for getting more clarity on this topic.
6. A restriction on a legal inner iteration group is that the parameter blocks of a group cannot co-occur in a particular residual.
  • Does it restrict us to a whole residual block or just a particular residual in a vector of residuals.
Thank you,
Ajay

Sameer Agarwal

unread,
Jul 14, 2022, 9:15:18 AM7/14/22
to ceres-...@googlegroups.com
On Wed, Jul 13, 2022 at 2:36 AM Ajay Karthik <ajayka...@gmail.com> wrote:
Hi Sameer,

I have a small doubt about the point 6 of the best practices of inner iteration groups. Just for getting more clarity on this topic.
6. A restriction on a legal inner iteration group is that the parameter blocks of a group cannot co-occur in a particular residual.
  • Does it restrict us to a whole residual block or just a particular residual in a vector of residuals.

I am not sure I understand your question.

Sameer

 

Ajay Karthik

unread,
Jul 15, 2022, 3:12:20 AM7/15/22
to Ceres Solver
Hi Sameer,

Sorry about the question. I was able to answer it myself.
I was just referring to the restriction on the set of parameter blocks in an inner iteration group, if they could co-occur in the same residual block referring to different residual evaluations. 
But I don't think, it is possible since ceres computes for a complete residual block at a point of time.

Thanks,
Ajay.

Reply all
Reply to author
Forward
0 new messages