Non-deterministic parameter re-ordering

347 views
Skip to first unread message

Richard Stebbing

unread,
Sep 17, 2013, 6:13:37 AM9/17/13
to ceres-...@googlegroups.com
Hi all,

I recently ran into an issue that running Ceres on the same problem, single-threaded, produced slightly different results per run. This was so noticeable that the number of iterations to convergence could vary by 10-20 between runs, and the final minimum was also different. After checking through all residuals, and Jacobians, and ensuring they are identical at Evaluate, I found the problem is actually as a result of non-deterministic parameter re-ordering.

As background, the problem I am working on has a Jacobian with a large block-diagonal structure on the left (the "E" block) and then a dense section on the right (the "F" block). To setup the Problem, I first add the parameter blocks manually and then add the residuals in the correct order so that they are lexicographically ordered and this block structure is established. I also set the parameter blocks pertaining to "E" to be group 0, and the parameter blocks to "F" to be in group 1. Given this setup I did not expect any re-ordering to take place, however currently in Ceres this is not guaranteed.

In `Solve`->`SolverImpl::TrustRegionSolve`->`SolverImpl::CreateReducedProgram`, `original_problem` initially has `parameter_blocks_` where each parameter block has its correct (original) index. However, after the initial call to `RemoveFixedBlocksFromProgram` all `index` values are set to 1. Now, while `parameter_blocks_` still has its initial ordering, the actual recorded indices are lost. Going forward, in `ReorderProgramForSchurTypeLinearSolver`->`ApplyUserOrdering`, `parameter_blocks_` is cleared and the parameter blocks are then re-added based on the groups. This ensures that the user-ordering of "E" and "F" blocks is respected, but it doesn't guarantee deterministic ordering within each group. This is because parameter blocks are added to `parameter_blocks_` based on their ordering within the std::set<double*> for each group. Since this ordering is based off the address of each parameter block, this can be quite arbitrary especially if the variables are heap-allocated. As a result, even if column re-ordering takes place afterwards, the final permutations of columns can be different because of this different initialisation. Personally, I think this is a bug because the permutation shouldn't depend on the address of the parameter blocks.

I will try to put a small example together to illustrate this, but I am currently pretty busy and this has consumed a pretty substantial amount of time already. In saying that, I don't mean to sound unappreciative as I am enjoying all of the other benefits of this framework and I massively appreciate it being available at all!
 
Please let me know if you need any more information/clarification.
 
Kind Regards,
 
Richard

 
 
 
 

Sameer Agarwal

unread,
Sep 17, 2013, 12:27:14 PM9/17/13
to ceres-...@googlegroups.com
Richard,
First of all, thats fantastic bit of debugging. 

While the reason this non-determinism is showing up is because of change in the non-deterministic ordering behaviour, the real cause is the poor conditioning of your problem. Generally speaking (and by that I mean exact arithmetic) these orderings should not have any real effect on the quality of your solution but if the Jacobian is particularly poorly conditioned, even small changes like the order in which a variable is eliminated will affect the value of the solution. This is one of the joys of floating point arithmetic.

As to the determinism of the ordering, I will look to see if I can improve the stability of the preprocessing of the problem. While that will help to make things more deterministic, it won't change the fact that the output you are getting is extremely sensitive to initial conditions. For example, if you enable threading in the linear solver, you are probably going to get different results every execution depending on how the threads are scheduled. 

Sameer





--
You received this message because you are subscribed to the Google Groups "Ceres Solver" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ceres-solver...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ceres-solver/e0565c2b-4fda-4cdd-b3f8-3aea696ac07a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sameer Agarwal

unread,
Sep 17, 2013, 12:59:45 PM9/17/13
to ceres-...@googlegroups.com
One more thing.

As background, the problem I am working on has a Jacobian with a large block-diagonal structure on the left (the "E" block) and then a dense section on the right (the "F" block). To setup the Problem, I first add the parameter blocks manually and then add the residuals in the correct order so that they are lexicographically ordered and this block structure is established. I also set the parameter blocks pertaining to "E" to be group 0, and the parameter blocks to "F" to be in group 1. Given this setup I did not expect any re-ordering to take place, however currently in Ceres this is not guaranteed.

The schur complement matrix if you are using a sparse solver will get re-ordered to get a fill reducing ordering. So just because you added parameters in a certain order and created a group 1 does not guarantee that the ordering will be preserved. Indeed this can't be the case, otherwise we won't be able to do AMD on the schur complement at all. 

Sameer

Richard Stebbing

unread,
Sep 17, 2013, 4:43:10 PM9/17/13
to ceres-...@googlegroups.com
Hi Sameer,
 
Thanks for the feedback!
 
You are absolutely right about the conditioning of the problem. To be honest, the final minimum wasn't that different between runs, and it wasn't having any real effect on the efficacy of the model or validity of the solution. I was just worried that I had done something silly on my end ...
 
I also understand about the intentional re-ordering by Ceres, and I'm glad for this! However, I just expected the re-ordering to be repeatable across runs which isn't always the case if the initial ordering is quite different.
 
Anyway, thanks again for getting back to me so quickly. I definitely appreciate it!
 
Cheers,
 
R

Sameer Agarwal

unread,
Sep 17, 2013, 5:05:28 PM9/17/13
to ceres-...@googlegroups.com
Richard,
As soon as I have some time, I am going to see what I can do to preserve as much of the original problem structure through the preprocessor as possible.
Sameer



Richard Stebbing

unread,
Sep 17, 2013, 5:53:12 PM9/17/13
to ceres-...@googlegroups.com
Not a problem. For the mean time I just added a #ifdef to prevent applying the user-ordering which essentially leaves `parameter_blocks` untouched (which is fine for my problem right now).
 
R

de...@dronedeploy.com

unread,
Apr 13, 2017, 6:39:36 PM4/13/17
to Ceres Solver
Sorry to resurrect an old thread, but I just ran into this issue when making an incremental SLAM system where the dynamically allocated keyframe objects resulted in the pointer order of the keyframe parameters (a transform) sometimes not matching the frame order. This resulted in non-deterministic operation of the solver as the parameter ordering would depend not on the input data but on the memory allocation environment at the time. Perhaps some kind of warning can be put in the documentation (or maybe it's already there and I missed it)? 

It would be nice if there was some automated way to detecting this, but given that users can add parameter blocks in any order I'm not sure how this would happen.

Thanks,

Sameer Agarwal

unread,
Jun 26, 2017, 3:58:38 AM6/26/17
to Ceres Solver
Devin,
Sorry I have been meaning to reply to this thread for a while, but I am confused. What is it that you are asking for here?
Sameer


Reply all
Reply to author
Forward
0 new messages