Problems of using multiple threads to speed up linear solver?

934 views
Skip to first unread message

Rui Yu

unread,
Mar 4, 2015, 12:44:34 PM3/4/15
to ceres-...@googlegroups.com
Hi all,

I am working on a dense bundle adjustment optimization problem at the moment, and trying to use multiple threads to speed up. But it turns out that there is not much difference in the time spent in the linear solver using 1 thread or 8 thread.

Here is the full report of the solver summary:

1 thread case:

iter      cost      cost_change  |gradient|   |step|    tr_ratio  tr_radius  ls_iter  iter_time  total_time
   0  6.228076e+04    0.00e+00    8.33e+06   0.00e+00   0.00e+00  1.00e+08        0    8.52e-02    2.68e-01
   1  3.191745e+04    3.04e+04    6.13e+06   2.35e+01   1.03e+00  1.00e+08        1    1.28e+00    1.55e+00
   2  1.501028e+04    1.69e+04    3.92e+06   2.09e+01   9.76e-01  1.00e+08        1    1.84e+00    3.39e+00
   3  6.627890e+03    8.38e+03    2.06e+06   3.05e+01   9.78e-01  1.00e+08        1    1.85e+00    5.24e+00
   4  3.739790e+03    2.89e+03    9.00e+05   2.60e+01   9.07e-01  1.00e+08        1    1.84e+00    7.08e+00
   5  3.164137e+03    5.76e+02    3.44e+05   1.23e+01   8.31e-01  1.00e+08        1    1.84e+00    8.92e+00
   6  3.092726e+03    7.14e+01    1.29e+05   4.57e+00   7.30e-01  1.00e+08        1    1.84e+00    1.08e+01

Solver Summary (v 1.10.0-lapack-suitesparse-cxsparse-openmp)

                                     Original                  Reduced
Parameter blocks                        84998                    84998
Parameters                              85002                    85002
Residual blocks                        113160                   113160
Residual                               253662                   253662

Minimizer                        TRUST_REGION

Sparse linear algebra library    SUITE_SPARSE
Trust region strategy     LEVENBERG_MARQUARDT

                                        Given                     Used
Linear solver          SPARSE_NORMAL_CHOLESKY   SPARSE_NORMAL_CHOLESKY
Threads                                     1                        1
Linear solver threads                       1                        1

Cost:
Initial                          6.228076e+04
Final                            3.092726e+03
Change                           5.918804e+04

Minimizer iterations                        6
Successful steps                            6
Unsuccessful steps                          0

Time (in seconds):
Preprocessor                           0.1833

  Residual evaluation                  0.1423
  Jacobian evaluation                  0.5402
  Linear solver                       11.5674
Minimizer                             12.3533

Postprocessor                          0.0089
Total                                 12.5455

Termination:                      CONVERGENCE (Parameter tolerance reached. Relative step_norm: 5.050391e-05 <= 1.000000e-04.)

Time = 12.7127 sec.



8 threads case:


iter      cost      cost_change  |gradient|   |step|    tr_ratio  tr_radius  ls_iter  iter_time  total_time
   0  6.228076e+04    0.00e+00    8.33e+06   0.00e+00   0.00e+00  1.00e+08        0    5.79e-02    2.38e-01
   1  3.191745e+04    3.04e+04    6.13e+06   2.35e+01   1.03e+00  1.00e+08        1    1.26e+00    1.50e+00
   2  1.501028e+04    1.69e+04    3.92e+06   2.09e+01   9.76e-01  1.00e+08        1    1.86e+00    3.35e+00
   3  6.627890e+03    8.38e+03    2.06e+06   3.05e+01   9.78e-01  1.00e+08        1    1.84e+00    5.20e+00
   4  3.739790e+03    2.89e+03    9.00e+05   2.60e+01   9.07e-01  1.00e+08        1    1.85e+00    7.04e+00
   5  3.164137e+03    5.76e+02    3.44e+05   1.23e+01   8.31e-01  1.00e+08        1    1.85e+00    8.89e+00
   6  3.092726e+03    7.14e+01    1.29e+05   4.57e+00   7.30e-01  1.00e+08        1    1.84e+00    1.07e+01

Solver Summary (v 1.10.0-lapack-suitesparse-cxsparse-openmp)

                                     Original                  Reduced
Parameter blocks                        84998                    84998
Parameters                              85002                    85002
Residual blocks                        113160                   113160
Residual                               253662                   253662

Minimizer                        TRUST_REGION

Sparse linear algebra library    SUITE_SPARSE
Trust region strategy     LEVENBERG_MARQUARDT

                                        Given                     Used
Linear solver          SPARSE_NORMAL_CHOLESKY   SPARSE_NORMAL_CHOLESKY
Threads                                     8                        8
Linear solver threads                       8                        8

Cost:
Initial                          6.228076e+04
Final                            3.092726e+03
Change                           5.918804e+04

Minimizer iterations                        6
Successful steps                            6
Unsuccessful steps                          0

Time (in seconds):
Preprocessor                           0.1805

  Residual evaluation                  0.0675
  Jacobian evaluation                  0.3485
  Linear solver                       11.8125
Minimizer                             12.3484

Postprocessor                          0.0102
Total                                 12.5391

Termination:                      CONVERGENCE (Parameter tolerance reached. Relative step_norm: 5.050391e-05 <= 1.000000e-04.)

Time = 12.7313 sec.


I wonder what might be the reason ? Am I doing something wrong ?


Best

Rui Yu

Sameer Agarwal

unread,
Mar 4, 2015, 12:55:46 PM3/4/15
to ceres-...@googlegroups.com
SPARSE_NORMAL_CHOLESKY is not threaded.

if you are doing bundle adjustment use one of the bundle adjustment oriented linear solvers
DENSE_SCHUR, SPARSE_SCHUR and there you should see the speedup.

The jacobian eval is fast enough that you will not see a lot of speedup.

Sameer


--
You received this message because you are subscribed to the Google Groups "Ceres Solver" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ceres-solver...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ceres-solver/ceffbd3c-8575-420d-83ee-70c333ff180e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kichang Kim

unread,
Mar 10, 2015, 10:34:11 PM3/10/15
to ceres-...@googlegroups.com
Hi. You mean that general nonlinear problem can't obtain multithreading but can only bundle adjustment?

I have large scale problem for image processing. With SPARSE_NORMAL_CHOLESKY, It always uses 10% cpu resources for solving LE.

Does DENSE_NORMAL_CHOLESKY or CGNR takes multithreading?

2015年3月5日木曜日 2時55分46秒 UTC+9 Sameer Agarwal:

Sameer Agarwal

unread,
Mar 10, 2015, 10:43:06 PM3/10/15
to ceres-...@googlegroups.com

Yes that is exactly what I am saying.
And no the other solvers do not support threading either.


Reply all
Reply to author
Forward
0 new messages