Multithreading issues

871 views
Skip to first unread message

Manuel Hofer

unread,
Oct 9, 2017, 5:41:28 AM10/9/17
to Ceres Solver
Hi

We have an interesting issue when using CERES multi-threaded (OpenMP), by setting the options "num_threads" and "num_linear_solver_threads" to some value greater than one.

When using a statically built Ceres 1.12, it works fine on Machine A (the one Ceres was built on). But when we use the static lib on Machine B it only uses one thread. Linking and compiling works fine though.
Both machines A and B are identical Lenovo ThinkStations with Ubuntu 16.04 LTS.

When we now try to build Ceres 1.12 or 1.13 on machine B as static lib, it works neither on A nor B with multiple threads. Again, compiling and linking works fine. OpenMP is also correctly found by the Ceres CMake script.

So it would appear that there is an issue with the build environment on machine B. However, there are no complaints by CMake that OpenMP is not found. It builds perfectly fine, without complaints.
Also, other executables work fine with OpenMP. Only Ceres doesn't use multiple threads...

Any ideas what could be wrong? We are running out of ideas :-)

Thanks!
Best,
Manuel

P.S.: we use ceres::SPARSE_SCHUR and ceres::SUITE_SPARSE

Johannes Schönberger

unread,
Oct 9, 2017, 6:23:47 AM10/9/17
to ceres-...@googlegroups.com
Hi Manuel,

Did you set any of the OpenMP environment variables (e.g. OMP_NUM_THREADS) on one machine but not the other?

Cheers,
Johannes
> --
> You received this message because you are subscribed to the Google Groups "Ceres Solver" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to ceres-solver...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/ceres-solver/f5b70b82-ecb7-4599-9944-64170b2f484c%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Manuel Hofer

unread,
Oct 9, 2017, 6:28:02 AM10/9/17
to Ceres Solver
Hi

No, we didn't set any specific environmental variables.
Also, OpenMP in general seems to work on all machines, just not within Ceres.

E.g. we call parallel for-loops multiple times in our codebase like this:

omp_set_dynamic(0);
omp_set_num_threads(num_threads);
#pragma omp parallel for
for(...){}

Best,
Manuel

Alex Stewart

unread,
Oct 9, 2017, 8:29:24 AM10/9/17
to ceres-...@googlegroups.com
Do you mean that in SolverSummary::FullReport() that you see something like:

Given Used
Threads N 1

or that you see N for both in the above, but that via some other metric you’ve determined that only one thread is being used?  I’m assuming the former.

-Alex

Manuel Hofer

unread,
Oct 9, 2017, 8:39:23 AM10/9/17
to Ceres Solver
Hi

We determined that only one thread is running by checking the CPU usage and by comparing the runtimes between the two machines.

I will check the full report, thanks for the hint.

Best,
Manuel

Sameer Agarwal

unread,
Oct 9, 2017, 10:08:15 AM10/9/17
to ceres-...@googlegroups.com
btw, you should check as to what code is actually using threading.
it is possible that ceres is not using threading but your underlying BLAS implementation is and thats the difference between the two machines.

Manuel Hofer

unread,
Oct 10, 2017, 3:28:13 AM10/10/17
to Ceres Solver
Okay, I checked the full report:

                                        Given                     Used
Linear solver                    SPARSE_SCHUR             SPARSE_SCHUR
Threads                                     4                        4
Linear solver threads                       4                        4
Linear solver ordering              AUTOMATIC               3443,20423
Schur structure                         3,3,3                    d,d,d

Still, only runs with 100% CPU usage on machine B and with 400% on machine A.

@Sameer: do you happen to know how to check if BLAS is parallelizing?

Best,
Manuel

Sameer Agarwal

unread,
Oct 24, 2017, 11:19:27 PM10/24/17
to ceres-...@googlegroups.com
Manuel,
Sorry this thread fell through the cracks. Having read your earlier email more carefully, I am not sure if BLAS is at fault here. Something else is going on. Did you make any progress in diagnosing it?
Sameer


Manuel Hofer

unread,
Oct 30, 2017, 6:52:54 AM10/30/17
to Ceres Solver
Hi

No, we did not yet make any progress regarding this problem...

Best,
Manuel

Sameer Agarwal

unread,
Nov 6, 2017, 12:52:54 AM11/6/17
to ceres-...@googlegroups.com
weird, I am not sure what advice to offer here.

--
You received this message because you are subscribed to the Google Groups "Ceres Solver" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ceres-solver...@googlegroups.com.

Giuseppe Trapani

unread,
Nov 10, 2017, 7:11:59 AM11/10/17
to Ceres Solver
Hi Manuel,

I run into a similar "problem". I am using LAPACK as my dense library for the linear solver. I saw from my terminal (using top and then displaying single threads for my process) that in some situations only the problem evaluation was multithreaded, while the linear solver wasn't and viceversa.

For me it all depends from the BLAS / LAPACK build and I am still trying to find the best compilation - linking strategy. 

For example I managed to obtain a multithreaded underlying linear solver by compiling OpenBLAS with OMP and then hard linking it to ceres when building ceres. This unfortunately makes my problem evaluation single threaded because OpenBLAS multithreading is in conflict with ceres multithreading (as the installation guide of both libraries reports).

I know this is not of much help but, if you want my two cents, make sure of how your dependencies are linked together. Since you are using SuiteSparse check to which linear algebra libraries it's linked against (if they are those compiled from you or those coming out of the box from linux repositories).
Reply all
Reply to author
Forward
0 new messages