Dear all,
I have been struggling with an issue for several days. My current platform is a multi-core aarch64 system using Ceres Solver. When enabling Ceres multithreading, the CPU utilization distribution across cores is uneven (70%, 10%, 10%, 10%). I aim to achieve more balanced workload allocation.
After reviewing documentation and forum discussions(see the References below), I identified the current configuration as Ceres multithreading with BLAS single-threading. To improve CPU balance, I attempted to switch to Ceres single-threading with BLAS multithreading.
I compiled a multithreaded OpenBLAS version and verified its behavior via cblas_dgemm, which confirmed the expected thread count alignment with compilation options. However, when integrating this with Ceres, BLAS multithreading fails to activate—the total threads remain equal to Ceres' thread count, even with Ceres set to single-threading.
Question: Is the "Ceres single-threading & BLAS multithreading" configuration feasible? If so, what are the correct implementation steps?
Implementation:
I tested this configuration in example/bundle_adjuster with the following approach:
Compilation Options for OpenBLAS:
Added the following flags to OpenBLAS' compilation:
-DUSE_THREAD=1 -DUSE_OPENMP=1 -DNUM_THREADS=8
Compilation Options for Ceres Solver:
Enabled the following flags in Ceres' compilation:
-DCERES_NO_OPENMP=ON
-DBLAS_VENDOR=OpenBLASCode Modification in bundle_adjuster:
Updated the thread configuration as follows:
DEFINE_int32(num_threads, 1, "...");
Just to add : I noticed the following code snippet in bundle_adjuster: DEFINE_string(sparse_linear_algebra_library, "suite_sparse", "...");
Could the inability to enable OpenBLAS multithreading in Ceres be related to the SuiteSparse configuration?
Looking forward to your response. Thanks.