On Tue, 2015-09-15 at 03:03 -0700,
craigwa...@gmail.com wrote:
>
> gives a master with 3 threads and a worker with 3 threads.
So did you try Lisandro's suggestion to use omp_set_num_threads() ?
Also, I would check what omp_get_num_threads() returns afterwards from a
parallel block to see if it has worked. I provide a sample code below,
you can wrap it using ctypes, or Cython, or else cffi. I'm not sure of
what your tools are reporting, and if you are interpreting what they are
reporting correctly...
Finally, it's generally not a very good idea to make the master use less
threads than the rest in an OpenMP / MPI scenario. If you are running it
on one machine, it won't make a difference anyways, because the extra
small number of threads to create wouldn't take much time, and they'll
keep idling anyways. If you are running on a large cluster, then
breaking the symmetry might lead to unexpected interactions with the
queuing system, and often it's difficult or impossible to request such
allocations anyways.
Sample code:
size_t omp_set_num_threads_native(const size_t n) {
size_t result = 1;
#ifdef _OPENMP
omp_set_num_threads(n);
#pragma omp parallel
{
result = omp_get_num_threads();
}
#endif
return result;
}
--
Sincerely yours,
Yury V. Zaytsev