Dear all,
I am dealing with the calculation of the steady state solution for the Hamiltonian describing a system made of N emitters coupled to a single mode cavity as a function of the cavity losses. In order to reduce calculation time I enable multithread in openblas, as suggested in your documentation and I added at the beginning of the script the following lines:
import os
os.environ['OPENBLAS_NUM_THREADS'] = '8'
os.environ['MKL_NUM_THREADS'] = '8'
I verified that numpy is compiled against openblas but something weird happens!! the strange thing is that it seems like calculation time is increasing with the number of thread used!
Here are some timings for N=4 when using a single thread:
real 0m27.511s
user 0m26.885s
sys 0m0.597s
or when using 24 threads (the max on our workstation):
real 0m49.670s
user 8m24.114s
sys 9m12.644s
so it seems that the CPUs waste a lot of time in the sys state and the total calculation takes longer.
Have you any idea on what's going on and how I can solve this problem?
Thanks a lot,
cheers
Sofia