Hi Edwin,
A few things to consider. Firstly, more cores is not always
better if there is insufficient work to distribute between them,
given that there is always some overhead with increasing the
number of cores. You may find that, even if you have 24 cores,
using only 12 of them, say, is still faster. Or, use the
opportunity of having more cores to run with more particles, if
that's something that will help you.
Secondly, do you have 24 physical cores or has this been doubled by hyperthreading? In my experience, it's worth setting the number of threads to the number of physical cores, which is half the number of cores reported by Linux if you have hyperthreading enabled.
Thirdly, make sure you have the right command-line options set if you're at the point where performance matters. Good rule of thumb:
--disable-assert --enable-avx --nthreads 12
Adjust the number of threads accordingly, and for an older CPU, use --enable-sse instead of --enable-avx.
It can take a little tuning to get the optimal configuration. Cache sizes and memory speed can matter a lot too, depending on the model.
Finally, if you have the Intel C++ compiler available, use it.
It's threading overhead seems significantly less than that of gcc.
Cheers,
Lawrence
--
You received this message because you are subscribed to the Google Groups "LibBi Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to libbi-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On that note and just out of interest, if `nthreads>1` but `nparticles==1` does any parallelisation occur?
On that note and just out of interest, if `nthreads>1` but `nparticles==1` does any parallelisation occur?
Yes it does.