Dear David,
Without knowing the exact components of deal.II you are using, the first places where I would start looking into is whether you use multi-threaded blas or multithreading within deal.II. So you could try to do
export DEAL_II_NUM_THREADS=1
export OMP_NUM_THREADS=1
or disable multithreading from the compilation of deal.II (and use serial BLAS/LAPACK libraries) and check again. The behavior you're describing looks to be a combination of something that sees a good speedup in some parts of the solver, but very little to none in other parts.
The second suspicion would be memory bandwidth limitations within
the node, but even if you are fully memory bound you should see a
factor of ~10-12 of speedup when going from 1 to 48 cores on a
node (or a bit less if the processor has full turbo frequency
turned on and thus clocks higher with 1 core loaded than with all
24 cores loaded per socket), while you observe much less than
that.
Best,
Martin
--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dealii+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/8bd2837d-c284-4f1e-a194-ad4a56835cb6n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dealii/b22ad7fa-ccb8-17da-4c5c-daebfe30dd1d%40gmail.com.