Hi
I assume you have a successful implementation of CP2K that can take advantage of MPI/OpenMP/GPU.
I would try to use the linear scaling setup. In this way you will be limited by matrix multiplications that
in turn will make full usage of the GPUs.
I would also use at least 32, better 64 MPI tasks and only 2/4 OpenMP threads.
Have a look at tests/xTB/regtest-4 for some example inputs.
regards
JH
________________________________________
From:
cp...@googlegroups.com <
cp...@googlegroups.com> on behalf of Fabian Årén <
fabbe...@gmail.com>
Sent: Thursday, August 18, 2022 3:42 PM
To: cp2k
Subject: [CP2K:17510] Setting up MPI, OMP and accelerator resources intelligently
Hello,
I've played around trying to optimize my cp2k compute resource inputs. How should one think regarding using MPI, OMP, and accelerators. I have a system of ~4000 atoms that I want to do some MD calculations on using xTB.
My hardware is
* 2x AMD EPYC™ 7763, 64-core, 2.45 GHz processors per node
* 1024 GB DDR4 3200MT/s of physical memory per node
* 8x GPU accelerator NVIDIA A100 per node, 320GB HBM2 memory per node
and my bash script looks as follows:
#PBS -l select=2:mpiprocs=2:openmpthreads=64
#PBS -m a
module load CP2K/8.2-fosscuda-2020b
mpirun -n 2 cp2k.psmp -o cp2k.out cp2k.inp
This however seem to cause severe I/O limitations. Any suggestions?
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
cp2k+uns...@googlegroups.com<mailto:
cp2k+uns...@googlegroups.com>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/cp2k/31d7c4fa-7dfa-467d-8ef3-42180e165d9an%40googlegroups.com<
https://groups.google.com/d/msgid/cp2k/31d7c4fa-7dfa-467d-8ef3-42180e165d9an%40googlegroups.com?utm_medium=email&utm_source=footer>.