Does CP2K allow a multi-GPU run?

333 views
Skip to first unread message

ASSIDUO Network

unread,
Apr 20, 2021, 9:00:17 AM4/20/21
to cp2k
I currently have access to 4 GPUs to run an AIMD simulation, but only one of the GPUs are being used. Is there a way to use the other 3, and if so, can you tell me how to set it up with a PBS job?

ASSIDUO Network

unread,
Apr 20, 2021, 1:44:15 PM4/20/21
to cp2k
I'm asking, since the administrator running my country's HPC is saying that although I'm requesting access to 4 GPUs, CP2K is only using 1. I checked the following output:
 DBCSR| ACC: Number of devices/node                                            4

And it shows that CP2K is picking up 4 GPUs.

Alfio Lazzaro

unread,
Apr 21, 2021, 3:22:35 AM4/21/21
to cp2k
The way to use 4 GPUs per node is to use 4 MPI ranks. How many ranks are you using?

Lenard Carroll

unread,
Apr 21, 2021, 3:26:53 AM4/21/21
to cp...@googlegroups.com
This is what my PBS file looks like:

#!/bin/bash
#PBS -P <PROJECT>
#PBS -N <JOBNAME>
#PBS -l select=1:ncpus=40:ngpus=4
#PBS -l walltime=08:00:00
#PBS -q gpu_4
#PBS -m be
#PBS -M none

module purge
module load chpc/cp2k/8.1.0/cuda10.1/openmpi-4.0.0/gcc-7.3.0
source $SETUP
cd $PBS_O_WORKDIR

cp2k.psmp -i gold.inp -o gold_pbc.out
~                                                                                                                       ~                                         

--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/70ba0fce-8636-4b75-940d-133ce4dbf0can%40googlegroups.com.

Alfio Lazzaro

unread,
Apr 22, 2021, 3:47:57 AM4/22/21
to cp2k
Hi,
Your command to run CP2K doesn't mention MPI (mpirun, mpiexc, ...). Are you running with multiple ranks?

You can check those lines in the output:

 GLOBAL| Total number of message passing processes                            32
 GLOBAL| Number of threads for this process                                    4

And check your numbers.
I can guess you have 1 rank and 40 threads.
To use 4 GPUs you need 4 ranks (and less threads per rank), i.e. something like

export OMP_NUM_THREADS=10
mpiexec -n 4 ./cp2k.psmp -i gold.inp -o gold_pbc.out

Please check with your sysadmin on how to run with multiple MPI ranks.

Hope it helps.

Alfio

Lenard Carroll

unread,
Apr 22, 2021, 3:52:41 AM4/22/21
to cp...@googlegroups.com
He suggested I try out:
mpirun -n 10 cp2k.psmp -i gold.inp -o gold_pbc.out

as he is hoping that will cause the 1 GPU to use 10 CPUs over the selected 4 GPUs.


Lenard Carroll

unread,
Apr 22, 2021, 4:15:47 AM4/22/21
to cp...@googlegroups.com
It says the following for me:
 GLOBAL| Total number of message passing processes                             1
 GLOBAL| Number of threads for this process                                   32

Lenard Carroll

unread,
Apr 22, 2021, 4:17:27 AM4/22/21
to cp...@googlegroups.com
Correction, he told me to use:

mpirun -np 10 cp2k.psmp -i gold.inp -o gold_pbc.out

but it didn't run correctly.

Alfio Lazzaro

unread,
Apr 22, 2021, 7:22:32 AM4/22/21
to cp2k
Could you try what I suggested:

export OMP_NUM_THREADS=10
mpirun -np 4 ./cp2k.psmp -i gold.inp -o gold_pbc.out

Please check the corresponding log.

As I said above, you need an MPI rank per GPU and you told us that you have 4 GPUs, so you need 4 ranks (or multiple). With 10 you get unbalance.

Lenard Carroll

unread,
Apr 22, 2021, 7:27:59 AM4/22/21
to cp...@googlegroups.com
Shall do. I already set it up, but it's in a long queue.

Alfio Lazzaro

unread,
Apr 22, 2021, 7:34:37 AM4/22/21
to cp2k
Wait, I see you have 32 threads in total, so need to have 32/4 = 8 threads.
Please change

export OMP_NUM_THREADS=8

Lenard Carroll

unread,
Apr 22, 2021, 7:42:16 AM4/22/21
to cp...@googlegroups.com
I am using 30 threads now over 3 GPUs, so I used:

export OMP_NUM_THREADS=10
mpiexec -n 3 cp2k.psmp -i gold50.inp -o gold50.out


Alfio Lazzaro

unread,
Apr 22, 2021, 9:15:32 AM4/22/21
to cp2k
And it works? Check the output and the performance... It can be that your particular test case doesn't use the GPU at all, so could you attach the log (at least the final part of it)

Lenard Carroll

unread,
Apr 22, 2021, 9:24:13 AM4/22/21
to cp...@googlegroups.com
Not sure yet. The job is still in the queue. As soon as it is finished I'll post the log file info here.

Lenard Carroll

unread,
Apr 22, 2021, 11:45:06 AM4/22/21
to cp...@googlegroups.com
Here's the log file. The job ended prematurely.
gold.txt

Alfio Lazzaro

unread,
Apr 22, 2021, 1:31:09 PM4/22/21
to cp2k
I'm sorry, I cannot assist you, I'm not an expert on how to use CP2K ('m not a domain scientist). Without the total log, I can help you...
I assume you should have a log file from PBS where you can see the error message. I can assume it is a memory limit.
Have you executed on a CPU only?



Lenard Carroll

unread,
Apr 22, 2021, 1:36:35 PM4/22/21
to cp...@googlegroups.com
Oh you meant the error file. Please find it attached.

I have run on CPU only and one GPU. It works.

error.txt

fabia...@gmail.com

unread,
Apr 22, 2021, 2:02:01 PM4/22/21
to cp2k
Hi,

cp2k is crashing when COSMA tries to access a gpu ("error: GPU API call : invalid resource handle"). On cray systems there is the environment variable "export CRAY_CUDA_MPS=1" that has to be set. Otherwise only one mpi rank can access a specific GPU device. Maybe there is a similar setting for your cluster?

Also cp2k can be memory hungry. Setting "ulimit -s unlimited" is often needed.

I hope this helps,
Fabian

Lenard Carroll

unread,
Apr 22, 2021, 2:08:20 PM4/22/21
to cp...@googlegroups.com
I'll play a bit around with it, but ultimately I'll have to forward this to an HPC administrator.

rongpei cui

unread,
Apr 23, 2021, 1:19:12 AM4/23/21
to cp2k
Hello, I am also participating in HPC. I would like to ask for some questions about running test files.

Lenard Carroll

unread,
Apr 23, 2021, 4:00:35 AM4/23/21
to cp...@googlegroups.com
Dear Fabian. COSMA wasn't installed with CP2K, so that can't be the issue. The HPC system is not CRAY, but I did ask the HPC admin to look into it.

On Thu, Apr 22, 2021 at 8:02 PM fabia...@gmail.com <fabia...@gmail.com> wrote:

Alfio Lazzaro

unread,
Apr 23, 2021, 4:24:57 AM4/23/21
to cp2k
The error in the log says that COSMA is used:

#7  0x2dfb43b in check_runtime_status
at /apps/chpc/chem/gpu/cp2k/8.1.0/tools/toolchain/build/cosma-2.2.0/libs/Tiled-MM/src/Tiled-MM/util.hpp:17
#8  0x2dfb43b in _ZNK3gpu13device_stream13enqueue_eventEv
at /apps/chpc/chem/gpu/cp2k/8.1.0/tools/toolchain/build/cosma-2.2.0/libs/Tiled-MM/src/Tiled-MM/device_stream.hpp:62
#9  0x2dfb43b in _ZN3gpu11round_robinIdEEvRNS_12tiled_matrixIT_EES4_S4_RNS_13device_bufferIS2_EES7_S7_iiiS2_S2_RNS_9mm_handleIS2_EE
at /apps/chpc/chem/gpu/cp2k/8.1.0/tools/toolchain/build/cosma-2.2.0/libs/Tiled-MM/src/Tiled-MM/tiled_mm.cpp:248

....

Lenard Carroll

unread,
Apr 23, 2021, 4:52:37 AM4/23/21
to cp...@googlegroups.com
I see. The HPC admin said he compiled it without COSMA. I have informed him of this.

ASSIDUO Network

unread,
Apr 25, 2021, 5:06:04 PM4/25/21
to cp2k
Thanks for letting me know. I forwarded this to the admin and he confirmed that COSMA was compiled with CP2K, even though he specified it not to be. He has fixed this now and I've been able to use all my GPU resources.

Thanks for the help.
Reply all
Reply to author
Forward
0 new messages