Hello Tiziano,
Thank you again for your help! After compiling psmp version of cp2k (openMPI based ), both of them, SIRIUS and ELPA are included. After the compilation I have benchmarked both of them and I got some results that I do not understand (ssmp and psmp were compiled with CUDA enabled). For testing, I was using my desktop pc, Ubuntu 20.04, Intel Core i7-4790, (4 cores, 2 threads), Gtx 1660 super and 16 gb of RAM. As benchmark I was using H2O-64.inp. Please find below what I got:
1) time cp2k.ssmp -o H2O-64.out H2O-64.inp
real 8m7,011s
user 27m34,937s
sys 1m2,519s
2) time mpirun -np 4 cp2k.psmp -o H2O-64.out H2O-64.inp
no load on GPU, only video memory used
real 49m20,818s
user 361m47,512s
sys 2m30,998s
Why do I observe such huge difference? I run it twice and I got the same result. After that I started to play little bit.
3) time mpirun -np 4 -x OMP_NUM_THREADS=2 cp2k.psmp -o H2O-64.out H2O-64.inp
real 7m59,329s
user 41m36,361s
sys 0m47,529s
4) time mpirun -np 4 --map-by slot:PE=1 -x OMP_PLACES=threads -x OMP_PROC_BIND=SPREAD -x OMP_NUM_THREADS=2 cp2k.psmp -o H2O-64.out H2O-64.inp
real 7m50,873s
user 41m25,381s
sys 0m46,038s
Tiziano, do you know why such huge discrepancies are observed? Thank you in advance for you comments!
Best regards,
Krzysztof