--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp2k+uns...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/39284c57-f6eb-463e-81a6-3a123596a9f2%40googlegroups.com.
-- Dr Pierre Cazade, PhD AD3-023, Bernal Institute, University of Limerick, Plassey Park Road, Castletroy, co. Limerick, Ireland
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/91cedcdb-79a0-a5f3-cf51-fde52abbba49%40gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/CAMEzy6RbFpKfoc7dr94Z%3DNCzdzotQQCMA8W7w1x%3DLZe9zXXeNA%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/5162a14a-ff8a-1a52-bcf6-68bab95c1f22%40gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/CAMEzy6TT%2B7k7UjBSjhqkLzXGUx8p1%2BZAdJHg2HwoHK6dssRECQ%40mail.gmail.com.
GLOBAL| Total number of message passing processes 1 GLOBAL| Number of threads for this process 1 GLOBAL| This output is from process 0
export OMP_NUM_THREADS=8mpirun -n 6 /lib/CP2K/cp2k/exe/local/cp2k.psmp -i project.inp -o output.out
Hello Nikhil,
Withe command "mpirun -n 42 cp2k.pop -i inp.inp -o -out.out", you are requesting 42 MPI threads and not 42 OpenMP threads. MPI usually relies on replicated data which means that, for a poorly program software, it will request a total amount of memory which the amount of memory required by a scalar execution times the number of threads. This can very quickly become problematic, in particular for QM calculations. OpenMP, however relies on shared memory, the data is normally not replicated but shared between threads and therefore, in an ideal scenario, the amount of memory needed for 42 OpenMP threads is the same as a single one.
This might explains why you calculation freezes. You are out of memory. On your workstation, you should only use the executable "cp2k.ssmp" which is the OpenMP version. Then you don't need the mpirun command:
cp2k.ssmp -i inp.inp -o -out.out
To control the number of OpenMP threads, set the env variable: OMP_NUM_THREADS, e.g. in bash, export OMP_NUM_THREADS=48
Now, if you need to balance between MPI and OpenMP, you should use the executable named cp2k.psmp. Here is such an example:
export OMP_NUM_THREADS=24
mpirun -n 2 cp2k.psmp -i inp.inp -o -out.out
In this example, I am requesting two MPI threads and each of them can use up to 24 OpenMP threads.
Hope this clarifies things for you.
Regards,
Pierre
On 20/09/2019 14:09, Nikhil Maroli wrote:
Dear all,--
I have installed all the versions of CP2K in my workstation with 2 x 12 core processor, total thread=48
I wanted to run cp2k in parallel using 42 threads, can anyone share the commands that i can use.
I have tried
mpirun -n 42 cp2k.pop -i inp.inp -o -out.out
After this command there is a rise in memory to 100 % and the whole system freezes. (i have 128GB ram).
Any suggestion will be greatly appreciated,
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/39284c57-f6eb-463e-81a6-3a123596a9f2%40googlegroups.com.
GLOBAL| Total number of message passing processes 1
GLOBAL| Number of threads for this process 1
GLOBAL| This output is from process 0
export OMP_NUM_THREADS=8
mpirun -n 6 /lib/CP2K/cp2k/exe/local/cp2k.popt -i project.inp -o output.out
NPROC_REP {Integer} |
|
Specify the number of processors to be used per replica environment (for parallel runs). In case of mode selective calculations more than one replica will start a block Davidson algorithm to track more than only one frequency [Edit on GitHub] | |
This keyword cannot be repeated and it expects precisely one integer. | |
Default value:
1 |
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/AM6PR10MB26968C27CCEC14903050632EAE800%40AM6PR10MB2696.EURPRD10.PROD.OUTLOOK.COM.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/2d26c264-d73a-44de-af89-13ff0ac86f69%40googlegroups.com.
Hello,Maybe you received the six files because of this:
<<>>
NPROC_REP
{Integer}
Specify the number of processors to be used per replica environment (for parallel runs). In case of mode selective calculations more than one replica will start a block Davidson algorithm to track more than only one frequency [Edit on GitHub] This keyword cannot be repeated and it expects precisely one integer. Default value: 1
https://manual.cp2k.org/trunk/CP2K_INPUT/VIBRATIONAL_ANALYSIS.html#list_NPROC_REPGreetings from Paris,apsi
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/2d26c264-d73a-44de-af89-13ff0ac86f69%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "cp2k" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/AM6PR10MB26968C27CCEC14903050632EAE800%40AM6PR10MB2696.EURPRD10.PROD.OUTLOOK.COM.
---=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-=*=-
Ari Paavo Seitsonen / Ari.P....@iki.fi / http://www.iki.fi/~apsi/
Hi,Vibrational analysis is done numerically for each mode. With many atoms, it takes a very long time. NPROC_REP requests that you split this job into smaller segments and use N processors for each of those tasks. For example,
mpiexec_mpt -np 2304 cp2k.7.0.psmp foo.inp > foo.logwith,
NPROC_REP 576
dumps 4 files (numbered 0 to 3). These files follow the SCF procedure for each calculation. The final spectral data is written to the main output file (foo.log, above). This is written at the very end, not piecewise. You should also get a Molden format file for visualizing the spectrum. I'm using OMP_NUM_THREADS set to 1 in the example above. For jobs on a single node, I'd go with OMP_NUM_THREADS equal 1 when you already parallelize your calculations by splitting them up like this.-T
Hi Matthew,
Unfortunately, there’s no single way to determine the best MPI/OpenMP load. It is system, calculation type, and hardware dependant. I recommend testing the performance. The first thing you could try is check if your CPUs are multithreaded. For example, if they are made of 34 cores and 2 virtual cores per physical core (68 virtual cores in total), you could try OMP_NUM_THREADS=2 and keep your mpirun -np (34*#nodes).
Roughly speaking, MPI creates multiple replica of the calculation (called process), each replica dealing with part of the calculation. CP2K is efficiently parallelized with MPI. OpenMP generated multiple threads on the fly, generally to parallelize a loop. OpenMP can be used in a MPI thread but not the other way around. Typically, having more MPI processed consumes more memory than the same number of OpenMP threads. To use multiple nodes, MPI is mandatory and more efficient. These are generalities and, again, combining both is best but the ideal ratio varies. Testing is the best course of action, check which combination yields the largest number of ps/day with the minimum hardware resources. Doubling the hardware does not double the output, so increasing the number of nodes becomes a waste of resources at some point. A rule of thumb, if the increase in output is less than 75-80% of the ideal case, then, it is not worth it.
As you can see, there is a lot of try and error, no systematic rule I am afraid.
Regards,
Pierre
To view this discussion on the web visit https://groups.google.com/d/msgid/cp2k/010a2dd7-dc2c-4475-8a9b-17cdbb10d20dn%40googlegroups.com.