Questions for parallel computations when using cp2k + i-PI

67 views
Skip to first unread message

cc l

unread,
Jun 10, 2024, 7:15:27 AMJun 10
to ipi-users
Hello everyone:
I have a question regarding parallel computations on HPC, when running i-PI with cp2k:
When i run rr.sh,  it open two clients for cp2k calculation corresponding to the beads i set. However, it computed serially instead of computing two beads at the same time. I wonder whether i have a wrong understanding about how i-PI runs, or these scripts are wrong.
Thanks,
Li
input.xml
rr.sh
fdu.inp
test2.slurm

Mariana Rossi

unread,
Jun 10, 2024, 7:51:20 AMJun 10
to ipi-users
Dear Li,

You are submitting slurm jobs to launch the two CP2K instances. Did you check whether the slurm jobs were running at the same time? Or did one get in, and then only the other got in? That would explain why you might see serial computation. It would help if you share the log output of i-PI.

cc l

unread,
Jun 10, 2024, 9:35:57 AMJun 10
to ipi-users
Dear Mariana,
Sorry, i replied  in an incorrect way. This is my first time to use the google group.
This is my log output of i-PI
 
Best wishes,
Li
log.ipi

Mariana Rossi

unread,
Jun 10, 2024, 10:14:49 AMJun 10
to ipi-users
Good thanks - why do you say that the code ran serially both beads? What is the diagnostics for that?
The handshaking happened with both clients and i-PI used them simultaneously:

 @ForceField: Starting the polling thread main loop.
 # i-PI loaded input file:  input.xml
 @SOCKET:   Client asked for connection from ('10.7.11.24', 40158). Now hand-shaking.
 @SOCKET:   Handshaking was successful. Added to the client list.
 @SOCKET:   Client asked for connection from ('10.7.11.31', 49360). Now hand-shaking.
 @SOCKET:   Handshaking was successful. Added to the client list.
 @SOCKET: 24/06/10-21:19:55 Assigning [ none] request id    0 to client with last-id None (  0/  2 : ('10.7.11.24', 40158))
 @SOCKET: 24/06/10-21:20:04 Assigning [ none] request id    1 to client with last-id None (  1/  2 : ('10.7.11.31', 49360))
 @SOCKET: 24/06/10-21:20:13 Assigning [match] request id    0 to client with last-id    0 (  0/  2 : ('10.7.11.24', 40158))
 @SOCKET: 24/06/10-21:20:15 Assigning [match] request id    1 to client with last-id    1 (  1/  2 : ('10.7.11.31', 49360))
 # Average timings at MD step       0. t/step: 4.35847e+00
 @SOCKET: 24/06/10-21:20:18 Assigning [match] request id    0 to client with last-id    0 (  0/  2 : ('10.7.11.24', 40158))
 @SOCKET: 24/06/10-21:20:19 Assigning [match] request id    1 to client with last-id    1 (  1/  2 : ('10.7.11.31', 49360))
 # Average timings at MD step       1. t/step: 3.92318e+00
 @SOCKET: 24/06/10-21:20:21 Assigning [match] request id    0 to client with last-id    0 (  0/  2 : ('10.7.11.24', 40158))
 @SOCKET: 24/06/10-21:20:23 Assigning [match] request id    1 to client with last-id    1 (  1/  2 : ('10.7.11.31', 49360))
 # Average timings at MD step       2. t/step: 4.04197e+00
 @SOCKET: 24/06/10-21:20:25 Assigning [match] request id    0 to client with last-id    0 (  0/  2 : ('10.7.11.24', 40158))
 @SOCKET: 24/06/10-21:20:27 Assigning [match] request id    1 to client with last-id    1 (  1/  2 : ('10.7.11.31', 49360))
 # Average timings at MD step       3. t/step: 4.15361e+00
 @SOCKET: 24/06/10-21:20:30 Assigning [match] request id    0 to client with last-id    0 (  0/  2 : ('10.7.11.24', 40158))
 @SOCKET: 24/06/10-21:20:32 Assigning [match] request id    1 to client with last-id    1 (  1/  2 : ('10.7.11.31', 49360))
 # Average timings at MD step       4. t/step: 5.78658e+00
SOFTEXIT CALLED FROM THREAD <_MainThread(MainThread, started 47678474139840)>  @ SIMULATION: Exiting cleanly.
 !W! Soft exit has been requested with message: ' @ SIMULATION: Exiting cleanly.
I-PI reports success. Restartable as is: NO.'. Cleaning up.
 @SOCKET: Shutting down the driver interface.
SOFTEXIT: Saving the latest status at the end of the step

cc l

unread,
Jun 10, 2024, 10:20:58 PMJun 10
to ipi-users
In a MD step,  another cp2k client start to run until SCF of the first bead run converged, as shown in the screenshot.pic0611.jpg

Mariana Rossi

unread,
Jun 11, 2024, 1:30:44 AMJun 11
to ipi-users
I am not sure I understand, sorry :-) Did you check that running with only one cp2k client and running with 2 cp2k clients results in a simulation that takes the same time? 

The CP2k output screenshot you sent does not show any information about whether one client is waiting for the other or not. With two clients, the time per md step in the log of i-PI should be double that of running with 2 clients. If the timings are the same, indeed there is something going wrong. Could you test that?

Mariana Rossi

unread,
Jun 11, 2024, 1:33:10 AMJun 11
to ipi-users
Sorry, I mean: "With one client, the time per MD step should be double that of running with only one client"... etc. In any case, a quick question - are you using the latest version of i-PI?

cc l

unread,
Jun 11, 2024, 4:53:01 AMJun 11
to ipi-users
My i-PI version is 2.6.3. And with two clients,  the time per MD step in the log of i-PI is about 4 s, while 16 s with one client. 
However, with the same material,  the time per MD step is about 50 s when i use 24 beads with 24 clients.  

Mariana Rossi

unread,
Jun 11, 2024, 5:23:19 AMJun 11
to ipi-users
Ok, weird timings but we may have fixed the source of this oddity in the new version. Could you please use the current 3.0-beta version https://github.com/i-pi/i-pi/releases/tag/v3.0.0-beta - or simply the current state of the master branch in the repository - and redo the tests? If you still see strange timings let us know.

cc l

unread,
Jun 11, 2024, 7:14:18 AMJun 11
to ipi-users
Dear Mariana
Thanks a lot. 
The current 3.0-beta version works. Now, CP2k clients can compute parallel.
Best wishes
Li
Reply all
Reply to author
Forward
0 new messages