Running remd with MPI-enabled driver?

dfipi

unread,

Jun 22, 2019, 7:35:05 AM6/22/19

to ipi-users

Hi all,

I'm running remd with dftb+ as the driver. My dftb+ client is compiled with MPI support thus it can run in parallel. The problem is when I launch N instances of dftb+ (mpirun -np 2 dftb+) where N equals the number of replicas, the %CPU drops to 50% for each process.

Thus it seems that the overall parallel efficiency is pretty bad.

My OMP_NUM_THREADS=1 as suggested by the dftb+ community for MPI runs.

Anyone can point me what am I doing wrong?

This is the output of 'top' running 4 replicas system with dftb+ (i.e. in total 8 cores are at work).

I run dftb+ with: nohup mpirun.openmpi -np 2 dftb+ &

========================================================

35426 user    20   0 423384 40404 18248 R 50.5 0.0   0:37.69 dftb+
35434 user    20   0 423080 39720 18248 R 50.2 0.0   0:05.70 dftb+
35396 user    20   0 422044 38660 17644 R 49.8 0.0   1:21.35 dftb+
35402 user    20   0 423384 40460 18304 R 49.8 0.0   1:09.54 dftb+
35403 user    20   0 422044 38872 17848 R 49.8 0.0   1:16.15 dftb+
35427 user    20   0 422044 39048 18032 R 49.8 0.0   0:32.93 dftb+
35435 user    20   0 421732 38164 17716 R 49.8 0.0   0:09.07 dftb+
35395 user    20   0 423384 40360 18204 R 49.5 0.0   1:13.18 dftb+
=========================================================

Running the same simulation but with only 1 client for all replicas (nohup mpirun.openmpi -np 8 dftb+ &) yields:

=========================================================

35837 user    20   0 441868 35196 19112 R 100.3 0.0   0:04.90 dftb+
35828 user    20   0 442580 35676 19120 R 100.0 0.0   0:04.83 dftb+
35829 user    20   0 442544 36024 19432 R 100.0 0.0   0:04.91 dftb+
35830 user    20   0 441912 35356 19360 R 100.0 0.0   0:04.91 dftb+
35831 user    20   0 441688 35048 19304 R 100.0 0.0   0:04.90 dftb+
35833 user    20   0 441904 35656 19500 R 100.0 0.0   0:04.91 dftb+
35841 user    20   0 441516 34824 19076 R 100.0 0.0   0:04.89 dftb+
35843 user    20   0 441224 34456 18868 R 99.7 0.0   0:04.89 dftb+
=========================================================

Michele Ceriotti

unread,

Jun 22, 2019, 7:44:54 AM6/22/19

to ipi-users

It is not clear if this is an i-PI issue or a DFTB+ issue. To be clear, each DFTB+ instance is meant to act as a single client, so the behavior when you run a single client seems correct. Check what happens if you run multiple stand-alone DFTB+ instances, seems to me more a problem of the parallel setup on the HPC system than something specific of i-PI.

dfipi

unread,

Jun 22, 2019, 9:44:10 AM6/22/19

to ipi-users

The problem seems to be DFTB+. Multiple stand-alone DFTB+ instances behave similarly wrong! I posted the same question to their mailing list and will update here in case anybody else will stumble upon this. Thanks Michele

dfipi

unread,

Jun 23, 2019, 5:34:54 AM6/23/19

to ipi-users

Hi all,

Update: after updating to openmpi-4.0.1, recompiling dftb+ and detaching binding of processes using --bind-to none

I could successfully run 4 instances with 100% load - as should be.

Best,

David

Reply all

Reply to author

Forward