GA 5.6.1 on Stampede2 at TACC

jlee...@gmail.com

unread,

Feb 25, 2019, 10:08:52 AM2/25/19

to hpctools

Hi Group,

Sorry if I missed a previous post on the subject.Can anyone make a recommendation regarding a reasonable first choice for networking configuration for GA 5.6.1 on the TACC stampede2 machine? I want decent performance with the least amount of installation drama. Note I am using ga++.

Many thanks,

jeff

Palmer, Bruce J

unread,

Feb 25, 2019, 12:47:03 PM2/25/19

to hpctools

We are generally recommending the MPI Progress Rank runtime for large scale computing (--with-mpi-pr). This gives good performance and is very robust. If you can get it to work, the Infiniband port is usually the highest performing runtime on Infiniband networks, but it has issues if you are allocating large arrays.

The ga++ interface is a lightweight wrapper on top of the core library and the underlying runtime will not affect it much, one way or the other.

Bruce

--
You received this message because you are subscribed to the Google Groups "hpctools" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hpctools+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tilson, Jeffrey L

unread,

Feb 25, 2019, 12:49:17 PM2/25/19

to hpct...@googlegroups.com

Thanks Bruce. The "if" statement reads ominous.

From: 'Palmer, Bruce J' via hpctools <hpct...@googlegroups.com>
Sent: Monday, February 25, 2019 12:46:58 PM
To: hpctools
Subject: RE: [hpctools] GA 5.6.1 on Stampede2 at TACC

Jeff Hammond

unread,

Feb 25, 2019, 2:34:30 PM2/25/19

to hpctools

My experience is that the OpenIB port is fast as heck until you start using close to 50% of the node memory, at which point you start seeing segfaults due to IB page registration. MPI-PR is far more robust because it relies on MPI send-recv under the hood and thus exploits all of the extensively debugged and tuned MPI functionality, including the IB page registration cache.

You should not have any installation drama from MPI-PR. Just remember that you need to mpirun -N Napp+Nnodes, where Napp is the number of processes your app uses, because 1 process per node is split off to run the progress engine.

Jeff

Jeff Hammond
jeff.s...@gmail.com
http://jeffhammond.github.io/

Reply all

Reply to author

Forward