Building mpi4py against cuda

26 views
Skip to first unread message

John Liefeld

unread,
Sep 15, 2022, 5:30:59 PM9/15/22
to mpi4py
I have seen this on the net a few times …

"… mpi4py now provides (experimental) support for passing CuPy arrays to MPI calls, provided that mpi4py is built against a CUDA-aware MPI implementation"

Are there any examples or instructions for how to build mpi4py against cuda?  I am working on a gpu node that does not have direct access to the net and downloaded an mpi4py release but so far, while the hello world works (and 53/56 of the unit tests) it always throws a segv when the mpi4py.MPI.Intracomm calls Allgatherv().  My guess is that its not built correctly.

Lisandro Dalcin

unread,
Sep 15, 2022, 5:57:31 PM9/15/22
to mpi...@googlegroups.com
On Fri, 16 Sept 2022 at 00:31, John Liefeld <jlie...@cloud.ucsd.edu> wrote:
I have seen this on the net a few times …

"… mpi4py now provides (experimental) support for passing CuPy arrays to MPI calls, provided that mpi4py is built against a CUDA-aware MPI implementation"


 
Are there any examples or instructions for how to build mpi4py against cuda? 

There is nothing special, really. You just need to build mpi4py with a CUDA-aware MPI. And next cross fingers that your backend MPI will communicate CUDA buffers as promised.
 
I am working on a gpu node that does not have direct access to the net and downloaded an mpi4py release but so far, while the hello world works (and 53/56 of the unit tests) it always throws a segv when the mpi4py.MPI.Intracomm calls Allgatherv(). 

There you have. Run `ldd` on the `MPI.*.so` extension module (print(MPI.__file__) to get the exact location) and make sure the ext module is linked to the right MPI libraries. And of course, make sure these MPI libraries actually support communication of CUDA buffers (maybe ask your cluster's sysadmins?).
 
My guess is that its not built correctly.

Double-check mpi4py is linked to the right libraries. If it is, then the problem is in your backend MPI, and you should perhaps ask for help to staff that administers the cluster.

Regards,

--
Lisandro Dalcin
============
Senior Research Scientist
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

Ted (aka John) Liefeld

unread,
Sep 15, 2022, 6:01:34 PM9/15/22
to mpi...@googlegroups.com
Lisandro,

Thanks.  Your suggestions are in line with what I thought.  I did contact our admins this morning but posted here this afternoon  because I am impatient… ;-).  I think I need the admins to make sure I know if its getting the 'right' mpi libraries

--
You received this message because you are subscribed to a topic in the Google Groups "mpi4py" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mpi4py/kFI8X93uejY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mpi4py+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/mpi4py/CAEcYPwAOvy6VXeVg7m6x4dhFVN5soHrHzti_K8Zn5cHkZ4nD%3Dg%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages