CUDA-Aware on OpenMPI v4 with CUDA IPC buffers

64 views
Skip to first unread message

Mike Adams

unread,
May 24, 2025, 11:56:50 AMMay 24
to Open MPI users
Hi, I'm using OpenMPI 4.0.5 with CUDA support on PSC Bridges-2.  I'm calling collectives like MPI_Allreduce on buffers that have already been shared between ranks via cudaIpcGetMemHandle/cudaIpcOpenMemHandle.

On these buffers, I receive the following message and some communication sizes fail:

--------------------------------------------------------------------------
The call to cuIpcGetMemHandle failed. This means the GPU RDMA protocol
cannot be used.
  cuIpcGetMemHandle return value:   1
  address: 0x147d54000068
Check the cuda.h file for what the return value means. Perhaps a reboot
of the node will clear the problem.
--------------------------------------------------------------------------

If I pass in the two mca parameters to disable OpenMPI IPC, everything works.

I'm wondering two things:
Is this failure to handle IPC buffers in OpenMPI 4 a known issue?
When I disable OpenMPI CUDA IPC with mca parameters, does OpenMPI still use GPUDirect RDMA?

Thanks,

Mike Adams

Tomislav Janjusic US

unread,
May 30, 2025, 12:54:23 PMMay 30
to Open MPI users, Mike Adams
Hi,

I'm not sure if it's a known issue, in v4.0 possibly, not sure about v4.1 or v5.0 - can you try?
As far as CUDA IPC - how are you disabling it? I don't remember the mca params in v4.0
If it's either through pml ucx, or smcuda then no, it won't use it.
-Tommy

Mike Adams

unread,
May 30, 2025, 1:05:04 PMMay 30
to Open MPI users, Tomislav Janjusic US, Mike Adams
Hi Tommy,

I'm setting btl_smcuda_use_cuda_ipc_same_gpu 0 and btl_smcuda_use_cuda_ipc 0.
So, are you saying that with these params, it is also not using GPUDirect RDMA?

PSC Bridges 2 only has v4 OpenMPI, but they may be working on installing v5 now.  Everything works on v5 on NCSA Delta - I'll try to test on an older OpenMPI.

Mike Adams

Dmitry N. Mikushin

unread,
May 30, 2025, 1:21:16 PMMay 30
to us...@lists.open-mpi.org, Tomislav Janjusic US, Mike Adams
There is a relevant explanation of the same issue reported for Julia: https://github.com/JuliaGPU/CUDA.jl/issues/1053

пт, 30 мая 2025 г. в 19:05, Mike Adams <mikeca...@gmail.com>:
To unsubscribe from this group and stop receiving emails from it, send an email to users+un...@lists.open-mpi.org.

Mike Adams

unread,
May 30, 2025, 1:39:49 PMMay 30
to Open MPI users, Dmitry N. Mikushin, Tomislav Janjusic US, Mike Adams
Dmitry, 

I'm not too familiar with the internals of OpenMPI, but I just tried 4.1.5 on NCSA Delta and received the same IPC errors (no mca flags switched).  The actual calls didn't fail this time to perform the actual operation, so maybe that's an improvement from v4.0.x to v4.1.x?

Thanks,

Mike Adams

Mike Adams

unread,
May 31, 2025, 2:26:58 PMMay 31
to Open MPI users, Mike Adams, Dmitry N. Mikushin, Tomislav Janjusic US
Interestingly, I made an error - Delta on 4.1.5 did fail like some of the cases on Bridges2 on 4.0.5, but at 16 ranks per GPU.  This is the core count of the AMD processor on Delta with 4 GPUs.  So, it looks like Bridges2 needs an OpenMPI upgrade.

Tommy, I'm still curious about your input on whether or not those mca parameters I mentioned yesterday are disabling GPUDirect RDMA as well?

Thank you both for your help!

Mike Adams

Tomislav Janjusic US

unread,
Jun 3, 2025, 10:03:16 AMJun 3
to Open MPI users, Mike Adams, Dmitry N. Mikushin, Tomislav Janjusic US
Can you post the full mpirun command? or at least the relevant mpi mca params?


" I'm still curious about your input on whether or not those mca parameters I mentioned yesterday are disabling GPUDirect RDMA as well?"
Even if you disable sm_cuda_ipc, it's possible you're still using cuda ipc via ucx for example.
The mentioned mca params disable it for sm_cuda btl, but UCX doesn't use smcuda as a transport so it's irrelevant for ucx pml.
Do you know which pml you're using?
-Tommy 

Mike Adams

unread,
Jun 3, 2025, 1:44:00 PMJun 3
to Open MPI users, Tomislav Janjusic US, Mike Adams, Dmitry N. Mikushin
mpirun --mca btl_smcuda_use_cuda_ipc_same_gpu 0 --mca btl_smcuda_use_cuda_ipc 0 --map-by ppr:2:numa --bind-to core --rank-by slot --display-map --display-allocation --report-bindings ./multilane_ring_allreduce
where there is 1 GPU per NUMA region.

I am not sure which pml I'm using, but since those parameters cause my program to succeed on OpenMPI 4.0.5 on PSC Bridges 2, I guess it is not UCX.  Can you point me in the right direction to determine the pml in use?

Thank you for your assistance!

Mike Adams

Tomislav Janjusic US

unread,
Jun 3, 2025, 3:08:17 PMJun 3
to Open MPI users, Mike Adams, Tomislav Janjusic US, Dmitry N. Mikushin
add --mca pml_base_verbose 90

And should see something like this:
[rock18:3045236] select: component ucx selected
[rock18:3045236] select: component ob1 not selected / finalized
Or whatever your ompi instance selected.
-Tommy
Reply all
Reply to author
Forward
0 new messages