Can't Enable Loopback on Connect X-/5/6

272 views
Skip to first unread message

Amanda Baran

unread,
Nov 11, 2023, 11:26:58 PM11/11/23
to cloudlab-users
Hi,

I have been unable to create a loopback connection on the RDMA cards of nodes with Connect X-4/5/6s. I am only able to create a connection on the r320s which have the Connect X-3 by setting the port number to 1 for the loopback connection. This same tactic doesn't work for the newer cards, such as on the r6525 or c6525, nor does using an assigned local ip address. 

I installed MFT to check on the output from mlxlink on these nodes and it does show that the Loopback mode is set to "No Loopback". The other options (https://docs.nvidia.com/networking/display/mftv4181/mlxlink+utility) for configuring loopback are physical or external. I am unable to find much about this online, but tried setting both with the --loopback flag and neither will change the output of mlxlink from No Loopback mode. I also tried rebooting after to make sure the changes were made if that was needed. 

I am trying to run a very simple experiment of a distributed kv store shared amongst the cluster and using RDMA CAS operations to lock/unlock different keys, but would really need to get this to work on the newer cards. 

Thanks for any help in advance!
Reply all
Reply to author
Forward
0 new messages