Server NIC not recognised as RDMA

335 views
Skip to first unread message

Steve Eppert

unread,
Aug 30, 2023, 4:35:16 AM8/30/23
to beegfs-user
Hi

I was checking out our BeeGFS system and found that it doesn't use RDMA (we use RoCE) and I'm wondering why, so I checked the documentation.
It looks like I need to install the libbeegfs-ib package (which I did not) but I can see that the client nics are detected as RDMA:

D62-64EDA994-node1.cluster [ID: 1196].
   Ports: UDP: 8004; TCP: 0
   Interfaces: enp33s0f0np0(RDMA) enp33s0f0np0(TCP)

On the meta server, the interface is not recognised as RDMA, even though the mellanox drivers are installed correctly.

meta1.cluster [ID: 1].
   Ports: UDP: 8005; TCP: 8005
   Interfaces: enp129s0(TCP)

ibv_devices
    Device Node GUID
    ------ ----------------
    mlx5_0 0c42a10a00421288

Currently connecting to the servers (obviously over TCP)
Metadata
==========
meta1.cluster [ID: 1]: accessible at 192.168.1.11:8005 (protocol: TCP)


So these questions came up:

- Why are the client nics detected as RDMA, but the libbeegfs-ib package is not installed?
- Do I need to install the package on the client/server? (could this be the reason why the server nics are not detected as RDMA?)
- If the libbeegfs-ib package is independent of the server nic RDMA problem, how can I check why BeeGFS does not recognise the server nic as RMDA?

Thanks in advance!

Tore H. Larsen

unread,
Aug 31, 2023, 6:08:35 PM8/31/23
to fhgfs...@googlegroups.com

What is the status of 

cat /sys/devices/{pci-bus-address}/roce_enable


To find the ethernet devices and verify roce is actually configured cat sysfs entry

root@n018:~# lspci | grep Mella | grep -i ethernet
23:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
23:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
a1:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]
a1:00.1 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]
c1:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)
c1:00.1 Infiniband controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01)
c1:00.2 DMA controller: Mellanox Technologies MT42822 BlueField-2 SoC Management Interface (rev 01)

root@n018:~# lspci | grep Mella | grep -i ethern
23:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
23:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
c1:00.0 Ethernet controller: Mellanox Technologies MT42822 BlueField-2 integrated ConnectX-6 Dx network controller (rev 01) 



Brgds,
Tore


--
You received this message because you are subscribed to the Google Groups "beegfs-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fhgfs-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fhgfs-user/79856f72-9721-4fc5-a2b3-78ebf1d36f4en%40googlegroups.com.

Tore H. Larsen

unread,
Aug 31, 2023, 6:12:19 PM8/31/23
to fhgfs...@googlegroups.com

And verify firewall settings.

Quote from a flash storage vendors manual: 

"If you use firewalls in your network configuration, ensure that traffic is open for TCP port 21455 and UDP ports 4791, 21451 and 21452. Node-to-node communications with RDMA-capable Ethernet connections use TCP port 21455 for data traffic and UDP port 21451 and 21452 for service discovery on the system. If you are using the RoCE protocol, ensure that traffic is also open for UDP port. Additionally, RDMA-capable Ethernet ports use Internet Group Management Protocol (IGMP) for group multicast communication for service discovery, so ensure that IGMP traffic is enabled on the firewall for redundant site configurations"



--
Kind Regards / Mvh,
Tore HLarsen

Chief Research Engineer HPC                                      Email:  to...@simula.no

Simula Research Laboratory - HPC department        Mobile: +47 918 33 670

Kristian Augusts gate 23, 0164 Oslo, Norway

Tore H. Larsen

unread,
Aug 31, 2023, 6:23:11 PM8/31/23
to fhgfs...@googlegroups.com

Steve Eppert

unread,
Sep 1, 2023, 3:34:35 AM9/1/23
to beegfs-user
Everything looks like it should be:

# lspci | grep Mella
81:00.0 Ethernet controller: Mellanox Technologies MT27700 Family [ConnectX-4]
/sys/devices/pci0000:80/0000:80:03.1/0000:81:00.0/roce_enable

Firewall is disabled on all systems. 

Clients (where devices are detected as RDMA) and server use the same switch, so switch configuration is ok.

So I assume this is a BeeGFS issue. 
I noticed some strange behaviour on the server (not the clients): 
beegfs-ctl gives "Unrecoverable error: No connAuthFile configured" but connAuthFile is configured for all meta, storage and clients. Also the BeeGFS is working fine (7.3.3) so I assume connAuthFile (which is mandatory in 7.3.3) is configured correctly

Steve

Tore H. Larsen

unread,
Sep 1, 2023, 5:51:08 AM9/1/23
to fhgfs...@googlegroups.com

Which version of MOFED is it?  


connAuthFile                  = /etc/beegfs/connauthfile
 

--Tore


Steve Eppert

unread,
Sep 1, 2023, 6:16:21 AM9/1/23
to beegfs-user
All servers (server + client) use same CentOS 8.2 installation with MOFED 5.1
connAuthFile exists and is configured in the server config, otherwise the server would not start (or?)

Steve

Tore H. Larsen

unread,
Sep 1, 2023, 6:27:41 AM9/1/23
to fhgfs...@googlegroups.com

> MOFED 5.1

# rpm -qa |grep -i  mlnx-ofed-kernel-dkms


> connAuthFile exists and is configured in the server config, otherwise the server would not start (or?)

strace shows command uses/opens connAuthfile from beegfs-client.conf.  Do you have beegfs-client on the servers?  

Maybe that's why it fails for you. 


--Tore


Steve Eppert

unread,
Sep 1, 2023, 6:31:55 AM9/1/23
to beegfs-user
Ah makes sense. There is no client on the server so the message seems normal.

# rpm -qa |grep -i  mlnx-ofed-kernel-dkms
mlnx-ofa_kernel-5.1-OFED.5.1.2.3.7.1.rhel8u2.x86_64
mlnx-ofa_kernel-modules-5.1-OFED.5.1.2.3.7.1.kver.4.18.0_193.19.1.el8_2.x86_64.x86_64
mlnx-ofa_kernel-devel-5.1-OFED.5.1.2.3.7.1.rhel8u2.x86_64

In general RDMA works fine between server and clients. ib_send_lat works and shows ~2usec. Same software versions are on the clients where BeeGFS correctly identifies the NICs as RDMA

Steve
Reply all
Reply to author
Forward
0 new messages