We're migrating our HPC storage from lustre to BeeGFS 7.3.0, on top of infiniband.
Since we've already had 1 shared SAN storage for storage and 1 for metadata we're keeping it like this. For HA we use 2 storage servers and 2 metadata servers. Each server sees 2 volumes on storage device so we have an active-active setup. Same goes for meta.
The setup and the failover works well. The only "strange" thing I see is that the virtual IPs are not beeing used and the real IPs are being used.
# pcs resource
* Resource Group: beegfs_storage1:
* VIP-storage1 (ocf::heartbeat:IPaddr2): Started oss-1
* disk-storage1 (ocf::heartbeat:Filesystem): Started oss-1
* beegfs-storage1 (systemd:beegfs-storage@storage1): Started oss-1
* Resource Group: beegfs_storage2:
* VIP-storage2 (ocf::heartbeat:IPaddr2): Started oss-2
* disk-storage2 (ocf::heartbeat:Filesystem): Started oss-2
* beegfs-storage2 (systemd:beegfs-storage@storage2): Started oss-2
Metadata looks also like this.
I'm using connNetFilterFile, which as I understand has to do with destination IP addresses only. Is this correct? It does not filter on source IP address. It's NOT something like bindToIP.
[ID: 1]: reachable at 10.1.7.21:8003
[ID: 2]: reachable at 10.1.7.22:8004
REAL IPs also above.
# beegfs-ctl --listnodes --nodetype=storage --nicdetailsstorage1-ib0.example.com
Ports: UDP: 8003; TCP: 8003
+ ib0[ip addr: 10.1.7.21; type: RDMA] (REAL IP HERE)
+ ib0[ip addr: 10.1.7.4; type: RDMA] (VIP HERE)
+ ib0[ip addr: 10.1.7.21; type: TCP] (REAL IP HERE)
+ ib0[ip addr: 10.1.7.4; type: TCP] (VIP HERE)storage2-ib0.example.com
Ports: UDP: 8004; TCP: 8004
+ ib0[ip addr: 10.1.7.22; type: RDMA] (REAL IP HERE)
+ ib0[ip addr: 10.1.7.5; type: RDMA] (VIP HERE)
+ ib0[ip addr: 10.1.7.22; type: TCP] (REAL IP HERE)
+ ib0[ip addr: 10.1.7.5; type: TCP] (VIP HERE)
So any way to force only the VIPs?
Listing only VIPs on connNetFilterFile does not work.
connInterfacesFile lists only ib0
If there is no such option I probably don't need VIPs at all.
Or maybe I don't need the real IPs, although even in this case, after failover I would see the two VIPs on each storage server which would still be wrong.
I'm running on Rocky 8.5 since this is the latest version supported by MLNX_OFED 4.9-0.1.7.0 which is for my connectX-3 cards.
Is anyone using INBOX drivers from RHEL instead of MLNX_OFED?
Are they stable?
Performance wise, is there a big difference?
I would to do faster updates this time, since in the past MLNX as well as Lustre had kept the cluster outdated a lot.