TennisBowling
unread,Jan 27, 2025, 1:00:01 PMJan 27Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to beegfs-user
Hello all,
I am experiencing unexpectedly low IOPS performance with my BeeGFS cluster and would greatly appreciate any insights or troubleshooting advice you can offer.
Problem:
My BeeGFS setup is delivering significantly lower IOPS than I expect, especially compared to the direct performance of the NVMe SSDs I am using. I am getting around 6,000 read and 6,000 write IOPS when running `fio` with a random read/write 4KB workload at iodepth 32 against my BeeGFS mount (`/mnt/beegfs/fast`). In contrast, when testing the NVMe SSDs directly on the storage servers using the same `fio` parameters, I achieve around 70,000 read and write IOPS. I lose a tiny bit of this if I mount one drive to another node with NFS over RDMA, but nowhere near the BeeGFS performance loss.
My Setup:
- Nodes: Two nodes running Ubuntu Server (latest LTS):
- `node1` (192.168.10.1): Storage Service, Client
- `node2` (192.168.10.2): Management Service, Metadata Service, Storage Service, Client
- Network:
- Mellanox ConnectX-3 (mlx4_0) cards on both machines
- Running in Ethernet mode (RoCE) - `connUseRDMA = true` in BeeGFS configs
- Two 40Gbps ports bonded using `bond0` in `balance-rr` mode. Intended for 80Gbps aggregate, but iperf shows only like 50Gbps, and I'm fine with that.
- Storage:
- 3 NVME SSD's: 1 on node1, 2 on node2
- BeeGFS Configuration:
- BeeGFS version: 7.4.5 (installed from official repository)
- Volume: "fast-pool" using NVMe targets from both servers, mounted at `/mnt/beegfs/fast`
- Stripe pattern: RAID0, Chunksize 128K (for IOPS testing), currently testing with 3 storage targets from NVMe pool.
- Does the low IOPS performance seem unusual for this type of setup?
- What could be the potential bottlenecks in my configuration?
- Are there any specific BeeGFS tuning parameters or best practices for maximizing IOPS with RDMA and NVMe storage that I should consider?
- Any suggestions for further troubleshooting steps or diagnostic tools I can use?
- Has anyone experienced similar IOPS limitations with Mellanox ConnectX-3 cards and BeeGFS over RDMA?
Any help or suggestions would be greatly appreciated. Thank you in advance for your time and expertise.
All The Best,
TennisBowling