Performance of test setup

46 views
Skip to first unread message

Sebastian

unread,
Jun 7, 2024, 6:38:38 AMJun 7
to beegfs-user
Hello,

I would like to have beegfs as our cluster scratch  file system, so I set up beegfs on one storage server to see if it is an improvement to the NFS we had before.
So mgmtd, meta and storage are all on this on server.
The network is 2x10G on the test client side an 4x10G on the server.

For Storage I have 14x2.9T NVMe, md-raid6 and for meta 2x128G Optane PMem, md-raid0.
I know this is not enough for a proper productive setup, but I can not request any money before I solve the performance problem.

 The performance is not only bad because of the network or missing RDMA, but also if I start the beegfs-client on the storage server. I run several tests with fio, but the most simple to describe the problem is:

dd if=/dev/zero of=zero1.tmp bs=32M count=32 conv=fdatasync                       
32+0 Datensätze ein
32+0 Datensätze aus
1073741824 Bytes (1,1 GB, 1,0 GiB) kopiert, 713,606 s, 1,5 MB/s

This is so incredibly bad, that there must be a major problem, that can not be solved by tuning or better hardware.

There is nothing in the logs, the CPU is almost idle and the server stats are:
Total results for 1 nodes:
time_index write_KiB  read_KiB   reqs   qlen bsy
1717752190      6144         0     12      0   2
1717752191      4096         0      8      0   2
1717752192      5120         0     10      0   2
1717752193      4608         0      9      0   2
1717752194      5120         0     10      0   2
1717752195      9728         0     19      0   4
1717752196      4096         0      8      0   2
1717752197      6656         0     13      0   2
1717752198      4096         0      8      0   2
1717752199      4096         0      8      0   2

Any idea what I did wrong and where I can start looking?



Fedor Pollak

unread,
Jun 7, 2024, 7:08:52 AMJun 7
to beegfs-user
Not sure what is the problem, but try testing with the same dd command directly on the file system without beegfs? If it is the same bad result, then maybe test it without md-mirror (disk by disk) etc.

Best regards,
Fedor Pollak

Sebastian

unread,
Jun 7, 2024, 8:19:54 AMJun 7
to beegfs-user
Thank you for the quick replay.
I tested the file system, and the performance is much better.
Like for the storage:
dd if=/dev/zero of=zero1.tmp bs=32M count=32 conv=fdatasync
32+0 Datensätze ein
32+0 Datensätze aus
1073741824 Bytes (1,1 GB, 1,0 GiB) kopiert, 1,65583 s, 648 MB/s

fio shows similar results; the file system is much faster then BeeGF in every test.

Andreas Skau

unread,
Jun 7, 2024, 8:20:00 AMJun 7
to fhgfs...@googlegroups.com
Just a few initial questions that might point you at the bottleneck:

- Did you check the performance directly on the storage volume (not via beegfs)?
- How did you create and tune the storage volume? Which fs, which mount options, etc
- Have you done performance tuning on the host OS?

Also I think you would be better off using zfs instead of mdraid6 (but your numbers are so bad I don't think this explains all of it)

Hope this helps,
Andreas

--
You received this message because you are subscribed to the Google Groups "beegfs-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fhgfs-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fhgfs-user/542e8f3a-728c-42c2-8f7e-56a797e4e29an%40googlegroups.com.

Scooter Morris

unread,
Jun 11, 2024, 5:36:20 PMJun 11
to fhgfs...@googlegroups.com

Adding a couple of thoughts here.

First, the local filesystem will always be faster than BeeGFS -- that's a given, and no surprise.

Second, your setup is sort of a worst - case.  BeeGFS is also slower than an NFS server serving a small number of clients.  BeeGFS is designed to be a parallel filesystem, when you have hundreds of clients accessing it.  You will only see a performance advantage for BeeGFS when you have multiple clients accessing multiple metadata servers and multiple storage servers.  We regularly get > 2-3 million IOPS and aggregate filesystem performance of 5 GiB/second, with peaks of over 18 GiB/second.  This is using 6 metadata pairs and 24 storage servers, serving ~500 servers and a aggregate storage of 12P.   So, in order to test BeeGFS, I would definitely separate the metadata from the storage servers and use multiple storage servers to get the advantage of parallelism.

-- scooter 

--
You received this message because you are subscribed to the Google Groups "beegfs-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fhgfs-user+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages