Don't know your deployment but everythink (nfsordma+beegfs) looks too slow as you say you use 24 nvme's plus a single nvme.
When you compare writing a single 100GB file with nfsordma with beegfs which do "default" 512k chunks you write >200000 object-files for that,
so there is still a lot of extra inode allocations and communication overhead which is normally spread over too a couple of hosts/daemons
and 1 single storage pool is expected to be everytime slower.
You should look for your bottlenecks cpu, I/O, network from the ground and last go to beegfs to config nr daemons, (chunk+ stripe are useless in your single target case).