GPU memory bandwidth relationship to performance

2 views
Skip to first unread message

Jared Sabre

unread,
4:48 PM (2 hours ago) 4:48 PM
to beast-users
Hi all. I work in IT at a university and I'm attempting to help one of our labs speed up their BEAST runs. I believe the post docs in the lab are following a sort of "tribal knowledge" script to run BEAST on our somewhat dated HPC cluster, and they don't seem to know what resources they are requesting. The two data samples they gave me were reported to have taken 7 and 20 days respectively to run.

My question around GPU VRAM bandwidth comes from the following observation.

Taking the "7 day" XML file, running on a desktop with an RTX 4090, I am initially seeing ~5.4min/million states while NVTOP has GPU usage locked in at 100%.
Running the same "7 day" XML file on a desktop with a new RTX Pro 6000 Max-Q, using all the same settings I am seeing ~2.6min/million states but NVTOP shows GPU usage ~43-48%.

In both cases I made sure double precision and CUDA (not OpenCL) was flagged to be used. I also read from a paper that the maximum likelihood kernels can be very memory bandwidth bound, so I'd like to know, what are the limits to performance gains as memory bandwith scales up? If kernels are waiting on memory operations, would it stand to reason that trying to track down an H100 NVL with ~4TB/s of memory bandwidth could see in improvement over my card's 1.8TB/s?

Thank you!


Reply all
Reply to author
Forward
0 new messages