GPU commands

Simey

unread,

Jul 17, 2018, 4:27:07 PM7/17/18

to beast-users

Hello,

I recently compared a large BEAST v2.5 run on two linux servers with Nvidia GP100 GPUs and a windows machine.

The windows machine has 64 GB RAM, two Xeon E5-2650 v4 2.2GHz. CPUs, + Nvidia GeForce GTX 1050Ti GPU

Ubuntu server 1 has 1TB RAM, four Xeon E5-4657L v2 2.4GHz CPUs+ Nvidia GP100 GPU

Ubuntu server 2 has 256 GB RAM, four AMD EPYK 7601 CPUs + Nvidia GP100 GPU

For the windows run I used the BEAST GUI interface (checked use BEAGLE library, Prefer use GPU, thread pool size = auto, precision = double:

For the server 1 run I used:

~/beast2/beast -threads 8 abbreviated_phylogeneny_with_fossils_0_outgroups.xml >& results &

For the server 2 GPU run I used:

~/beast2/beast -beagle_GPU -beagle_order 1,2,1,2,1,2,1,2 -threads 8 abbreviated_phylogeneny_with_fossils_0_outgroups.xml >& results

I expected the server 2 GPU run to be the fastest, but the window GPU run is significantly faster:

windows GPU = 33s/Msamples

Server 1 CPU = 5m39s/Msamples

Server 2 GPU = 21m1s/Msamples

My server commands must be wrong.

What would be the equivalent linux command for the Windows GPU run?

Thanks

Miller, Mark

unread,

Jul 17, 2018, 5:20:39 PM7/17/18

to beast...@googlegroups.com

Hi,

We find the speedup depends very much on the data characteristics.

Here is our table for optimal configuration for nucleotide data:

Data Data Slurm Other beagle

partitions patterns partition -threads -instances GPUs parameters

1 <750 shared 1 1 -beagle_SSE

1 750-4,999 shared 3 3 -beagle_SSE

1 5,000-9,999 shared 6 6 -beagle_SSE

1 10,000-49,999 gpu-shared 1 1 1 -beagle_GPU

1 â‰¥50,000 gpu 4 4 4 -beagle_GPU

-beagle_order 1,2,3,4

2 or 3 <750 shared 1 1 -beagle_SSE

2 or 3 750-2,999 shared 3 3 -beagle_SSE

2 or 3 >= 3,000 shared 4 4 -beagle_SSE

>=4 <1,200 shared 1 1 -beagle_SSE

>=4 1,200-2,999 shared 3 3 -beagle_SSE

>=4 >=3,000 shared 4 4 -beagle_SSE

And for amino acids:

Partitions patterns queue -threads -instances GPUs Other

1 <5,000 gpu-shared 1 1 1 -beagle_GPU

1 >=5,000 gpu 4 4 4 -beagle_GPU -beagle_order 1,2,3,4

2 to 39 any gpu-shared 1 1 1 -beagle_GPU

>=40 any shared 4 4 -beagle_SSE

Best,

Mark

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at https://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Artem B

unread,

Jul 18, 2018, 3:52:03 AM7/18/18

to beast-users

Hi Mike! What does 'data partitions' means?

среда, 18 июля 2018 г., 5:20:39 UTC+8 пользователь Mark Miller написал:

Miller, Mark

unread,

Jul 18, 2018, 10:41:50 AM7/18/18

to beast...@googlegroups.com

Hi.

This is just how many partitions your data set has, as set up in Beauti.

Simey

unread,

Jul 18, 2018, 12:52:51 PM7/18/18

to beast-users

Thanks Mark, good info. But, I would like a direct comparison between GPU cards and had hoped someone would know what the equivalent command would be for the Windows GUI run. I checked the log file and I cannot find the command.

Remco Bouckaert

unread,

Jul 18, 2018, 4:01:59 PM7/18/18

to beast...@googlegroups.com

The command you used is correct, but the difference in timing is suspiciously large. One possibility is that the GPU on the server is not accessible (for example, due to library path settings). You can verify that by looking at the screen output, which should contain 8 lines with something resembling:

  Using BEAGLE version: 2.1.2 resource 0: GPU

if it uses GPUs, but if BEAGLE cannot be found, it will contain lines with the following instead:

TreeLikelihood(treeLikelihood) uses BeerLikelihoodCore4

Another possibility is that the thread-overhead is larger than the thread gain. Running it with fewer threads (possibly just 1) may give speed-ups as well.

Remco

On 19/07/2018, at 4:52 AM, Simey <wbs...@gmail.com> wrote:

Thanks Mark, good info. But, I would like a direct comparison between GPU cards and had hoped someone would know what the equivalent command would be for the Windows GUI run. I checked the log file and I cannot find the command.

On Tuesday, July 17, 2018 at 1:27:07 PM UTC-7, Simey wrote:

Hello,
I recently compared a large BEAST v2.5 run on two linux servers with Nvidia GP100 GPUs and a windows machine.
The windows machine has 64 GB RAM, two Xeon E5-2650 v4 2.2GHz. CPUs, + Nvidia GeForce GTX 1050Ti GPU
Ubuntu server 1 has 1TB RAM, four Xeon E5-4657L v2 2.4GHz CPUs+ Nvidia GP100 GPU
Ubuntu server 2 has 256 GB RAM, four AMD EPYK 7601 CPUs + Nvidia GP100 GPU

For the windows run I used the BEAST GUI interface (checked use BEAGLE library, Prefer use GPU, thread pool size = auto, precision = double:

For the server 1 run I used:
~/beast2/beast -threads 8 abbreviated_phylogeneny_with_fossils_0_outgroups.xml >& results &

For the server 2 GPU run I used:
~/beast2/beast -beagle_GPU -beagle_order 1,2,1,2,1,2,1,2 -threads 8 abbreviated_phylogeneny_with_fossils_0_outgroups.xml >& results

I expected the server 2 GPU run to be the fastest, but the window GPU run is significantly faster:
windows GPU = 33s/Msamples
Server 1 CPU = 5m39s/Msamples
Server 2 GPU = 21m1s/Msamples

My server commands must be wrong.
What would be the equivalent linux command for the Windows GPU run?
Thanks

Reply all

Reply to author

Forward