GPU commands

372 views
Skip to first unread message

Simey

unread,
Jul 17, 2018, 4:27:07 PM7/17/18
to beast-users
Hello,
I recently compared a large BEAST v2.5 run on two linux servers with Nvidia GP100 GPUs and a windows machine.
The windows machine has 64 GB RAM, two Xeon E5-2650 v4 2.2GHz. CPUs, + Nvidia GeForce GTX 1050Ti GPU
Ubuntu server 1 has 1TB RAM, four Xeon E5-4657L v2 2.4GHz CPUs+ Nvidia GP100 GPU
Ubuntu server 2 has 256 GB RAM, four AMD EPYK 7601 CPUs + Nvidia GP100 GPU

For the windows run I used the BEAST GUI interface (checked use BEAGLE library, Prefer use GPU, thread pool size = auto, precision = double:

For the server 1 run I used:
~/beast2/beast -threads 8 abbreviated_phylogeneny_with_fossils_0_outgroups.xml >& results &

For the server 2 GPU run I used:
~/beast2/beast -beagle_GPU -beagle_order 1,2,1,2,1,2,1,2 -threads 8 abbreviated_phylogeneny_with_fossils_0_outgroups.xml  >& results

I expected the server 2 GPU run to be the fastest, but the window GPU run is significantly faster:
windows GPU = 33s/Msamples
Server 1 CPU = 5m39s/Msamples
Server 2 GPU = 21m1s/Msamples

My server commands must be wrong. 
What would be the equivalent linux command for the Windows GPU run?
Thanks



Miller, Mark

unread,
Jul 17, 2018, 5:20:39 PM7/17/18
to beast...@googlegroups.com

Hi,

We find the speedup depends very much on the data characteristics.

Here is our table for optimal configuration for nucleotide data:

 

   Data          Data        Slurm                               Other beagle

partitions     patterns    partition  -threads -instances  GPUs    parameters

 

     1             <750     shared        1         1             -beagle_SSE

     1        750-4,999     shared        3         3             -beagle_SSE

     1      5,000-9,999     shared        6         6             -beagle_SSE

     1    10,000-49,999   gpu-shared      1         1        1    -beagle_GPU

     1          ≥50,000       gpu       4         4        4    -beagle_GPU

                                                                    -beagle_order 1,2,3,4

  2 or 3           <750     shared        1         1             -beagle_SSE

  2 or 3      750-2,999     shared        3         3             -beagle_SSE

  2 or 3       >= 3,000     shared        4         4             -beagle_SSE

 

   >=4           <1,200     shared        1         1             -beagle_SSE

   >=4      1,200-2,999     shared        3         3             -beagle_SSE

   >=4          >=3,000     shared        4         4             -beagle_SSE

 

 

And for amino acids:

 

Partitions patterns         queue                 -threads             -instances          GPUs              Other

     1           <5,000             gpu-shared        1                          1                          1                  -beagle_GPU

     1          >=5,000            gpu                      4                           4                           4                  -beagle_GPU -beagle_order 1,2,3,4

 

2 to 39           any             gpu-shared        1                           1                           1                  -beagle_GPU

 

   >=40           any              shared                4                           4                                         -beagle_SSE

 

Best,

Mark

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at https://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

Artem B

unread,
Jul 18, 2018, 3:52:03 AM7/18/18
to beast-users
Hi Mike! What does 'data partitions' means?

среда, 18 июля 2018 г., 5:20:39 UTC+8 пользователь Mark Miller написал:

Miller, Mark

unread,
Jul 18, 2018, 10:41:50 AM7/18/18
to beast...@googlegroups.com

Hi.

This is just how many partitions your data set has, as set up in Beauti.

Simey

unread,
Jul 18, 2018, 12:52:51 PM7/18/18
to beast-users
Thanks Mark, good info. But, I would like a direct comparison between GPU cards and had hoped someone would know what the equivalent command would be for the Windows GUI run. I checked the log file and I cannot find the command.

Remco Bouckaert

unread,
Jul 18, 2018, 4:01:59 PM7/18/18
to beast...@googlegroups.com
The command you used is correct, but the difference in timing is suspiciously large. One possibility is that the GPU on the server is not accessible (for example, due to library path settings). You can verify that by looking at the screen output, which should contain 8 lines with something resembling:

  Using BEAGLE version: 2.1.2 resource 0: GPU

if it uses GPUs, but if BEAGLE cannot be found, it will contain lines with the following instead:

TreeLikelihood(treeLikelihood) uses BeerLikelihoodCore4

Another possibility is that the thread-overhead is larger than the thread gain. Running it with fewer threads (possibly just 1) may give speed-ups as well.

Remco


On 19/07/2018, at 4:52 AM, Simey <wbs...@gmail.com> wrote:

Thanks Mark, good info. But, I would like a direct comparison between GPU cards and had hoped someone would know what the equivalent command would be for the Windows GUI run. I checked the log file and I cannot find the command.

On Tuesday, July 17, 2018 at 1:27:07 PM UTC-7, Simey wrote:
Hello,
I recently compared a large BEAST v2.5 run on two linux servers with Nvidia GP100 GPUs and a windows machine.
The windows machine has 64 GB RAM, two Xeon E5-2650 v4 2.2GHz. CPUs, + Nvidia GeForce GTX 1050Ti GPU
Ubuntu server 1 has 1TB RAM, four Xeon E5-4657L v2 2.4GHz CPUs+ Nvidia GP100 GPU
Ubuntu server 2 has 256 GB RAM, four AMD EPYK 7601 CPUs + Nvidia GP100 GPU

For the windows run I used the BEAST GUI interface (checked use BEAGLE library, Prefer use GPU, thread pool size = auto, precision = double:


For the server 1 run I used:
~/beast2/beast -threads 8 abbreviated_phylogeneny_with_fossils_0_outgroups.xml >& results &

For the server 2 GPU run I used:
~/beast2/beast -beagle_GPU -beagle_order 1,2,1,2,1,2,1,2 -threads 8 abbreviated_phylogeneny_with_fossils_0_outgroups.xml  >& results

I expected the server 2 GPU run to be the fastest, but the window GPU run is significantly faster:
windows GPU = 33s/Msamples
Server 1 CPU = 5m39s/Msamples
Server 2 GPU = 21m1s/Msamples

My server commands must be wrong. 
What would be the equivalent linux command for the Windows GPU run?
Thanks



Reply all
Reply to author
Forward
0 new messages