Help on BEAST (2.4.5) command line usage

877 views
Skip to first unread message

Terry Jones

unread,
Feb 20, 2017, 5:23:08 PM2/20/17
to beast-users
Is there a document or web page anywhere that describes how to run BEAST (v2) from the command line?  I can run it with -help and see the various options, but I do not know what they do exactly or in which combination to use them. There doesn't seem to be anything on the beast2.org web site and in searching this group I don't see anything that helps with my specific issue.  I'm running BEAST on an Amazon p2.8xlarge (see https://aws.amazon.com/ec2/instance-types/) instance, which has 8 GPUs. But I don't know for sure that BEAST (via BEAGLE) is making use of them all. I've guessed at a command line:

$ time beast -beagle -beagle_GPU -threads 32

but that was just guesswork, in part based on reading posts in this group. In the output of that command, I see sections like

Filter 102-202
124 taxa
101 sites
77 patterns
  Using BEAGLE version: 2.1.2 resource 1: Tesla K80
    Global memory (MB): 11520
    Clock speed (Ghz): 0.82
    Number of cores: 2496
    with instance flags:  PRECISION_SINGLE COMPUTATION_SYNCH EIGEN_REAL SCALING_MANUAL SCALERS_RAW VECTOR_NONE THREADING_NONE PROCESSOR_GPU FRAMEWORK_CUDA
  Ignoring ambiguities in tree likelihood.
  Ignoring character uncertainty in tree likelihood.
  With 77 unique site patterns.
  Using rescaling scheme : dynamic

so I know at least one GPU is being used. $ beast -beagle_info lists the 16 GPU resources. So maybe my command line is fine?

In the output of beast -help, I see this line:

 -beagle_GPU BEAGLE: use GPU instance if available

Can someone tell me what the BEAGLE part of that line means? The command runs without error using just -beagle_GPU, but maybe I am supposed to replace BEAGLE with something? Later in the help output, it has a similar line, for -beagle_order BEAGLE in which case it seems one must replace BEAGLE with something, but what? I guess there must be some documentation on this, but I'm not seeing it - sorry!

Thanks for any help!

Terry

Terry Jones

unread,
Feb 25, 2017, 2:41:12 PM2/25/17
to beast...@googlegroups.com
Is there anyone on this list using BEAST with GPUs? 

Terry


--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users+unsubscribe@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at https://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

Remco Bouckaert

unread,
Feb 26, 2017, 3:09:19 PM2/26/17
to beast...@googlegroups.com
Hi Terry,

You can tell from the output which resources BEAGLE is using from the line

 Using BEAGLE version: 2.1.2 resource 1: Tesla K80

which states that it uses resource 1, which is the first available GPU. 
If you have multiple partitions, each partition should use its own GPU, but you can use the “-beagle_order” option (with a comma separated list of BEAGLE resource numbers) to specify the order in which GPUs are assigned.

The “-threads” option only helps calculating partition likelihoods in parallel, so it does not help (or hinder) to have more threads than partitions when using GPUs, so for most common types of analyses your command line looks OK.

The “BEAGLE” part in the help only refers to the fact that the option is specific to BEAGLE, so of no use when BEAGLE is not installed.

Hope this helps,

Remco



--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.

Terry Jones

unread,
Feb 26, 2017, 3:42:12 PM2/26/17
to beast...@googlegroups.com
Hi Remco

Yes, thanks, that's very helpful!  It sounds like having more GPUs than you have partitions is of no added value - is that right?

Terry


To unsubscribe from this group and stop receiving emails from it, send an email to beast-users+unsubscribe@googlegroups.com.

To post to this group, send email to beast...@googlegroups.com.
Visit this group at https://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users+unsubscribe@googlegroups.com.

Remco Bouckaert

unread,
Feb 26, 2017, 7:33:02 PM2/26/17
to beast...@googlegroups.com
Hi Terry,

It depends…

It is possible to split the calculation for a single partition over 2 or more GPUs using threading. If your alignment has few patterns, the overhead can be larger than the gain, but if there are many patterns it is possible to get improved performance. To see whether your data benefits from splitting you just have to try.

To split calculations over alignments, use the “-instances” option. Make sure to use at least as many threads as there are BEAGLE instances, so for example

beast -instances 2 -threads 4 -beagle_GPU beast.xml

for an analysis with 2 partitions will use 2 threads, 2 for each partition and will use 4 GPUs.

Cheers,

Remco


To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.

Terry Jones

unread,
Mar 26, 2018, 10:54:15 AM3/26/18
to beast-users
Hi again Remco

I've just run into this old thread & see I didn't respond or say thanks. So: thanks!

I've now managed to get BEAGLE to see one or more GPUs on the cluster I am trying to use, which is great.  So returning to your mail below, can you make a couple of things clearer for me?

1. I don't know what you mean by "patterns". I guess you don't mean sites, but maybe sites with some (diversity) property?
2. If I have just one partition and 4 GPUs, it seems I should run with -instances 4 -threads 4 -beagle_GPU -beagle file.xml

Do I also need -beagle_order 1,2,3,4 ?

Thanks again,
Terry
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at https://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at https://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

Remco Bouckaert

unread,
Mar 27, 2018, 3:37:34 PM3/27/18
to beast...@googlegroups.com
Hi Terry,

I've now managed to get BEAGLE to see one or more GPUs on the cluster I am trying to use, which is great.  So returning to your mail below, can you make a couple of things clearer for me?

How did you make this happen?

1. I don't know what you mean by "patterns". I guess you don't mean sites, but maybe sites with some (diversity) property?

A site pattern is a unique assignment of characters to taxa. Since sites are assumed to be independent, calculating the likelihood for two sites that have the same pattern is not necessary: if you know the likelihood for the first site, it will be the same for the second. So, instead of calculating the likelihood for all sites individually, the likelihood for all patterns are calculated and its contribution multiplied with the number of occurrences of that pattern. In general, there are fewer patterns than sites, so that save a bit on calculation time.

2. If I have just one partition and 4 GPUs, it seems I should run with -instances 4 -threads 4 -beagle_GPU -beagle file.xml

That looks OK.

Do I also need -beagle_order 1,2,3,4 ?

If with -beagle_info you get that the four GPU instances are numbered 1, 2, 3 and 4, that would work, but might be unnecessary if you use -beagle_GPU. At the start of the screen output, BEAST tells which BEAGLE resources are used, including the type of resource (CPU or GPU), so have a look there to be sure it works as you want to.

Cheers,

Remco



Terry Jones

unread,
Mar 29, 2018, 4:57:03 AM3/29/18
to beast...@googlegroups.com
Thanks Remco

It turned out BEAGLE had (somehow) been built with gcc 5.4.0 on the cluster I was using. I don't know how they managed that as it complains and the compile fails for me in that case (maybe the gcc version check was only added recently?).  Anyway, building my own with gcc 4.9.2 and setting LD_LIBRARY_PATH and BEAST_EXTRA_LIBS to point to the BEAGLE dir resolved the issue.

Thanks for the explanation re patterns. Makes sense. I guess I should learn a bit more!  Re number of GPUs, 4 are detected when I use -threads 4 but not (IIRC) if I just use -beagle_order 1,2,3,4.  However running with 4 GPUs is no faster, at least according to the intermediate output from BEAST, and on my data set (around 500 patterns).

Terry


--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users+unsubscribe@googlegroups.com.

Terry Jones

unread,
Mar 29, 2018, 5:25:21 AM3/29/18
to Terry Jones, beast...@googlegroups.com
Sorry, s/BEAST_EXTRA_LIBS/BEAGLE_EXTRA_LIBS/ in my previous mail.

Terry

Andrew Rambaut

unread,
Mar 29, 2018, 7:07:16 AM3/29/18
to beast...@googlegroups.com
GPU cores are much slower than CPUs but there are many more of them. To get good speed ups you generally need to use all the cores at once, otherwise the overheads of running the threads on the GPUs outweigh the benefits. You don't say what type of GPU you have but 500 patterns wouldn't fully occupy any of the modern GPUs so there would be no benefit (and potentially some cost) to distributing these over multiple GPUs.

Andrew

To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.

Terry Jones

unread,
Mar 29, 2018, 7:13:20 AM3/29/18
to beast...@googlegroups.com
OK, thanks Andrew, that makes sense (now).  The cluster has NVIDIA Tesla P100 GPUs, each with 10,752 cores. I guess I should stop working on viruses, or at least those with such short genomes :-)

Terry

Andrew Rambaut

unread,
Mar 29, 2018, 9:35:25 AM3/29/18
to beast...@googlegroups.com
Nice hardware. But yes, your 500 site patterns is not going to make much of a dent in these. On the plus side, you should be able to run lots of independent runs on these in parallel (memory allowing).

A.
Reply all
Reply to author
Forward
0 new messages