Installing Beagle to run BEAST on multiple cores

albernererhelge

unread,

Apr 11, 2016, 5:02:59 AM4/11/16

to beast-users

Hey,

i was trying to figure out how to let BEAST run faster using more than only one core. Is it enough only to install Beagle via the Beagle-installer file or do i also need to install OpenCL by Intel? The fact is that I don't really know what's the purpose of OpenCL and whether I need it therefore. Will BEAST run using multiple cores automatically after installing Beagle?

Regards
Chris

bander...@gmail.com

unread,

Apr 11, 2016, 5:41:04 AM4/11/16

to beast-users

Hi Chris,

As far as I understand it, it depends on what graphics card(s) you have. Even if you have no graphics cards that have nice GPUs, BEAGLE will still allow BEAST to run on multiple cores of your CPU, so my guess is that you don't need OpenCL to run BEAST in parallel. I had some issues getting it to use multiple cores (I don't have nice GPUs), and I had to manually edit the xml to include a useThreads="true" for the likelihood. After that, I ran BEAST as:
>beast -beagle_SSE -instances 4 -threads -1 input.xml
and it seems to work OK.

Hope that helps!

Ben

Andrew Rambaut

unread,

Apr 11, 2016, 7:07:14 AM4/11/16

to beast...@googlegroups.com

Hi Chris & Ben,

If you install BEAGLE without OpenCL then BEAST will divide up data partitions amongst cores. If your data has multiple partitions (genes or you use the codon position partitioning) then this will use as many cores as you have partitions. If you have 1 large partition you can use the

“-beagle_instances X” option to divide up the partition into X equal parts which will then be distributed amongst cores (if you have 2 partitions and do -beagle_instances 2 then it will divide both those partitions giving you 4 partitions). As you divide up partitions you will get a trade off between the parallelization and the overhead of small partitions (i.e., diminishing returns).

Installing OpenCL gives you another option which will automatically parallelize a single large partition across all the available CPU cores. This may not give any benefit over using -beagle_instances to divide the data unless you have a large number of cores (i.e., 32 or something).

If you have a GPU it will also allow you to parallelize a single partition to on that too. If you have an AMD GPU or an Intel Xeon Phu board you can use OpenCL to exploit those boards too - although I have heard that they give limited benefit for BEAGLE.

If you have an NVIDIA GPU then you can just install CUDA, although you can also install OpenCL, OpenCL will just use CUDA anyway (you will see your NVIDIA board coming up as a CUDA resource and an OpenCL resource but direct CUDA is generally a little bit fasta). NVIDIA boards work well with BEAGLE if it is a top spec one (with 1000s of cores and good double precision performance) and you have lots of data (site patterns).

The other case where OpenCL might help is for a very large state space model (i.e., phylogeography with 10s or 100s of discrete states but a single column of data). Here BEAGLE uses some parallelizations for calculating the big matrices required.

Andrew.

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at https://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

albernererhelge

unread,

Apr 12, 2016, 4:03:52 AM4/12/16

to beast-users

Dear Mr. Rambaut and Ben,

thank you both for your detailed answer...this is more complicated to understand than I thought. I will try to go through this step by step within my program. Since I dont have a "nice GPU", I will first install BEAGLE and see what happens. So my first question concerning this would be, how do I see if using more than one core works after the installation? Only by the remaining time BEAST is giving me, or is there a command line?

Regards
Chris

Andrew Rambaut

unread,

Apr 12, 2016, 4:24:26 AM4/12/16

to beast...@googlegroups.com

Are you running on Linux? I usually use the command ‘top’ which shows all the processes running. The CPU column has the % of a CPU core being used (so 400% will mean it is using roughly 4 cores). It is a bit more complicated that that because processes and threads get moved around between cores dynamically by the OS to balance the load.

On the Mac there is a GUI program called "Activity Monitor" that will give you this information.

Andrew

albernererhelge

unread,

Apr 13, 2016, 4:07:16 AM4/13/16

to beast-users

I'm running on WIndows 7 Enterprise x64

Graham

unread,

Apr 13, 2016, 8:32:33 AM4/13/16

to beast-users

Control panel - Performance - Advanced tools - Resource monitor.

Reply all

Reply to author

Forward