Hi Chris & Ben,
If you install BEAGLE without OpenCL then BEAST will divide up data partitions amongst cores. If your data has multiple partitions (genes or you use the codon position partitioning) then this will use as many cores as you have partitions. If you have 1 large partition you can use the
“-beagle_instances X” option to divide up the partition into X equal parts which will then be distributed amongst cores (if you have 2 partitions and do -beagle_instances 2 then it will divide both those partitions giving you 4 partitions). As you divide up partitions you will get a trade off between the parallelization and the overhead of small partitions (i.e., diminishing returns).
Installing OpenCL gives you another option which will automatically parallelize a single large partition across all the available CPU cores. This may not give any benefit over using -beagle_instances to divide the data unless you have a large number of cores (i.e., 32 or something).
If you have a GPU it will also allow you to parallelize a single partition to on that too. If you have an AMD GPU or an Intel Xeon Phu board you can use OpenCL to exploit those boards too - although I have heard that they give limited benefit for BEAGLE.
If you have an NVIDIA GPU then you can just install CUDA, although you can also install OpenCL, OpenCL will just use CUDA anyway (you will see your NVIDIA board coming up as a CUDA resource and an OpenCL resource but direct CUDA is generally a little bit fasta). NVIDIA boards work well with BEAGLE if it is a top spec one (with 1000s of cores and good double precision performance) and you have lots of data (site patterns).
The other case where OpenCL might help is for a very large state space model (i.e., phylogeography with 10s or 100s of discrete states but a single column of data). Here BEAGLE uses some parallelizations for calculating the big matrices required.
Andrew.