starBeast3 parameters do not converge

Stephania Sandoval Arango

unread,

May 27, 2022, 2:03:15 PM5/27/22

to beast-users

Hello all,

I have been trying to run starbeast3 for a dataset that contains 100 UCE loci and 33 individuals in 7 putative species. I started with a GTR +G substitution model, a Yule model with default parameters, and a species tree relaxed clock. After running 8 replicates for 200 million generations each I haven't been able to reach convergence for the tree height and length statistics for both the species tree and the individual gene trees. Even after combining all the runs, these parameters are all still below 200 ESS. The other statistics seem fine. I tried reducing the dataset to 50 loci to see if that would help but the problem persists. Any suggestions on how to improve this situation?

Another question is how to handle the visualization of the species trees and gene trees. I have been trying to visualize the preliminary trees that I got from these runs so far using UglyTrees but it seems like it is way too much info. The species tree looks good with like maybe 5-10 gene trees but when adding more than that it gets really messy and hard to interpret, and it also crashes the webpage. Since I am running multiple independent runs I will need to combine these files which makes them massive and hard to handle. Has anyone run into this same issue?

Thank you all!

Stephania

Jordan Douglas

unread,

May 29, 2022, 5:31:57 PM5/29/22

to beast-users

Hi Stephania,

The tree lengths/heights are usually one of the slowest parameters to converge under the multispecies coalescent, so you are definitely not alone on this. You may have to run the chains for longer. If you have access to a many CPUs, running on more threads (eg. 16 or 32) can help a lot, especially for the 100 gene dataset. Also, if you use a strict clock model, instead of relaxed clock, it will converge a lot faster (despite being less biologically realistic). What is your estimate of the relaxed clock standard deviation? If it's quite small (eg. less than 0.3) then a strict clock may be okay.

Regarding visualisation, as far as I am aware UglyTrees is the only application which can show gene trees inside a species tree. In terms of choosing a visualisation tool for your dataset, this depends on your question. UglyTrees is useful for debugging your model and confirming that things are behaving, as well as learning about where the gene tree coalescence events are occurring. But if you are interested in discordance between the gene and species trees, then you will be better off using a Tanglegram program (eg, Dendroscope). This can plot the species tree next to one gene tree at a time so you can easily compare topologies. Overall, the multispecies coalescent is incredibly complex and can easily give an information overload (or crash the web browser!). Its visualistion is not a trivial problem.

BTW: in UglyTrees if you are finding the species tree branch widths (population sizes) are making things too big/small, then open 'Species tree' and set 'Species node width top' and 'bottom' to 'Select annotation...'. They default to population sizes but if 1 node has a very large population size it can give unwanted behaviour.

Hope this helps,

Jordan

Stephania Sandoval Arango

unread,

May 31, 2022, 2:12:26 PM5/31/22

to beast-users

Hi Jordan,

Thank you so much for the answers. I will try the new settings and see how it goes. In the meantime, I was trying to explore speedemon but I could not get it to run, I get the following error:

Error 1017 parsing the xml input file

Class could not be found. Did you mean snap.util.SkylineAnalyser?

Perhaps a package required for this class is not installed?

Error detected about here: <beast> <run id='mcmc' spec='MCMC'> <distribution id='posterior' spec='util.CompoundDistribution'> <distribution id='prior' spec='util.CompoundDistribution'> <distribution id='SPEEDEMONYuleSkylineCollapse.t:Species' spec='speedemon.YuleSkylineCollapse'> ... moving on

I created the xml file in beauti and previously had installed both starbeast3 and speedemon. Both of them are updated to the latest version. I wonder if this is because I am trying to run this using CIPRES? their BEAST2 version is 2.6.6.

Thank you in advance!

Stephania

Jordan Douglas

unread,

May 31, 2022, 5:54:26 PM5/31/22

to beast-users

Hi Stephania,

speedemon is still in prerelease (and not yet peer-reviewed). I am not so familiar with CIPRES, but I suspect they will wait for the main release before making the package available. We will make the v1.0.0 release hopefully in the next few months. But in the meantime, speedemon will only run through the beast2 package manager.

Cheers,

Jordan

Pfeiffer, Wayne

unread,

Jun 8, 2022, 9:46:02 PM6/8/22

to beast...@googlegroups.com, Pfeiffer, Wayne

Hi Jordan,

I help maintain the CIPRES gateway and would be willing to make SPEEDEMON 0.1.1 available if it can be added to BEAST2 as a package.

* If so, please let me know how to do that.

Also, you mentioned that runs with the multispecies coalescent can make use of more than a few cores. Right now we only use more than 6 cores for SNAPP or for AA alignments with 40 or more partitions.

* Please let me know what packages should scale to larger numbers of cores, and I will run some benchmarks for them on the compute nodes that we use. Based on the results that I obtain, we can then increase the number of cores for CIPRES runs to make them go faster.

Thanks, Wayne

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beast-users/b629e76f-cbe6-44c8-95f8-62d6cc3850e9n%40googlegroups.com.

Miller, Mark

unread,

Jun 9, 2022, 11:44:20 AM6/9/22

to beast...@googlegroups.com

Wyane,

Here is a link to some starbeast jobs:

https://object.cloud.sdsc.edu/v1/AUTH_cipres/lopezuribelab/

Mark

To view this discussion on the web visit https://groups.google.com/d/msgid/beast-users/693A40D3-D8AE-471E-8958-4ABB814888D2%40ucsd.edu.

Jordan Douglas

unread,

Jun 9, 2022, 5:35:56 PM6/9/22

to beast-users

Hi Wayne,

Thank you! That would be great if you can add speedemon to CIPRES.

In general, BEAST2 packages in prerelease are typically specified here:

https://github.com/CompEvol/CBAN/blob/master/packages-extra.xml

After their release, we include them in the latest package directory. For 2.6 this is:

https://github.com/CompEvol/CBAN/blob/master/packages2.6.xml

Installing prerelease packages requires adding packages-extra to the repository on your machine. This can all be done using beauti (see speedemon github tutorial https://github.com/rbouckaert/speedemon ).

Alternatively, through the command line you can edit ~./beast/2.6/beauti.properties by adding the following line: "packages.url=https\://raw.githubusercontent.com/CompEvol/CBAN/master/packages-extra.xml". Then, speedemon (or any other package in the repository) can be added using ~/beast/bin/packagemanager -add speedemon

Regarding multithreading, I am aware of the following packages:

- StarBeast3 - can support up to 1 thread per loci (users typically have up to a few hundred loci). I usually use 8, 16 or 32 threads. But the majority (~80%?) of the wall-time is single-threaded.

- SNAPPER - a fast approximation of SNAPP

- speedemon (but only by using the above 2 packages)

Cheers,

Jordan

Pfeiffer, Wayne

unread,

Jun 9, 2022, 6:11:17 PM6/9/22

to beast...@googlegroups.com, s.sand...@gmail.com, mpr...@gmail.com, Miller, Mark, Pfeiffer, Wayne

Hi Jordan,

Thanks for the installation instructions. SPEEDEMON 0.1.1 is now installed and available for BEAST2 via CIPRES :)

* Please let me know if you have a couple of benchmark data sets for SPEEDEMON that will benefit from using more than 6 cores.

As for StarBeast3 and SNAPPER, I should be able to find benchmark data sets for them from previous submissions by CIPRES users.

After I run the benchmarks, I will adjust the number of cores used by CIPRES runs depending upon the size of the data sets.

Thanks again, Wayne

To view this discussion on the web visit https://groups.google.com/d/msgid/beast-users/73569787-3cdb-4c36-a196-a06d22b278c0n%40googlegroups.com.

Jordan Douglas

unread,

Jun 9, 2022, 6:52:57 PM6/9/22

to beast-users

Hi Wayne,

Thanks for the installation :)

There are 2 large datasets here

https://github.com/rbouckaert/speedemon/blob/master/examples/pozzi.xml

https://github.com/rbouckaert/speedemon/blob/master/examples/simulated.xml

Both of which have around 100 loci. I note that when speedemon uses multithreading, it does so through the same avenue as starbeast3 or snapper, since speedemon is a tree prior which plugs into these two packages.

We also have some large starbeast3 datasets here:

https://github.com/rbouckaert/starbeast3/tree/master/examples

Thanks,

Jordan

bander...@gmail.com

unread,

Jun 10, 2022, 3:33:06 AM6/10/22

to beast-users

Correct me if I'm wrong, but I don't believe SNAPPER multithreading currently works.

Pfeiffer, Wayne

unread,

Jun 15, 2022, 8:27:06 PM6/15/22

to beast...@googlegroups.com, Pfeiffer, Wayne

Hi Jordan,

I tried to analyze three of your suggested data sets – barrows.xml, bryson.xml, and hamilton.xml – using StarBeast3 1.0.5 with BEAST2 2.6.6, and all of them failed with the following error message:

Error 1017 parsing the xml input file

Class could not be found. Did you mean beast.util.TimeLogger?

Perhaps a package required for this class is not installed?

Error detected about here:

<beast>

... moving on

* Please advise.

Thanks, Wayne

To view this discussion on the web visit https://groups.google.com/d/msgid/beast-users/0adc4e46-2d4c-40af-85f2-9c378d3adda6n%40googlegroups.com.

Jordan Douglas

unread,

Jun 16, 2022, 9:44:37 PM6/16/22

to beast-users

Hi Wayne,

Thanks for pointing that out. That was some old xml code I forgot to remove. I just pushed the correct version to github

Jordan

Pfeiffer, Wayne

unread,

Jun 17, 2022, 11:59:56 AM6/17/22

to beast...@googlegroups.com, Pfeiffer, Wayne

Hi Jordan,

My runs for these three data sets still fail. Attached is the stderr file for the bryson data set.

Meanwhile, I made benchmark runs for your two example SPEEDEMON data sets and the three largest StarBeast3 data sets run so far via CIPRES. A spreadsheet with my results for these data sets and other similar ones is also attached.

These DNA data sets have 18 or more partitions. Until now, all such data sets used 4 threads when run via CIPRES. Based upon the new results, either 1 thread, 4 threads, or 6 threads will be used depending upon the number of patterns. Using more threads (and cores) than 6 is not cost-effective since the parallel efficiency drops below 0.50.

Each Expanse node that we use for CIPRES has 128 AMD Rome cores. Jobs in the compute partition have exclusive access to all cores and are charged for the full node, whereas jobs in the shared partition are charged only for the cores specified. CIPRES jobs run in the shared partition because that is much more cost-effective. However, shared jobs are generally slower than compute jobs because of memory contention from other jobs on the node. That can be seen by comparing the times in Column I for compute jobs to the times in Columns J, K, and L for shared jobs. For each row in the spreadsheet, either three or five shared jobs were run, and the shared times listed are the fastest, median, and slowest. The slowdown from memory contention is often significant.

Once you get me good data sets for the barrow, bryson, and hamilton data sets, I can run benchmarks for them as well. However, based upon their numbers of partitions and patterns, I do not expect that it will be cost-effective to use more than 4 threads for any of them.

Best regards, Wayne

To view this discussion on the web visit https://groups.google.com/d/msgid/beast-users/0f497400-668f-4ef7-b538-d02916ea1327n%40googlegroups.com.

bryson.t4.i1.13535392.exp-1-13.err

BEAST2.BinDNA18.220617.xlsx

Jordan Douglas

unread,

Jun 20, 2022, 8:06:05 PM6/20/22

to beast-users

Hi Wayne,

Thank you for the benchmarking! I am not surprised that cost-efficiency drops above 6 threads (but the number of effective samples per hour continues to increase well beyond 16 threads). The way the parallelisation works is: the loci are evenly partitioned into n threads and mcmc is run on the n partitions independently. The length of each mcmc chain is determined by the 'chainCoverage' parameter in the XML file under the 'ParallelMCMCTreeOperator' class. This defaults to 1, meaning that the total MCMC chain length (across all n partitions) is equal to 1*the number of parameters being estimated. Thus, more threads means each chain is shorter. If this term was increased to, say 2, then perhaps 12 threads would be where cost-efficiency drops off.

Thanks for posting the example file error. Turns out the error only occurred when running on beagle which is why I did not notice at first. I have updated the example files, hope they are working for you now

Thanks again,

Jordan

Pfeiffer, Wayne

unread,

Jun 20, 2022, 10:12:36 PM6/20/22

to beast...@googlegroups.com, Pfeiffer, Wayne

Hi Jordan,

My runs for the three sample StarBeast3 data sets still fail in similar ways. Attached is the latest stderr file for the bryson data set.

* Please let me know when you think you have good xml files.

Thanks, Wayne

To view this discussion on the web visit https://groups.google.com/d/msgid/beast-users/0f37faae-4b6e-4101-a3d8-2e5c210b119dn%40googlegroups.com.

bryson.t4.i1.13700900.exp-1-03.err

Jordan Douglas

unread,

Jun 20, 2022, 11:30:02 PM6/20/22

to beast-users

Hi Wayne,

My apologies, I did not run the example sessions for long enough to see the error. The files have been corrected. Hope it works this time

Jordan

Pfeiffer, Wayne

unread,

Jun 21, 2022, 9:43:06 AM6/21/22

to beast...@googlegroups.com, Pfeiffer, Wayne

Hi Jordan,

Thanks for the corrected xml files for the three StarBeast3 data sets :)

I made benchmark runs for all three and found

- no speedup when using more than one thread for the barrow and bryson data sets and

- 1.5x speedup going from one to four threads for the hamilton data set.

The numbers of patterns for these data sets are

- 2,849 for barrow,

- 4,193 for bryson, and

- 12,520 for hamilton.

More patterns are needed to get better speedup, especially when there are many partitions.

Best regards, Wayne

To view this discussion on the web visit https://groups.google.com/d/msgid/beast-users/84d493a7-2601-47af-a6a1-907019940400n%40googlegroups.com.

Jordan Douglas

unread,

Jun 21, 2022, 5:34:42 PM6/21/22

to beast-users

Hi Wayne,

That all makes sense. Barrow/Bryson are fairly small datasets which did not take long to converge, but Hamilton was particularly slow. We used larger datasets when benchmarking speedemon (~100 loci, compared with ~50) because the strict clock model was used instead of the relaxed clock - the former mixes a lot faster.