Subject: StarBEAST3 uncalibrated UCLN guide tree — scale non-identifiability and fix validation request

17 views
Skip to first unread message

Jesse Barrington

unread,
Apr 2, 2026, 2:36:22 PM (5 days ago) Apr 2
to beast-users

Dear BEAST2/StarBEAST3 developers,

I am running a StarBEAST3 (v1.2.1) uncalibrated Stage I guide tree in BEAST 2.7.7 with CoupledMCMC v1.2.2 as part of a DELINEATE species delimitation pipeline. The guide tree is intentionally uncalibrated - only topology is needed for downstream BPP analysis. I am writing to request validation of our final configuration after an extended convergence debugging process.

Dataset:

  • 13 loci: 1 mtDNA COI + 12 nuclear AHE (BUTTERFLY2.0 probe kit)
  • 165 specimens mapped to 36 species tree lineages (34 ingroup + 2 outgroups)
  • ~165 gene tree tips per locus
  • Hardware: Apple M4 Max, BEAST 2.7.7, BEAGLE CPU, dynamic scaling

Stepping-stone sampling result: UCLN relaxed clock was favored over strict clock across loci. We therefore cannot discard UCLN entirely.

Problem history:

Initially all 13 loci were UCLN with default priors. TreeHeight diverged across 12 orders of magnitude between independent replicates. Posterior showed catastrophic excursions to extreme values. ESS on TreeHeight remained at 3-5 after 450M states.

Attempted fixes that failed:

  1. Fixing COI meanClockRate=1.0 with estimate=false - TreeHeight drift persisted at 9-12 orders of magnitude
  2. Tightening ucldStdev priors from Exponential(mean=0.5) to Exponential(mean=0.1) - drift reduced to ~3 orders of magnitude but ucldStdev for COI was sampling values of ~4.0 despite the tight prior, causing posterior excursions

Root cause identified: The UCLN variance parameter (ucldStdev) provides a mathematical escape hatch: when ucldMean is fixed to 1.0 but ucldStdev is free, the MCMC drives ucldStdev to ~4.0, dropping the median branch rate to exp(-8) ≈ 0.00033, inflating TreeHeight by ~3000× while maintaining the same genetic distance. The coalescent likelihood reward for an inflated species tree overwhelms the prior penalty.

Final configuration (seeking validation):

  • COI (L01): Strict clock, rate fixed at 1.0, removed from UpDownOperator, all UCLN-specific parameters and operators removed
  • Nuclear loci L02-L13: UCLN relaxed clock, ucldStdev prior Exponential(mean=0.1), hard upper bound ucldStdev ≤ 1.0
  • CoupledMCMC: chains=6, deltaTemperature adaptive (optimise=true)
  • Species-tree clock: no separate species-tree clock rate estimated

Questions:

  1. Is strict clock on COI + UCLN on 12 nuclear loci with ucldStdev ∈ [0,1] the correct approach for an uncalibrated StarBEAST3 guide tree? Is there a more standard solution we should have used?

  2. Should the species-tree clock rate be explicitly fixed anywhere in the XML, or is the absence of a species-tree clock rate parameter sufficient?

  3. Is the ucldStdev upper bound of 1.0 appropriate, or should it be tighter (0.5) or removed in favor of a tighter prior alone?

  4. Are posterior calculation corrections (~1 per 1.5-3M states, magnitudes <200) from BEAGLE dynamic scaling on Apple M4 Max expected at this dataset scale, or do they indicate a remaining structural problem?

We have also consulted the Taming the BEAST StarBEAST3 tutorial and the StarBEAST3 GitHub documentation throughout this process.

Thank you for any guidance.

Jesse Barrington Masters Candidate, Computational Systematics CCNY

Jesse Barrington

unread,
Apr 3, 2026, 1:35:08 PM (4 days ago) Apr 3
to beast-users

Hello again!

Following up on my previous post about uncalibrated StarBEAST3 UCLN guide tree convergence issues.

After resolving numerical instability (strict clock on COI anchor locus, beagle_scaling always, clockRate lower bound 0.001 on nuclear loci), the runs are now numerically clean — zero posterior corrections, negative likelihoods, TreeHeight in substitutions/site units.

However topology convergence is failing. After 200M states across 3 replicates:

  • ASDSF between all pairs: 0.33-0.38
  • ASDSF is getting worse over time, not better
  • CoupledMCMC (chains=6, optimise=true) has driven deltaTemperature down to ~2E-4, meaning all 6 chains are effectively at identical temperature
  • Swap rate is stable at 0.235 but the heated chains are providing no topology-crossing power

Dataset: 36 species tree tips, 13 loci (1 strict clock COI + 12 UCLN nuclear AHE), 165 specimens in gene trees, Apple M4 Max, BEAST 2.7.7, StarBEAST3 v1.2.1, CoupledMCMC v1.2.2.

Questions:

  1. When CoupledMCMC optimises deltaTemperature to near zero for a StarBEAST3 MSC, does this indicate the posterior landscape is unimodal and standard MCMC would suffice, or does it indicate the heated chains are failing to cross topology barriers?
  2. What deltaTemperature starting value is appropriate for a 36-tip StarBEAST3 MSC to maintain meaningful topology mixing?
  3. Should we disable optimisation and fix deltaTemperature manually, and if so at what value?
Reply all
Reply to author
Forward
0 new messages