Hello everyone,
I have been trying to perform BETS (https://doi.org/10.1093/molbev/msaa163) on my dataset to formally test for a clock-like signal. I want to compare models with a strict clock and no tip dates (i.e., tip dates all 0), with a model with a relaxed clock and tip dates included. I am assuming constant population size based on the ecology and history of my virus. I have been running these with generalised stepping stone sampling for the marginal likelihood estimation and the product of exponential distributions as the tree working prior for the GSS. These models run fine and give nice traces and high ESS values, but after the chain is finished I am getting an error for the strict clock model as below:
Operator analysis
Operator Tuning Count Time Time/Op Pr(accept)
scale(CP1+2.kappa) 0.57 861738 220145 0.26 0.2361
scale(CP3.kappa) 0.574 862584 185757 0.22 0.2366
frequencies 0.069 863024 216329 0.25 0.2392
scale(CP1+2.alpha) 0.582 862732 220102 0.26 0.2355
scale(CP3.alpha) 0.55 862216 185369 0.21 0.2364
scale(nodeHeights(treeModel)) 0.81 2587709 884091 0.34 0.2334
allNus 0.107 2587090 645221 0.25 0.2405
subtreeSlide(treeModel) 0.02 25861347 1596126 0.06 0.2318
Narrow Exchange(treeModel) 25867090 1491738 0.06 0.1799
Wide Exchange(treeModel) 2586065 123270 0.05 0.0017
wilsonBalding(treeModel) 2587423 240745 0.09 0.0035
scale(treeModel.rootHeight) 0.154 2587305 192216 0.07 0.24
uniform(nodeHeights(treeModel)) 25853038 2202356 0.09 0.4342
scale(exponential.popSize) 0.64 2584777 120932 0.05 0.2361
exponential.growthRate 13.194 2585862 119868 0.05 0.2404
8.017392521906503, CP1+2.kappa=[0.0, Infinity]
1
12.509066895581492, CP3.kappa=[0.0, Infinity]
1
0.5851754970785747[0.0,1.0], 0.4148245029214019[0.0,1.0]
2
0.24350298680410806, frequencies=[0.0, 1.0], 0.3523928115652138[0.0, 1.0], 0.2060185133307498[0.0, 1.0], 0.19808568829992856[0.0, 1.0]
4
0.255648428292721, CP1+2.alpha=[0.0, Infinity]
1
0.6590643762660736, CP3.alpha=[0.0, Infinity]
1
1.0, clock.rate=[-Infinity, Infinity]
1
1.4449411369650107, exponential.popSize=[0.0, Infinity]
1
WARNING: Likelihood component, null, created but not used in the MCMC
WARNING: Likelihood component, null, created but not used in the MCMC
WARNING: Likelihood component, null, created but not used in the MCMC
WARNING: Likelihood component, exponentials, created but not used in the MCMC
WARNING: Likelihood component, null, created but not used in the MCMC
WARNING: Likelihood component, null, created but not used in the MCMC
WARNING: Likelihood component, null, created but not used in the MCMC
WARNING: Likelihood component, null, created but not used in the MCMC
WARNING: Likelihood component, null, created but not used in the MCMC
WARNING: Likelihood component, treeLikelihood, created but not used in the MCMC
WARNING: Likelihood component, null, created but not used in the MCMC
Creating the Marginal Likelihood Estimator chain:
chainLength=1000000
pathSteps=100
pathScheme=betaQuantile(0.3)
Attempting theta (1/101) = 1.0 for 1000000 iterations + 100000 burnin.
Exception in thread "Thread-1" java.lang.IllegalArgumentException: The initial likelihood is zero:
CompoundLikelihood(compoundModel)=(
LogNormal(CP1+2.kappa)=-3.5981,
LogNormal(CP3.kappa)=-4.4142,
MultivariateDistributionLikelihood(dirichletDistribution)=1.7918,
Exponential(CP1+2.alpha)=0.1819,
Exponential(CP3.alpha)=-0.625,
MultivariateDistributionLikelihood(dirichletDistribution)=0.0,
OneOnX(exponential.popSize)=-0.3681,
Laplace(exponential.growthRate)=-8.3978,
CoalescentLikelihood(coalescentLikelihood[coalescent])=-175.4109,
StrictClockBranchRates(strictClockBranchRates[branchRates])=0.0
Total = -190.84033926912733
),
CompoundLikelihood(compoundModel)=(
TreeDataLikelihood(TreeDataLikelihood)=-4167.0428,
TreeDataLikelihood(TreeDataLikelihood)=-4192.2151
Total = -8359.25787084048
)
Total = -8550.098210109607:
LogTransformedNormalKDE(CP1+2.kappa)=-0.6163,
LogTransformedNormalKDE(CP3.kappa)=-1.4498,
MultivariateDistributionLikelihood(multivariateKDE)=3.9551,
MultivariateDistributionLikelihood(multivariateKDE)=8.1766,
LogTransformedNormalKDE(CP1+2.alpha)=2.9608,
LogTransformedNormalKDE(CP3.alpha)=0.6647,
LogTransformedNormalKDE(clock.rate)=NaN,
StrictClockBranchRates(strictClockBranchRates[branchRates])=0.0,
LogTransformedNormalKDE(exponential.popSize)=0.6865,
NormalKDE(exponential.growthRate)=-1.9509,
ExponentialProductPosteriorMeansLikelihood(treeModel)=-181.9166
Total = NaN
at dr.inference.markovchain.MarkovChain.runChain(Unknown Source)
at dr.inference.mcmc.MarginalLikelihoodEstimator.integrate(Unknown Source)
at dr.inference.mcmc.MarginalLikelihoodEstimator.run(Unknown Source)
at java.base/java.lang.Thread.run(Thread.java:829)
1 file(s) found with marginal likelihood samples
Jan 15, 2025 6:45:00 PM dr.app.beast.BeastMain <init>
SEVERE: Parsing error - poorly formed BEAST file, SAT1_st1_Iso_Sct_Exp.xml:
Error parsing '<generalizedSteppingStoneSamplingAnalysis>' element with id, 'null':
Incorrect file format, no sample is found !
Error thrown at: dr.inference.trace.GeneralizedSteppingStoneSamplingAnalysis$1.parseXMLObject(Unknown Source)
Does anyone have any idea what I am doing wrong? I am using BEAST v.1.10.4.
Thanks,
Avery
Hi Avery,
Try to find a line similar to the line below and delete it:
<logTransformedNormalReferencePrior fileName="no_teouma_sidron_SC_DATES.log" parameterColumn="clock.rate" burnin="5000000"> <parameter idref="clock.rate"/> </logTransformedNormalReferencePrior>Besides, please, be sure that your priors are proper (i.e., integrated to 1). For example, do not use an 1/x distribution, or if you set a uniform prior on clock rate, you also should set strict boundaries for it (e.g., [0,1]). It prevent marginal likelihood estimations from crashes.
Best,
Artem
Hi Artem,
Thank you, this fixed it! Just out of curiosity, why does this work?
Many thanks,
Avery
Hello again,
Now that I have looked at and thought about the results from these models, I have some additional questions.
First, some of my mle log files (but not all of them) have two separate likelihood values, which are sometimes similar and sometimes massively different. Are these two independent estimates? And shouldn’t these values be approximately the same? Is there a problem when a log file has only one?
Second, some of these values are really massive, e.g., -1.5551871444796084 E 163. Does this also indicate a problem?
This is all new for me and I’m very unsure how to interpret these results or diagnose any issues.
Thank you in advance for any input!