Error with marginal likelihood estimation

164 views
Skip to first unread message

Avery Holmes

unread,
Jan 16, 2025, 12:21:39 PM1/16/25
to beast-users

Hello everyone, 



I have been trying to perform BETS (https://doi.org/10.1093/molbev/msaa163) on my dataset to formally test for a clock-like signal. I want to compare models with a strict clock and no tip dates (i.e., tip dates all 0), with a model with a relaxed clock and tip dates included. I am assuming constant population size based on the ecology and history of my virus. I have been running these with generalised stepping stone sampling for the marginal likelihood estimation and the product of exponential distributions as the tree working prior for the GSS. These models run fine and give nice traces and high ESS values, but after the chain is finished I am getting an error for the strict clock model as below:


Operator analysis

Operator                                          Tuning   Count      Time     Time/Op  Pr(accept) 

scale(CP1+2.kappa)                                0.57    861738     220145   0.26     0.2361      

scale(CP3.kappa)                                  0.574   862584     185757   0.22     0.2366      

frequencies                                       0.069   863024     216329   0.25     0.2392      

scale(CP1+2.alpha)                                0.582   862732     220102   0.26     0.2355      

scale(CP3.alpha)                                  0.55    862216     185369   0.21     0.2364      

scale(nodeHeights(treeModel))                     0.81    2587709    884091   0.34     0.2334      

allNus                                            0.107   2587090    645221   0.25     0.2405      

subtreeSlide(treeModel)                           0.02    25861347   1596126  0.06     0.2318      

Narrow Exchange(treeModel)                                25867090   1491738  0.06     0.1799      

Wide Exchange(treeModel)                                  2586065    123270   0.05     0.0017      

wilsonBalding(treeModel)                                  2587423    240745   0.09     0.0035      

scale(treeModel.rootHeight)                       0.154   2587305    192216   0.07     0.24        

uniform(nodeHeights(treeModel))                           25853038   2202356  0.09     0.4342      

scale(exponential.popSize)                        0.64    2584777    120932   0.05     0.2361      

exponential.growthRate                            13.194  2585862    119868   0.05     0.2404      


8.017392521906503, CP1+2.kappa=[0.0, Infinity]

1

12.509066895581492, CP3.kappa=[0.0, Infinity]

1

0.5851754970785747[0.0,1.0], 0.4148245029214019[0.0,1.0]

2

0.24350298680410806, frequencies=[0.0, 1.0], 0.3523928115652138[0.0, 1.0], 0.2060185133307498[0.0, 1.0], 0.19808568829992856[0.0, 1.0]

4

0.255648428292721, CP1+2.alpha=[0.0, Infinity]

1

0.6590643762660736, CP3.alpha=[0.0, Infinity]

1

1.0, clock.rate=[-Infinity, Infinity]

1

1.4449411369650107, exponential.popSize=[0.0, Infinity]

1

WARNING: Likelihood component, null, created but not used in the MCMC

WARNING: Likelihood component, null, created but not used in the MCMC

WARNING: Likelihood component, null, created but not used in the MCMC

WARNING: Likelihood component, exponentials, created but not used in the MCMC

WARNING: Likelihood component, null, created but not used in the MCMC

WARNING: Likelihood component, null, created but not used in the MCMC

WARNING: Likelihood component, null, created but not used in the MCMC

WARNING: Likelihood component, null, created but not used in the MCMC

WARNING: Likelihood component, null, created but not used in the MCMC

WARNING: Likelihood component, treeLikelihood, created but not used in the MCMC

WARNING: Likelihood component, null, created but not used in the MCMC


Creating the Marginal Likelihood Estimator chain:

  chainLength=1000000

  pathSteps=100

  pathScheme=betaQuantile(0.3)



Attempting theta (1/101) = 1.0 for 1000000 iterations + 100000 burnin.

Exception in thread "Thread-1" java.lang.IllegalArgumentException: The initial likelihood is zero: 

CompoundLikelihood(compoundModel)=(

  LogNormal(CP1+2.kappa)=-3.5981, 

  LogNormal(CP3.kappa)=-4.4142, 

  MultivariateDistributionLikelihood(dirichletDistribution)=1.7918, 

  Exponential(CP1+2.alpha)=0.1819, 

  Exponential(CP3.alpha)=-0.625, 

  MultivariateDistributionLikelihood(dirichletDistribution)=0.0, 

  OneOnX(exponential.popSize)=-0.3681, 

  Laplace(exponential.growthRate)=-8.3978, 

  CoalescentLikelihood(coalescentLikelihood[coalescent])=-175.4109, 

  StrictClockBranchRates(strictClockBranchRates[branchRates])=0.0

  Total = -190.84033926912733

), 

CompoundLikelihood(compoundModel)=(

  TreeDataLikelihood(TreeDataLikelihood)=-4167.0428, 

  TreeDataLikelihood(TreeDataLikelihood)=-4192.2151

  Total = -8359.25787084048

)

Total = -8550.098210109607: 

LogTransformedNormalKDE(CP1+2.kappa)=-0.6163, 

LogTransformedNormalKDE(CP3.kappa)=-1.4498, 

MultivariateDistributionLikelihood(multivariateKDE)=3.9551, 

MultivariateDistributionLikelihood(multivariateKDE)=8.1766, 

LogTransformedNormalKDE(CP1+2.alpha)=2.9608, 

LogTransformedNormalKDE(CP3.alpha)=0.6647, 

LogTransformedNormalKDE(clock.rate)=NaN, 

StrictClockBranchRates(strictClockBranchRates[branchRates])=0.0, 

LogTransformedNormalKDE(exponential.popSize)=0.6865, 

NormalKDE(exponential.growthRate)=-1.9509, 

ExponentialProductPosteriorMeansLikelihood(treeModel)=-181.9166

Total = NaN

at dr.inference.markovchain.MarkovChain.runChain(Unknown Source)

at dr.inference.mcmc.MarginalLikelihoodEstimator.integrate(Unknown Source)

at dr.inference.mcmc.MarginalLikelihoodEstimator.run(Unknown Source)

at java.base/java.lang.Thread.run(Thread.java:829)

1 file(s) found with marginal likelihood samples

Jan 15, 2025 6:45:00 PM dr.app.beast.BeastMain <init>

SEVERE: Parsing error - poorly formed BEAST file, SAT1_st1_Iso_Sct_Exp.xml:

Error parsing '<generalizedSteppingStoneSamplingAnalysis>' element with id, 'null':

Incorrect file format, no sample is found !


Error thrown at: dr.inference.trace.GeneralizedSteppingStoneSamplingAnalysis$1.parseXMLObject(Unknown Source)



Does anyone have any idea what I am doing wrong? I am using BEAST v.1.10.4.



Thanks,

Avery


Artem B

unread,
Jan 24, 2025, 2:20:20 AM1/24/25
to beast-users

Hi Avery,

Try to find a line similar to the line below and delete it:

<logTransformedNormalReferencePrior fileName="no_teouma_sidron_SC_DATES.log" parameterColumn="clock.rate" burnin="5000000"> <parameter idref="clock.rate"/> </logTransformedNormalReferencePrior>

Besides, please, be sure that your priors are proper (i.e., integrated to 1). For example, do not use an 1/x distribution, or if you set a uniform prior on clock rate, you also should set strict boundaries for it (e.g., [0,1]). It prevent marginal likelihood estimations from crashes.

Best,
Artem

пятница, 17 января 2025 г. в 01:21:39 UTC+8, Avery Holmes:

Avery Holmes

unread,
Jan 28, 2025, 1:04:46 PM1/28/25
to beast-users

Hi Artem,



Thank you, this fixed it! Just out of curiosity, why does this work?



Many thanks,

Avery 

Artem B

unread,
Feb 2, 2025, 9:09:14 PM2/2/25
to beast-users
Hi Avery,

If I remember correctly, this bug happens when you fix a substitution rate.  But the xml is printed out with a working prior for the rate (which has no prior since it has been fixed). First of all, thanks to Sebastian Duchene who helped me solve this problem someday :)

All the best,
Artem

среда, 29 января 2025 г. в 02:04:46 UTC+8, Avery Holmes:

Avery Holmes

unread,
Feb 7, 2025, 1:02:07 PM2/7/25
to beast-users

Hello again, 


Now that I have looked at and thought about the results from these models, I have some additional questions. 


First, some of my mle log files (but not all of them) have two separate likelihood values, which are sometimes similar and sometimes massively different. Are these two independent estimates? And shouldn’t these values be approximately the same? Is there a problem when a log file has only one?


Second, some of these values are really massive, e.g., -1.5551871444796084 E 163. Does this also indicate a problem? 


This is all new for me and I’m very unsure how to interpret these results or diagnose any issues. 


Thank you in advance for any input! 

Reply all
Reply to author
Forward
0 new messages