Comparing Beast output (AICM, SS/PS) and general sanity check

Kevin Daly

unread,

Jun 22, 2016, 7:18:30 AM6/22/16

to beast-users

Hi,

I have been playing around with Beautii and Beast2 over the past week, using a mitochondrial genome dataset and some calibrated samples. Having never used Beast before, I'll describe what I've been doing in case there are glaring errors: I have estimated a site substitution model using ModelGenerator. I have run several Beast analyses, varying some combination of the clock rate (using a known prior; uniform distribution), clock model (strict or relaxed lognormal) and partitioning scheme (noncoding + coding, vs no partition - for the site model when using the partitioned data, I use a common site model for both partitions, the one suggested by ModelGenerator). For the substitution rate and clock rate, I placed prior values when appropriate and kept the "estimate" box checked (as I want a calibrated estimate of the mutation rate). For the site model, I use the gamma category, shape proportionate invariant and kappa as suggested by modelgenerator, though I have Kappa set to "estimate" and empiricially-derived frequencies, which may be incorrect. I run everything for 50M iterations twice (5M burn-in), then merging using logcombiner.

Using Tracer, I have been naively comparing the different models (having only learned about aicm, ss/ps etc yesterday). So for example my partitioned dataset using two rate priors gave very healthy ESS for all parameters (>1000), where as the runs with say no-partitions+a prior clock rate+strict clock, or no-partitions+a prior clock rate+relaxed log normal clock give very low ESS for the posterior, prior, the HyperPrior.hyperNoraml-sigma-ClockPrior (all ~10). I interpreted these ESS values as indicating poor-fitting of the former models to the data, and did not run the models for longer. Again, these may all be mistakes on my part, and please correct me if that's the case.

AN aside - for all of these models I get a single mutationRate parameter, always set at "1" with a "-" ESS. Is this expected? How does "mutationRate" differ from "clockRate"?

When I learned about AICM as a method of model comparison, I ran it and supported by my very naive interpretation of the Tracer logs. However, reading previous posts in this group I saw there are issues using AIC (which is different to AICM, correct?) when comparing models, and that SS/PS are the gold standards. There were several references to a tutorial on how to run these via Beast, however the "How To" section of the Beast2 Model Comparison page is empty (also for the Beast1 website). Is using AICM acceptable for Model Comparison? Or is PS/SS require for publication-level analyses? If anyone could explain how to run these using Beast, that would be fantastic.

Thanks for your time and this excellent resource,

Kevin

Alexei Drummond

unread,

Jun 22, 2016, 8:06:47 AM6/22/16

to beast...@googlegroups.com

Dear Kevin,

In answer to your question about whether ESS can tell you about model fit: the short answer is no.

Long answer: the rate of mixing of the MCMC chain doesn’t tell you anything about the appropriateness of the model. It only tells you that the MCMC algorithm is not as efficient at sampling some posterior distributions as others. Equivalent MCMC algorithms can vary greatly in mixing, but still, when given long enough, produce the same posterior distribution when given the same data to analyse under the same model. Furthermore, an alternative but equivalent MCMC algorithm to the ones implemented in BEAST2 may just as well mix fast on the models that you find BEAST2 mixing slowly on, and vice versa.

The only thing that you can infer from low ESS values is that you have to run the chain longer if you want to assess the fit of that particular model. Having said that, another small possibility is that the models that are mixing badly are doing so because of an error made in specifying the model. Examples of such an error could be (i) the use of an improper prior, or (ii) the introduction of non-identifiability by overparameterization.

Cheers

Alexei

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To post to this group, send email to beast...@googlegroups.com.
Visit this group at https://groups.google.com/group/beast-users.
For more options, visit https://groups.google.com/d/optout.

Guy Baele

unread,

Jun 22, 2016, 8:26:07 AM6/22/16

to beast-users

Hi Kevin,

We have only shown that the AICM may be suitable (while still being outperformed by PS/SS) for comparing simple parametric coalescent models, i.e. constant population size versus exponential growth. For comparing molecular clock models, the AICM performs terribly. So yes, PS/SS are the current standards being used to perform Bayesian model selection and are suitable for publication-level analyses. Newer methods are on the way, like GSS - offering more accurate marginal likelihoods in the case of vague priors and requiring less computation time - but these are not yet fully available for all data types and models (working on it though).

Best regards,

Guy

Op woensdag 22 juni 2016 13:18:30 UTC+2 schreef Kevin Daly:

Kevin Daly

unread,

Jun 22, 2016, 11:13:42 AM6/22/16

to beast-users

Thank you both! Would you suggest running the problematic models for longer than? Such a low ESS would require ~500M runs to get close to 100 ESS - assuming there is no issue of improper priors or overparameterization.

For model comparison then, I could run PathSampler (following these instructions: http://beast2.org/path-sampling/ and http://beast2.org/2014/07/14/path-sampling-with-a-gui/). Would one run a model initially to ensure mixing of parameters, then rerun for the same number of iterations (eg 50M) by modifying the beauti log file as described in the links? Is there no way to exploit the fact that you have already run the model in beast? I assume you would do this for all models, then select the best (marginal L closest to 0) model as your most supported model?

Thanks again

Reply all

Reply to author

Forward