bModel Test use and settings

Edward B

unread,

Jul 22, 2018, 1:35:27 PM7/22/18

to beast-users

Hi folks (and Remco),

Another quick question.

I was running a test on 10 bird taxa (individual mt DNA region spread over several individuals from several families), with two calibration points from previous studies, and running two replicates of each. Getting good ess values well above 200 due to 10 mil + runs and after burnin runs look good BUT when comparing two runs in Tracer I seem to be getting bimodality BETWEEN them (not within each run) in several factors (namely proportion invariant, ucldST DEV, rate Coef of Variation). Another poster (Remco) explained that this meant non-convergence and suggested using bModelTest. Having read the PDF and done the tutorial (with primate mtDNA dataset) I have several questions regarding this program and settings:-

1. Am I correct in my understanding that bModelTest can be used to select the most appropriate model for the dataset AND to find the correct consensus tree at the same time i.e. alleviating the need to specify the substitution model (i.e. HKY, GTR etc) as you would normally do?

2. Also I am assuming that you also have burn-in periods when using bModelTest, is this correct?

3. Is there any reason that I cannot use my two calibration sets (set-up in prior section) in bModelTest AND if OK is there any specific settings I should be aware of?

4. Regarding settings, as my data is mtDNA I have partitioned each region dataset on codon position (C1/2/3) and run these unlinked BUT have linked their clock model and tree model, reasoning that as each region (the whole mitogenome in fact) is a single gene in time and so should be linked...is this correct?

5. As my dataset in not intraspecific I am correct to use a relaxed clock log normal model rather than a strict clock model?...I originally chose the relaxed clock log normal model rather than the relaxed clock exponential model as I read in the literature that it was most appropriate BUT would the exponential model be more appropriate and if so can someone explain why?

6. Finally, am I best leaving the rest of the priors at the default settings OR should I tweak specific ones (and which).

Sorry for so many questions but hopefully (or hopefully not) someone else is going through a similar scenario and this will help them at the same time.Thanks for you comments/help.

Edward

Remco Bouckaert

unread,

Jul 22, 2018, 3:52:05 PM7/22/18

to beast...@googlegroups.com

Hi Edward,

On 23/07/2018, at 5:35 AM, Edward B <egb...@gmail.com> wrote:
1. Am I correct in my understanding that bModelTest can be used to select the most appropriate model for the dataset AND to find the correct consensus tree at the same time i.e. alleviating the need to specify the substitution model (i.e. HKY, GTR etc) as you would normally do?

Yes — bModelTest actually does model averaging, so there is no need to commit to a particular model.

2. Also I am assuming that you also have burn-in periods when using bModelTest, is this correct?

Yes, but this is no different from any other MCMC run.

3. Is there any reason that I cannot use my two calibration sets (set-up in prior section) in bModelTest AND if OK is there any specific settings I should be aware of?

There should be no restriction on how to set up calibrations because of the choice of site model. Perhaps you ran into a bug and something else went wrong during setting up the analysis. If you set up the analysis again, does the problem persist? If so, what is shown in BEAUti under the menu Help/Messages?

4. Regarding settings, as my data is mtDNA I have partitioned each region dataset on codon position (C1/2/3) and run these unlinked BUT have linked their clock model and tree model, reasoning that as each region (the whole mitogenome in fact) is a single gene in time and so should be linked...is this correct?

Yes, the tree should be linked for mtDNA genes, since they share the same history.

5. As my dataset in not intraspecific I am correct to use a relaxed clock log normal model rather than a strict clock model?...I originally chose the relaxed clock log normal model rather than the relaxed clock exponential model as I read in the literature that it was most appropriate BUT would the exponential model be more appropriate and if so can someone explain why?

Perhaps the best way to choose a clock model is through model selection. Using nested sampling (https://github.com/BEAST2-Dev/nested-sampling/wiki) or stepping stone analysis, you can estimate the marginal likelihoods for each of the clock models and base your choice on the best fitting model.

6. Finally, am I best leaving the rest of the priors at the default settings OR should I tweak specific ones (and which).

The unwelcome answer is that this falls in the ‘it depends' category. You know your data best, so should include all prior information into the analysis, which typically involves some (or even a lot of) thought. If you are not sure, you can always test how sensitive the analysis is to your prior by running it under different priors and compare the feature of the analysis that you are interested in.

Cheers,

Remco

Edward B

unread,

Jul 23, 2018, 9:14:27 AM7/23/18

to beast-users

Hi Remco,

thanks for your reply.

Regarding 1/2 below, its reassuring to know that I have it right...or am finally thinking along the right way...phew, Beast is a long learning curve!

Regarding 3 below, there wasnt an issue, I was just making sure I wasnt doing something I shouldnt be.

Regarding 5 below, I looked at the literature and tutorials online and to be honest didnt understand what I had to do (and/or how to interpret the output), so prefer not to apply something/this approach in this situation. What I did find though was BMA, and it appears this will help me make a choice between the three relaxed clock models. One question here, does BMA work in a similar way to bModelTest in that it averages across models RATHER than testing each individually and giving an output that suggests the best to choose and use in subsequent runs? AND if the former is there any reason BMA and bModelTest cannot be used in the same run?

Thanks for your efforts on this issue.

Edward

Edward B

unread,

Aug 8, 2018, 8:06:21 AM8/8/18

to beast-users

Hi All,

just for clarification, can anyone help with my final question?, namely:-

One question here, does BMA work in a similar way to bModelTest in that it averages across models RATHER than testing each individually and giving an output that suggests the best to choose and use in subsequent runs? AND if the former is there any reason BMA and bModelTest cannot be used in the same run?

Regards,

Edward

Edward B

unread,

Aug 8, 2018, 8:08:10 AM8/8/18

to beast-users

Hi All,

just for clarification, can anyone help with my final question?, namely:-

One question here, does BMA work in a similar way to bModelTest in that it averages across models RATHER than testing each individually and giving an output that suggests the best to choose and use in subsequent runs? AND if the former is there any reason BMA and bModelTest cannot be used in the same run?