Lognormal calibration priors

Martin Dohrmann

unread,

Jan 28, 2008, 2:08:36 PM1/28/08

to beast...@googlegroups.com

Dear BEAST-users,

I'm trying to estimate divergence times between multiple species from a DNA alignment using some fossil calibrations. The most appropriate prior distribution for the calibration points would be lognormal in my case (following Ho 2007: J. Avian Biol. 38). However, with lognormal priors (and also exponential priors) BEAST tells me that "The initial model is invalid because state has a zero probability". This occurs with both the exponential and lognormal relaxed clock models. When I use normal distributions (all else being equal), the analysis runs just fine. Substitution model is GTR+I+G(4), starting tree is user-specified. Any idea what's going on?

Also, I'm not sure about how to chose stdev and mean of the lognormal distribution in a meaningful way. Is there anybody who can explain this to a "non-mathematical" person?

Cheers

Martin Dohrmann

PhD candidate

Department of Geobiology

Geoscience Centre Göttingen (GZG)

University of Göttingen

Goldschmidtstr. 3

37077 Göttingen

Germany

mdoh...@gwdg.de

Simon Ho

unread,

Jan 30, 2008, 3:38:58 AM1/30/08

to beast-users

Hi Martin,

Did you create your input file using BEAUti? If not, then you need to
check that the starting tree satisfies the calibration constraint.
With a lognormal calibration prior, there is zero probability of the
nodal age being more recent than the offset (which is a 'hard' minimum
constraint). One way to check whether or not this is causing the
problem is to  the calibration prior and see if
BEAST starts running. Actually,  components of
the prior can be a good way of investigating any analyses that have an
initial posterior of zero.

In regard to your question about choosing the stdev and mean of the
lognormal distribution, I attempted (probably not too successfully) to
provide an explanation in a previous thread:
http://groups.google.com/group/beast-users/browse_thread/thread/1dde79dfaab20b19/309abe34d7f482a5?lnk=gst&q=lognormal+calibration#309abe34d7f482a5

If the above link doesn't work, you can just do a search for
'lognormal calibration' and it should be the first hit.

Cheers,
Simon

> mdohr...@gwdg.de

Martin Dohrmann

unread,

Jan 30, 2008, 12:40:44 PM1/30/08

to drsi...@gmail.com, beast...@googlegroups.com

Hi Simon,

I used BEAUti but I pasted a user-defined starting tree (phylogram) into the XML file. Somebody suggested to me off the list to use a chronogram produced by a simple r8s analysis instead, and it worked fine (thanks again, Joseph!).

Regarding the lognormal distribution, I had read that thread before but it didn't really help (sorry). It really seems to be a very subjective thing - fortunately BEAUti provides graphs of the prior distributions so one can play around until the distribution looks "reasonable" (whatever that means...).

Cheers,

Martin

mdoh...@gwdg.de

Andrew Rambaut

unread,

Jan 30, 2008, 2:54:56 PM1/30/08

to mdoh...@gwdg.de, drsi...@gmail.com, beast...@googlegroups.com

When specifying a prior, BEAUti will report the upper and lower 1%,
2.5% and 5% tails of the distribution. This allows you to try
different parameters for the distribution to achieve a particular
probabilistic statement (i.e., "I am 95% sure that the divergence is
not older than X" or I am 95% sure the split lies between X and Y").

Andrew

___________________________________________________________________
Andrew Rambaut
Institute of Evolutionary Biology University of Edinburgh
Ashworth Laboratories Edinburgh EH9 3JT
EMAIL - a.ra...@ed.ac.uk TEL - +44 131 6508624

Simon Ho

unread,

Jan 30, 2008, 6:55:23 PM1/30/08

to beast-users

Hi Martin,

As Andrew mentioned, you can play with the parameters in order to
reflect your confidence in the fossil evidence. You can choose values
so that only 5% of the distribution lies above a certain value, X,
which can be translated as 'There is a 5% chance that the split is
actually older than X'. This is analogous to a 'soft' maximum bound,
sensu Yang and Rannala (2006).

Choosing this value will be influenced by quantifiable factors
(preservation probability, stratigraphic completeness, and radiometric
error). But it will also be influenced by non-quantifiable factors
(confidence in taxonomic assignment), making it difficult to formulate
an objective function for generating an appropriate prior
distribution. So it is probably inevitable that there will be an
element of subjectivity.

With a lognormal prior, the other parameter that needs to be chosen is
the mean (or median). This is also difficult to choose because it is a
subjective statement about the age at which the split most likely
occurred. Alternatively, you could avoid having to choose this
parameter by using an exponential prior.

Cheers,
Simon

> mdohr...@gwdg.de

Martin Dohrmann

unread,

Jan 31, 2008, 7:58:35 AM1/31/08

to drsi...@gmail.com, beast...@googlegroups.com

Hi Simon,

actually there are two sources of uncertainty with my calibrations that I want to model. First, the literature gives only vague statements about the age of the fossils, such as "Upper Ordovicium". As I figure it, a normal distribution over the range of the stratigraphic interval would be appropriate to describe this uncertainty. On the other hand the fossil gives only a minimum age, i.e., there is a non-zero probability for the node being older than the interval given in the literature. Although there is no way of objectively deciding how much older it might be (and with what probability), I thought a lognormal prior would be most appropriate in this case. Am I on the right way or is there a distribution that might be better suited for this type of uncertainties?

Martin

mdoh...@gwdg.de

Simon Ho

unread,

Feb 1, 2008, 1:41:33 AM2/1/08

to beast-users

Hi Martin,

It really depends on how conservative you wish to be with your fossil
evidence. The most conservative approach would be to put a hard
minimum constraint of 445 Myr on the node, as this represents the end
of the Upper Ordovician.

Adding any information beyond this would be subjective, so it is
ultimately up to you to decide how you would like to select a 'soft'
maximum bound. You could use the maximum age of the Upper Ordovician
(461 Myr) as a soft maximum, in the absence of any other information.
The median of the prior distribution could be chosen to match the
midpoint of the Upper Ordovician. Both of these would be somewhat
artificial, however, and there is no strong justification for using
this approach.

A good option at this point would be to contact a palaeontologist
familiar with the fossil group, and s/he would probably be able to
give you an informed estimate of the soft maximum bound and perhaps
suggest an appropriate shape for the prior distribution.

Cheers,
Simon

> mdohr...@gwdg.de

Martin Dohrmann

unread,

Feb 4, 2008, 7:42:53 AM2/4/08

to drsi...@gmail.com, beast...@googlegroups.com

Hi Simon,

thanks for your advice. How can I implement a hard minimum constraint in BEAST? BEAUti does not seem to provide that option, or am I missing something?

Cheers,

Martin

mdoh...@gwdg.de

Simon Ho

unread,

Feb 4, 2008, 5:44:00 PM2/4/08

to beast-users

Hi Martin,

You can implement a minimum constraint by modifying the XML file
manually, but it's probably easier just to choose a uniform prior in
BEAUti. You can set the minimum bound of the interval to equal your
fossil constraint, and set the maximum bound to a very high number.
For example, you could use a uniform prior of (445, 1000).

Cheers,
Simon

> mdohr...@gwdg.de

Reply all

Reply to author

Forward