I am attempting to determine which relaxed clock model (lognormal or
exponential) is a better fit for my data. I have come across a
problem that has been alluded to in previous posts, but never
specifically addressed. I am using four different genes from the
same organism (distinct data sets with different BEAST runs) that
have been serially sampled. Under the exponential clock model, the
coefficient of variation for the four different genes ranges between
9.2 - 9.8 and the ESS is always exactly 4.781 (rather low?). After
looking at the trace, they essentially fluctuate within an extremely
narrow interval (StdDev = 4E-14) for the entire run. When I use the
lognormal clock everything looks kosher, the coefficient of variation
is different for each gene (0.3 - 0.6) and the ESS are all well above
200. The BEAST runs are 25 million generations (sampled every 2500),
and the other ESS values are generally very good.
Bayes Factor measurements indicate a lower marginal likelihood for
the exponential model, but given the general preference for LogNormal
indicated by the authors and this strange behavior of the coefficient
of variation under the exponential model, I am inclined to use the
lognormal model. Does anyone have any insight into this problem?
Thank you,
Joel Wertheim
I think you mean between 0.92 and 0.98. The exponential distribution
or rates across branches will always have a coefficient of variation
of 1.0. It is a one parameter distribution (the parameter determines
both the mean and the variance). The only reason that BEAST returns a
number slightly under one is because of the way the distribution is
discretized across the branches in the tree. So you should ignore the
ESS estimates for this statistic when using the exponential model. If
you dont think that the coefficient of variation is about one for your
data, then you probably shouldnt use the exponential model.
Cheers
Alexei
One further point to Alexei's - and this may just be a typo in your
email - but a lower marginal likelihood for the exponential model would
favour the lognormal model (unless by 'lower' you mean 'less negative').
Andrew
On 31 Aug 2007, at 02:31, Joel wrote:
> Bayes Factor measurements indicate a lower marginal likelihood for
> the exponential model, but given the general preference for LogNormal
> indicated by the authors and this strange behavior of the coefficient
> of variation under the exponential model, I am inclined to use the
> lognormal model. Does anyone have any insight into this problem?
___________________________________________________________________
Andrew Rambaut
Institute of Evolutionary Biology University of Edinburgh
Ashworth Laboratories Edinburgh EH9 3JT
EMAIL - a.ra...@ed.ac.uk TEL - +44 131 6508624