Yule Speciation Model

cadhla

unread,

Nov 13, 2007, 9:23:35 AM11/13/07

to beast...@googlegroups.com

Hi,

I want to estimate the TMRCA for the split of a bunch of species within a Genus using BEAST. Does anyone know if I need to use the Yule Process to do this since they are not a single species? Or can I use the regular population growth models?

Thanks!

lynn

alexei....@gmail.com

unread,

Nov 29, 2007, 10:42:41 PM11/29/07

to beast-users

Dear Lynn,

If you have a single sequence from each species then the Yule process
is a reasonable tree prior to choose. It assumes a pure birth process
but the distribution of branch lengths it assumes is very similar to
that of a birth-death process with the equivalent net birth rate. It
always makes sense to investigate the sensitivity of your estimates to
the choice of prior, so I would recommend trying more than one tree
prior if you are uncertain which one to use.

Cheers
Alexei

alexei....@gmail.com

unread,

Dec 2, 2007, 3:32:30 PM12/2/07

to beast-users

Graham Jones emailed me this about my post below:

----------------------------------------------------
Hmmm... I was looking at Yang and Rannala 97, and came to a different
conclusion. From their equation (3), assuming the root time t_1 is
fixed, the distribution the node times is proportional to a product of
things like

f(t) = [1 + ((S/s)(d/v) - 1)exp(-dt)]^-2 exp(-dt)

where S = total number of extant species descended from the LCA of the
sampled species, s is number of sampled species, v = speciation rate,
d = diversification rate (= v-u, where u is extinction rate).

For some analyses (S/s)(d/v) could be close to 1, so f(t) is close to
exp(-dt), as for a pure birth process with rate d, but it seems it
could easily be much bigger or smaller than 1 in others, and then f(t)
can be very different from exp(-dt). Have I missed something?
----------------------------------------------------

And he is correct that the Yule model can be quite different than
birth-death-sampling model. The Yule model is a simple one-parameter
model nested inside the general birth-death-sampling model of Yang and
Rannala 97. I agree with Graham that the sampling proportion can
definitely make a big difference to the prior if it is very small. But
if we put the sampling proportion aside for a moment then Yule versus
birth-death isn't as big a difference in priors as Yule versus
coalescent. The coalescent prior varies quadratically with the number
of lineages spanning an internode time whereas the Yule prior varies
linearly. As I remember, the birth-death model is most different to
the Yule model near the root...

Suffice to say, Yule and birth-death priors *are* different. We
implemented birth-death models in BEAST quite a long time ago but
haven't satisfied ourself about their behaviour sufficiently to
release them in BEAUti yet. It is on my list :-) The main wrinkle is
that the formulation in Yang & Rannala 1997 doesn't impose *any* prior
on the first internode time after the root. We don't think this is
right, because sampling trees under that prior gives a distribution of
trees with root nodes that tend towards infinity (clearly not what you
would get if you simulated trees forward in time under the same
model).

Cheers
Alexei

Rodrigo Colpo

unread,

Dec 4, 2007, 3:56:12 PM12/4/07

to beast...@googlegroups.com

Hello!

I´m a new user of Beast package, and I´m pass for same difficulty to do a Analysis of tracer output. I think that is a very important way to fallow, and because this I believe in the existence of a Tracer manual. But a only find "A Rough Guide to BEAST 1.4". And there´s no much information about analysis there.

Where a can find a detailed manual for this application? Explaining each point of the parameters.

Thanks
Rodrigo

Abra sua conta no Yahoo! Mail, o único sem limite de espaço para armazenamento!

Graham Jones

unread,

Dec 8, 2007, 9:31:53 AM12/8/07

to BEAST users

Thanks for the reply - I'm glad you confirmed what I thought. Some further
comments...

If (S/s)(d/v) is small this makes nodes more likely to be recent, as
compared to the Yule model. It becomes more like the coalescent model. This
would be the situation if you sampled all or nearly all extant species in
some clade, and d/v is small, because there have been nearly as many
extinctions as speciations since the LCA. It might be appropriate if you
sampled all 20-odd crocodiles for example.

If (S/s)(d/v) is big, the opposite happens, and recent nodes are more
unlikely. It might be appropriate if you sampled 20 randomly chosen birds,
for example.

So roughly speaking, the models line up like

....coalescent....crocs....Yule....birds....

On your `wrinkle', a prior for the first node time (Yang & Rannala's t_1), I
agree they have no prior for t_1 and that this is problem. Yang & Rannala
take t_1 to be 1.0 and only make inferences about u and v relative to t_1.
The same seems to be true of Nee 2001 which is referenced in the BEAST code.

Nee's equation (3) is for the joint probability for t_2...t_s' and s, given
t_1 and v (where s' = s-1), ie

Pr(t_2,...t_s', s | t_1, v)

That does not seem the correct quantity for Bayesian phylogenetic analysis
to me. I think one should use

Pr(t_1,...t_s' | s, v)

if you regard s as chosen by the researcher, or possibly
Pr(t_1,...t_s', s | v) if you regard s as a random variable produced by an
evolutionary process. In any case, t_1 is surely a random variable in almost
all phylogenetic analyses. What's missing from Y & R and Nee is an
expression for Pr(t_1 | s, v).

In the case u=0, and no subsampling, ie the Yule model, you can calculate a
distribution for t_1. Assume the S=s species are sampled just before the
next speciation, ie just before there are s+1 species. Then t_1 = X_2 + ...
+ X_s, where X_n is exponential with parameter nv, and t_1 has density
proportional to

f(t) = exp(-2vt) * (1-exp(-vt))^(s-2).

Then multiplying this by Nee's equation (5), you get

Pr(t_1,...t_s' | s, v) proportional to exp(-v(2*t_1 + t_2 +...t_s')).

This is an identical formula to Nee's equation (3), but it is for a
different quantity. So, I think you have the right formula in BEAST, but
maybe the comment needs work ;-).

For the birth-death model, I *think* that given s,v,u, then t_1 has density
proportional to

(1-E)^(s-2) * ( 1 - (u/v)*E )^(-s-1) * E^2

where E = exp(t*(u-v)), with a more complicated expression if sampling is
also modelled. This should be multiplied by Yang and Rannala 97, equation
(3) to give a value for
Pr(t_1,...t_s' | S, s, v, u).