Divergence times with an average rate across genes

Santiago Sánchez

unread,

Apr 28, 2022, 2:29:11 PM4/28/22

to beast...@googlegroups.com

Dear BEAST developers,

I remember the is/used to be an option in BEAST1 to set an average rate of substitution across genes rather than specifying the rate for one genes and co-estimating the rate for the rest of the genes. The rates of all the genes are then drawn from a normal distribution with that fixed mean rate.

I'm wondering if anyone has some code they can share or a way to implement this in BEAST2?

I want to run starBEAST with several hundred genes but I only have an average rate that I estimated with a different data set.

Thanks,

Santiago

Jordan Douglas

unread,

Apr 28, 2022, 6:17:31 PM4/28/22

to beast-users

Hi Santiago,

Using the Starbeast3 package, this can be done with the Clock Model tab in beauti.

First, open 'Mode' and disable "Automatic set clock rate"

Then, open the Clock Model tab. Here you can fix clock.rate at some constant, or you can give it a prior and estimate it.

If you data supports it, you could use the relaxed clock model instead (where there is 1 substitution rate per branch in the species tree, and the gene tree inherits branch rates from the species tree). This model is only implemented in StarBeast2 and 3.

Hope this helps,

Jordan

Santiago Sánchez

unread,

Apr 28, 2022, 8:37:42 PM4/28/22

to beast...@googlegroups.com

Hi Jordan,

Thanks for the input!

Would this be assuming a single clock model for all gene partitions?

I was kind of hoping to have each gene have it’s own clock model and just have, say the ucld.mean parameter, be sampled from an hyperparameter distribution that has the global mean.

But now that I think about it having a single clock model might actually be a better solution and have values drawn for each gene from it.

Thanks again,

Santiago

--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beast-users/25db445d-95c6-442a-915b-f552cd03e421n%40googlegroups.com.

Jordan Douglas

unread,

Apr 29, 2022, 12:57:38 AM4/29/22

to beast-users

Hi Santiago,

No worries :)

By default, each gene will have its own clock rate and these clock rates will be multiplied by the clock.rate I described above (which I think is close to what you originally wanted).

At the XML level, the gene tree likelihoods will look like this (one gene clock rate rate per gene):

And these gene tree rates will be multiplied by the scalar clock rate of the clock model:

By default, the gene clock rates are sampled from a fairly tight lognormal distribution with a mean of 1. In theory, this could lead to non-identifiability problems between heights and rates, but this has not been an issue in my experience