Definition of substitution rate vs. mutation rate

2,806 views
Skip to first unread message

Eric Ma

unread,
May 9, 2014, 6:08:29 PM5/9/14
to beast...@googlegroups.com
Hello,

I am a newcomer here, but I had a few small question I wanted to clarify. I hope this isn't too troublesome.

Is the substitution rate of an organism/virus the same as its polymerase's mutation rate? Which one are we inferring from BEAST? The substitution rate should incorporate the effect of selection, while the polymerase mutation rate should be independent of any selection effects, is that correct?

If I were to try simulating the process of mutation over time, where at each time step, I allow a nucleotide sequence to mutate with a probability, how would I use the inferred substitution or mutation rate? For example, I know that substitution rates are given as substitutions/(site.year), so if we were simulating a week's worth of substitutional processes, would we divide the substitution rate by 52 to get the substitutions/(site.week)?

Thank you!

Cheers,
Eric

Mark Ravinet

unread,
May 9, 2014, 11:43:11 PM5/9/14
to beast...@googlegroups.com
Hi Eric,

I'm not an expert on this but in the interests of supporting the BEAST community, I thought I'd weigh in with some answers to your questions.

Is the substitution rate of an organism/virus the same as its polymerase's mutation rate? Which one are we inferring from BEAST? The substitution rate should incorporate the effect of selection, while the polymerase mutation rate should be independent of any selection effects, is that correct?

The short answer is no, mutation rate is not the same as substitution rate. If you measure a mutation rate - i.e. the rate of unrepaired damage to a nucleotide sequence then you are observing all genetic differences. This is typically done using pedigree and lineage studies and it includes all genetic differences except highly lethal mutations. In a pedigree study across a single generation for example, we would be able to measure all mutations except those that are lethal to offspring from the first point of development.

A substitution rate is typically measured over a long-term time scale. As you mention, this means substitution rates are influenced by selection and potentially, genetic drift. The effects of these processes increase with time and so substitution rates can decay - i.e. they are often much slower than mutation rates, by several orders of magnitude in some cases. I would strongly recommend you read this review by Simon Ho and others (http://onlinelibrary.wiley.com/doi/10.1111/j.1365-294X.2011.05178.x/full) for an in-depth insight into this.

Mutation rates can only be measured directly - i.e. looking at DNA sequence differences across lineage or pedigree experiments. In BEAST we are typically estimating substitution rates. I suppose you could use sequence data from a pedigree in BEAST to get a mutation rate estimate but I don't if anyone has or whether this is worthwhile. Perhaps one of the BEAST developers can comment on this.

If I were to try simulating the process of mutation over time, where at each time step, I allow a nucleotide sequence to mutate with a probability, how would I use the inferred substitution or mutation rate? For example, I know that substitution rates are given as substitutions/(site.year), so if we were simulating a week's worth of substitutional processes, would we divide the substitution rate by 52 to get the substitutions/(site.week)?

I'm not 100% certain of your question here. If you don't place any calibrations on the internal nodes and you specify a clock rate of 1.0, substitution rate estimates will be in units of substitutions per site across the length of the tree. However I am not familiar with analysis of heterochronous sequence data so perhaps others can be more informative here.


Reply all
Reply to author
Forward
0 new messages