Calculating bisulfite conversion rates?

814 views
Skip to first unread message

watermelon

unread,
Dec 2, 2017, 10:59:10 AM12/2/17
to methylkit_discussion
Hi,

I am trying to calculate the bisulfite conversion rates for whole genome bisulfite sequencing dataset.

The bisulfite conversion rate of each base (non-CpG) can be calculated as T / (T + C) * 100, where T is thymine and C is cytosine read numbers on that base.


Below is from the aligned (by bismark) and methylation-called (by methylKit) file of one sample (CHH context).

chrBase chr     base    strand  coverage        freqC   freqT
scaffold1.1005      scaffold1   1005    F       12      0.00    100.00
scaffold1.1006      scaffold1   1006    F       13      0.00    100.00
scaffold1.1016      scaffold1   1016    F       17      0.00    100.00
scaffold1.1024      scaffold1   1024    F       18      0.00    100.00
scaffold1.1039      scaffold1   1039    F       17      11.76   88.24
scaffold1.1046      scaffold1   1046    F       16      0.00    100.00
scaffold1.1067      scaffold1   1067    F       23      0.00    100.00
.....
 
  
To calculate overall conversion rate of this sample, i think i should calculate below.
1. all C = sum of (freqC * coverage) in this sample
2. all T  = sum of (freqT * coverage)  in this sample
3. overall conversion rate = all T / (all C + all T)
 
Is it correct?



Also, what is the acceptable range of non-conversion rates?

 
Thank you very much!
 

Altuna Akalin

unread,
Dec 3, 2017, 5:05:39 AM12/3/17
to methylkit_...@googlegroups.com
the conversion rate is calculated with the assumption that every C in the non-CpG context (CHH and CHG) will be converted to T, which is not a reasonable assumption for samples with high non-CpG methylation, but it is pretty much the only proxy to conversion rate when there are no spike-in experiments. 

so the conversion rate could be sum of all non-CpG Ts divided by non-CpG (T+C) or you can calculate the average conversion rate per site by mean( nonCpG( (T)/(C+T))  ). I forgot how it is calculated in methylKit, probably the latter, it is somewhere in C++ code. 


Best,
Altuna

--
You received this message because you are subscribed to the Google Groups "methylkit_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to methylkit_discussion+unsub...@googlegroups.com.
To post to this group, send email to methylkit_discussion@googlegroups.com.
Visit this group at https://groups.google.com/group/methylkit_discussion.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages