do DP4 values represent converted or unconverted values?

91 views
Skip to first unread message

annie...@gmail.com

unread,
Nov 5, 2014, 10:04:58 AM11/5/14
to bissn...@googlegroups.com
BisSNP is supposed to call the most likely genotype of a SNP taking into account the bisulfite conversion rate. My question is, do the values in DP4 ("Reads supporting ALT, only keep good base. Number of 1) forward ref alleles; 2) reverse ref; 3) forward non-ref; 4) reverse non-ref alleles"> ) in the VCF file represent the most likely "original" base or the base after bisulfite conversion. For instance, if a site is ref C and alt T. I might be getting a genotype of (0/0) but the value in the DP4 field is 6,0,11,0. To me, this is more likely (0/1) than (0/0). Now if it was 50% methylated, then I would expect that the T's were likely not the true allele but rather due to the conversion, hence the (0/0) call. So, are the values in DP4 the probable alleles or what is actually read on the sequencer? I am just trying to make sense of where the genotype call seems to drastically differ from the ratios of the DP4 values. I don't know what value to "trust" more.  

ping

unread,
Nov 5, 2014, 12:07:40 PM11/5/14
to bissn...@googlegroups.com
DP4 are not used for genotyping in WGBS, just to match the VCF v4.1 format requirement. BCR6 are more informative for bisulfite genotyping. For the position you give, if reference genome is "C", it could be 6 'C' reads and 11'T' reads. Which should show much more likelihood to be C genotype (0/0). 

On Wed, Nov 5, 2014 at 10:04 AM, <annie...@gmail.com> wrote:
BisSNP is supposed to call the most likely genotype of a SNP taking into account the bisulfite conversion rate. My question is, do the values in DP4 ("Reads supporting ALT, only keep good base. Number of 1) forward ref alleles; 2) reverse ref; 3) forward non-ref; 4) reverse non-ref alleles"> ) in the VCF file represent the most likely "original" base or the base after bisulfite conversion. For instance, if a site is ref C and alt T. I might be getting a genotype of (0/0) but the value in the DP4 field is 6,0,11,0. To me, this is more likely (0/1) than (0/0). Now if it was 50% methylated, then I would expect that the T's were likely not the true allele but rather due to the conversion, hence the (0/0) call. So, are the values in DP4 the probable alleles or what is actually read on the sequencer? I am just trying to make sense of where the genotype call seems to drastically differ from the ratios of the DP4 values. I don't know what value to "trust" more.  

--
You received this message because you are subscribed to the Google Groups "bissnp-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bissnp-help...@googlegroups.com.
To post to this group, send email to bissn...@googlegroups.com.
Visit this group at http://groups.google.com/group/bissnp-help.
For more options, visit https://groups.google.com/d/optout.



--
Yaping Liu, Ph.D.

Postdoctoral Associate
Manolis Kellis Lab
Computer Science and Artificial Intelligence Lab (CSAIL)
Massachusetts Institute of Technology
Broad Institute of MIT and Harvard


Reply all
Reply to author
Forward
0 new messages