large discrepancy between DP and DP4 in vcf file

Skip to first unread message

Sep 24, 2014, 3:15:03 PM9/24/14
I tried doing some SNP calling on some targeted bisulfite sequencing data set. I used all of the default setting in BisSNP BisulfiteGenotyper. What I am noticing is that my DP is sometimes 50% larger than the sum of what is contained in the DP4 column. I understand that the DP4 doesn't count low quality bases but it seems more is getting filtered out than that. I took a look at some of my sites in IGV. I scrolled across the different reads to see what the mmq and base quality score was. For most of them it was very high. I am not sure if those are the only settings that can affect DP4. The problem is, I am getting genotype calls that just don't seem right when you look at the raw allele frequencies. I guess I would like to know, is there something I am missing? Is there a setting that I should maybe relax or some way to figure out why these reads that appear good in IGV are being treated as poor by the genotype caller? I would appreciate any suggestions.



Sep 24, 2014, 4:55:02 PM9/24/14
Hi Annie,
Could you let me know which you think it is a good reads but it is missing in the methylation/genotype call? Show me the reads information in the bam file and IGV browser...There are a lot of reads filter rule applied during the genotyping call. You could only visualize them a little bit in IGV browser...


On Wed, Sep 24, 2014 at 3:15 PM, <> wrote:
I tried doing some SNP calling on some targeted bisulfite sequencing data set. I used all of the default setting in BisSNP BisulfiteGenotyper. What I am noticing is that my DP is sometimes 50% larger than the sum of what is contained in the DP4 column. I understand that the DP4 doesn't count low quality bases but it seems more is getting filtered out than that. I took a look at some of my sites in IGV. I scrolled across the different reads to see what the mmq and base quality score was. For most of them it was very high. I am not sure if those are the only settings that can affect DP4. The problem is, I am getting genotype calls that just don't seem right when you look at the raw allele frequencies. I guess I would like to know, is there something I am missing? Is there a setting that I should maybe relax or some way to figure out why these reads that appear good in IGV are being treated as poor by the genotype caller? I would appreciate any suggestions.


You received this message because you are subscribed to the Google Groups "bissnp-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To post to this group, send email to
Visit this group at
For more options, visit

Yaping Liu Ph.D.

Postdoctoral Associate
Manolis Kellis Lab
Computer Science and Artificial Intelligence Lab (CSAIL)
Massachusetts Institute of Technology
Broad Institute of MIT and Harvard

Sep 25, 2014, 8:25:59 AM9/25/14
Upon further review of the data I do think the reads are getting filtered out primarily because the default mmq score is set at 40. Obviously I can't change the quality of my data. I have to work with what I have. I've attached 2 sam files with the reads that overlap the particular site I was looking at. I can see that the mmq scores are very low for a good majority of the reads. Can you give me any advice as to what I might lower that threshold to and still be fairly accurate? Maybe 40 is the best but would 20 be decent. I am not trying to identify novel SNPs. I am only interested in differences in genotypes at known SNP sites. I would rather more sites be included that weren't ideal then think I have different genotypes where I really don't.



Sep 25, 2014, 9:55:38 AM9/25/14
Hi Annie,
You could try this option to adjust mapping quality threshold,-mmq 20
and this option could help for base quality score threshold, -mbq 5


You received this message because you are subscribed to the Google Groups "bissnp-help" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To post to this group, send email to
Visit this group at
For more options, visit
Reply all
Reply to author
0 new messages