Hi Emma,
You are correct, the GQ field in populations.snps.vcf is the same as the p-value for the likelihood ratio test for the values given in the GL field, i.e. with --gt-alpha=0.01 the GQs will be higher than 20.
--var-alpha impacts the calling of polymorphisms (whether a particular site is variable rather than fixed/invariant across all individuals). Note that for the marukilow model no multiple-testing correction is needed, but that for the marukihigh and snp models you should in principle use one or at least use a conservative threshold.
By AB do you mean allele frequencies? At the moment for simplicity we do not report marukilow's optimized allele frequencies, but you can just count the alleles (this will be skewed if your coverage is low, though).
Best,
Nicolas
Hi,--
I am in the process of filtering my reference aligned ddRAD data (stacks v. 2.0Beta9) and wanted to ask about the stacks vcf output and quality score estimation.
Can you please confirm how the genotype quality that is exported as part of the vcf file is calculated? The gt-alpha parameter sets a minimum threshold for the likelihood-ratio test statistic, but if the p-value for each genotype is calculated is this used to estimate the quality score, e.g., if the alpha value is 0.01 would the minimum quality score be equivalent to 20? Or are these just not comparable? Also can you please confirm that the genotype quality score in the populations.snps.vcf file is for the genotype not the individual SNP? Or is this only calculated for SNPs if you change the default genotyping model from marikulow to snp?
Additionally, information like allele balance (AB) per SNP are not exported in the vcf file, but presumably would be available? However, this information is integrated into the genotype calling algorithm of Maruki & Lynch so I guess that's why it's not exported?
thanks for building and maintaining such a great pipeline!Emma
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/stacks-users.
For more options, visit https://groups.google.com/d/optout.
Hi Carol,
Yes, a gt-alpha of 0.01 corresponds to a GQ of ≥20. And the
p-value that is compared to the alpha is derived from a likelihood
ratio test on the likelihoods of the two most likely genotypes. In
the standard GQ is defined somewhat loosely because no particular
statistical framework is assumed but in our case it is
–log10(likelihood ratio test p-value)
Best,
Nicolas
To view this discussion on the web visit https://groups.google.com/d/msgid/stacks-users/1af12800-88cb-4353-95f4-8280c1668a46%40googlegroups.com.
For per-allele coverage, the AD (allele depth) field is available
and more informative that the lone allele balance figure.
Regarding your second point, you are correct.
Regards,
Nicolas
To view this discussion on the web visit https://groups.google.com/d/msgid/stacks-users/e86e5b4e-432c-454a-9277-5cc49e564c55%40googlegroups.com.
You received this message because you are subscribed to a topic in the Google Groups "Stacks" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stacks-users/VP4qAMaBg3U/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stacks-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/stacks-users/73458e47-82d3-cbd8-79f5-a8224d46ce6b%40illinois.edu.