Hi Julian (and everyone),
I’m confused as to how to report nucleotide and heterozygosity values generated from the module populations.
I’ve run the Stacks pipeline, output a vcf file using populations (parameters: -p 3 -r 0.2 –write-single-snp) and then subsequently filtered the vcf SNP dataset using vcftools (for coverage, missing data, minor allele frequency, etc.).
I would like to report the nucleotide and heterozygosity values derived from stacks:populations (variant and fixed positions); however, since I filtered these data using vcftools AFTER having generated these diversity indices (nucleotide and heterozygosity), are these values still meaningful?
Any thoughts or guidance you (or anyone) may have on this topic would be helpful.
Many thanks,
Andrea
Hi Andrea,
If you filter with vcftools after exporting from populations, you will change all of those values, of course. You can just do your filtering directly with populations – it can filter based on missing data and maf, and indirectly for coverage (by adjusting the required p-value of genotype calls).
julian