Dears,
I am currently running the ref_map pipeline, and after processing my data through populations, I noticed an issue with the resulting VCF file. When inspecting the quality scores (QUAL column), I found that all sites have a value of -1. This result seems unusual, and I am not entirely sure what could be causing it. I would greatly appreciate your insight into what might be responsible for this and any recommendations you might have to resolve the issue.
Thank you very much in advance for your time and support!
Hi Emanuele,
It is not clear what Stacks output file you are referencing in your message. Stacks will output VCF files, of course, but they would have columns such as
CHROM POS ID REF ALT QUAL FILTER...
And the software does not populate the QUAL field (it would be a “.”). If you want more troubleshooting, we would need to know what version of Stacks you ran and what commands you executed? What sort of data are you processing?
Best,
Julian
vcftools --vcf input.vcf --maf 0.05 --max-missing 0.8 --minQ 30 --recode --out filtered_combined
, I end up with an "empty" VCF file as output (with the message: "After filtering, kept 0 out of a possible 11,532 sites"). The issue seems to be related to the use of the --minQ
parameter. I believe this is because the quality scores (which I checked with vcftools --gzvcf populations.snps.vcf --site-quality --out $OUT
) are set to -1
for all sites, as I mentioned in my previous email. --
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/stacks-users/PH7PR11MB672203950736D12547F7B525A7ED2%40PH7PR11MB6722.namprd11.prod.outlook.com.
Hi Emanuele,
I can’t help you with your execution of vcftools (that said, it does not make sense to run vcftools 7 times, connected by UNIX pipes, but to supply the same, input file, --gzvcf populations.snps.vcf, to each stage). If you can show me that the populations program output a VCF file with -1 for the quality parameter, I could look for software bugs, however, if vcftools is making the conversion, then you need to modify your use of that program.
Dear Angel,
Thank you very much for your explanation and your help! I was asking about this because I had seen in some papers that part of the filtering on the VCF was performed after populations, so the presence of the "." value in the QUAL column had surprised me. In any case, I will follow your advice regarding the alternatives to assess the quality of the VCF and filter it accordingly.
Many thanks for your help!
Best
Emanuele