335523 loci, composed of 31138856 sites; 162739 variant sites with no -r flag, or
2557 loci, composed of 241607 sites; 976 of those sites were filtered, 8640 variant sites (with -r 0.8).
I then tried gstacks on a subset of 12 samples, following alignment with bwa (default parameters). The gstacks logfiles show ~30% of reads are being discarded because of excessive soft-clipping. I guess this is because the quality of the reference genome is poor and there is not much I can do to improve it? I could reduce the tolerance for soft-clipping in the alignment program, but presumably this just excludes the soft-clipped reads at a different stage?
With such poor retention of aligned reads (I get 40-45% overall retained in gstacks) is there any point proceeding with the post hoc catalogue alignment, or should I stick to denovo_map/try getting a better genome?
best, Harriet
--
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/stacks-users.
For more options, visit https://groups.google.com/d/optout.