Hello Stacks users!
I have received some paired-end single digest GBS data that was digested with ApeKII and sequenced on an Illumina novaseq (150bp reads). I am using these data for a population genetics study. After QC including removing low-quality samples, and retaining only sequences greater than 100bp in length, I have 160 samples. Following parameter optimization, I ran the
denovo_map.pl wrapper in Stacks v2.64 using m 3, M 2 and n 2 parameters. This successfully executed and gave me the gstacks output (below).
After completing this, I became aware that using paired-end reads from single digest GBS data is not advised in Stacks as it will not be able to correctly assemble paired-end loci. In the gstacks output, however, I see that Stacks was able to successfully assemble contigs for the majority of my reads. I was wondering how this might have been possible given that my GBS data are not double digested? I've had a look around and haven't been able to find any information on this, so I was wondering if anyone might be able to provide some insight on the situation?
All the best,
Rob
0 loci had no or almost no paired-end reads
8614 loci had paired end reads that couldn't be assembled into a paired-end contig
For the remaining 5402145 loci, a paired-end contig was assembled
Average contig size was 175.7 bp
4869151 paired-end contigs overlapped the forward region
Out of 176750520 paired-end reads in these loci (mean 32.4 reads per locus), 174773434 were successfully aligned
Mean insert length was 161.8, stdev: 48.2
Genotyped 5402104 loci
effective per-sample coverage: mean=8.5x, stdev=1.4x, min=5.9x, max=12.8x
mean number of sites per locus: 174.5
a consistent phasing was found for 1005419 out of 1147664 (87.6%) diploid loci needing phasing