question about using different Illumina sequencers

27 views
Skip to first unread message

h.h.hemm...@gmail.com

unread,
Dec 30, 2022, 10:23:53 AM12/30/22
to Stacks
Hi All, 

Would it be a problem to use a data set in which the samples were all extracted and digested the same way, but were sequenced on different models of Illumina sequencer, leading to two different read lengths?

I ran process_radtags, aligned to a high quality reference genome, and ran gstacks. Unfortunately I'm getting spurious results regardless of how I filter it, even though a subset of the same data worked well previously.

I'm just wondering if there is something obvious I'm missing here. I didn't truncate the reads during process_radtags because I thought it would be better to align them first. 

Thanks,
Heather

Catchen, Julian

unread,
Jan 2, 2023, 11:04:16 AM1/2/23
to stacks...@googlegroups.com

Hi Heather,

 

You did not specify what “spurious results” means with respect to your data. However, if you are running the pipeline with two different read lengths, for different sets of samples, you will create a set of SNP calls that are only present in the samples with the longer read length and which appear as “missing data” in the samples with the shorter read lengths. This will affect how filtering works, particularly if read length differences are correlated with populations in your popmap, though without knowing what you are trying to do it is hard to be more detailed.

 

Best,

 

julian

Reply all
Reply to author
Forward
0 new messages