reads of different length

118 views
Skip to first unread message

Steve

unread,
Mar 11, 2013, 5:27:31 AM3/11/13
to stacks...@googlegroups.com
Hi,

I am using ustacks to try and create a catalogue of loci from a population with 4 haplotypes. I am turning the deleveraging algorithm off and setting the --max-locus_stacks option to 4 and the -M 3, and -m 3. The data set consists of ~80million reads for this sample generated after resitriction with ApekI. There is no random shearing or size selection during library prep so while a good proportion of fragments will be larger than the read length, some will be shorter (but no less than 40bp). I would like to make use of these so I have filled the reads that are shorter than the read length with A's. I have been running ustacks for a few days with 20CPU but I don't remember this taking as long before. Ustacks splits the read into kmers to find overlap between primary stacks, so is it likely that the A tails on some of my reads are causing problems?

Thanks for any help!
   

Julian Catchen

unread,
Mar 14, 2013, 12:29:13 AM3/14/13
to stacks...@googlegroups.com, sbyr...@gmail.com
Hi,

Your reads may slow down the matching algorithm depending on the number of
mismatches allowed, as the As may create a lot of kmer matches between all reads
with runs of As, causing lots of extra alignments.

I don't imagine that reads with As would effectively match against reads without
As, as there will be too many mismatches on the tail of the stack. However, if
you manage to get enough reads with similar runs of As they may match up, but
then I expect you will get a lot of spurious SNP calls along the bound with
where the As have been added on. This extra SNP calling may also slow things down.

In principle, it should complete, however, I would be interested to know what it
looks like.

You may be better off, plotting the trimmed lengths and then choosing a length
and trimming all reads to that length (it depends how many of your reads were
shorter than the read length).

Best,

julian
Reply all
Reply to author
Forward
0 new messages