alignment software and indels in Stacks

397 views
Skip to first unread message

Michelle Sanford

unread,
Sep 13, 2012, 6:02:56 PM9/13/12
to stacks...@googlegroups.com
Hi Everyone,

I am working my way through my first RAD project (48 individuals 5 populations, single end 100bp, HiSeq) and I am struggling a bit with interpreting the results I am getting. I am trying a couple of different approaches to the analysis. Since we have a reference genome I am trying BWA aligned SAM format file through the ref_map.pl pipeline with the default settings. I have the suspicion that I am not getting the number of markers I was expecting so we decided to try out the denovo_map.pl pipeline but this seems to give us an unusually high number of tags and SNPs. We suspect that there are a lot of indels in our genome and this was the initial reason we went with BWA (to handle the gaps). But most of the RAD literature that had a reference genome used Bowtie for alignments which originally was not able to align indels. From my readings of this group's archives I know that Stacks also doesn't handle indels. So I was wondering if the use of Bowtie helps alleviate potential problems with Stacks?

With respect to the possibility of underestimating tags and SNP's is it possible that a poor quality alignment might be to blame? Or would this be more indicative of a problem with library preparation? The quality of the sequencing was pretty good (as observed with FastQC and quality filtering).

Thanks for your help and suggestions,
Michelle

Julian Catchen

unread,
Sep 13, 2012, 10:00:48 PM9/13/12
to stacks...@googlegroups.com
Hi Michelle,

Stacks should have no problems with gapped alignments. We are regularly using it
with GSnap and have used BWA in the past. Make sure you have a current version
of the software.

The first question I would want to know about underestimating tags is: how many
raw reads you had, how many passed quality filtering, and how many were aligned
to the genome? If you had poor library preparation, likely many will be filtered
for quality and many will not align to the genome. Since you have a reference,
have you checked in silico to see how many RAD sites you expect to have?

Have you installed the web interface? This tool is very useful to get a look at
the alignments made and what your stacks look like.

Cheers,

julian
Reply all
Reply to author
Forward
0 new messages