Your thoughts on different alignment tools?

253 views
Skip to first unread message

Harriet Hunt

unread,
Apr 16, 2018, 12:07:30 PM4/16/18
to Stacks
Dear Julian,

I'm currently running a GbS data set (single digest, single-end sequencing) for a species with a pretty poor-quality reference genome through Stacks. When I say poor quality, I mean currently there are over 300,000 scaffolds, many very short.

When I align with bowtie2 I get ~30% of reads aligning with a single best match for each sample. When I use bwa I get ~85% of reads aligning, but a lot of these have extensive softmasking - I guess maybe related to the short scaffold length. I know it's not possible to suppress terminal alignments with bwa.

I notice in some of your recent Stacks courses you have switched between using bwa/ bowtie2/ gsnap as the aligner  - I was wondering if this actually a change of opinion on your part on what the most suitable alignment tool is, or whether you just did this for the sake of variety!?

As I understand your recent paper, there is a strong argument for using denovo_map as well or instead of refmap_pl, even when a reference genome is available. So maybe I will try this too. 

all best, Harriet

Julian Catchen

unread,
Apr 19, 2018, 4:18:02 PM4/19/18
to 'Harriet Hunt' via Stacks
Hi Harriet,

Yes, we recently switched internally to using BWA from Gsnap. I have
never really used Bowtie, but not for technical reasons, it seems to
work well. We did see an improvement with our results when we made the
switch, however we are working with a pretty good reference. With your
reference, it may not be the aligner that can make the difference.

My advice would be to build your data de novo with Stacks 2, and then
try to place the resulting data back on your reference genome after the
fact. We include a script in Stacks 2 to accomplish this goal.

Best,

julian

Harriet Hunt

unread,
Apr 30, 2018, 9:25:27 AM4/30/18
to Stacks
Hi Julian,

Thanks - is this the script stacks-integrate-alignments? I can't find any direct reference to it in the manual or in your recent paper with Nicolas Rochette, but typing 'more stacks-integrate-alignments' suggests this may be the tool. I have output files from denovo_map.pl now so if you can confirm, I'll give it a go.

best, Harriet

Julian Catchen

unread,
Apr 30, 2018, 4:33:13 PM4/30/18
to 'Harriet Hunt' via Stacks
Hi Harriet, yes, that is the correct program. We will hopefully be
improving the documentation soon.

ido bar

unread,
Jun 7, 2018, 10:41:59 PM6/7/18
to Stacks
Hi Julian,

Any particular recommendations for BWA parameters optimized for the task of aligning the catalog.fa file against the reference genome?

Thanks, Ido

Harriet Hunt

unread,
Jun 8, 2018, 9:51:17 AM6/8/18
to stacks...@googlegroups.com
This doesn't directly answer Ido's question, but following up on my original question.... I have tried the denovo_map.pl approach (on a full set of 160 samples, following optimisation of M==n on a test sample set), which gives me 

335523 loci, composed of 31138856 sites; 162739 variant sites with no -r flag, or

2557 loci,  composed of 241607 sites; 976 of those sites were filtered, 8640 variant sites (with -r 0.8). 


I then tried gstacks on a subset of 12 samples, following alignment with bwa (default parameters). The gstacks logfiles show ~30% of reads are being discarded because of excessive soft-clipping. I guess this is because the quality of the reference genome is poor and there is not much I can do to improve it? I could reduce the tolerance for soft-clipping in the alignment program, but presumably this just excludes the soft-clipped reads at a different stage? 


With such poor retention of aligned reads (I get 40-45% overall retained in gstacks) is there any point proceeding with the post hoc catalogue alignment, or should I stick to denovo_map/try getting a better genome?


best, Harriet




--
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/stacks-users.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages