Increase in unmapped short reads with alignEndsType paramter

47 views
Skip to first unread message

Rosannah Cameron

unread,
Oct 13, 2016, 3:10:28 PM10/13/16
to rna-star
Hi there,

I have been using the STAR aligner as part of the rMATs analysis pipeline and I noticed an increase in the percentage of unmapped short reads when I use rMATS vs. STAR alone. I get around 26% of my reads in the unmapped short category with rMATs compared to 14% with STAR alone. I found that this difference seems to be related to the alignEndsType parameter - I had it set on local when I did the analysis with STAR and rMATs has it set to EndToEnd. rMATS requires the parameter be EndToEnd to keep read length consistent so I can't change this . Do you have any suggestions that I could try to reduce the number of short unmapped reads while using the EndToEnd parameter? I've tried clipping the fastq files but that doesn't seem to make a difference. Any help would be greatly appreciated. 

Cheers
Rosannah

Alexander Dobin

unread,
Oct 18, 2016, 4:19:50 PM10/18/16
to rna-star
Hi Rosannah,

first I would try to identify the reason the reads are soft-clipped with the local alignment option. You can look at a few of soft-clipped reads and try to see the pattern. The most common possibilities are:
1. Adapter sequence at the ends of the reads. This means that the reads have truly varying length, and cannot think of anyway to mitigate it.
2. Poor quality tails. This, in principle, could be mitigated with increasing the number of allowed mismatches --outFilterMismatchNmax (=10 by default).
3. Short splice overhangs. Annotations and/or 2-pass mapping should reduce those.

Please post or send the Log.final.out outputs from both the EndToEnd and Local runs, there might be some hints in the summary statistics.

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages