Interpreting the Log.final.out file - % of reads unmapped: too short?

2,702 views
Skip to first unread message

Alex Chitsazan

unread,
Sep 15, 2016, 7:53:41 PM9/15/16
to rna-...@googlegroups.com
Hey Alex,

We are having trouble with a sample we just sequenced. This is a pilot study where we are trying to make sure our protocol works on a new organism so sample prep is most likely the culprit. However, I'm having trouble understanding our low alignment rate (~50%). After looking at our Log.final.out, we saw that our "% of reads unmapped: too short:" flag was about 48% and thought it was because our reads were short (maybe primer dimers). However after making a histogram of sequence length, that wasn't the case. Can you help explain the Log.final.out a little clearer for me? Specifically, why the too short percentage is so high? I will attach the log.final.out. For some experimental background, the reads come from illumina next-seq paired end 75bp reads.

Thank you very much,

Alex
Log.final.out

Alexander Dobin

unread,
Sep 20, 2016, 5:51:41 PM9/20/16
to rna-star
Hi Alex,

"too short" means that the best alignments STAR found were too short to pass the filters.
This is controlled by --outFilterScoreMinOverLread  --outFilterMatchNminOverLread which by default are set to 0.66. which means that ~2/3 of the total read length (sum of mates) should be mapped.
You can try to reduce these parameters to see how many more reads will be mapped.

One of the possibilities that you have short inserts, but it looks like you have trimmed the reads before mapping?

Also, you can try to map the reads 1 and 2 separately, to see if one of the reads is of poorer quality.

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages