Hi ,
First of all, thanks for having forum. I can’t believe I missed it earlier on when I was doing my analysis. Finding this forum already answered a bunch of my questions.
However I have a really specific question, that mainly relates to my project. A while back I used STAR aligner with default parameters to align the RNA-seq reads of a non-model organism to its closest relative with a fully-sequenced genome. Initially this yielded a very low alignment result (a total of 0.91 reads mapped).I then increased the mismatch parameters however did not seem to have a huge impact on the results.
Anyway, after I found this form, I saw that you can change the --outFilterMatchNminOverLread and –outFilterScoreMinOverLread to increase mappability. I used the values 0.10, 0.25 and 0.5, with 0.10 having the highest number of reads mapped. I wanted to know what’s the trade off between setting a lower value for these parameters and getting a higher mapping rate? Obviously my average mapped read length has decreased. I also wanted to know if there is a way I can check which parameter gives me a good quality of results? Is there a way I can check this?
With the 0.10 parameter, I got an overall unique mapping rate of 63.27% and a multi-mapping rate of 32.66%. Both the genome and my non-model organism are known to have a high repeat content so I’m not surprised at the multi-mapping rate, but I wanted to know if this is a good mapping result for a non-model organism aligned to a distant relative?
I've attached the screenshots of both the final Log reports for reference.
Any help on this would be great :)
cheers,
Urwah