Slow mapping of total RNA samples using STAR

Kevin

unread,

Nov 25, 2019, 3:59:02 PM11/25/19

to rna-...@googlegroups.com

Hi,

I have been using STAR to analyze my data. I did some total RNA sequencing recently and found that STAR took an extremely long time to finish mapping. For an 8 MB fastq file, I run it on a server using 8 cores and 250Gb of ram. It did not finish mapping within the time limit of the job which was 12 hours. I used the primary assembly fasta and gtf files from gencode. Based on my previous experience, the sample has ~25% unique mapping rate and ~60% multi-mapping rate (mostly coming from rRNA).

Could someone help me with this?

Thanks!

Kevin

Log.out

Log.progress.out

Alexander Dobin

unread,

Nov 26, 2019, 10:52:55 AM11/26/19

to rna-star

Hi Kevin,

a few things to try:

1. Re-generate the genome index without --genomeChrBinNbits 0 . The default value of this parameter should work fine.

2. If this does not help, please try to map a "good" human or mouse dataset (e.g. ENCODE) to your combined genome and check the speed. This will check whether this slowdown is specific to your sample.

3. If good dataset maps fast, then the slowdown is likely caused by reads that map poorly in your dataset. In this case, please try reducing --seedPerWindowNmax from the deafult of 50 to 30 or even lower 20,10.

Cheers

Alex

Thank!

Kevin

Kevin

unread,

Nov 27, 2019, 12:57:17 AM11/27/19

to rna-star

Hi Alex,

Thanks a lot! Your first suggestion solved my problem! Now STAR works like a charm! Thank you for developing this wonderful tool for the community!