Help understanding STAR algorithm

35 views

Skip to first unread message

sent...@gmail.com

unread,

Feb 1, 2018, 3:41:37 PM2/1/18

to rna-star

Hi ,

Please correct me, if I am wrong

First step of alignment is make a hg19 index from hg19 Fasta. i.e creating a "uncompressed Suffix Array (SA)" of the whole hg19 fasta ( human reference genome).

Then for every read (or read-pair) STAR looks for "exact-matches "/"matches" in the index and hence maps the sequencing read to the genome.

My question is : Every time STAR runs on a sample, it spends time to sorts the "uncompressed Suffix Array (SA) " ? I am wondering why doesn't STAR sorts the index when it builds it ? So that for a new run it doesn't spend time sorting the SA and directly starts from the step of matching the reads to the index ?

Alexander Dobin

unread,

Feb 5, 2018, 12:24:38 PM2/5/18

to rna-star

Hi @sentisci

indeed, the suffix array is created only once, with the --runMode genomGenerate option.

On the mapping runs, the suffix array and other indexes are just loaded in RAM.

Cheers

Alex

Reply all

Reply to author

Forward

0 new messages