Help understanding STAR algorithm

35 views
Skip to first unread message

sent...@gmail.com

unread,
Feb 1, 2018, 3:41:37 PM2/1/18
to rna-star
Hi ,

Please correct me, if I am wrong
First step of alignment is make a hg19 index from hg19 Fasta. i.e creating a "uncompressed Suffix Array (SA)" of the whole hg19 fasta ( human reference genome). 
Then for every read (or read-pair)  STAR looks for "exact-matches "/"matches" in the index and hence maps the sequencing read to the genome.

My question is : Every time STAR runs on a sample, it spends time to sorts the "uncompressed Suffix Array (SA) " ? I am wondering why doesn't STAR sorts the index when it builds it ? So that  for a new run it doesn't spend time sorting the SA and directly starts from the step  of matching the reads to the index ?

Alexander Dobin

unread,
Feb 5, 2018, 12:24:38 PM2/5/18
to rna-star
Hi @sentisci

indeed, the suffix array is created only once, with the --runMode genomGenerate option.
On the mapping runs, the suffix array and other indexes are just loaded in RAM.

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages