STARlong parameters

Dorit Hockman

unread,

May 9, 2016, 11:39:25 AM5/9/16

to rna-star

Hi,

I am using STARlong to map a de novo trinity assembly back to a genome. I am using the parameters that are specified in the supplementary data of the STAR paper (see below). Would it be possible to explain why these parameters were chosen? I am finding it difficult to interpret them using just the manual.

--outFilterMismatchNmax 100
--seedSearchLmax 30   --seedSearchStartLmax 30
--seedPerReadNmax 100000   --seedPerWindowNmax 100
--alignTranscriptsPerReadNmax 100000
--alignTranscriptsPerWindowNmax 10000

Thank you for your help.

- Dorit Hockman

Alexander Dobin

unread,

May 9, 2016, 5:36:15 PM5/9/16

to rna-star

Hi Dorit,

--outFilterMismatchNmax 100 : increases the number of allowed mismatches to 100 - need to allow more mismatches for longer reads

--seedSearchStartLmax 30 : increases the number of seed search start position in the read - important for reads with high error rate

--seedSearchLmax 30 : similar to the above, limits the maximum length of the seeds. Presently, I do not recommend changing this parameter

--seedPerReadNmax 100000   --seedPerWindowNmax 100 : increase the number of allowed seeds for each read and alignment window - need to store more seeds for longer reads

--alignTranscriptsPerReadNmax 100000 --alignTranscriptsPerWindowNmax 10000 : increase the number of allowed alignments for each read and alignment window - need to store more putative alignments for longer reads

Cheers

Alex

Dorit Hockman

unread,

May 10, 2016, 6:10:37 AM5/10/16

to rna-star

Thank you! That is very helpful.

-Dorit

Reply all

Reply to author

Forward