STARlong parameters

636 views
Skip to first unread message

Dorit Hockman

unread,
May 9, 2016, 11:39:25 AM5/9/16
to rna-star
Hi,

I am using STARlong to map a de novo trinity assembly back to a genome. I am using the parameters that are specified in the supplementary data of the STAR paper (see below). Would it be possible to explain why these parameters were chosen? I am finding it difficult to interpret them using just the manual.

--outFilterMismatchNmax 100
--seedSearchLmax 30   --seedSearchStartLmax 30
--seedPerReadNmax 100000   --seedPerWindowNmax 100
--alignTranscriptsPerReadNmax 100000
--alignTranscriptsPerWindowNmax 10000

Thank you for your help.

- Dorit Hockman

Alexander Dobin

unread,
May 9, 2016, 5:36:15 PM5/9/16
to rna-star
Hi Dorit,

--outFilterMismatchNmax 100 : increases the number of allowed mismatches to 100 - need to allow more mismatches for longer reads
--seedSearchStartLmax 30 : increases the number of seed search start position in the read - important for reads with high error rate
--seedSearchLmax 30 : similar to the above, limits the maximum length of the seeds. Presently, I do not recommend changing this parameter
--seedPerReadNmax 100000   --seedPerWindowNmax 100 : increase the number of allowed seeds for each read and alignment window - need to store more seeds for longer reads
--alignTranscriptsPerReadNmax 100000 --alignTranscriptsPerWindowNmax 10000 : increase the number of allowed alignments for each read and alignment window - need to store more putative alignments for longer reads

Cheers
Alex

Dorit Hockman

unread,
May 10, 2016, 6:10:37 AM5/10/16
to rna-star
Thank you! That is very helpful.

-Dorit
Reply all
Reply to author
Forward
0 new messages