Problem running STAR 2 pass

459 views
Skip to first unread message

Mauricio Losilla

unread,
Jun 8, 2015, 3:38:26 PM6/8/15
to rna-...@googlegroups.com

Hello,

I am trying to run STAR 2.4.1d, but I am getting an error when I choose to run the 2 pass (--twopassMode Basic). The run starts and performs initial steps properly (including the entire 1 pass), but then fails in the step "Inserting junctions into the genome indices". At least this is what I gather from the attached STARalignMouse file, generated by the HPCC I am running STAR on. I am also attaching the testLog.out file generated by STAR.

Thanks
Mau
testLog.out
STARalignMouse.txt

Mauricio Losilla

unread,
Jun 9, 2015, 11:42:04 AM6/9/15
to rna-...@googlegroups.com

Hi,

I tested this again on my personal computer, to see if the problem lies within STAR or in the HPCC installation. Due to RAM restrictions, I only used one chromosome as the reference genome, but I don't think that's a problem. 

The run failed again, and I think it failed in the same steps as before. I am attaching the testLog.out file, plus a screenshot of my terminal with explanations of each run I did.

Is this a bug in STAR, or am I doing something wrong?

Thanks
Mau
STARfail.png
testLogLinux.out

Alexander Dobin

unread,
Jun 11, 2015, 6:45:16 PM6/11/15
to rna-...@googlegroups.com, mlosi...@gmail.com
Hi Mauricio,

which parameters did you use to generate the genome? It seems --sjdbOverhang was set to 0, which could crash STAR.
If you do not use annotations at the genome generation step, do not specify the --sjdbOverhang parameter.

The logic of selecting the --sjdbOverhang parameter is somewhat convoluted. 
If you specify it at the genome generation step, but do not specify it for the mapping step, the genome generation value will be used.

If it still does not work, please send me the Log.out file, as well as stdout again.

Cheers
Alex

Mauricio Losilla

unread,
Jun 12, 2015, 10:31:55 AM6/12/15
to rna-...@googlegroups.com, mlosi...@gmail.com
Hi Alex,

Thank you for your reply. I have never specified the parameter --sjdbOverhang, neither when genome indexing nor when mapping. To be honest, I don't fully understand what that parameter does, and I didn't look into it too much, because my genome isn't annotated. 


My genome indexing command was:

STAR --runThreadN 4 --runMode genomeGenerate --genomeFastaFiles /path/to/genome/file --genomeDir /path/to/output/folder

So I guess it ran with the default value of 100 (according to the manual for version 2.4.1a, which is the newest I found). My reads are 76 pb long, I wonder if this matters.

Upon checking the attached genomeParameters.txt file, generated during genome indexing, it says sjdbOverhang 0.


I hope this helps!

Thanks a lot
Mau
genomeParameters.txt

Mauricio Losilla

unread,
Jun 12, 2015, 11:07:06 AM6/12/15
to rna-...@googlegroups.com

I played with the --sjdbOverhang parameter. I tried to set it to 75 during the genome index step, but I got this error: 

Jun 12 10:52:04 ..... Started STAR run

EXITING because of FATAL INPUT PARAMETER ERROR: when generating genome without annotations (--sjdbFileChrStartEnd or --sjdbGTFfile options)
do not specify >0 --sjdbOverhang
SOLUTION: re-run genome generation without --sjdbOverhang option

Jun 12 10:52:04 ...... FATAL ERROR, exiting


Then I tried to set it in the mapping step. I added --sjdbOverhang 75 to my previous mapping command, and it seems to complete beautifully! 


Is this what I should do properly run 2 pass mode on unannotated genomes? is --sjdbOverhang defaulted to 0?

Alexander Dobin

unread,
Jun 12, 2015, 11:49:49 AM6/12/15
to rna-...@googlegroups.com, mlosi...@gmail.com
Hi Mauricio,

if you generated genome with older versions of STAR, the default --sjdbOverhang wold be indeed 0, and this would create a problem for the mapping with 2-step.
I have changed the default value of sjdbOverhang since 2.4.1a and it created a lot of confusion unfortunately. The 2.4.1d looks stable, and I strongly recommend using it instead of 2.4.1a-c,
bot for genome generation and mapping.

The --sjdbOverhang option can only be used at the genome generation step if you use annotation (e.g. gtf file).
At the mapping step, --sjdbOverhang option will work only if you are doing 2-step mapping or inserting annotations on the fly.

Cheers
Alex

Mauricio Losilla

unread,
Jun 12, 2015, 12:02:49 PM6/12/15
to rna-...@googlegroups.com, mlosi...@gmail.com

No, I have only used v2.4.1d

Is the default supposed to be 0? That is the value according to the genomeParameters file I got.

thanks
Mau

Alexander Dobin

unread,
Jun 12, 2015, 12:27:30 PM6/12/15
to rna-...@googlegroups.com, mlosi...@gmail.com
Hi Mauricio,

sorry, I am confusing everyone including myself with the --sjdbOverhang logic.
I have checked the code again and this is what happens.

The default value of --sjdbOverhang is 100. This is used if you generate genome with annotations.
If you generate genome without annotations, STAR forces --sjdbOverhang to 0 and records it in the genomeParameters file.

For the 2-step mapping, if you do not specify the --sjdbOverhang value, the value is taken from the genomeParameters file, and if this value is 0, it crashes STAR.
I will add checking for this condition. 

At the moment, the safest thing is to specify --sjdbOverhang at the mapping step.

Cheers
Alex

Mauricio Losilla

unread,
Jun 12, 2015, 2:08:41 PM6/12/15
to rna-...@googlegroups.com, mlosi...@gmail.com

That makes perfect sense! 

Thank you very much for your help, I would've never figured out myself!


Many Thanks
Mau
Reply all
Reply to author
Forward
0 new messages