STAR failed to build genome index

2,408 views
Skip to first unread message

Quincy Mo

unread,
Nov 19, 2014, 1:38:03 AM11/19/14
to rna-...@googlegroups.com
Hi Alex,

I tried to build the index for 90 bp paired-end reads using the latest STAR as follows on Linux. 

>mkdir /work/02457/qmo/Genome/Homo_sapiens/UCSC/hg19/Sequence/starIndex90
>STAR --runThreadN 6 \
     --runMode genomeGenerate \
     --genomeDir  /work/02457/qmo/Genome/Homo_sapiens/UCSC/hg19/Sequence/starIndex90 \
     --genomeFastaFiles /work/02457/qmo/Genome/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa \
     --sjdbGTFfile /work/02457/qmo/Genome/Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf \
     --sjdbOverhang 89

There was not any error message.  However, the index is only 4.6G.   STAR reported FATAL ERROR when I used the index.

>du -ch starIndex90/
4.6G    starIndex90/
4.6G    total

I was able to run STAR on Mac OS, but not on Linux.  I would greatly appreciate if you could help me out.

Thank you very much for your time.

Quincy

Quincy Mo

unread,
Nov 19, 2014, 10:56:37 AM11/19/14
to rna-...@googlegroups.com
 
Hi Alex,

   Bellows are the details.
 
I tried to build the index for 90 bp paired-end reads using the latest STAR as follows on Linux. 

>mkdir /work/02457/qmo/Genome/Homo_sapiens/UCSC/hg19/Sequence/starIndex90
>STAR --runThreadN 6 \
     --runMode genomeGenerate \
     --genomeDir  /work/02457/qmo/Genome/Homo_sapiens/UCSC/hg19/Sequence/starIndex90 \
     --genomeFastaFiles /work/02457/qmo/Genome/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa \
     --sjdbGTFfile /work/02457/qmo/Genome/Homo_sapiens/UCSC/hg19/Annotation/Genes/genes.gtf \
     --sjdbOverhang 89

Nov 18 23:12:08 ..... Started STAR run
Nov 18 23:12:08 ... Starting to generate Genome files
Nov 18 23:13:23 ... finished processing splice junctions database ...
Nov 18 23:13:47 ... starting to sort  Suffix Array. This may take a long time...
Nov 18 23:14:10 ... sorting Suffix Array chunks and saving them to disk...
Nov 18 23:41:55 ... loading chunks from disk, packing SA...


There was not any error message.  However, the index is only 4.6G.  

>du -ch starIndex90/
4.6G    starIndex90/
4.6G    total

> ls starIndex90/
chrLength.txt      chrName.txt   exonInfo.tab  genomeParameters.txt  sjdbInfo.txt              sjdbList.out.tab
chrNameLength.txt  chrStart.txt  Genome        SA_19                 sjdbList.fromGTF.out.tab  transcriptInfo.tab

It seems that STAR failed to merge all SA_*.  In this case only SA_19 was left.  When I used the index, STAR reported FATAL ERROR as the following.

EXITING because of FATAL ERROR: could not open genome file /work/02457/qmo/Genome/Homo_sapiens/UCSC/hg19/Sequence/starIndex90/SA

SOLUTION: check that the path to genome files, specified in --genomDir is correct and the files are present, and have user read permsissions

Nov 19 09:39:21 ...... FATAL ERROR, exiting

Quincy Mo

unread,
Nov 19, 2014, 10:36:01 PM11/19/14
to rna-...@googlegroups.com
I tried to build the genome index again.  STAR_2.4.0f1 generated the following error message.


BUG: next index is smaller than previous, EXITING

Nov 19 18:03:15 ...... FATAL ERROR, exiting

Alexander Dobin

unread,
Nov 20, 2014, 12:49:00 AM11/20/14
to rna-...@googlegroups.com
Hi Quincy,

please remove all files from genome directory and try to re-run it. Also, make sure that you have enough space on the disk (~100GB), and you have the previous instances of STAR jobd killed.
If this still does not complete, please send me the links to the fasta anf gtf files you are using.

Cheers
Alex

Quincy Mo

unread,
Nov 21, 2014, 1:25:45 AM11/21/14
to rna-...@googlegroups.com
Hi Alex,

Thank you for your help.

I might have found the problem.  In one of our computer clusters,  each node only has 32GB RAM.  It should run out of memory when I tried to build the index.
It worked when I added '--genomeSAsparseD 2' to the arguments.  However, if just added '--limitGenomeGenerateRAM 24000000000', it still did not work.
In addition, I have not figure out why STAR failed on another cluster which has 64GB RAM for each node.

Quincy



On Wednesday, November 19, 2014 12:38:03 AM UTC-6, Quincy Mo wrote:
Reply all
Reply to author
Forward
0 new messages