STAR 2.4.2a: 2nd Pass mapping is extremly slow

79 views
Skip to first unread message

Anil Kesarwani

unread,
Oct 2, 2015, 11:07:53 AM10/2/15
to rna-star
Hi,

The 2nd-pass run is extremly slow compared to 1st-pass STAR run. I am not able to figure out why. For the better sense, I am providing information about RN-seq sample as well STAR paramters used, as follows:

I have total 200 RNA-Seq (100 bp PE reads, ~100 million read depth per sample) for human samples.

The 1st-pass run with genome generated with GTF file was quite fast (~30 min each files). This time, the  "sjdbList.out.tab" and "sjdbList.fromGTF.out.tab" present inside "genomeDir" contained total 347,743  and 347,819 lines respectively. The parameter used are as follow:

STAR --genomeDir $genome_dir --readFilesIn $indir/$fq1 $indir/$fq2 --readFilesCommand zcat --outFileNamePrefix $outdir/$lib_name/$lib_name. --runThreadN 20 --outSAMtype BAM SortedByCoordinate --limitBAMsortRAM 115000000000 --outFilterType BySJout --outFilterMismatchNmax 8 --alignIntronMax 1000000 --alignIntronMin 20 --alignMatesGapMax 1000000 --alignSJoverhangMin 8 --alignSJDBoverhangMin 1 --chimSegmentMin 20 --chimJunctionOverhangMin 20 --outReadsUnmapped Fastx --outSAMstrandField intronMotif --outFilterMultimapNmax 20

The 2nd-pass run was executed with merged "SJ.out.tab" from all 200 samples from 1st-pass. Now, total 3,175,064 lines (~10 time higher number of SJ) are there in "sjdbList.out.tab" present inside "._STARgenome" directory. Following parameters were appended to the 1st paremter, in the 2nd are as follows:

--sjdbFileChrStartEnd $sjdb --limitSjdbInsertNsj 10000000 --sjdbInsertSave All --sjdbOverhang 100

The 2nd pass run took around 10 hrs, which is pathetically slow. I will highly appreciated if someone could guide me in this regard.


Thanks a lot.

Best regards,
Anil


Alexander Dobin

unread,
Oct 5, 2015, 4:21:59 PM10/5/15
to rna-star
Hi Anil,

one documented problem for the slow 2nd pass runs were the junctions in the mitochondrion genome:
https://groups.google.com/d/msg/rna-star/O1oDItDltjY/cyp7mlhRjP4J
Please try to remove them from SJ.out.tab before concatenating them for the 2nd pass genome generation.
If this does not work, I will need to try replicating this problem.

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages