Segmentation fault - STARlong

115 views
Skip to first unread message
Assigned to ado...@gmail.com by me

Etai Jacob

unread,
Oct 26, 2016, 10:40:50 AM10/26/16
to rna-star

Hello,

I am trying to align 150 paired end RNA-seq using STARlong but keep getting core dumped. This does not occur using STAR. RAM is increasing with alignWindowsPerReadNmax but always fails in the end (e.g. with alignWindowsPerReadNmax RAM reaches 0.52 TB which is the machine limit).

STARlong --runMode alignReads --runThreadN 32 --genomeDir ReferenceData/STAR/GRCh38_Gencode.v25.ERCC92.readLength149 --twopassMode None --sjdbOverhang 149 --outFilterMultimapNmax 20 --alignSJoverhangMin 8 --alignSJDBoverhangMin 2 --outFilterMismatchNmax 999 --outFilterMismatchNoverLmax 0.1 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --outFilterType BySJout --outFilterScoreMinOverLread 0.33 --outFilterMatchNminOverLread 0.33 --limitSjdbInsertNsj 1200000 --readFilesIn 1_S1_L001_R1_001.fastq.gz 1_S1_L001_R2_001.fastq.gz --readFilesCommand zcat --outFileNamePrefix ./1_S1_L001_R1_001.fastq.gz. --outSAMstrandField intronMotif --outFilterIntronMotifs None --alignSoftClipAtReferenceEnds No --quantMode TranscriptomeSAM GeneCounts --outSAMtype BAM SortedByCoordinate --limitBAMsortRAM 171798691840 --outSAMunmapped Within --genomeLoad NoSharedMemory --chimSegmentMin 15 --chimJunctionOverhangMin 15 --outSAMattributes NH HI AS nM NM --outSAMattrRGline ID:rg1 SM:sm1 --alignWindowsPerReadNmax 2000000

Oct 25 18:48:05 ..... started STAR run
Oct 25 18:48:12 ..... loading genome
Oct 25 18:48:24 ..... started mapping
Segmentation fault (core dumped)

Thanks!

Alexander Dobin

unread,
Oct 28, 2016, 11:50:45 AM10/28/16
to rna-star
Hi Etai,

the --alignWindowsPerReadNmax 2000000 seems to be to large, why would you need to increase it so much?
This parameter pre-allocates memory to store putative alignments - 2M feels too much.

Cheers
Alex

Etai Jacob

unread,
Nov 4, 2016, 1:53:59 PM11/4/16
to rna-star
Hello Alex,

I just tried different values of alignWindowsPerReadNmax to see if it helps to solve the problem (it did not - problem remains for different values of alignWindowsPerReadNmax).

Thanks,
Etai

Alexander Dobin

unread,
Nov 4, 2016, 5:49:18 PM11/4/16
to rna-star
Hi Etai,

please send me the Log.out file of the failed run with the smallest --alignWindowsPerReadNmax value you used.

Cheers
Alex

Etai Jacob

unread,
Nov 14, 2016, 7:50:33 PM11/14/16
to rna-star
Hi Alex,

Please find the attached Log.out for the following command (which resulted in: [1]+  Segmentation fault      (core dumped) )

STARlong --runMode alignReads --runThreadN 32 --genomeDir ../../ReferenceData/STAR/GRCh38_Gencode.v25.ERCC92.readLength149 --twopassMode None --sjdbOverhang 149 --outFilterMultimapNmax 20 --alignSJoverhangMin 8 --alignSJDBoverhangMin 2 --outFilterMismatchNmax 999 --outFilterMismatchNoverLmax 0.1 --alignIntronMin 20 --alignIntronMax 1000000 --alignMatesGapMax 1000000 --outFilterType BySJout --outFilterScoreMinOverLread 0.33 --outFilterMatchNminOverLread 0.33 --limitSjdbInsertNsj 1200000 --readFilesIn ../Stamatis/results_latest/1_S1_L001_R1_001.fastq.gz ../Stamatis/results_latest/1_S1_L001_R2_001.fastq.gz --readFilesCommand zcat --outFileNamePrefix ./1_S1_L001_R1_001.fastq.gz. --outSAMstrandField intronMotif --outFilterIntronMotifs None --alignSoftClipAtReferenceEnds No --quantMode TranscriptomeSAM GeneCounts --outSAMtype BAM SortedByCoordinate --limitBAMsortRAM 171798691840 --outSAMunmapped Within --genomeLoad NoSharedMemory --chimSegmentMin 15 --chimJunctionOverhangMin 15 --outSAMattributes NH HI AS nM NM --outSAMattrRGline ID:rg1 SM:sm1 1>&1_S1_L001_R1_001.fastq.gz.stdouterr

Thanks,
Etai
1_S1_L001_R1_001.fastq.gz.Log.out
1_S1_L001_R1_001.fastq.gz.stdouterr

Shani Amarasinghe

unread,
Nov 15, 2016, 11:35:33 AM11/15/16
to rna-star
Hi Alex,

I was wondering whether Etai's issue was resolved. I also am facing a similar situation with my mapping. However, I'm using the default --alignWindowsPerReadNmax value. My genome is quite large (barley) so I'm using the default (14) for --genomeSAindexNbases as well. I'm creating a tmp file for all the inputs as well.

I used STAR before with Arabidopsis genome, and it never let me down.

If you can please check the attached log.out file and comment on how I can fix this issue it will be so great. I really like STAR, I hate to loose so much time on this when what I actually should be looking at are the incredibly well mapped reads at an incredibly fast speed.

Thanks so very much. 
Shani
C8H26ANXX_L10A_S6_Log.out

Shani Amarasinghe

unread,
Nov 15, 2016, 8:08:14 PM11/15/16
to rna-star
Hi Alex,

Sorry for the false alarm. I started using the newest version of STAR (2.4.2a_mod) and it seems to be working fine for the moment (no error and from log_out it seems that mapping is in action).

And just thinking of Etai's issue. Maybe using STARlong with the option --outSAMstrandField intronMotif as seen in the original script could be what's causing the segmentation error.

Thanks a  lot
Shani.

Etai Jacob

unread,
Nov 16, 2016, 12:14:55 PM11/16/16
to rna-star
Hi,

Thanks Shani.
I tried it without this option - the segmentation fault occurs either with or without this option.

Etai

Shani Amarasinghe

unread,
Nov 18, 2016, 4:29:47 AM11/18/16
to rna-star
Hi Etai,

Then you might have to wait for Alex to get back to you with a solution.... Sorry couldn't be any help.

Shani

Alexander Dobin

unread,
Nov 18, 2016, 6:43:24 PM11/18/16
to rna-star
Hi Etai,

I have tested your parameters, and it looks like the combination of chimeric detection (--chimSegmentMin 15 --chimJunctionOverhangMin 15)
with --outSAMstrandField intronMotif caused the seg-fault for STARlong.
I will try to fix the bug next week, but in the meantime you could try to use only of these options.
Also, how long are your reads? If they are <300b, you may get good (or even better) results with standard STAR, not STARlong.

Cheers
Alex

Etai Jacob

unread,
Nov 18, 2016, 7:20:04 PM11/18/16
to Alexander Dobin, rna-star
Thanks Alex.
My reads are 150x2 (paired end).
Could you briefly explain why it would even be better?
Etai

Sent from my iPhone
--
You received this message because you are subscribed to a topic in the Google Groups "rna-star" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rna-star/CHpZutgLzxA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rna-star+u...@googlegroups.com.
Visit this group at https://groups.google.com/group/rna-star.

Alexander Dobin

unread,
Nov 21, 2016, 11:07:46 AM11/21/16
to rna-star, ado...@gmail.com
Hi Etai,

STARlong is designed for mapping very long reads (>300b) such as PacBio. It is somewhat less accurate in finding all multimapping loci for short reads.
Also it's less accurate for chimeric detection. It is sometimes useful to use it for 250-300b Illumina reads as it is much faster than STAR for this range. For 2x150 reads standard STAR is definitely a better option.

Cheers
Alex

On Friday, November 18, 2016 at 7:20:04 PM UTC-5, Etai Jacob wrote:
Thanks Alex.
My reads are 150x2 (paired end).
Could you briefly explain why it would even be better?
Etai

Sent from my iPhone
To unsubscribe from this group and all its topics, send an email to rna-star+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages