Jellyfish running out of memory on an HPC cluster, --JM no longer available

1,291 views
Skip to first unread message

mira...@umn.edu

unread,
Mar 17, 2015, 9:59:22 PM3/17/15
to trinityrn...@googlegroups.com
Dear all,

I have been banging my head against my desktop for a while figuring out why my RNAseq assemblies on my institution's cluster would crash with more than one library's reads per assembly (60 M PE reads; runs fine with 20M PE). I followed lots of threads and blog question and added the "--verbose" to see more details on the errors. I found that Jellyfish is running out of memory, can't allocate enough, or seems that way, but in the v2 version I can't try to adjust it with the old --JM command. I requested 32 cores and 125 G of RAM for this analysis but cannot change Jellyfish's memory. Is there a way? 

Could anybody help me? I concatenated all the PE reads no problem to make the both.fa file...

This is my output window and run.log errors:

>>>>>>

#######################################

Running Java Tests

Tuesday, March 17, 2015: 21:40:09       CMD: java -Xmx64m -jar /home/applications/trinity/2.0.3/util/support_scripts/ExitTester.jar 0

Picked up _JAVA_OPTIONS: -Xms128m -Xmx1024m

Error, do not understand options: --JM 70G --bflyHeapSpaceMax 125G --bflyHeapSpaceInit 3900M --bflyCPU 32 --inchworm_cpu 32CMD finished (1 seconds)

Tuesday, March 17, 2015: 21:40:10 CMD: java -Xmx64m -jar /home/applications/trinity/2.0.3/util/support_scripts/ExitTester.jar 1

Picked up _JAVA_OPTIONS: -Xms128m -Xmx1024m

-we properly captured the java failure status, as needed.  Looking good.

Java tests succeeded.

###################################




----------------------------------------------------------------------------------

-------------- Trinity Phase 1: Clustering of RNA-Seq Reads  ---------------------

----------------------------------------------------------------------------------


-------------------------------------------
----------- Jellyfish  --------------------
-- (building a k-mer catalog from reads) --
-------------------------------------------

Tuesday, March 17, 2015: 21:40:20 CMD: /home/applications/trinity/2.0.3/trinity-plugins/jellyfish/bin/jellyfish count -t 32 -m 25 -s 17531127145  both.fa
terminate called after throwing an instance of 'jellyfish::large_hash::array_base<jellyfish::mer_dna_ns::mer_base_static<unsigned long, 0>, unsigned long, atomic::gcc, jellyfish::large_hash::array<j$
  what():  Failed to allocate 67527304624 bytes of memory
Error, cmd: /home/applications/trinity/2.0.3/trinity-plugins/jellyfish/bin/jellyfish count -t 32 -m 25 -s 17531127145  both.fa died with ret 134 at /home/applications/trinity/2.0.3/Trinity line 2110.

Trinity run failed. Must investigate error above.
<<<<<<<

###If I try the old solution my script won't run
Error, do not understand options: --JM 70G --bflyHeapSpaceMax 125G --bflyHeapSpaceInit 3900M --bflyCPU 32 --inchworm_cpu 32


Thanks for any insight!

--Hernán

Tiago Hori

unread,
Mar 18, 2015, 6:07:01 AM3/18/15
to mira...@umn.edu, trinityrn...@googlegroups.com
You are looking for the max_memory flag. Check out v2 documentation @ gitHub

T.

Sent from my iPhone
--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-u...@googlegroups.com.
To post to this group, send email to trinityrn...@googlegroups.com.
Visit this group at http://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.

mira...@umn.edu

unread,
Mar 18, 2015, 7:03:25 AM3/18/15
to trinityrn...@googlegroups.com, mira...@umn.edu

Thanks for responding. That's not it. I did set the maximum memory at 125G. Here is the entire Trinity string I've been using. Th issue is still Jellyfish not able to allocate enough memory and I can't find in the documentation how to change me more for Jellyfish, as you use to do with the flag --JM:

Trinity --seqType fq --max_memory 125G --SS_lib_type RF --verbose --left ../1_R1.fastq.1P_TruSeq3.fastq.gz,../2_R1.fastq.1P_TruSeq3.fastq.gz,../3_R1.fastq.1P_TruSeq3.fastq.gz --right ../1_R2.fastq.2P_TruSeq3.fastq.gz,../2_R2.fastq.2P_TruSeq3.fastq.gz,../3_R2.fastq.2P_TruSeq3.fastq.gz --CPU 32 --bflyHeapSpaceMax 125G --bflyHeapSpaceInit 3900M --bflyCPU 32 --inchworm_cpu 32 > run.log 2>&1

Tiago Hori

unread,
Mar 18, 2015, 7:06:56 AM3/18/15
to mira...@umn.edu, trinityrn...@googlegroups.com
Max memory replaces the -JM flag. I think your problem is that your butterfly heap space is to large. I think that flags tried to set the space per thread. So around 4 gigs that your setting times 32 CPU is getting you over 125.

T.

Sent from my iPhone

mira...@umn.edu

unread,
Mar 18, 2015, 8:18:57 AM3/18/15
to trinityrn...@googlegroups.com, mira...@umn.edu
Thank you Tiago. I appreciate your time.
The same problem persists with Jellyfish. I reduced the number of CPUs to 16 and increase max_memory to 126G constraining the heap space for butterfly to 70G and Jellyfish still crashes. I requested 7G per core so it doesn't exceed 126G (16X7=112G). How come you cannot longer control Jellyfish memory allocation individually?Reducing the heap space for java in butterfly seems like it should be its own separate setting...

Do you know if butterfly does the heapSpaceMax per core, or total? That'd be good to know! Otherwise 50X16 will of course make everything crash. I'll give it a try.

Here are my commands and error outputs if you have more ideas:
Trinity --seqType fq --max_memory 126G --SS_lib_type RF --verbose --left ../1_R1.fastq.1P_TruSeq3.fastq.gz,../2_R1.fastq.1P_TruSeq3.fastq.gz,../3_R1.fastq.1P_TruSeq3.fastq.gz --right ../1_R2.fastq.2P_TruSeq3.fastq.gz,../2_R2.fastq.2P_TruSeq3.fastq.gz,../3_R2.fastq.2P_TruSeq3.fastq.gz --CPU 16 --bflyHeapSpaceMax 50G --bflyHeapSpaceInit 4000M --bflyCPU 16 --inchworm_cpu 16 > run.log 2>&1

Errors:

Paired mode requires bowtie. Found bowtie at: /home/applications/bowtie/1.0.0/bin/bowtie

 and bowtie-build at /home/applications/bowtie/1.0.0/bin/bowtie-build


Found samtools at: /home/applications/samtools/0.1.19/bin/samtools

-since butterfly will eventually be run, lets test for proper execution of java
#######################################
Running Java Tests
Wednesday, March 18, 2015: 07:20:59 CMD: java -Xmx64m -jar /home/applications/trinity/2.0.3/util/support_scripts/ExitTester.jar 0
Picked up _JAVA_OPTIONS: -Xms128m -Xmx1024m
CMD finished (0 seconds)
Wednesday, March 18, 2015: 07:20:59 CMD: java -Xmx64m -jar /home/applications/trinity/2.0.3/util/support_scripts/ExitTester.jar 1
Picked up _JAVA_OPTIONS: -Xms128m -Xmx1024m
-we properly captured the java failure status, as needed.  Looking good.
Java tests succeeded.
###################################



----------------------------------------------------------------------------------
-------------- Trinity Phase 1: Clustering of RNA-Seq Reads  ---------------------
----------------------------------------------------------------------------------

-------------------------------------------
----------- Jellyfish  --------------------
-- (building a k-mer catalog from reads) --
-------------------------------------------

Wednesday, March 18, 2015: 07:21:10 CMD: /home/applications/trinity/2.0.3/trinity-plugins/jellyfish/bin/jellyfish count -t 16 -m 25 -s 17684518834  both.fa
terminate called after throwing an instance of 'jellyfish::large_hash::array_base<jellyfish::mer_dna_ns::mer_base_static<unsigned long, 0>, unsigned long, atomic::gcc, jellyfish::large_hash::array<jellyfish::mer_dna_ns::mer_base_static<unsigned long, 0>, unsigned long, atomic::gcc, allocators::mmap> >::ErrorAllocation'
  what():  Failed to allocate 68118146720 bytes of memory
Error, cmd: /home/applications/trinity/2.0.3/trinity-plugins/jellyfish/bin/jellyfish count -t 16 -m 25 -s 17684518834  both.fa died with ret 134 at /home/applications/trinity/2.0.3/Trinity line 2110.

Trinity run failed. Must investigate error above.

-- Hernán

Tiago Hori

unread,
Mar 18, 2015, 8:22:12 AM3/18/15
to mira...@umn.edu, trinityrn...@googlegroups.com
I am sorry. I thought you were crashing at butterfly for some reason. :)

How many reads do you have?

Every time I have seen that Jellyfish error is because I had too many reads.

Also, can I ask why you are not normalizing your data?

T.


Ultimate stresses sportsmanship and fair play. Competitive play is encouraged, but NEVER AT THE EXPENSE OF RESPECT BETWEEN PLAYERS, adherence to the rules and the BASIC JOY OF PLAY.

Tiago S. F. Hori, M.Sc., Ph.D.


mira...@umn.edu

unread,
Mar 18, 2015, 8:31:40 AM3/18/15
to trinityrn...@googlegroups.com, mira...@umn.edu
Yes, fiddling with -bfly memory parameters didn't work.

Trinity is crashing at the Jellyfish stage. I didn't think I had too many, only ~60 million PE reads from 3 libraries (~20M each). Is that too many already? I thought it was only necessary with more than hundred million reads. Do you think normalizing would solve my issue?

Thanks Tiago!

Hernán

Tiago Hori

unread,
Mar 18, 2015, 8:36:19 AM3/18/15
to mira...@umn.edu, trinityrn...@googlegroups.com
Hi Hernan,

125G is more than enough more 60M PE reads. The only thing I can think of is that your cluster is actually not allocating you the memory you need. If you look at your error Jellyfish is trying to allocate only 68G of RAM and failing to do so. That leads me to think you have some memory restriction on the cluster. Can you verify with your sys admin?

As for normalization. I would always normalize. For one thing it reduces the memory footprint down the road, but more importantly it reduces the number of spurious transcripts generated form low supported kmers.

T.

Ultimate stresses sportsmanship and fair play. Competitive play is encouraged, but NEVER AT THE EXPENSE OF RESPECT BETWEEN PLAYERS, adherence to the rules and the BASIC JOY OF PLAY.

Tiago S. F. Hori, M.Sc., Ph.D.


mira...@umn.edu

unread,
Mar 18, 2015, 8:57:45 AM3/18/15
to trinityrn...@googlegroups.com, mira...@umn.edu
I was hoping it wasn't something related to my lsf cluster script but yes, it might be that I'm not getting enough memory for Jellyfish. I'll talk to my cluster people and post here the solution so others don't run into the same wall.

I most certainly will normalize soon! This assembly in full has ~300 M reads so I must normalize everything.

Thank you for the insight and clues!

Hernán

mira...@umn.edu

unread,
Mar 22, 2015, 10:47:21 AM3/22/15
to trinityrn...@googlegroups.com, mira...@umn.edu
Tiago, you pointed me in the right direction, thank you very much! It is the way so request RAM memory and how to allocate it. I will start a new post with the answer so if people run into the same issue, they can find it easily and hopefully solve it.

Hernán

Mohammed Kanchwala

unread,
Dec 13, 2016, 11:26:33 AM12/13/16
to trinityrnaseq-users, mira...@umn.edu
Hello Mira,

Could you please  point out your solution for this problem? May be a link to a new discussion?

Thanks.

Caroline Judy

unread,
May 25, 2017, 11:01:04 AM5/25/17
to trinityrnaseq-users, mira...@umn.edu
Is there a solution to this problem? 
Reply all
Reply to author
Forward
0 new messages