Failure of samtools sort

653 views
Skip to first unread message

Matt Romero

unread,
May 30, 2022, 7:39:57 AM5/30/22
to 3D Genomics
Hello,
I am using juicer on CPU in AWS on an instance. I ran into a problem with the following error:
[bam_sort_core] merging from 1216 files and 64 in-memory blocks...
[E::hts_open_format] Failed to open file "./samtools.225856.9824.tmp.1021.bam" : Too many open files
samtools sort: fail to open "./samtools.225856.9824.tmp.1021.bam": Too many open files
***! Failure during chimera handling of /data/juicer/splits/HiC-lib-Combined-H9-SMPC-I4_S2_L003_001.fastq.gz

I added "-m 3G" flag to the samtools sort command in the juicer script and it is currently running. I'm wondering if this would help fix the issue? Or is there something else I can do in case this doesn't work? I'm using an AWS instance with 256gb of memory and 64vCPUs. 
Thanks!
-Matt

Moshe Olshansky

unread,
Jun 1, 2022, 8:13:48 AM6/1/22
to 3D Genomics
Hi Matt,

I think that at the moment the only option is to do what you did, i.e. editing the -m flag in juicer.sh. We probably should make a command line flag for this or even set it automatically based on the number of treads and the available RAM.

Best regards,
Moshe.

苏茜薇

unread,
Jul 3, 2022, 1:59:36 AM7/3/22
to 3D Genomics
hi, have you success with "-m 3G" in samtools sort? I met the same problem as you, but after I add " -m 3G ", it runs over 70 hours and still didn't finish, my sam file is ~700G, I'm not sure if the situation is normal, if you success, how long do you spend on the samtools sort and how big is you sam file?
thanks for your patience!

Matt Romero

unread,
Jul 10, 2022, 7:43:35 PM7/10/22
to 3D Genomics
I did have success with the -m 3G addition to the script. I also added flags for the number of threads for the juicer script to use using the -t flag. There's also a -T flag as well to instruct the juicer script on how many threads to use during .hic file generation so be careful there. Maybe this would help, if you haven't done this already. 

Jacob SS

unread,
Aug 11, 2023, 8:24:59 PM8/11/23
to 3D Genomics
Hi Matt and Moshe,

I am using Juicer 2.0 and am running into a similar error. I have 3 technical replicates (1F and 1R for each) in the fastq directory. When I run the following command (modified to use "path/to" instead of the full path):

"juicer.sh -d path/to/Juicer -t 20  -z path/to/Pt_aquil_hifiasm_output_may8_nohic.asm.bp.p_ctg.fa  -D path/to/juicer/ -p path/to/chrom.sizes   &> juicer.log"

juicer is able to complete all alignment and chimera handling for runs 1 and 2, but then seems to fail on the 3rd technical replicate providing this error: 

"merging from 1400 files and 20 in-memory blocks...

[E::hts_open_format] Failed to open file ./samtools.20402.5490.tmp.1021.bam

samtools sort: fail to open "./samtools.20402.5490.tmp.1021.bam": Too many open files

***! Failure during chimera handling of..."


my ulimit is set to "ulimit -n 2048"


I have not tried incorporating the -m 3G into the samtools sort function in the juicer.sh script partially. because I am unsure where exactly to incorporate it and I do not want to break the script. 


Thank you for your help,

Jacob

Reply all
Reply to author
Forward
0 new messages