Dear Juicer team,
I’m running Juicer 2.0 in SLURM mode on in-situ Hi-C data from 19 libraries. Eighteen processed successfully, but one repeatedly gets OOM-killed during merging, sorting, dedup, dup-split or dup-merge—even with up to 512 GB RAM and 16–48 CPUs, on partitions allowing >900 GB. The same pipeline and parameters work on all other samples.
Key facts:
Input size: ~400 million read pairs, producing 400–500 GB SAM files.
Merge/sort failures:
samtools sort -t cb -n -O SAM -@ 8 -m 2G … > …sam … Killed … OOM_kill eventDup-split failures:
Split jobs finish at ~700 K reads (expected ~940 M), so no shards → no merged_dedup.sam.
Dup-merge failures:
Auto-generated merge script requests --mem=50G → OOM kill.
Dup-check error:
***! Error! The sorted file and dups/no dups files do not add up, or were empty.Partition & resources:
Tested wall times 24–48-72 h; even 256 GB–512 GB RAM on intel-g4 and edr9; intel-g4 often OOMs, edr9 succeeds.
Attempts:
Increased RAM demand gradually from 96 GB to 512 GB.
Applied “--mem=8g” fix from issue #345 ( https://github.com/aidenlab/juicer/issues/345 ) to dup-split SBATCH directives—no effect on this file - for other files this fixed their OOM kills issue.
Verified FASTQ integrity (line counts, headers, quickcheck).
No evidence of file corruption or missing site file.
Questions:
Is there a known Juicer fix or workaround beyond adjusting --mem in split_rmdups_sam.awk?
How can I determine if the problem lies in the sequence or in Juicer’s handling of this specific sample?
Please let me know any suggestions or possible solutions to fix the error regarding this file.
Thanks in advance,
Amey Vaijapurkar.