Good day,
I'm running the juicer_tools pre (v1.22.01) from the juicier.sh script (v1.6), on a cluster.
The script has finished the creation of the merged_nodups.txt and it is currently creating my .hic files. The thing is that it's been more than a week since it started to write the .hic files.
I didn't realize that the juicer.sh doesn't parallelize this step even if the juicer_tools pre has the capabilities to do it, until I noticed how long is taking this step.
In consequence, I'll like to re-run the juicer_tools pre with parallelization, but I'm nor sure about the right parameters to use to optimize it.
What would be the suitable numbers for the (1) -j (number of CPU threads to use), (2) --threads (number of threads), and the (3) free memory and (4) virtual memory needed for each CPU used, taking into account that my merged_nodups.txt is 241 Gbytes, and I have a genome assembly of about 2.4 Gb, and that I'm using a cluster with good capabilities.
Thank you for your comments
Andrea