Juicer pre slow and .hic file unreadable in Juicebox

427 views
Skip to first unread message

ricardogue...@gmail.com

unread,
Mar 27, 2019, 7:14:30 AM3/27/19
to 3D Genomics

Hello,

I only want a simple thing. Quick heatmaps of contact probability in multiple genome assemblies. But my Juicer pre takes multiple weeks to finish creating a ~3 gb .hic file. And then this file does not load any heatmap on Juicebox.

What are the limits of your juicer? I am trying to apply it to a genome assembly with 23 000 contigs, N50 of 500 kb, L50 of  33 and L75 of  309 contigs. Is this too many contigs?

I now am running it on LSF (with which I am inexperienced) I am not very familiar). It took 5 weeks with CPU on a cluster without queuing system).

The bsub comands run without errors (chose for  long_queue my multicore20, with a max of 20 cores).

Here is the resource consumption:


Can you help me? I just want quick heatmaps to have a general idea of assembly errors, I don't need anything else..

Cheers,
Ricardo

ricardogue...@gmail.com

unread,
Mar 27, 2019, 7:38:03 AM3/27/19
to 3D Genomics

Additional information:

- I think the problem is that pre is barely using any RAM. Is it supposed to be that way?

- I am running juicer.sh with -x flag to make it faster. Here is the command:

LSF/scripts/juicer.sh -p fastq/salsa_it2_cleaned_ChrIDs.txt -x -S final -g Potato_salsa \
    -z references/salsa/scaffolds_it2_cleaned.fasta \
    -y restriction_sites/salsa_it2_cleaned_DpnII.txt

Olga Dudchenko

unread,
Mar 27, 2019, 7:12:55 PM3/27/19
to 3D Genomics
Hi Ricardo,

Please see page 5 of Genome Assembly Cookbook, hosted here, on how to visualize draft genomes.

Best,
Olga

ricardogue...@gmail.com

unread,
Mar 28, 2019, 11:16:21 AM3/28/19
to 3D Genomics
Thanks Olga!

This is much better. So what is the difference between the .hic file created by juicer pre and the one created by run-assembly-visualizer.sh ? It's such a speed difference.


Kind regards,
Ricardo

Olga Dudchenko

unread,
Mar 28, 2019, 8:14:31 PM3/28/19
to 3D Genomics
Ricardo,

The 3D-DNA visualizer is a wrapper around juicer pre that does not try to handle boundaries of scaffolds as individual entities. Rather, the assembly is a single sandbox. The boundaries are tracked by the .assembly file. See Tutorial video on dnazoo.org/methods for instructions on how to load.

Best,
Olga
Reply all
Reply to author
Forward
0 new messages