Hi-C map is too sparse to find many domains via Arrowhead

771 views
Skip to first unread message

antarik...@gmail.com

unread,
Feb 8, 2021, 3:34:36 PM2/8/21
to 3D Genomics
Hi Neva,

My .hic.txt file has ~250million contacts. Converting it to .hic results in ~140 million intra-chromosomal contacts. However, running juicertools gives warning that the matrix is too sparse. Example: "Warning: Hi-C map is too sparse to find many domains via Arrowhead"

Also, ignoring sparsity does not result in any output.

Could you please suggest the minimum number of contacts required for juicertools analysis and how much data it would correspond to.
Attached here are straw output of my .hic file as well as the arrowhead output.


Thanks in advance
Ant
arrowhead_out.txt
read_hic_header.txt
straw_out_all_chromosomes.txt

Neva Durand

unread,
Feb 8, 2021, 3:45:35 PM2/8/21
to antarik...@gmail.com, 3D Genomics
What command did you use? The filename for the arrowhead output is strange.


--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/509a8f67-cba2-4a77-8bf9-b2674f80da48n%40googlegroups.com.


--
Neva Cherniavsky Durand, Ph.D. | she, her, hers
Assistant Professor |  Molecular and Human Genetics
Aiden Lab | Baylor College of Medicine

Antariksh Tyagi

unread,
Feb 8, 2021, 3:57:40 PM2/8/21
to Neva Durand, 3D Genomics
This is the command

java -Xmx500g -jar juicer_tools_1.22.01.jar arrowhead --threads 24 -r 50000 --ignore-sparsity NlaIII_run01_UCSC_hg38.hic arrowhead_out

arrowhead_out is output directory 

Thanks

--
antariksh

Neva Durand

unread,
Feb 8, 2021, 4:09:35 PM2/8/21
to Antariksh Tyagi, 3D Genomics
You should not do it at 50kb resolution. Get rid of the “-r” flag. 

Antariksh Tyagi

unread,
Feb 8, 2021, 5:27:06 PM2/8/21
to Neva Durand, 3D Genomics
I did rerun:
java -Xmx100g -jar juicer_tools_1.22.01.jar arrowhead --ignore-sparsity --threads 8 hic_file/NlaIII_run01_UCSC_hg38.hic juicertools_out

Got the same stdout message:

0 domains written to file: /lower_bay/home/antariksh.tyagi/softwares/Pore-C-Snakemake/combined_replicates_all_runs_till_Dec_2020_post_analysis/juicertools_out/10000_blocks.bedpe

And there is no output file ( 10000_blocks.bedpe) in this path.
Just to let you know it is a poreC run

Thanks


--
antariksh

Muhammad Saad Shamim

unread,
Feb 9, 2021, 5:41:25 PM2/9/21
to Antariksh Tyagi, Neva Durand, 3D Genomics
Hey Antariksh,

Have you looked at the hic file in Juicebox? 
It seems to be the case that the map is too sparse for Arrowhead to find any statistically significant domains.

Best,


Antariksh Tyagi

unread,
Feb 9, 2021, 6:10:56 PM2/9/21
to Muhammad Saad Shamim, Neva Durand, 3D Genomics
Hi,

Yes, I can view the file in Juicebox. Here is screen grab
image.png

Thanks
--
antariksh

Muhammad Saad Shamim

unread,
Feb 9, 2021, 6:15:12 PM2/9/21
to Antariksh Tyagi, Neva Durand, 3D Genomics
What does it look like at 5kb for a particular region?

Antariksh Tyagi

unread,
Feb 9, 2021, 6:37:20 PM2/9/21
to Muhammad Saad Shamim, Neva Durand, 3D Genomics
Ok, so here are some for random regions on Chr 1
image.png

image.png

image.png

Let me know if you want me to pull up any other region/resolution

--
antariksh

Antariksh Tyagi

unread,
Feb 11, 2021, 10:41:44 AM2/11/21
to Muhammad Saad Shamim, Neva Durand, 3D Genomics
Hello Muhammad Saad and Neva,

What would you suggest seeing these matrices? Do they look too sparse?

Thanks
Ant
--
antariksh

alan zhou

unread,
Sep 14, 2023, 12:19:09 PM9/14/23
to 3D Genomics
Hey,
 I am getting the same issue. I was trying to go over all the conversations here regarding this 0 domain found in Arrowhead. Can't find any method to solve this. My data has around 6 M contact. The inter_30.hic file is larger than 10gb in size but still get no output for the arrowhead. I did ignore-sparsity still the same output. Was able to see the contacts using Juicer visualization. 
Another concern is I tried the example data HIC003_R1.fastq.gz. Still get empty output for arrowhead. I am wondering if there is a standard output that we can use to compare our installation for the HIC003 demo data. 
Thanks
Reply all
Reply to author
Forward
0 new messages