Resolution <5kb?

1,312 views
Skip to first unread message

Anthony D'Ippolito

unread,
Feb 29, 2016, 11:08:43 AM2/29/16
to 3D Genomics
Hello,

I had a few questions about the resolution settings for Juicer/HiCCCUPS: 

1. Are there any visualization options for resolutions <5kb? 

2. It doesn't appear that HiCCUPS has an option for calling loops at <5kb resolution. Do you know if it is effective to call loops at a higher resolution, or does the 5kb option already reach the limits of the assay in respect to calling the shortest possible interaction distance?

Thanks for your help!

Tony

Suhas Rao

unread,
Feb 29, 2016, 12:00:51 PM2/29/16
to 3D Genomics
Hi Tony,

re: 1. Currently there aren't any fixed-bin visualization options for resolutions <5kb (although you can view maps at fragment-delimited resolutions by clicking on the 'Resolution' header in Juicebox to toggle. For MboI/DpnII, that would go down to ~400bp for single fragment resolution). We may add fixed-bin resolutions to Juicebox at a later date.

re: 2. There is no inherent reason that HiCCUPS can't be performed at resolutions <5kb other than the fact that they aren't generated when creating the .hic file. If we do end up adding higher resolutions to .hic files in Juicebox, HiCCUPS would be able be performed at those higher resolutions. However, keep in mind that significant coverage would be needed to make loop calls at <5kb resolution (i.e. ~billions of contacts). In addition, since you mentioned that you were interested in making loop calls at the shortest possible interaction distance, keep in mind that the hardest place to make loop calls is very close to the diagonal of a proximity ligation map (between two fragments very close together in 1D), because the interaction background from random polymer fluctuations is so high. As you get closer and closer to the diagonal, the ratio of looping signal to random polymer noise becomes lower and lower, so counterintuitively, you can often be sure of looping interactions farther away from the diagonal well before you can be certain of looping interactions between loci that are very close together in 1D. This is, of course, true of any proximity ligation experiment (3C, 4C, 5C, etc.)

Hope that helps,
Suhas

Anthony D'Ippolito

unread,
Mar 2, 2016, 4:31:10 PM3/2/16
to 3D Genomics
Hey Suhas,

Thanks for the info. If I wanted to give loop calling a try at lower resolutions, do you have any suggestions on how I might modify the code for .hic file creation and HiCCUPs loop calling to achieve this? For example, if there are specific segments of code I should focus on, or any obvious dependency/resource/performance issues that might arise?

Thanks for your help!
Tony 

Suhas Rao

unread,
Mar 3, 2016, 11:10:40 PM3/3/16
to 3D Genomics
Hey Tony,

If you adjust the static array bpBinSizes (and bpBinSizeNames) to add higher resolutions, that should be sufficient. Nothing needs to be changed in the HiCCUPS code, you'll just need to make sure to pass all the parameters because defaults aren't set. 

Cheers,
Suhas

Anthony D'Ippolito

unread,
Mar 7, 2016, 9:15:44 AM3/7/16
to 3D Genomics
Hey Suhas,

Thanks for the info. I had a question about preparing .hic files to view data at restriction fragment resolution. I'm a little confused about the format of the input data, and the format of the fragment file used with the "-f" option for the juicebox "pre" command. Just to clarify: 

The fragment file should have 1 line per chromosome, starting with the chromosome name, then a tab-delimited list of each chromosome coordinate where the enzyme cuts, ending with the chromosome size, in numerical order.

For the input file, should the "frag" field be N, designating the Nth restriction fragment determined by the cut sites in the fragment file? If so, is this fragment number in the context of each chromosome, or the entire genome? In other words, does the fragment numbering start over for each chromosome?

Thanks for all your help,
Tony 

Neva Durand

unread,
Mar 8, 2016, 9:59:45 AM3/8/16
to Anthony D'Ippolito, 3D Genomics
Hi Tony,

Your description of the fragment file is correct, though it doesn't need to be tab delimited (though that will work fine).

Each chromosome starts a new fragment number, so the fragment number in the input file is with respect to that chromosome (i.e. starts over for each chromosome).  You should be able to determine easily with the chromosome, position, and restriction site file what the fragment number is.  This is what the perl script "fragment.pl" does.

Best
Neva

--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/71495daa-15d3-4eaa-83de-56f63bc7f4aa%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Neva Cherniavsky Durand, Ph.D.
Staff Scientist, Aiden Lab

Anthony D'Ippolito

unread,
Mar 17, 2016, 10:23:32 AM3/17/16
to 3D Genomics
Hey Suhas,

I'm trying to figure out the cause of some of the errors I'm getting when running HiCCUPs (I'm using a script with the modifications you suggested to call loops at lower resolutions). Here is the output:

Fri Mar 11 10:06:27 EST 2016
Reading file: file.hic
HiC file version: 8
Running HiCCUPS for resolution 4000
2% 
4% 
6% 
8% 
10% 
12% 
14% 
16% 
18% 
20% 
22% 
25% 
27% 
29% 
31% 
33% 
35% 
37% 
39% 
41% 
43% 
45% 
47% 
50% 
52% 
54% 
56% 
58% 
60% 
62% 
64% 
66% 
68% 
70% 
72% 
75% 
77% 
79% 
81% 
83% 
85% 
87% 
89% 
91% 
93% 
95% 
97% 
100% 
Running HiCCUPS for resolution 5000
2% 
4% 
6% 
8% 
10% 
12% 
14% 
16% 
Data not available for 4 at 5000 resolution
18% 
20% 
22% 
25% 
27% 
29% 
31% 
33% 
35% 
37% 
39% 
41% 
43% 
45% 
47% 
50% 
52% 
54% 
56% 
58% 
60% 
62% 
64% 
66% 
Data not available for 4 at 5000 resolution
68% 
70% 
72% 
75% 
77% 
79% 
81% 
83% 
85% 
87% 
89% 
91% 
93% 
95% 
97% 
100% 
Running HiCCUPS for resolution 10000
2% 
4% 
6% 
8% 
10% 
12% 
14% 
16% 
18% 
20% 
22% 
25% 
27% 
29% 
31% 
33% 
35% 
37% 
39% 
41% 
43% 
45% 
47% 
50% 
52% 
54% 
56% 
58% 
60% 
62% 
64% 
66% 
68% 
70% 
72% 
75% 
77% 
79% 
81% 
83% 
85% 
87% 
89% 
91% 
93% 
95% 
97% 
100% 
24_24 key not found for centroids. NN. Possible error?
Centroid: [10_10, 15_15, 21_21, 9_9, 19_19, 22_22, 16_16, 5_5, 11_11, 6_6, 12_12, 3_3, 23_23, 2_2, 17_17, 1_1, 7_7, 13_13, 8_8, 20_20, 14_14, 18_18]
Actual: [15_15, 10_10, 21_21, 9_9, 19_19, 16_16, 22_22, 5_5, 11_11, 6_6, 12_12, 3_3, 23_23, 2_2, 1_1, 17_17, 7_7, 13_13, 4_4, 24_24, 8_8, 20_20, 14_14, 18_18]
4_4 key not found for centroids. NN. Possible error?
Centroid: [10_10, 15_15, 21_21, 9_9, 19_19, 22_22, 16_16, 5_5, 11_11, 6_6, 12_12, 3_3, 23_23, 2_2, 17_17, 1_1, 7_7, 13_13, 8_8, 20_20, 14_14, 18_18]
Actual: [15_15, 10_10, 21_21, 9_9, 19_19, 16_16, 22_22, 5_5, 11_11, 6_6, 12_12, 3_3, 23_23, 2_2, 1_1, 17_17, 7_7, 13_13, 4_4, 24_24, 8_8, 20_20, 14_14, 18_18]
7547 loops written to file: file.loops
HiCCUPS complete
Fri Mar 11 18:42:34 EST 2016



Thanks for your help,
Tony

Muhammad Shamim

unread,
Jun 29, 2016, 8:42:20 PM6/29/16
to 3D Genomics
(Answered via email; posting response to forum as well)

Also, this appears to be printout from an outdated jar.
Please use the latest jar from:

---------- Forwarded message ----------
From: Muhammad Saad Shamim
Date: Thu, May 5, 2016 at 3:53 PM
Subject: Re: Resolution <5kb?
To: Anthony D'Ippolito


Hey Anthony,

My sincere apologies for the late response.
Were you already able to resolve this issue?

In short, that was an old warning message left in the code which warns if enriched pixels are found at one resolution but not another.
We'll certainly edit the description to be more helpful.

The 4_4 key wasn't found because of the earlier output (Data not available for 4 at 5000 resolution).
You may want to double check that you are able to view chromosome 4 at balanced (KR) normalization using Juicebox - the hic file may not have built completely.

The 24_24 key likely is due to a small number of loops found on chromosome Y at one resolution, but not another.

Please let me know if there is anything else I can help with.

Richard Gill

unread,
Mar 12, 2018, 6:08:21 PM3/12/18
to 3D Genomics
Hello,
Would you please elaborate on how to adjust bpBinSizes and bpBinSizeNames? They don't seem to be allowable flags in hiccups.

Thanks,

Richard

Muhammad Saad Shamim

unread,
Mar 12, 2018, 7:08:08 PM3/12/18
to Richard Gill, 3D Genomics
Hi Richard,

You'd need to build the hic file first using your custom resolutions with the -r flag in pre:

You can then specify those custom resolutions and custom parameters for them with the hiccups flags (specifically the -r f i d p flags):

No need to edit bpBinSizes or bpBinSizeNames.

​Best,​

--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/cb5ad773-7bfd-4ffe-a671-9a03cd330e54%40googlegroups.com.

Richard Gill

unread,
Mar 12, 2018, 7:29:07 PM3/12/18
to Muhammad Saad Shamim, 3D Genomics
Hi Muhammad,

1) Is there a way to do this using Juicer? I'm running things on AWS. And if the new .hic file can be created from deduped reads, please let me know.

2) What hiccups flags would you recommend for 2-kb bins?

Thanks,

Richard

Muhammad Saad Shamim

unread,
Mar 12, 2018, 7:39:46 PM3/12/18
to Richard Gill, 3D Genomics
Sure,

1) Juicer Tools is the java jar used in juicer. Usage is in the previous link; you can call 
java -jar juicer_tools.jar pre ... [takes the merged_nodups.txt file as input]
java -jar juicer_tools.jar hiccups ... [flags/parameters]

​2) We haven't use 2kb resolution before, so you may need to experiment a bit.
You can also try extrapolating from our existing defaults in the prior link. 
Defaults:
-r 5000,10000,25000 
-f .1,.1,.1 
-p 4,2,1 
-i 7,5,3 
-d 20000,20000,50000 

Maybe try~:
-r 2000,5000,10000,25000 
-f .1,.1,.1,.1 
-p 7,4,2,1 
-i 10,7,5,3 
-d 20000,20000,20000,50000 ​

​Let us know if any bugs or issues are encountered.​

Muhammad Saad Shamim

unread,
Mar 12, 2018, 8:13:49 PM3/12/18
to Richard Gill, 3D Genomics
Actually p should be 10 or 12 for 2kb. See VI.a.4. of Rao and Huntley et al. 2014 for additional description of how the parameters were selected.

Richard Gill

unread,
Mar 13, 2018, 10:19:12 AM3/13/18
to Muhammad Saad Shamim, 3D Genomics
Hi Muhammad,

1) Pre worked:
java -jar /opt/juicer/scripts/juicer_tools.7.0.jar pre -r 2000 -q 30 aligned/merged_nodups.txt test_2K.hic hg38
Not including fragment map
Start preprocess
Writing header
Writing body
......................................................................................................................................................................................................................................................................................................................................
Writing footer

Finished preprocess

Calculating norms for zoom BP_2000
Writing expected
Writing norms
Finished writing norms

2) HICCUPS did not work:
java -jar /opt/juicer/scripts/juicer_tools.7.0.jar hiccups -m 1024 -r 2000 -k KR -f .1 -p 10 -i 10 -d 20000 test_2K.hic test_hiccups/
Reading file: test_2K.hic
Discarding invalid configuration: Config res: 2000 peak: 10 window: -1 fdr: 10% radius: 20000
No valid configurations specified, using default settings
Unable to assess map sparsity; continuing with HiCCUPS
Default settings for 5kb, 10kb, and 25kb being used
Running HiCCUPS for resolution 5000
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.elementData(ArrayList.java:422)
at java.util.ArrayList.get(ArrayList.java:435)
at juicebox.data.Dataset.getZoom(Dataset.java:240)
at juicebox.data.Dataset.getZoomForBPResolution(Dataset.java:244)
at juicebox.tools.clt.juicer.HiCCUPS.runHiccupsProcessing(HiCCUPS.java:389)
at juicebox.tools.clt.juicer.HiCCUPS.run(HiCCUPS.java:359)
at juicebox.tools.HiCTools.main(HiCTools.java:98)

Please let me know if I should tweak anything. When I add the other lower resolutions you provided, it says they are not available, because I didn't specify them in the Pre step.

Thanks,

Richard

Suhas Rao

unread,
Mar 13, 2018, 1:29:32 PM3/13/18
to 3D Genomics
Richard, 

You can't set -p and -i to the same value. Read section VI.a.4 of the Experimental Procedures of Rao and Huntley et al, 2014 to understand what the different parameters are. 

I would try setting -p to 10 and -i to 20. 

Cheers,
Suhas

Muhammad Saad Shamim

unread,
Mar 13, 2018, 1:32:20 PM3/13/18
to Richard Gill, 3D Genomics
Also the .hic file should have all the required resolutions, not just 2000.

pre ... -r 2000,5000,10000,25000,50000,100000,500000,2500000 ...

Muhammad Saad Shamim

unread,
Mar 13, 2018, 1:33:12 PM3/13/18
to Richard Gill, 3D Genomics
(unless you're only running at 2kb and ignoring loop calls at lower resolutions)

Richard Gill

unread,
Mar 13, 2018, 2:07:17 PM3/13/18
to Muhammad Saad Shamim, 3D Genomics
OK thanks very much Suhas and Muhammad.

Richard

Richard Gill

unread,
Mar 16, 2018, 9:28:09 AM3/16/18
to 3D Genomics
Hi, when I run HICCUPS at 2000,5000,10000,25000 (-f .1,.1,.1,.1 -p 10,4,2,1 -i 20,7,5,3 -d 20000,20000,20000,50000)

The job finishes but I get these messages:

Data not available at 3 at 2000 resolution
Data not available at 3 at 5000 resolution

I don't get these messages at the lower resolutions. Is this due to the parameters or the data? I tried a p of 12 at 2-kb and the error persisted.

Please advise.

Thanks,

Richard

Neva Durand

unread,
Mar 16, 2018, 9:31:40 AM3/16/18
to Richard Gill, 3D Genomics
Check that you have normalization vectors at that resolution (can check via dump or by looking in Juicebox).


For more options, visit https://groups.google.com/d/optout.



--

Richard Gill

unread,
Mar 16, 2018, 9:53:18 AM3/16/18
to 3D Genomics
OK it looks like the KR algorithm failed to converge for chr3 at the higher resolutions. The solution would be to run HICCUPS with VC normalization instead?

Thanks,

Richard
Richard, 

Suhas
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.

Neva Durand

unread,
Mar 16, 2018, 10:03:32 AM3/16/18
to Richard Gill, 3D Genomics
Yes, exactly.

To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/3d06e76f-0d30-4d90-a1c1-1b66f5e7ffc7%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

xian...@gmail.com

unread,
Sep 8, 2023, 10:16:30 PM9/8/23
to 3D Genomics
Hi Neva,

Could you please provide a recommended setting for 1kb resolution (-r 1000) using hiccups? Thanks a lot!

I used:
-k KR -r 1000 -f 0.1 -i 16 -d 2500 -p 22
I got an error message below:
Discarding invalid configuration: Config res: 1000 peak: 22 window: 16 fdr: 10% radius: 2500

Best,
Xiang

Reply all
Reply to author
Forward
0 new messages