hicpro2juicebox question

1,028 views
Skip to first unread message

TH

unread,
Dec 8, 2016, 2:15:29 PM12/8/16
to HiC-Pro

Hi Nicolas,

I've been a big fan of HiC-pro pipeline, which is super fast and reliable. Recently, I'm interested in using some of cool functions in Juicebox command line tool, e.g. peak calling APA. Firstly, I tried a lazy way - just use the hicpro2juicbox script to convert the validpairs file from HiC-pro to Juicebox's XXX.HiC file. It looks good as I visualize the contact matrix in Juicebox, but I couldn't proceed to other applications like dump, APA, etc. I got the errors like this:

"
HiC file version: 8
Exception in thread "main" java.lang.NullPointerException
at juicebox.tools.clt.old.Dump.extractChromosomeRegionIndices(Dump.java:512)
at juicebox.tools.clt.old.Dump.readArguments(Dump.java:412)
at juicebox.tools.HiCTools.main(HiCTools.java:82)

"

Any idea to solve it? My colleague suggests me redo all the works by Juicer.... you know it would pain in the ass... In addition, Juicer seems not compatible to our Institute's clusters. Thanks!

-TH

nservant

unread,
Dec 8, 2016, 2:38:43 PM12/8/16
to HiC-Pro
Hi,
Thank you very much for using HiC-Pro.
And to be honnest, I never tried to use the juicebox functions with the HiC-Pro output ... but I will.
Could you please give me an example of command that you run on the .hic file, so that I could have a try and investigate what's going on ?
I do not know if Juicebox provide some test .hic file. Did you try to run the same command on a "pure" Juicer hic file ?
Thank you very much
Nicolas

nservant

unread,
Dec 8, 2016, 4:29:18 PM12/8/16
to HiC-Pro
Hi again,

I did some test and it work well.
First I generated the .hic file using the HiC-Pro output :

>>~/Apps/HiC-Pro_2.7.9/bin/utils/hicpro2juicebox.sh -i test_results/hic_results/data/dixon_2M/dixon_2M_allValidPairs -g ~/Apps/HiC-Pro_2.7.9/annotation/chrom_hg19.sizes -j /bioinfo/local/build/juicebox/juicebox_clt_1.4.jar -r ~/Apps/HiC-Pro_2.7.9/annotation/HindIII_resfrag_hg19.bed

I also shared this file here :
http://zerkalo.curie.fr/partage/HiC-Pro/

Then I run Juicebox tools.
>>java -jar /bioinfo/local/build/juicebox/juicebox_clt_1.4.jar dump observed NONE ./dixon_2M_allValidPairs.hic chr1 chr1 BP 1000000 test.txt
HiC file version: 8

It works well ...
>>head test.txt
0    0    1.0
1000000    1000000    4.0
1000000    2000000    3.0
2000000    2000000    1.0
2000000    3000000    2.0
3000000    3000000    5.0
2000000    4000000    1.0
4000000    4000000    6.0
2000000    5000000    1.0
3000000    5000000    1.0

>>java -jar /bioinfo/local/build/juicebox/juicebox_clt_1.4.jar arrowhead -c chr1 -m 2000 -r 40000 -k NONE ./dixon_2M_allValidPairs.hic domains.txt
Reading file: ./dixon_2M_allValidPairs.hic
HiC file version: 8
HiC contact map is too sparse to run Arrowhead, exiting.

Well, here it failed but this is just because my test data are too small. Otherwise it should work.
So here are my points :
* Juicebox and the command line tools can be independantly installed. I would suggest to reinstall the ctl part.
* You should also try the exemple in their website. They should work !
http://aidenlab.org/commandlinetools/docs.html
* Which version of the hicpro2juicebox.sh are you using ? It would be nice to regenerate the hic file with the last version.
* In the same way, which version of HiC-Pro are you using ? could you check wether you have all the require field in the allValipPairs file ? ie.

>>head  test_results/hic_results/data/dixon_2M/dixon_2M_allValidPairs
SRR400264.167192    chr1    802423    +    chr1    807061    -    381    HIC_chr1_209    HIC_chr1_211    37    37   
SRR400264.42007    chr1    802814    -    chr7    25699822    +    306    HIC_chr1_210    HIC_chr7_7361    42    42   
SRR400264.148854    chr1    815430    +    chr4    134939390    +    320    HIC_chr1_212    HIC_chr4_41057    0    37   
SRR400264.148022    chr1    1497612    +    chr1    41645460    +    365    HIC_chr1_288    HIC_chr1_9208    42    37   

Nicolas

TH

unread,
Dec 8, 2016, 4:48:43 PM12/8/16
to HiC-Pro

Hi Nicolas,

I've been dumb! I didn't type "chr" with "chrX" (they claimed we don't need to type it on the website...)!! Now it works beautifully! Thank you very much!
I just want to try their arrowhead and APA functions. I've tried HOMER's peakcalling and hist function, but the result looks weird. Do you have any plan to incorporate these functions in Pro (PowerPro!!)? That would be sweet! Don't need to convert file formats again and again...

Anyway, it works now!

Best,
TH

nservant

unread,
Dec 9, 2016, 4:15:55 AM12/9/16
to HiC-Pro
Hi,
Yes, take care of the chromosome names.
Actually, when you generate the .hic file, you can specify the organism to Juicebox (hg19, mm9, etc.). In this case, it will use its own annotation file.
However, we realized a few months ago that according to the genome, the chromosome names used by Juicebox are not the same. Sometimes you have chr1, chr2, chr3, ... and for other species, you have 1,2,3,4.
I guess they are not using the same annotation (Ensembl or NCBI) for all genome ...

That's why now, the hicpro2juicebox.sh utils uses the HiC-pro annotation (-g chromosome_size) option. It ensures that the chromosome names that you have in your .hic file, are the one used by HiC-Pro and described in the chrom.size file.

Regarding your point in adding downstream analysis tools in HiC-Pro, this is currently not planned ;)
My idea was to develop a tool for data processing and quality controls only. One tool for one task !
The recent changes that I did was more to make HiC-Pro compatible with other protocols or C based techniques such as DNAse Hi-C, micro-C, capture Hi-C, etc.
And other tools start to be compatible with HiC-Pro outputs such as the R package HiTC, and the HiCplotter visualization package.

If you developed other tools to convert HiC-Pro format, let me know, I would be happy to add them in the utilities ;)
Best
N



Reply all
Reply to author
Forward
0 new messages