Error when generating hic file

612 views
Skip to first unread message

不息淑

unread,
Sep 5, 2023, 11:51:40 PM9/5/23
to 3D Genomics
hello teachers:
        When I use the latest version of juicer_tools.2.20.00.jar to generate the hic file I got the following error

 WARN [2023-09-05T14:37:07,686]  [Globals.java:138] [main]  Development mode is enabled
Using 200 CPU thread(s) for primary task
Using 10 CPU thread(s) for secondary task
No mndIndex provided
Using single threaded preprocessor
Not including fragment map
Start preprocess
Writing header
Writing body
..
Writing footer
nBytesV5: 91158102
masterIndexPosition: 12502261790

Finished preprocess

Calculating norms for zoom BP_312500java.lang.NullPointerException
        at juicebox.data.iterator.ListOfListIterator.hasNext(ListOfListIterator.java:44)
        at juicebox.data.iterator.IteratorContainer.getNumberOfContactRecords(IteratorContainer.java:54)
        at juicebox.data.iterator.ListOfListIteratorContainer.getIsThereEnoughMemoryForNormCalculation(ListOfListIteratorContainer.java:56)
        at juicebox.tools.utils.norm.NormalizationCalculations.<init>(NormalizationCalculations.java:59)
        at juicebox.tools.utils.norm.GenomeWideNormalizationVectorUpdater.getWGVectors(GenomeWideNormalizationVectorUpdater.java:167)
        at juicebox.tools.utils.norm.GenomeWideNormalizationVectorUpdater.updateHicFileForGWfromPreAddNormOnly(GenomeWideNormalizationVectorUpdater.java:132)
        at juicebox.tools.utils.norm.NormalizationVectorUpdater.updateHicFile(NormalizationVectorUpdater.java:159)
        at juicebox.tools.clt.old.AddNorm.launch(AddNorm.java:83)
        at juicebox.tools.clt.old.PreProcessing.run(PreProcessing.java:185)
        at juicebox.tools.HiCTools.main(HiCTools.java:97)

My command line is:
java -Xms500000m -Xmx700000m -jar juicer_tools.2.20.00.jar pre -q 30 -r 312500,125000,62500,31250,12500,6250,3125,1250,625,125 ../temp.ST.contigs.0.asm_mnd.txt ST.contigs.0.hic test.chrom.sizes --threads 200

My input file looks like this:
$ head ../temp.ST.contigs.0.asm_mnd.txt
0 assembly 858708677 0 16 assembly 858708657 1 30 23S127M R1 33 11S139M R2 Id1 Id2
0 assembly 858708675 0 16 assembly 858708657 1 34 7S137M R1 34 7S137M R2 Id1 Id2
0 assembly 858708680 0 16 assembly 858708652 2 22 150M R1 43 141M9S R2 Id1 Id2
0 assembly 858708678 0 16 assembly 858708652 2 35 29S121M R1 42 150M R2 Id1 Id2
0 assembly 858708677 0 16 assembly 858708646 2 35 26S124M R1 32 150M R2 Id1 Id2
0 assembly 858708677 0 16 assembly 858708638 2 35 26S124M R1 8 150M R2 Id1 Id2
0 assembly 858708676 0 16 assembly 858708653 2 34 17S133M R1 43 150M R2 Id1 Id2
0 assembly 858708678 0 0 assembly 858701042 160 32 150M R1 60 54M2D96M R2 Id1 Id2
0 assembly 858708679 0 0 assembly 858697218 268 0 47M103S R1 60 150M R2 Id1 Id2
0 assembly 858708674 1 16 assembly 858708654 2 43 150M R1 43 150M R2 Id1 Id2

$ cat test.chrom.sizes
assembly       2068158601

test.chrom.sizes is based on the "bash ${juicebox} pre -q ${mapq} ${add_options} ${remapped_mnd} ${genomeid}${mapqsuf}.hic <(echo "assembly "$ ((totlength / scale)))” calculated

In addition, my contig-level genome is very fragmented, with about 10,000 ctg, and the genome size is 16G.

Can anyone help me out, it'd be much appreciated



Olga Dudchenko

unread,
Sep 8, 2023, 3:39:47 PM9/8/23
to 3D Genomics
Consider using the 3d-dna visualize script built for the purpose. -Olga

不息淑

unread,
Sep 24, 2023, 10:10:54 PM9/24/23
to 3D Genomics
Ok thanks a lot, I did solve the problem with the latest version of run-asm-visualizer.sh, my command was
/software/3d-dna-201008/visualize/run-asm-visualizer.sh -r 2500000,1000000,500000,250000,100000,50000 -q 30 -m temp.ST.contigs.0.asm_mnd.txt ST.contigs .0.cprops ST.contigs.0.assembly merged_nodups.txt

The log file is as follows:
:) -r flag was triggered, starting calculations for resolution list: 2500000,1000000,500000,250000,100000,50000
:) -q flag was triggered, starting calculations for 30 threshold mapping quality
:) Skipping remap step and using temp.ST.contigs.0.asm_mnd.txt as premapped input
...Building track files
:( Assembly file does not match cprops file. Exiting!
...Building the hic file
Sep 11, 2023 2:20:38 AM java.util.prefs.FileSystemPreferences$1 run
INFO: Created user preferences directory.

Not including fragment map
Start preprocess
Writing header
Writing body
..
Writing footer

Finished preprocess
HiC file version: 8

Calculating norms for zoom BP_156250
Calculating norms for zoom BP_62500
Calculating norms for zoom BP_31250
Calculating norms for zoom BP_15625
Calculating norms for zoom BP_6250
Calculating norms for zoom BP_3125
Writing expected
Writing norms
Finished writing norms

It can indeed be imported into the latest version of juicerbox for image adjustment. Thank you again.

Olga Dudchenko

unread,
Oct 10, 2023, 6:23:45 PM10/10/23
to 3D Genomics
You are using the asm visualizer with the assembly file. This is causing the error. Use either the run-assembly w assembly file or run-asm with cprops/asm (older version)

Nasreen Bano

unread,
Dec 18, 2023, 4:59:40 PM12/18/23
to 3D Genomics
Hi all,

When I'm using the latest version of juicer_tools.2.20.00.jar to generate the hic file I'm getting below errors:

WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.

WARN [2023-12-18T11:04:30,987]  [Globals.java:138] [main]  Development mode is enabled

Using 1 CPU thread(s) for primary task

Using 10 CPU thread(s) for secondary task

Start preprocess

Writing header

Writing body

............java.lang.ArrayIndexOutOfBoundsException: Index -1 out of bounds for length 23

at juicebox.data.ChromosomeHandler.getChromosomeFromIndex(ChromosomeHandler.java:314)

at juicebox.tools.utils.original.MatrixPP.<init>(MatrixPP.java:73)

at juicebox.tools.utils.original.Preprocessor.writeBody(Preprocessor.java:756)

at juicebox.tools.utils.original.Preprocessor.preprocess(Preprocessor.java:452)

at juicebox.tools.clt.old.PreProcessing.run(PreProcessing.java:176)

at juicebox.tools.HiCTools.main(HiCTools.java:97)


My command is:

/appl/juicer-1.6/scripts/juicer_tools pre -r 500,1000,2500,5000,10000,15000,25000,50000,100000,250000,500000,1000000 merged_nodups.txt YY1FF.hic sizes2.genome -f mm10_GATC_GANTC.txt

My input file is:

head merged_nodups.txt

0 chr1 3000148 0 16 chr1 3000446 2 60 43S108M TCTGTATGCTTCCCTGCAGAGCTCTCACCTGCTTAATAAGATCGATCCTGATTTAGCTTTGGTACCTGGTATCTGTCTAGGAAGTTGTCCATTTCATCCAGGTTTTCCTGGTTTTTTTTTAGTATAGCCTTTCATAGTAGAATCTGATGAT 60 151M GATGTTTTTGATATCCTCATGTTCTGTTGTTATGTCTCCTTTTTCATTTCTGATTTTGTTAATTATAGTACAGTCCCTATGCCCTCTAGTTAGTCTGGCTAAGGGTTTATCTATCTTGTTGACTTTCTCAAAGAACCAGCTACTAGTTTGG VH00824:3:AAATCGGHV:1:2506:44173:1303/1 VH00824:3:AAATCGGHV:1:2506:44173:1303/2 

0 chr1 3000162 0 16 chr1 3000493 3 60 29S122M AGCACATCACTTTCATCGATGGACAGATCGATCCTGATTTAGCTTTGGTACCTGGTATCTGTCTAGGAAGTTGTCCATTTCATCCAGGTTTTCCTGGTTTTTTTTTAGTATAGCCTTTCATAGTAGAATCTGATGATGTTTTTGATTTCCT 60 150M TCTGATTTTGTTAATTATAGTACAGTCCCTATGCCCTCTAGTTAGTCTGGCTAAGGGTTTATCTATCTTGTTGACTTTCTCAAAGAACCAGCTACTAGTTTGGTTGATTCTTTGAATATTTCTTTTTGTTTCCACTTGGTTGATTTCAGC VH00824:3:AAATCGGHV:2:2412:39988:5222/1 VH00824:3:AAATCGGHV:2:2412:39988:5222/2 

0 chr1 3000166 0 16 chr1 3000475 3 60 25S126M TTTTGTTAAAACTAAGTTTATGATCGATCCTGATTTAGCTTTGGTACCTGGTATCTGTCTAGGAAGTTGTCCATTTCATCCAGGTTTTCCTGGTTTTTTTTTAGTATAGCCTTTCATAGTAGAATCTGATGATGTTTTTGATATCCTCATG 60 150M TATGTCTCCTTTTTCATTTCTGATTTTGTTAATTATAGTACAGTCCCTATGCCCTCTAGTTAGTCTGGCTAAGGGTTTATCTATCTTGTTGACTTTCTCAAAGAACCAGCTACTAGTTTGGTTGATTCTTTGAATATTTCTTTTTGTTTC VH00824:3:AAATCGGHV:2:2405:33323:54670/2 VH00824:3:AAATCGGHV:2:2405:33323:54670/1 

head sizes2.genome

chr1 195471971

chr2 182113224

chr3 160039680

chr4 156508116

chr5 151834684

chr6 149736546

chr7 145441459

chr8 129401213

chr9 124595110

chr10 130694993


Any help would be highly appreciated.

Thank you,

Nasreen


Reply all
Reply to author
Forward
0 new messages