Juicer tools Pre not fully compatible with 4DN DCIC format

146 views
Skip to first unread message

Charles Vejnar

unread,
Apr 6, 2022, 11:25:13 PM4/6/22
to 3D Genomics
Hi,

Thanks for developing Juicer tools. I am trying to use Pre to convert pairs in the 4DN DCIC format as described in [1] with a command like:

java -Xms512m -Xmx2048m -jar juicer_tools.jar pre -f restriction_sites_MboI.txt aln.pairs aln.hic chrom_length.tab

However it fails immediately:

```
java.lang.ArrayIndexOutOfBoundsException: 3
        at juicebox.tools.utils.original.mnditerator.ComplexLineParser.generateBasicPair(ComplexLineParser.java:56)
        at juicebox.tools.utils.original.mnditerator.MNDFileParser.parseDCICFormat(MNDFileParser.java:118)
        at juicebox.tools.utils.original.mnditerator.MNDFileParser.parse(MNDFileParser.java:83)
        at juicebox.tools.utils.original.mnditerator.GenericPairIterator.advance(GenericPairIterator.java:56)
        at juicebox.tools.utils.original.mnditerator.GenericPairIterator.next(GenericPairIterator.java:46)
        at juicebox.tools.utils.original.Preprocessor.computeWholeGenomeMatrix(Preprocessor.java:587)
        at juicebox.tools.utils.original.Preprocessor.writeBody(Preprocessor.java:674)
        at juicebox.tools.utils.original.Preprocessor.preprocess(Preprocessor.java:436)
        at juicebox.tools.clt.old.PreProcessing.run(PreProcessing.java:165)
        at juicebox.tools.HiCTools.main(HiCTools.java:94)
```

when aln.pairs file contains:

```
## pairs format v1.0
#chromsize: chr1 249250621
#columns: readID chr1 pos1 chr2 pos2 strand1 strand2
EAS139:136:FC706VJ:2:2104:23462:197393 chr1 10000 chr1 20000 + +
EAS139:136:FC706VJ:2:8762:23765:128766 chr1 50000 chr1 70000 + +
```

But it doesn't fail with:

```
## pairs format v1.0
#columns: readID chr1 pos1 chr2 pos2 strand1 strand2
EAS139:136:FC706VJ:2:2104:23462:197393 chr1 10000 chr1 20000 + +
EAS139:136:FC706VJ:2:8762:23765:128766 chr1 50000 chr1 70000 + +
```

[Examples are copy-pasted from [1] as example]

As described in [1] having the #chromsize (and other lines to describe the pairs) is part of the specification.

Could please help with this problem? Is Pre not expecting the chromsize (etc) headers? Thanks.

Best,
Charles

[1] https://github.com/4dn-dcic/pairix/blob/master/pairs_format_specification.md
Reply all
Reply to author
Forward
0 new messages