Dear develop team:
Thanks so much for the development and maintenance for juicer community.
I've encountered an issue while processing public HiC datasets, specifically with the Genome ID inside .hic format.
For instance, in the public dataset GM12878.hic, the Genome ID is set to:
Genome ID: /var/lib/cwl/stgd70fe0e1-9845-41d7-b59b-e2b2dd1ef6a9/4DNFI823LSII.chrom.sizes
This path, which seems to be defined by the dataset creators, is causing an error in hic-emt when trying to fetch the chrom.sizes file. Code and error message is as follows:
java -Xmx50g -jar hic_emt_1.10.2.jar excise -r 5000 --subsample 1246601193 --cleanup \
GM12878_alt_dilution_hic.hic ./GM12878_ds_50pct GM12878_ds_50pct
java.io.FileNotFoundException: /var/lib/cwl/stgd70fe0e1-9845-41d7-b59b-e2b2dd1ef6a9/4DNFI823LSII.chrom.sizes (No such file or directory)
at java.base/java.io.FileOutputStream.open0(Native Method)
at java.base/java.io.FileOutputStream.open(FileOutputStream.java:289)
at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:230)
at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:118)
at emt.main.Excision.writeOutCustomCDS(Excision.java:72)
at emt.main.Excision.buildTempFiles(Excision.java:41)
at emt.main.FileBuildingMethod.tryToBuild(FileBuildingMethod.java:57)
at emt.clt.tools.Excise.run(Excise.java:71)
at emt.Tools.main(Tools.java:75)
May I know is there a way to modify the Genome ID to commonly used genomes or other path to make these public datasets easier to process? Thanks so much.
Best,
Elaine