Hi Olga and 3D-DNA team,
I'm revisiting this from January... I recently acquired high quality PacBio HiFi CCS reads for my contig-level draft assembly (assembled using hifiasm with Hi-C mode), which worked better than the previous nanopore+illumina assembly. I used the same Hi-C sequencing information with Juicer on this new assembly, and used the default settings on 3d-dna to scaffold.
Here are some estimated characteristics about my genome:
Size: 2n = 1.85Gb (diploid); the current assembly is a haploid genome
Chromosome Count: 2n = 52 (diploid)
Here is the output from inter.txt (from Juicer):
Sequenced Read Pairs: 542,109,179
Normal Paired: 273,878,423 (50.52%)
Chimeric Paired: 113,699,827 (20.97%)
Chimeric Ambiguous: 142,711,672 (26.33%)
Unmapped: 11,819,257 (2.18%)
Ligation Motif Present: 161,328,676 (29.76%)
Alignable (Normal+Chimeric Paired): 387,578,250 (71.49%)
WARN [2021-05-19T10:13:34,181] [Globals.java:138] [main] Development mode is enabled
Unique Reads: 322,278,283 (59.45%)
PCR Duplicates: 64,860,844 (11.96%)
Optical Duplicates: 439,123 (0.08%)
Library Complexity Estimate: 1,022,358,395
Intra-fragment Reads: 14,719,719 (2.72% / 4.57%)
Below MAPQ Threshold: 158,503,171 (29.24% / 49.18%)
Hi-C Contacts: 149,055,393 (27.50% / 46.25%)
Ligation Motif Present: 39,622,702 (7.31% / 12.29%)
3' Bias (Long Range): 62% - 38%
Pair Type %(L-I-O-R): 25% - 25% - 25% - 25%
Inter-chromosomal: 90,961,595 (16.78% / 28.22%)
Intra-chromosomal: 58,093,798 (10.72% / 18.03%)
Short Range (<20Kb): 37,998,143 (7.01% / 11.79%)
Long Range (>20Kb): 20,075,680 (3.70% / 6.23%)
For some reason, after running the 3d-dna pipeline in default mode ('./run-asm-pipeline.sh ../hifiasm.asm.hic.p_ctg.fa ../juicer/aligned/merged_nodups.txt'), I seem to get two very different Hi-C maps between the "0" vs. "rawchrom"... Do you happen to know what might be going on here, and which parameters I can tune to get a better scaffolding result?
Any insights would be super useful!
Thank you again,
Colin