Hi,
I am new to 3D-DNA and need a little help on some final tweaking! I am assembling a plant genome that is known to be ~330 Mb with 8 chromosomes. I used hifiasm to do the initial assembly which had a BUSCO completeness of 98.5% for the Embryophyta, and ran juicer to align my HiC data to the contigs. At a colleague’s suggestion, I ran 3D-DNA with the parameters --editor-saturation-centile 10 --editor-coarse-resolution 100000 --editor-coarse-region 400000 --editor-repeat-coverage 50. (We did this in order to try to reduce false-positive edits from misjoin detection, and it definitely worked better than using the default parameters which had produced a very fragmented assembly.)
In the resulting final assembly, you can visually see the 8 chromosomes—however, the assembler is erroneously grouping the contigs together into larger chromosomes. I am planning to use JBAT to manually adjust the chromosome boundaries later, but I suspect that 3D-DNA can do a better job first before I manually curate, if I can just figure out which parameters to tweak? Also, there are some low coverage (repetitive?) sequences that aren't included in the final assembly but I wonder if they could be part of the chromosomes, especially the centromeres or telomeres. I found your (Olga’s) response to Colin here useful but did not understand what you meant by “shoot-outs” associated with the centers of the chromosomes, I would love to have that clarified.
I am attaching the final map with the coverage tracks loaded, and with the assembly shown. I am also attaching the .0.hic map with the coverage tracks loaded, and with the other annotations loaded.
Please let me know if you have any tips for how I can deal with the mis-partitioning of chromosomes and the low-coverage sequences. Thank you very much!!
Best,
Arielle
Hi Olga,
Thanks so much for the quick response!
I had read the Genome Assembly Cookbook and the supplement to Dudchenko et al 2017 but was still having trouble. As a novice, I had some difficulty with the explanations of the parameters. For example, the only explanation for --editor-coarse-stringency in the Cookbook is “misjoin editor stringency parameter. Default: 55”, then in the SI for Dudchenko et al 2017, the only mention of stringency comes in an equation: “we annotate a locus as a putative misjoin whenever the misjoin score for that locus satisfies [equation], where k is an arbitrary stringency parameter such that 0 ≤ k < 1”. This is super confusing to me as 55 is definitely not between 0 and 1 so I am not sure what the –editor-coarse-stringency parameter is actually modifying! And I definitely have no clue as to how it might change the final assembly. Do you know where I could find a more intuitive explanation of what the parameters change?
Are there any parameters that do affect the chromosome boundaries?
Thanks very much!
Best,
Arielle