Tweaking parameters for 330 Mb plant Hi-C run

159 views
Skip to first unread message

Arielle Johnson

unread,
Jul 2, 2021, 1:24:46 PM7/2/21
to 3D Genomics

Hi,

I am new to 3D-DNA and need a little help on some final tweaking!  I am assembling a plant genome that is known to be ~330 Mb with 8 chromosomes.  I used hifiasm to do the initial assembly which had a BUSCO completeness of 98.5% for the Embryophyta, and ran juicer to align my HiC data to the contigs.  At a colleague’s suggestion, I ran 3D-DNA with the parameters --editor-saturation-centile 10 --editor-coarse-resolution 100000 --editor-coarse-region 400000 --editor-repeat-coverage 50.  (We did this in order to try to reduce false-positive edits from misjoin detection, and it definitely worked better than using the default parameters which had produced a very fragmented assembly.)

In the resulting final assembly, you can visually see the 8 chromosomes—however, the assembler is erroneously grouping the contigs together into larger chromosomes.  I am planning to use JBAT to manually adjust the chromosome boundaries later, but I suspect that 3D-DNA can do a better job first before I manually curate, if I can just figure out which parameters to tweak?  Also, there are some low coverage (repetitive?) sequences that aren't included in the final assembly but I wonder if they could be part of the chromosomes, especially the centromeres or telomeres.  I found your (Olga’s) response to Colin here useful but did not understand what you meant by “shoot-outs” associated with the centers of the chromosomes, I would love to have that clarified.

I am attaching the final map with the coverage tracks loaded, and with the assembly shown.  I am also attaching the .0.hic map with the coverage tracks loaded, and with the other annotations loaded. 

Please let me know if you have any tips for how I can deal with the mis-partitioning of chromosomes and the low-coverage sequences.   Thank you very much!!

Best, 

Arielle

step0.AlexParameters.HiCImage.withCoverage.pdf
step0.AlexParameters.HiCImage.withAnnotations.pdf
AlexParameters.HiCImage.withCoverage.pdf
AlexParameters.HiCImage.withAssembly.pdf

Olga Dudchenko

unread,
Jul 6, 2021, 2:01:09 AM7/6/21
to 3D Genomics
Hi Arielle,

To be able to better adjust the parameters I suggest you look into the logic of how they work. For this you might want to either take a look into the genome assembly cookbook (would be in chapter 4) or in the supplement to Dudchenko et al., 2017. Once you take a look you should be able to guide the choice of your parameters. Right now I cannot say whether those parameters are good or bad for you. I can say that none of them seem to have anything to do with figuring out the chromosome boundaries, and more or less just turn off the misjoin detection. If this was your goal you could have done this with -r 0 also.

Best,
Olga

Arielle Johnson

unread,
Jul 6, 2021, 8:59:24 AM7/6/21
to 3D Genomics

Hi Olga,

Thanks so much for the quick response! 

I had read the Genome Assembly Cookbook and the supplement to Dudchenko et al 2017 but was still having trouble.  As a novice, I had some difficulty with the explanations of the parameters.  For example, the only explanation for --editor-coarse-stringency in the Cookbook is “misjoin editor stringency parameter. Default: 55”, then in the SI for Dudchenko et al 2017, the only mention of stringency comes in an equation: “we annotate a locus as a putative misjoin whenever the misjoin score for that locus satisfies [equation], where k is an arbitrary stringency parameter such that 0 ≤ k < 1”.  This is super confusing to me as 55 is definitely not between 0 and 1 so I am not sure what the –editor-coarse-stringency parameter is actually modifying!  And I definitely have no clue as to how it might change the final assembly.  Do you know where I could find a more intuitive explanation of what the parameters change?   

Are there any parameters that do affect the chromosome boundaries?

Thanks very much!

Best,

Arielle 

Reply all
Reply to author
Forward
0 new messages