Juicebox - adjusting chromosome boundaries from very fragmented assembly file

680 views
Skip to first unread message

Trypdal

unread,
Mar 13, 2019, 12:20:14 PM3/13/19
to 3D Genomics
Hi

I'm having some issues finding the best way to adjust my chromosome boundaries in Juicebox. I have read section 'chromosome boundaries' (p.38 of the tutorial) but I think I would still need some help if that's ok.

So the .fasta assembly file (and related final.assembly file) I'm using as a starting point for my manual correction in Juicebox is extremely fragmented. 

This, if I understand correctly, translates to a Juicebox representation in which the blue square annotations are much smaller than chromosome size, where by 'chromosome size' I mean the square portrayed by the Hi-c data. The following  screenshot should give an idea of what I'm talking about:

hicisample.png


So my understanding on how to proceed, so far, has been as follows: I manually zoom in at the point of contact of all these minuscule blue squares, move around the mouse pointer a bit until the cursor changes into an angle, click. This will cause the two adjacent blue squares to join, forming a larger blue square.


I repeat the above until the blue square perfectly overlaps the chromosome-wide signal produced by Hi-C. When I have completed this, I will save the assembly and get a fasta file out of it.


I am having two issues with this:


1. The sheer number of blue squares to adjust. Is there a way to achieve this sooner? Would there be a setting to play with in the upstream 3d-dna pipeline to make these squares bigger and closer in size and numbers, hopefully, to the Hi-c derived chromosomal squares? It is probably useful to add that I do have a reference genome for this 3d-aided assembly project, and I know the sequences of the reference chromosomes. Would there be a way to feed in the 3d-dna pipeline the expected number and size of chromosome to have a better starting set of blue squares?


2. It seems that, no matter how hard I try, some blue box boundaries just won't disappear. In other words for some boundaries I don't get the angle-shaped cursor. Eg here's an example at max magnification

Screenshot_2019-03-13_16-58-07.png

Am I doing something wrong? How to deal with these instances?


Thanks so much for your awesome work on this!



Olga Dudchenko

unread,
Mar 15, 2019, 2:08:26 AM3/15/19
to 3D Genomics
Hi Trypdal,

There are several ways to address this.

1) [recommended] address this in 3D-DNA: the step that is responsible for splitting scaffolded output into chromosomes is the split step. All files related to this step have the suffix .split. I'd suggest loading the relevant map and the associated split.mismatch_wide.bed and split.mismatch_wide.wig tracks to help investigate why there are so many breaks. (The most likely reason is that the .wig track isn't saturated: perhaps the map is relatively sparse at the resolution by defaults examined for this step). Tweaking the --spliter options (use ./run-asm-pipeline.sh --help for full is of options) will help you get the track to how it is supposed to be (see page 28 of cookbook as example). Use -s|--step option to fast-track to split part of the pipeline.

2) [address this in JBAT]: this is as you have already experienced going to be tricky. E.g. the problem you are experiencing with angle not appearing is because what you are looking at is not in fact two scaffold. It's three: if you select the problematic region, you'll see three entries in the tooltip text on the right. The middle one is just too small to be seen (smaller than bin size). The best way if you want to go this route is to clone Juicebox from github and run from dev tool: the main branch contains some options that have not yet made it to the app like "Remove chr boundaries" in the selection that can be of help in this scenario.

3) [address this manually]: you can open the .assembly file and remove newline characters which mark the blue boundaries.

Thanks for the kind words,

Olga

Trypdal

unread,
Mar 15, 2019, 7:10:52 AM3/15/19
to 3D Genomics
Hi Olga

Thanks for this. I started tinkering with the .assembly file and seems like I got this to look good. Finding the right line and joining two adjacent lines as you suggested in 3. did the job. In the future I guess I will keep 1. in mind. Thanks!
Reply all
Reply to author
Forward
0 new messages