editing .assembly files

159 views
Skip to first unread message

Michel Moser

unread,
Jun 16, 2020, 9:17:48 AM6/16/20
to 3D Genomics
Hi, 

I have mapped hic data to our chromosome-level genome assembly and would like to add contig information to the assembly file. 
I wrote a short wrapper script which can transform a bed file of contig locations into a .assembly-like file. 

Unfortunately, there are some issues with JuiceBox reading  the file. I must miss something about the structure, which Juicebox expects.

I created fragments from the inital chromosomes, like: 

inital assembly file: 

>ssa01 1 174673756
>ssa09 2 160669013
>ssa10 3 126124147
>ssa13 4 114504115
>ssa11 5 111966310

modified assembly file: 

>ssa01:::fragment_1 1 129591
>ssa01:::fragment_2 2 2857436
>ssa01:::fragment_3 3 781246
>ssa01:::fragment_4 4 44728362
>ssa01:::fragment_5 5 56399376
>ssa01:::fragment_6 6 1104360
>ssa01:::fragment_7 7 362161
>ssa01:::fragment_8 8 1212759
>ssa01:::fragment_9 9 3047345


I made sure identifiers (second column) appeared only once and total length added up to the original sizes. 


Could you advise me on which input characteristics JuiceBox assembly files have to have for a given .hic file? 
I could imagine other people having interestin in such back-and-forth transition between assembly and bed files as well and would be happy to share the code once it works. 

By just manually manipulating an assembly file, i can make JuiceBox display pretty much everything (see screenshot below), so its just a matter of finding the semantics =) 


original                                                                                                                             edited


Thank you, 
Michel

 

Olga Dudchenko

unread,
Jun 17, 2020, 1:47:53 PM6/17/20
to 3D Genomics
Hey Michel,

I am not entirely sure I follow what you are trying to do here. Are there no gaps between your contigs? Note that there is a generate-gap-bed or something in the utils, you can use that to create a gap bed, and there should be a bed-to-annotations and then edit assembly based on annotations. 

Cut in relevant places manually in JBAT and export to see the proper annotation for the procedure.

Olga 

Michel Moser

unread,
Jun 18, 2020, 7:11:53 AM6/18/20
to 3D Genomics
Dear Olga, 

Ok, that was probably not clearly framed by me. 

In my .hic file, the scaffolds are actually already scaffolded chromosomes and i would like to see if disagreements between hic-contacts and those chromosomes are co-located with underlying contig boarders of those scaffolded chromosomes. 

Therefore i would like to display contig boarders in addition to the chromosome-like scaffolds. 
I can easily transform the scaffolds into chromosomes by editing the assembly file and importing it as a modified assembly file. But i fail to add scaffold to them by dividing them into fragments. 

i also tried to load 2S annotation bedpe files but they seem to disapear as soon as i load an assembly on top of it. I tried to move the annotation to the top, but still they don't show up. 
Is this expected behaviour ? 
I run on windows with juicebox 1.11.08 
 
Could you point me to the bed-to-annotations file, i could not find it and i might find the solutions there. 

Thank you, 
Michel





Reply all
Reply to author
Forward
0 new messages