Juicer query regarding chrom.sizes file preparation

565 views
Skip to first unread message

Shilpi Singh

unread,
May 4, 2022, 7:34:15 PM5/4/22
to 3D Genomics
G'day,

I have a draft assembly 202 Mb from Illumina reads and have recently received Phase genomics HiC sequence files for the purpose of scaffolding draft genome assembly to chromosome level.

I am trying to move ahead with command line protocol to create HiC files in juicer. The first step of running jucier.sh, I have to generate the chrom.sizes files for my reference genome. My draft assembly is at contig level and not at chromosome level, therefore I cannot create a Chr# and length file. Can you please suggest how I predict the individual chromosome size. I know the number of chromosomes (9).

I followed the protocol at https://www.dnazoo.org/methods.
There is one similar post in the google-forum but it suggests to provide chrom.sizes file in the below format:

Chr_seq    seq_length(1000000)

My confusion is, as my draft assembly is at contig level, I dont have predicted chromosome sizes of my assembly, how does one go about producing chrom.sizes file for a contig level assembly?

Can I use a 11,000 line chrom.sizes file which has length of my 11,000 contigs?

My query is, if I generate length of my contigs as below

contig1                 10000
contig2                 12340
contig3                 189090
.
.
contig11000         20090

Is this acceptable?

Any help and suggestions is really appreciated.

Many thanks in advance.

Regards
Shilpi

Shilpi Singh

unread,
May 4, 2022, 7:34:28 PM5/4/22
to 3D Genomics
G'day,

I have a draft assembly 202 Mb from Illumina reads and have recently received Phase genomics HiC sequence files for the purpose of scaffolding draft genome assembly to chromosome level.

I am trying to move ahead with command line protocol to create HiC files in juicer. The first step of running jucier.sh, I have to generate the chrom.sizes files for my reference genome. My draft assembly is at contig level and not at chromosome level, therefore I cannot create a Chr# and length file. Can you please suggest how I predict the individual chromosome size. I know the number of chromosomes (9).

As my draft assembly is at contig level, I cannot have 9 lines corresponding to chromosome name and seq length in my chrom.sizes file. I can produce the sequence length for my contig which will be chrom.sizes file of 11,000 line for my ~11,000contigs.
Please let me know how to move ahead to produce a chrom.sizes file for a contig level draft assembly with 11,000 contigs.

Shilpi Singh

unread,
May 4, 2022, 7:34:38 PM5/4/22
to 3D Genomics
G'day,

I have a draft assembly 202 Mb from Illumina reads and have recently received Phase genomics HiC sequence files for the purpose of scaffolding draft genome assembly to chromosome level.

I am trying to move ahead with command line protocol to create HiC files in juicer. The first step of running jucier.sh, I have to generate the chrom.sizes files for my reference genome. My draft assembly is at contig level and not at chromosome level, therefore I cannot create a Chr# and length file. Can you please suggest how I predict the individual chromosome size. I know the number of chromosomes (9).

My question is , what goes as the chrom.sizes file for a contig level draft assmebly with 11,000 contigs (a chrom.sizes file of 11,000 lines)

Olga Dudchenko

unread,
May 11, 2022, 10:17:54 AM5/11/22
to 3D Genomics
Hello Shilpi,

i've responded to this elsewhere. To reiterate, you do not need to pass chrom.sizes when running Juicer for assembly purposes. Depending on the Juicer version you should either use -S early [coupled or not with --assembly] or -p assembly, as described in the Cookbook. This will result in Juicer stopping after the creation of the merged_nodups.txt file, a file that contains deduplicated Hi-C alignments, that is the input to 3D-DNA. The chrom.sizes file is necessary if one wants Juicer to build the contact maps. For assembly, all visualizations will be handled by 3D-DNA and do not require a chrom.sizes file.

Best,
Olga  

Shilpi Singh

unread,
May 13, 2022, 3:55:12 AM5/13/22
to 3d-ge...@googlegroups.com
Hi Olga, 

Many thanks for getting back. My purpose is scaffolding of my Contig level Illumina assembly using HiC contact map data. Therefore for scaffolding, please suggest.

Regards
Shilpi

--
You received this message because you are subscribed to the Google Groups "3D Genomics" group.
To unsubscribe from this group and stop receiving emails from it, send an email to 3d-genomics...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/3d-genomics/0416c656-2c5c-4b54-bc16-c639e57515a0n%40googlegroups.com.

Olga Dudchenko

unread,
May 13, 2022, 4:22:23 PM5/13/22
to 3D Genomics
Depending on the Juicer version you should either use -S early [coupled or not with --assembly] or -p assembly, as described in the Cookbook. There is no need to pass the chrom.sizes file. Best, -Olga
Reply all
Reply to author
Forward
0 new messages