G'day,
I have a draft
assembly 202 Mb from Illumina reads and have recently received Phase
genomics HiC sequence files for the purpose of scaffolding draft genome
assembly to chromosome level.
I am trying
to move ahead with
command line protocol to create HiC files in juicer. The first step of
running jucier.sh, I have to generate the chrom.sizes files
for my reference genome. My draft assembly is at contig level and
not at chromosome level, therefore I cannot create a Chr# and length
file. Can you please suggest how I predict the individual chromosome
size. I know the number of chromosomes (9).
There is one similar post in the google-forum but it suggests to provide chrom.sizes file in the below format:
Chr_seq seq_length(1000000)
My confusion is, as my draft assembly is at contig level, I dont have predicted chromosome sizes of my assembly, how does one go about producing chrom.sizes file for a contig level assembly?
Can I use a 11,000 line chrom.sizes file which has length of my 11,000 contigs?
My query is, if I generate length of my contigs as below
contig1 10000
contig2 12340
contig3 189090
.
.
contig11000 20090
Is this acceptable?
Any help and suggestions is really appreciated.
Many thanks in advance.
Regards
Shilpi