SOAP Scaffolding Inconsistencies

72 views
Skip to first unread message

kaboroevich

unread,
Sep 2, 2012, 9:19:31 PM9/2/12
to bgi-...@googlegroups.com
Hello all,

Sorry if this is a repost. I was sure I posted this last week but I cannot find the original using google groups search.

I am getting some conflicting results for the SOAPdenovo scaffolding process. I found that if I use the  ouput of the prepare (http://soap.genomics.org.cn/down/prepare.tgz) tool, I get different, arguably better, results.
After error correction (http://soap.genomics.org.cn/down/correction.tar.gz), I am left with 242,857,233 reads in 113 libraries. Running pregraph and contig with a kmer of 49, I generate “original.contig”. Following the normal steps, I ran map and scaff resulting in “original.scafSeq”.  I also copied “original.contig” to a new directory and ran “prepare –g rescaf –K 49 –c original.contig”. Running map and scaf on rescaf.contig results in the file rescaf.scafSeq. The following is a summary of the assemblies.

File              Contigs  Total_length Longest   N50
original.contig   2459270     144845786    5254    54
rescaf.contig     2459270     144845786    5254    54
original.scafSeq    15289       4964403    5254   532
rescaf.scafSeq       9875       5416552   36403  2579

Does anyone know the explanation for this difference? The only difference between “original.contig” and “rescaf.contig” is the lack of “cvg_X.X_tip_X” in the latter. If any information on the accompanying files is needed, I can provide that as well.

Regards, Keith
Reply all
Reply to author
Forward
0 new messages