Duplication between the contig edges

11 views
Skip to first unread message

whchung

unread,
Jan 13, 2012, 6:32:04 AM1/13/12
to ABySS
Dear,

The latest version (1.3.2) of ABySS appealed to me to show good result
for one of my microbial genome project.

While in depth analysis of the contigs I found the ends of the contigs
were duplicated each other; from hundreds to thousands.

I think it is caused by that each contig is extended too much.

I have no idea to how reduce such duplication.

Any explanation or suggestion will be very helpful to my job.

Best,

Won-Hyong Chung
Korean Bioinformation Center, South Korea

Shaun Jackman

unread,
Jan 24, 2012, 5:12:28 PM1/24/12
to whchung, ABySS
Hi Won-Hyong,

With a microbial assembly, ABySS will output lots of duplicated
overlapping sequence. You have a couple options. The first is to not use
the paired-end assembly. The file ${name}-3.fa should have much less
duplication, as overlapping contigs will overlap by exactly k-1 bp.
Next, you can increase the parameter s, which will result in smaller
contigs but less duplication. Finally, the file ${name}-contigs.dot
indicates which contigs overlap and by how much.

Sorry for the delay in my reply. Google Groups stopped sending me email
notifications of messages waiting for moderation.

Cheers,
Shaun

Reply all
Reply to author
Forward
0 new messages