Hi ZH,
Sorry if this was answered already, but the final assembly you are looking for is the unitigs.fa file. The bubbles.fa and indel.fa contain variant sequences, which were removed to make the overall assembly more contiguous.
Cheers,
Tony
________________________________________
From:
abyss...@googlegroups.com [
abyss...@googlegroups.com] On Behalf Of ZH [
zh9...@gmail.com]
Sent: Monday, July 22, 2013 12:48 PM
To:
abyss...@googlegroups.com
Cc: ZH
Subject: Re: ABYSS single-end assembly result
Sorry, I forgot to thank you for the useful message.
By the way, in the result files I got, there are several .fa files with the names bubbles.fa, indel.fa and unitigs.fa. The uniting.fa is the final result, right? The reads I used to do the assembly are hundreds even thousands length, actually they are not the reads, but the part genomes of a species. So how can I get the whole assembly sequence from the result? In the result, the sequences are piece and piece.
ZH
On Wednesday, July 17, 2013 4:58:00 PM UTC-6, Ben Vandervalk wrote:
Hi ZH,
Kmer coverage is an approximation of read coverage based on kmers (sequences of length k). It is the "number of read kmers per contig kmer".
The formula for kmer coverage is:
Ck = sum (multiplicity(kmer_i)) / (L - k + 1)
where
Ck = kmer coverage
multiplicity(kmer_i) = number of times that kmer occurs in a read
L = length of contig
k = kmer size
and the sum from i = 1 to i = (L - k + 1)
I made another error in my previous reply -- the third number is actually not KMER_COVERAGE, but the denominator of the formula above, i.e. sum(multiplicity(kmer_i))
Kmer coverage is a useful approximation for read coverage because it can be computed without aligning the reads to the assembly.
- Ben
To unsubscribe from this group and stop receiving emails from it, send an email to
abyss-users...@googlegroups.com<javascript:>.