Assembly size reduced after Sealer

Skip to first unread message


Jul 7, 2017, 2:07:22 PM7/7/17
to ABySS

I am running Sealer on the large genome (1.4 GB) assembly. I am able to complete the run without any issues. However, I see that the assembly size after sealer is decreased by 4.6 MB and so is the length of the largest contig. I am not sure how Sealer is working in the background that reducing the assembly size. I was expecting it to fill the N's in the assembly. I thought sealer may be joining the contigs, but then the number of contigs before and after are same. Can anyone please explain the reason for assembly size reduction after sealer?

An input to Sealer    - 293,653 scaffolds in 1.468 GB with maximum contig size 1.068 MB.
An output by Sealer - 293,653 scaffolds in 1.462 GB with maximum contig size 1.067 MB


Ben Vandervalk

Jul 7, 2017, 2:17:33 PM7/7/17
to Sagar, ABySS
Hi Sagar,

Sealer fills in gaps (sequences of 'N's) in the input assembly contigs/scaffolds, replacing the N's with the real sequence.  It does not join contigs/scaffolds together, so the number of input and output contigs/scaffolds should always be the same.

Usually, gap sizes are estimated by assemblers/scaffolders, so that there may be some error w.r.t. to the true gap lengths.  Sealer corrects the gap lengths as it is filling in the sequences in the contigs.  Most likely, the reduction in overall assembly size is because the gap sizes were overestimated by the upstream assembler/scaffolder.


- Ben

You received this message because you are subscribed to the Google Groups "ABySS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
For more options, visit


Jul 10, 2017, 9:07:39 AM7/10/17
to ABySS,
Thank you very much for the quick reply. This makes perfect sense. 
To unsubscribe from this group and stop receiving emails from it, send an email to
Reply all
Reply to author
0 new messages