Abyss assembly got stuck in generating adjency while running denovo whole genome assembly of about 300gb data

17 views
Skip to first unread message

Manmohan Pandey

unread,
Jun 26, 2015, 11:50:49 AM6/26/15
to abyss...@googlegroups.com
I am running denovo whole genome assembly using Illumina NextSeq paired-end data of about 300 gb for a eukaryotic genome of approx 1.45 gb(genome size). My server configuration is 512gb RAM and 7.5 tb hard disc with 64 processor. My problem is that the assembly got stuck in the step 'generating adjency' from the past 30-35 days. I have started the assembly on 7th of May and its around 48 days since its running. Now I am little bit confused whether to abort the assembly or should I continue. Kindly guide me. I am not using MPI option since we don't have cluster in our machine. I would like to know is there any process or code to enhance the speed of assembly.

Ben Vandervalk

unread,
Jun 26, 2015, 12:49:04 PM6/26/15
to Manmohan Pandey, abyss...@googlegroups.com

Hi @way2manmohan,

Wow... more than 30 days!  Something is wrong for sure.  (I have assembled human genomes with the MPI version, and the first stage typically takes about 24 hours.)

@cyril-cros (on BioStar post https://www.biostars.org/p/148276/#148355) may be right about running out of RAM.  It is good advice to watch the process in "top" or "htop".  By "stuck", do you mean that you are no longer seeing any new messages in the log output?  It would be helpful to post your full log output to a github gist and link to it here or on your BioStar post.  Also, please enable the verbose option, if possible (add "v=-v" to the abyss-pe command line).

It is possible to use the MPI version even on a single machine, and it will speed things up greatly.  When using the non-MPI version, you are only using 1 of 64 available cores.  To use the MPI version you just need to have a recent OpenMPI library installed on your system (and also the OpenMPI development headers.)  If you have a system administrator, they should be able to do that for you very easily.

- Ben


On Thu, Jun 25, 2015 at 10:07 PM, Manmohan Pandey <way2ma...@gmail.com> wrote:
I am running denovo whole genome assembly using Illumina NextSeq paired-end data of about 300 gb for a eukaryotic genome of approx 1.45 gb(genome size). My server configuration is 512gb RAM and 7.5 tb hard disc with 64 processor. My problem is that the assembly got stuck in the step 'generating adjency' from the past 30-35 days. I have started the assembly on 7th of May and its around 48 days since its running. Now I am little bit confused whether to abort the assembly or should I continue. Kindly guide me. I am not using MPI option since we don't have cluster in our machine. I would like to know is there any process or code to enhance the speed of assembly.

--
You received this message because you are subscribed to the Google Groups "ABySS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to abyss-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages