abyss-1.5.2 problem with memory

Luca Cozzuto

unread,

Oct 3, 2014, 6:35:39 AM10/3/14

to abyss...@googlegroups.com

Dear all,

I noticed that abyss is consuming much more RAM than in past...

I'm using the same command line and the same dataset and with abyss 1.3.7 I managed to assemble with a peak of 40G, while with the 1.5.2 I was unable to assemble with 120G available.

Any hint on this?

Thank you

Luca

Anthony Raymond

unread,

Oct 3, 2014, 5:36:30 PM10/3/14

to Luca Cozzuto, abyss...@googlegroups.com

Hi Luca,

I can't think of any reason why this would happen. Please send be the log output of both assemblies.

Thanks,

Tony

--
You received this message because you are subscribed to the Google Groups "ABySS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to abyss-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Anthony Raymond

unread,

Oct 3, 2014, 5:41:22 PM10/3/14

to Luca Cozzuto, abyss...@googlegroups.com

Just wanted to add that the log should be generated with the verbose option of abyss-pe. This will show memory usage of many of the programs in abyss. In case you didn't run the assemblies with this option enabled, please rerun like so:

abyss-pe v=-v …

Thanks,

Tony

Luca Cozzuto

unread,

Oct 6, 2014, 7:29:56 AM10/6/14

to abyss...@googlegroups.com, lucac...@gmail.com

Dear Tony,

I'm running the two versions of abyss with 120 Gb and 16 CPUs.

here a "tail" from the log before the program crashes (1.5.2):

==> test.log <==

0: Read 8900000 reads. 0: Hash load: 232351264 / 536870912 = 0.433 using 8.09 GB

0: Read 8700000 reads. 0: Hash load: 229847765 / 536870912 = 0.428 using 8 GB

0: Read 8900000 reads. 0: Hash load: 232351264 / 536870912 = 0.433 using 8.09 GB

0: Read 8700000 reads. 0: Hash load: 229847765 / 536870912 = 0.428 using 8 GB

0: Read 8100000 reads. 0: Hash load: 221936385 / 536870912 = 0.413 using 7.74 GB

0: Read 8900000 reads. 0: Hash load: 232351264 / 536870912 = 0.433 using 8.09 GB

0: Read 8200000 reads. 0: Hash load: 223297651 / 536870912 = 0.416 using 7.78 GB

0: Read 8700000 reads. 0: Hash load: 229847765 / 536870912 = 0.428 using 8 GB

while here a tail of the log (old version 1.3.7) that is still running (but I already know it will finish)

==> test_old.log <==

0: Read 111600000 reads. 0: Hash load: 35823087 / 268435456 = 0.133 using 1.4 GB

1: Read 109300000 reads. 1: Hash load: 35238609 / 268435456 = 0.131 using 1.38 GB

0: Read 111700000 reads. 0: Hash load: 35831868 / 268435456 = 0.133 using 1.4 GB

1: Read 109400000 reads. 1: Hash load: 35247089 / 268435456 = 0.131 using 1.38 GB

0: Read 111800000 reads. 0: Hash load: 35840588 / 268435456 = 0.134 using 1.4 GB

1: Read 109500000 reads. 1: Hash load: 35254985 / 268435456 = 0.131 using 1.38 GB

0: Read 111900000 reads. 0: Hash load: 35849269 / 268435456 = 0.134 using 1.4 GB

1: Read 109600000 reads. 1: Hash load: 35263042 / 268435456 = 0.131 using 1.38 GB

0: Read 112000000 reads. 0: Hash load: 35857394 / 268435456 = 0.134 using 1.4 GB

1: Read 109700000 reads. 1: Hash load: 35271415 / 268435456 = 0.131 using 1.38 GB

Do you need something more?

Thanks again

Luca

Luca Cozzuto

unread,

Oct 7, 2014, 5:34:26 AM10/7/14

to abyss...@googlegroups.com

Dear all,

it looks like a memory problem since every process arrive to ask for 8 GB ram and more... while previous versions no...

example:

0: Read 8700000 reads. 0: Hash load: 229847765 / 536870912 = 0.428 using 8 GB

any hint on this?

Thanks

Anthony Raymond

unread,

Oct 7, 2014, 2:36:58 PM10/7/14

to Luca Cozzuto, abyss...@googlegroups.com

Hi Luca,

Sorry for not replying yesterday, I was away sick.

How exactly (commands and all) did you compile both ABySS versions? Please send me the commands you used to run both ABySS versions too.

Thanks,

Tony

On 2014-10-07, at 2:34 AM, Luca Cozzuto wrote:

Dear all,
it looks like a memory problem since every process arrive to ask for 8 GB ram and more... while previous versions no...
example:

0: Read 8700000 reads. 0: Hash load: 229847765 / 536870912 = 0.428 using 8 GB

any hint on this?

Thanks

On Friday, October 3, 2014 12:35:39 PM UTC+2, Luca Cozzuto wrote:
Dear all,
I noticed that abyss is consuming much more RAM than in past...
I'm using the same command line and the same dataset and with abyss 1.3.7 I managed to assemble with a peak of 40G, while with the 1.5.2 I was unable to assemble with 120G available.

Any hint on this?

Thank you
Luca

Luca Cozzuto

unread,

Oct 8, 2014, 3:56:30 AM10/8/14

to abyss...@googlegroups.com, lucac...@gmail.com

Hi Toni,

take care! :)

here my configuration that is the same for both versions

./configure --prefix=$HOME/abyss --with-mpi=/usr/include/openmpi-x86_64/ CPPFLAGS=-I/software/bi/el6.3/sparsehash-2.0.2/include --enable-maxk=96

(I also used 63 as maxk but I had the same problem)

then the command line:

abyss-pe v=-v name=test k=23 n=10 np=16 l=30 ALIGNER_OPTIONS='-k24' lib='pe1' pe1='fastq1.fq fastq2.fq' > test.log

I tried several k values for reducing the memory...

Thanks again,

Luca

Anthony Raymond

unread,

Oct 9, 2014, 2:00:10 PM10/9/14

to Luca Cozzuto, abyss...@googlegroups.com

OK, I had a hypothesis that this was related to the maxk, but that's not the case.

Please send me the whole test.log (in a private email if you'd like).

Thanks,

Tony

marc

unread,

Dec 15, 2014, 2:57:52 PM12/15/14

to abyss...@googlegroups.com, lucac...@gmail.com

Hello,

Just wondering, was this problem ever resolved?

Anthony Raymond

unread,

Dec 15, 2014, 4:05:40 PM12/15/14

to marc, abyss...@googlegroups.com, lucac...@gmail.com

Hi Marc,

Not that I'm aware of. Are you seeing the same issue?

Thanks,

Tony

marc

unread,

Dec 15, 2014, 4:43:42 PM12/15/14

to abyss...@googlegroups.com, mto...@asu.edu, lucac...@gmail.com

I am having what I think is a similar issue with both versions 1.3.7 and 1.5.2.

To compare contig assemblies of different kmers, I generally run the unitig step (using mpirun -np annd ABYSS-P) on many cores across a few WestmereEP computers with 96GB RAM each, and when they are finished I run abyss-fac to obtain N50 statistics and such. For paired-end assembly, I run abyss-pe with the best performing kmer value. I have access to a large shared memory computer with Intel(R) Xeon(R) CPU X7560 @ 2.27GHz and 1TB of RAM. However, during the Hash Table loading process, it seems to take a very long time, which I know it should, but it also seems it is using very little of the available RAM so I am wondering why that could be.

This is the command I run for abyss-pe:

/home/mtollis/genome_assembly/abyss-1.3.7/bin/abyss-pe np=32 k=63 E=0 s=200 n=3 v=-v l=27 name=Aapl20A_k63 C=/scratch/mtollis/Aapl_v3.0/genome/ABYSS/unitig/63 lib='pe180_L2 pe180_L8 pe1kb pe653 pe500' pe180_L2="$PE_180_L2_R1 $PE_180_L2_R2" pe180_L8="$PE_180_L8_R1 $PE_180_L8_R2" pe1kb="$pe1kb_R1 $pe1kb_R2" pe653="$pe653_R1 $pe653_R2" pe500="$pe500bp_R1 $pe500bp_R2" se="$se180_L2_flash $se180_L2_notcomb $se180_L2_S1 $se180_L2_S2 $se180_L8_flash $se180_L8_notcomb $se180_L8_S1 $se180_L8_S1 $se180_L8_S2 $se1kb_S1 $se1kb_S2 $se1kb_S3 $se653kb_S1 $se653kb_S2 $se653kb_S3 $se500bp_S1 $se500bp_S2 $se500bp_S3"

Here is some of the output after it successfully builds the .dist files for two libraries:

Reading from standard input...

Reading `Aapl20A_k63-3.fa'...

Using 1.82 GB of memory and 86.2 B/sequence.

Reading `Aapl20A_k63-3.fa'...

Building the suffix array...

Building the Burrows-Wheeler transform...

Building the character occurrence table...

Read 3.97 GB in 21151601 contigs.

Using 36.7 GB of memory and 9.24 B/bp.

Read 7 alignments. Hash load: 3 / 5 = 0.6 using 545 kB.

Read 10 alignments. Hash load: 6 / 11 = 0.545455 using 545 kB.

Read 484344 alignments. Hash load: 12 / 23 = 0.521739 using 545 kB.

Read 1000000 alignments. Hash load: 2 / 23 = 0.0869565 using 545 kB.

Read 2000000 alignments. Hash load: 2 / 23 = 0.0869565 using 545 kB.

Read 3000000 alignments. Hash load: 2 / 23 = 0.0869565 using 545 kB.

Read 4000000 alignments. Hash load: 4 / 23 = 0.173913 using 545 kB.

Read 5000000 alignments. Hash load: 2 / 23 = 0.0869565 using 545 kB.

Read 6000000 alignments. Hash load: 4 / 23 = 0.173913 using 545 kB.

Read 7000000 alignments. Hash load: 4 / 23 = 0.173913 using 545 kB.

Read 8000000 alignments. Hash load: 0 / 23 = 0 using 545 kB.

Is there any way to know why it is only using 545kb of RAM?

Marc

Anthony Raymond

unread,

Dec 15, 2014, 7:14:50 PM12/15/14

to marc, abyss...@googlegroups.com, lucac...@gmail.com

Hi Marc,

That log isn't very clear to read. This is because abyss-map and abyss-fixmate are both running at the same time during the log snippet you've attached, and memory is being reported for both. The line "Using 36.7 GB of memory and 9.24 B/bp." is from abyss-map, which is reporting the size of the FM-index of Aapl20A_k63-3.fa in memory. The lines like "Read 8000000 alignments. Hash load: 0 / 23 = 0 using 545 kB." are from abyss-fixmate, which is reporting the amount of memory needed to pair up alignments. abyss-fixmate is using little memory because the read pairs were either interleaved in a single file, or are being given in two separate files.

I see that you have many pe libraries. You could increase the amount of computation done at once using the make option --jobs (or -j):

abyss-pe <your parameters> -j5

Note that if you do this and specify np=32, 5 jobs will run at once trying to use 32cpus each. I'd specify j=6 so that you don't over subscribe your machine. Also, the memory usage of the whole job will increase to ~5x37GB (185GB).