Segmentation fault with SimpleGraph when running abyss-pe

111 views
Skip to first unread message

maheshi dassanayake

unread,
Mar 6, 2011, 2:24:31 PM3/6/11
to ABySS
Dear ABySS developers/users,
I was trying to run abyss-pe with an input read set of single and paired end reads with multiple insert sizes (fats reads from 454 and fastq reads from illumina). I'm using ABySS 1.2.6. The genome size is about 140MB and I have ~30x coverage.
The command I used is:
mpirun -np 4 abyss-pe k=31 n=6 s=300 name=tp lib='lib1 lib2 lib3 lib4' lib1='per86_1_1.cor.fq per86_1_2.cor.fq' lib2='pe3kb.fa' lib3='pe8kb.fa' lib4='pe20kb.fa' se='s678.cor.fq per86_1_1.cor_single.fq per86_1_2.cor_single.fq se_l.fa s_from_pe.fa'

I get a segmentation fault with SimpleGraph:

abyss-joindist lib1-3.dist lib2-3.dist lib3-3.dist lib4-3.dist >tp-3.dist
Overlap   -k31 -g tp-4.adj -o tp-4.fa tp-3.fa tp-3.adj tp-3.dist
abyss-joindist lib1-3.dist lib2-3.dist lib3-3.dist lib4-3.dist >tp-3.dist
Overlap: 1067
Scaffold: 582
No overlap: 406
Insignificant (<5bp): 388
Homopolymer: 1198
Motif: 22
Ambiguous: 163
SimpleGraph   -j2 -k31 -o tp-4.path1 tp-4.adj tp-3.dist
Overlap   -k31 -g tp-4.adj -o tp-4.fa tp-3.fa tp-3.adj tp-3.dist
make: *** [tp-4.path1] Segmentation fault
make: *** Deleting file `tp-4.path1'
Overlap: 1316
Scaffold: 678
No overlap: 636
Insignificant (<5bp): 611
Homopolymer: 1932
Motif: 29
Ambiguous: 198
SimpleGraph   -j2 -k31 -o tp-4.path1 tp-4.adj tp-3.dist
make: *** [tp-4.path1] Segmentation fault
make: *** Deleting file `tp-4.path1'
abyss-joindist lib1-3.dist lib2-3.dist lib3-3.dist lib4-3.dist >tp-3.dist
Overlap   -k31 -g tp-4.adj -o tp-4.fa tp-3.fa tp-3.adj tp-3.dist
Overlap: 1316
Scaffold: 678
No overlap: 636
Insignificant (<5bp): 611
Homopolymer: 1932
Motif: 29
Ambiguous: 198
SimpleGraph   -j2 -k31 -o tp-4.path1 tp-4.adj tp-3.dist
make: *** [tp-4.path1] Segmentation fault
make: *** Deleting file `tp-4.path1'
abyss-joindist lib1-3.dist lib2-3.dist lib3-3.dist lib4-3.dist >tp-3.dist
Overlap   -k31 -g tp-4.adj -o tp-4.fa tp-3.fa tp-3.adj tp-3.dist
Overlap: 1316
Scaffold: 678
No overlap: 636
Insignificant (<5bp): 611
Homopolymer: 1932
Motif: 29
Ambiguous: 198
SimpleGraph   -j2 -k31 -o tp-4.path1 tp-4.adj tp-3.dist
make: *** [tp-4.path1] Segmentation fault
make: *** Deleting file `tp-4.path1

Any comments or feedback to avoid this is appreciated.
Thanks.
Maheshi.

maheshi dassanayake

unread,
Mar 8, 2011, 10:27:24 AM3/8/11
to ABySS
Daer ABySS developers and users,
I get the same error in my repeated attempts to run abyss for my current data set. I have used it successfully a number of times before.
The error message I get is:

make: *** [tp-4.path1] Segmentation fault
make: *** Deleting file `tp-4.path1'

The command line I used:
$ mpirun -np 4 abyss-pe -j4 k=33 n=5 c=6 s=300 v=-v name=tp lib='lib1 lib2 lib3 lib4' lib1='per86_1_1.cor.fq per86_1_2.cor.fq' lib2='pe3kb.fa' lib3='pe8kb.fa' lib4='pe20kb.fa' se='s678.cor.fq per86_1_1.cor_single.fq per86_1_2.cor_single.fq se_l.fa s_from_pe.fa'

Any feedback would be very helpful.
 

Here's a bit more about my input data. I have reads from both 454 and Illumina. The 454 recommended based are trimmed from those reads and converted to fasta sequences.  I have 3 PE libraries with 3 insert sizes- 3,8, and 20kb and also 454 shotgun reads similarly treated. The illumina reads are corrected with the program Quake and I have 1kb PE and single end reads.


The machine I'm using is 64bit; 48 processors with a total of 250GB RAM. The combined input file size is 35.68GB. Do you think a segmentation error could occur due to system capacity limitations?

The target genome is ~140Mb.

If this keeps failing, I would have to try an approach like assembling the 454 data with a very high k-value and then in another assembly for illumina reads with a  smaller k and try to combine them in later. But that would not give me the best use of the hybrid sequencing. So really hope if I can solve this problem.

Really appreciate it if anybody has come across a similar issue and would like to share how this was solved or if the developers can give me some insight. I'm attaching the detailed messages with -v option for the failed run.

Maheshi.

 
abyssrun_log.txt

Shaun Jackman

unread,
Mar 8, 2011, 1:37:13 PM3/8/11
to maheshi dassanayake, ABySS
Hi Maheshi,

Can you report the output of
ulimit -s

Try increasing the stack size limit like so:
ulimit -s unlimited

Cheers,
Shaun

maheshi dassanayake

unread,
Mar 8, 2011, 1:48:39 PM3/8/11
to Shaun Jackman, ABySS
Hi Shaun,
It was set to:
$ ulimit -s
8192

Perhaps this too low(?) Anyways, I'll try it with -s unlimited. thanks a lot for the suggestion.
Maheshi.

maheshi dassanayake

unread,
Mar 9, 2011, 8:26:15 AM3/9/11
to ABySS
Hi Shaun,
I get the same error even after changing the stack size limit to unlimited. I wouldn't think but, could this be that I have too many different types of reads or too much input data that makes this assembly not viable on the 250GB machine I'm running it? Also, I haven't removed any duplicate reads from my input set, but is that something others do as pre-assembly processing?
Thanks.
Maheshi.


On Tue, Mar 8, 2011 at 1:37 PM, Shaun Jackman <sjac...@bcgsc.ca> wrote:

John Donners

unread,
Mar 9, 2011, 9:57:38 AM3/9/11
to maheshi dassanayake, ABySS
Hi Maheshi,

SimpleGraph is multi-threaded, so it could be somewhat more complicated to
change the stack size.
ulimit -s can be used to set the stack size of a process, but not of a thread.
Each thread of a process gets its own stack.
The stack size of threads is determined by the compiler (in case of OpenMP) or
the library (in case of pthreads). See also:

http://stackoverflow.com/questions/2340093/how-is-stack-size-of-process-on-linux-related-to-pthread-fork-and-exec

so either you could set a higher limit with ulimit (but not unlimited), or add a call to
pthread_attr_setstacksize
.

Cheers,
John

Shaun Jackman

unread,
Mar 9, 2011, 12:56:09 PM3/9/11
to maheshi dassanayake, ABySS
Hi Maheshi,

On the topic of multithreading, try disabling multithreading when
running SimpleGraph by setting the option -j1. Enable core dumps like
so:
ulimit -c unlimited
run SimpleGraph until it crashes, and get a backtrace from the core
dump:
gdb SimpleGraph core.*
bt
which you can post on the mailing list.

Cheers,
Shaun

maheshi dassanayake

unread,
Mar 10, 2011, 11:40:01 AM3/10/11
to ABySS
Hi Shaun,
I'm copying the backtrace from gdb below. Any help to proceed from here would be really appreciated.
Thanks.
maheshi.

Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `SimpleGraph -v -j1 -k41 -o tp-4.path tp-4.adj tp-3.dist'.
Program terminated with signal 11, Segmentation fault.
#0  constrainedSearch (g=..., u=..., constraints=..., nextConstraint=...,
    satisfied=<value optimized out>, path=..., solutions=..., distance=24457,
    visitedCount=@0x2aea519beaac) at ConstrainedSearch.cpp:54
54      {
(gdb) bt
#0  constrainedSearch (g=..., u=..., constraints=..., nextConstraint=...,
    satisfied=<value optimized out>, path=..., solutions=..., distance=24457,
    visitedCount=@0x2aea519beaac) at ConstrainedSearch.cpp:54
#1  0x000000000040e006 in constrainedSearch (g=..., u=<value optimized out>,
    constraints=<value optimized out>, nextConstraint=...,
    satisfied=<value optimized out>, path=..., solutions=..., distance=24497,
    visitedCount=@0x2aea519beaac) at ConstrainedSearch.cpp:99
#2  0x000000000040e006 in constrainedSearch (g=..., u=<value optimized out>,
    constraints=<value optimized out>, nextConstraint=...,
    satisfied=<value optimized out>, path=..., solutions=..., distance=24494,
    visitedCount=@0x2aea519beaac) at ConstrainedSearch.cpp:99
#3  0x000000000040e006 in constrainedSearch (g=..., u=<value optimized out>,
    constraints=<value optimized out>, nextConstraint=...,
    satisfied=<value optimized out>, path=..., solutions=..., distance=24493,
    visitedCount=@0x2aea519beaac) at ConstrainedSearch.cpp:99
#4  0x000000000040e006 in constrainedSearch (g=..., u=<value optimized out>,
    constraints=<value optimized out>, nextConstraint=...,
    satisfied=<value optimized out>, path=..., solutions=..., distance=24490,
    visitedCount=@0x2aea519beaac) at ConstrainedSearch.cpp:99
#5  0x000000000040e006 in constrainedSearch (g=..., u=<value optimized out>,
    constraints=<value optimized out>, nextConstraint=...,
    satisfied=<value optimized out>, path=..., solutions=..., distance=24489,
    visitedCount=@0x2aea519beaac) at ConstrainedSearch.cpp:99

Shaun Jackman

unread,
Mar 10, 2011, 4:28:28 PM3/10/11
to maheshi dassanayake, ABySS
Hi Maheshi,

Could you send me the files
tp-4.adj tp-3.dist
compressed in an email off the list?

Thanks,
Shaun

Shaun Jackman

unread,
Mar 10, 2011, 6:53:32 PM3/10/11
to John Donners, maheshi dassanayake, ABySS
Hi Maheshi,

I ran SimpleGraph on my machine here and it run to completion with no
trouble. So, the most likely explanation is that you are running out of
stack space -- which I wish gave a more informative error message than a
segmentation fault. I'm not sure why your default thread stack size is
different than my default thread stack size, or how you change it. Which
compiler are you using?

I noticed that a number of paths resulted in multiple valid paths, which
often means that there's likely a lot of bubbles remaining in your data.
Try decreasing p (the minimum identity) to 0.8 (the default is 0.9).
Make sure that you're using ABySS 1.2.6.

Cheers,
Shaun

$ time SimpleGraph -k41 -o tp-4.path tp-4.adj tp-3.dist
Total paths attempted: 85032
Unique path: 14785
No possible paths: 36460
No valid paths: 530
Repetitive: 50
Multiple valid paths: 10461
Too many solutions: 1469
Too complex: 21277

The minimum number of pairs in a distance estimate is 5.
The minimum number of pairs used in a path is 5.

real 5m40.751s
user 2m9.387s
sys 0m0.566s

On Wed, 2011-03-09 at 06:57 -0800, John Donners wrote:

Shaun Jackman

unread,
Mar 10, 2011, 8:32:14 PM3/10/11
to maheshi dassanayake, ABySS
Hi Maheshi,

That's correct. Use
abyss-pe --dry-run ...
to check that it's not going to start over from the beginning. If it
looks good, rerun your original abyss-pe command (without the
--dry-run). The option --dry-run is an option of GNU make, also known as
-n, --just-print, --dry-run, --recon. See `man make' for more details.

Cheers,
Shaun

On Thu, 2011-03-10 at 17:26 -0800, maheshi dassanayake wrote:
> Thanks a lot, Shaun. Should I give the full abyss-pe original command
> to continue from where it stopped?
> maheshi.
>
> On Thu, Mar 10, 2011 at 6:54 PM, Shaun Jackman <sjac...@bcgsc.ca>
> wrote:
> Hi Maheshi,
>
> Here's the output of SimpleGraph. You can use this file to
> continue the
> assembly from where it failed.
> /home/sjackman/Desktop/tp/tp-4.path1.gz
>
> Cheers,
> Shaun
>
>
> On Thu, 2011-03-10 at 14:05 -0800, maheshi dassanayake wrote:
> > Sure, here's the link to both files. Thanks so much for
> taking the
> > time to look into this.
> > Maheshi.
> >
> > p.s: Please let me know if you find any issues with
> downloading from
> > this dropbox link.
> >
> > On Thu, Mar 10, 2011 at 4:28 PM, Shaun Jackman

> > --
> > Maheshi Dassanayake
> > University of Illinois
> > Plant Biology
> > 196 ERML, 1201 W Gregory Ave
> > Urbana, IL 61801
>
>
>

maheshi dassanayake

unread,
Mar 13, 2011, 2:00:31 PM3/13/11
to ABySS, Jeffrey Haas
Hi:
I did finally manage to run abyss without getting this 'segmentation fault'. We were running Ubuntu 10.4 LTS 64-bit server edition on our system and abyss was compiled with g++ 4.4.
As Shaun suggested we needed to change the stack size with ulimit, and as John advised it should be set not to 'unlimited' but some other value.
For our system it worked when we typed:

ulimit -s 33554432

Thanks a lot Shaun, John, and Jeff (who figured out the numbers and settings for our system) for all the help and feedback.
Maheshi.

Shaun Jackman

unread,
Mar 14, 2011, 2:43:06 PM3/14/11
to maheshi dassanayake, ABySS, Jeffrey Haas
Thanks, Maheshi, for reporting the solution, and to John and Jeff for
finding it. I'm glad that you were able to get it working.

Cheers,
Shaun

Reply all
Reply to author
Forward
0 new messages