Segmentation fault (core dumped)

taslima haque

unread,

Mar 9, 2018, 4:42:19 PM3/9/18

to Stacks

Hi there

I''m running stacks-2.0Beta9 right now. I have some aligned and sorted bam files which I want to to run by gstacks individually like this :

stacks-2.0Beta9/gstacks -B 405_S381.bam -O /path/4way_cross/

I got the following msg with Segmentation fault:

Logging to '/path/gstacks.log'.

Locus/sample distributions will be written to '/path/4way_cross/gstacks.log.distribs'.

Configuration for this run:

Input mode: reference-based

Input files: 1, e.g. '405_S381.bam'

Output to: '/path/4way_cross/'

Model: marukilow (var_alpha: 0.05, gt_alpha: 0.05)

Reading BAM headers...

Processing all loci...

Segmentation fault (core dumped)

I have tried with different file size but got the same result.

If I run the same bam file using input directory tag it goes well

stacks-2.0Beta9/gstacks -I BAM/ -M SampID -O /path/4way_cross/

Logging to '/path/4way_cross/gstacks.log'.

Locus/sample distributions will be written to '/path/4way_cross/gstacks.log.distribs'.

Configuration for this run:

Input mode: reference-based

Population map: 'SampID'

Input files: 1, e.g. 'BAM/405_S381.bam'

Output to: '/path/4way_cross/'

Model: marukilow (var_alpha: 0.05, gt_alpha: 0.05)

Reading BAM headers...

Processing all loci...

1K...

2K...

5K...

10K...

20K...

50K...

100K...

Any idea why -B tag is not working properly? I actually have around 400 samples so I want to run it individually.

Best,

Taslima

Nicolas Rochette

unread,

Mar 9, 2018, 6:32:21 PM3/9/18

to Stacks

Hi Taslima,

Thanks for the report. I've found a bug in the code for -B; replacing the v2.0Beta9 "locus_readers.h" file (in the "src/" directory) with the attached one and recompiling should fix it.

But can I ask you why you want to process each sample separately? This just sounds wrong to me.

Best,

Nicolas

--
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/stacks-users.
For more options, visit https://groups.google.com/d/optout.

locus_readers.h

taslima haque

unread,

Mar 9, 2018, 6:50:52 PM3/9/18

to Stacks

Hi Nicolas

Thanks a lot for your quick reply.I have a limited time to complete the job in server so was trying to throw jobs in different node. Is there any difference if I run each sample separately or running all together. Isn't this process independent for each sample?

Best,

Taslima

Nicolas Rochette

unread,

Mar 9, 2018, 7:00:30 PM3/9/18

to Stacks

No, the process is not independent. Even in ref-based mode, the existence of a SNP at a particular position is a population-wide property.

How much time are you allowed? Have you tried increasing the number of threads to use?

Best,

Nicolas

taslima haque

unread,

Mar 9, 2018, 7:05:54 PM3/9/18

to Stacks

I am allowed to run for 48 hours for 384 individuals, on top of that I have very big alignment files for parents which will not finish in that time limit .. do you think I can run parents and population separately?

Best,

Taslima

Nicolas Rochette

unread,

Mar 9, 2018, 7:18:02 PM3/9/18

to Stacks

I would just start it with 8 threads and 48 hours, I don't know how big your dataset is but I think that there is a good chance that it will complete anyway.

However, note that v2 doesn't support crosses yet (although you may, or not, be able to mingle with the non-cross genotypes a posteriori).

Best,

Nicolas

taslima haque

unread,

Mar 9, 2018, 7:23:19 PM3/9/18

to Stacks

So, in that case are you suggesting me to stay with version 1? so how do you set the posterior probability in version 2?

Best,

Taslima

John Garbe

unread,

Mar 12, 2018, 11:17:59 AM3/12/18

to Stacks

I recompiled v2.0Beta9 with the new locus_readers.h file but I'm getting the same core dumb in gstacks:

Logging to '/home/umgc-staff/jgarbe/test/stacks2/stacks/gstacks.log'.

Locus/sample distributions will be written to '/home/umgc-staff/jgarbe/test/stacks2/stacks/gstacks.log.distribs'.

Configuration for this run:

Input mode: denovo

Population map: '/home/umgc-staff/jgarbe/test/stacks2/popmaps/popmap'

Input files: 2, e.g. '/home/umgc-staff/jgarbe/test/stacks2/stacks/R.70b.0.matches.bam'

Output to: '/home/umgc-staff/jgarbe/test/stacks2/stacks/'

Model: marukilow (var_alpha: 0.05, gt_alpha: 0.05)

Reading BAM headers...

Processing all loci...

1%...

2%...

5%...

10%...

20%...

50%...

gstacks: src/locus.cc:362: void CLocAlnSet::hard_clip_right_Ns(): Assertion `!ref_.empty() && ref_[0] != Nt4::n' failed.

stacks2.sh: line 58: 8506 Aborted (core dumped) gstacks -P $src/stacks/ -M $src/popmaps/popmap -t 8

I'm running the pipeline by hand because I have 700 samples to process and I want to control the number of samples used to create the catalog.

-John

Julian Catchen

unread,

Mar 12, 2018, 12:27:48 PM3/12/18

to stacks...@googlegroups.com, jga...@gmail.com

Hi John,

The loci have already been defined by cstacks in denovo mode. So, if you
want to limit the number of individuals in the catalog, you do that in
cstacks.

The gstacks program will require access to all 700 of your samples at
one time, so it can process each locus in the context of all the data in
the metapopulation for that locus (e.g. the SNP calling prior is
generated from the entire population, and then each individual is
genotyped).

So, the corrected error that Nicolas posted (re: locus_readers.h)
related to a different input mode you than you are using, so this may be
a separate bug.

julian

John Garbe

unread,

Mar 12, 2018, 1:34:30 PM3/12/18

to Stacks

OK, thanks. I think the gstacks error I ran into was because I was using other tools to filter and trim fastq files instead of the process_radtags.pl script. After running my files through process_radtags I didn't encounter any problems.

Reply all

Reply to author

Forward