Segmentation fault in process_radtags and ustacks

685 views
Skip to first unread message

Todd Pierson

unread,
Jun 4, 2014, 5:22:58 PM6/4/14
to stacks...@googlegroups.com
Hello! I've successfully used both process_radtags and ustacks on one batch of fastq files, but another consistently gives me segmentation faults in both programs. The two sets of fastq files both have ends trimmed down to the enzyme overhangs and don't seem to differ in any significant way. An example of each can be found in this Dropbox folder. The file beginning with "2D..." seems to work just fine, but "17831..." causes the error. If it's helpful, I'm using the latest version of Stacks.

I'd greatly appreciate any advice.

Thanks!
Todd

Julian Catchen

unread,
Jun 4, 2014, 8:10:58 PM6/4/14
to stacks...@googlegroups.com, twpi...@gmail.com
Hi Todd,

How are you running the programs and what are they outputting on the
console when you run them?

julian

Todd Pierson wrote:
> Hello! I've successfully used both process_radtags and ustacks on one
> batch of fastq files, but another consistently gives me segmentation
> faults in both programs. The two sets of fastq files both have ends
> trimmed down to the enzyme overhangs and don't seem to differ in any
> significant way. An example of each can be found in this Dropbox folder
> <https://www.dropbox.com/sh/ay2l2iutpeir1je/AAD1gVYkeDk3YlWOmVeJkuRza>.

Todd Pierson

unread,
Jun 9, 2014, 9:37:07 AM6/9/14
to stacks...@googlegroups.com, twpi...@gmail.com, jcat...@uoregon.edu
Julian,

Thanks for taking the time to respond!

I'm running Stacks through our university's cluster, so here's the shell script for process_radtags:

#! /bin/bash
cd /home/tcglab/tcg_tech/Todd/Stacks/
export LD_LIBRARY_PATH=/usr/local/gcc/4.7.1/lib:/usr/local/gcc/4.7.1/lib64:${LD_LIBRARY_PATH}
time /usr/local/stacks/latest/bin/process_radtags -P -p ./Plethodon/* -o ./output/ -c -q -r -t 150 --renz_1 xbaI --renz_2 ecoRI -i fastq

...and the resulting error when I use the problematic input files:

No barcodes specified, files will not be demultiplexed.
Using Phred+33 encoding for quality scores.
Reads will be truncated to 150bp
Found 1 paired input file(s).
Barcode type unspecified, assuming unbarcoded data.
Processing file 1 of 1 [17831_S92_L001_R1_001_trimmed.fastq]
  Reading data from:
  ./Plethodon/17831_S92_L001_R1_001_trimmed.fastq and
  ./Plethodon/17831_S92_L001_R2_001_trimmed.fastq
*** glibc detected *** /usr/local/stacks/latest/bin/process_radtags: munmap_chunk(): invalid pointer: 0x000000000f003bb0 ***
======= Backtrace: =========
/lib64/libc.so.6(cfree+0x166)[0x339d275b66]
/usr/local/stacks/latest/bin/process_radtags[0x409fff]
/usr/local/stacks/latest/bin/process_radtags[0x406ce3]
/usr/local/stacks/latest/bin/process_radtags[0x410ace]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x339d21d9c4]
/usr/local/stacks/latest/bin/process_radtags(__gxx_personality_v0+0x109)[0x4032e9]
======= Memory map: ========
00400000-00426000 r-xp 00000000 00:19 494637290                          /panfs/pstor.storage/rcclocal/zcluster/stacks/1.19/bin/process_radtags
@                                                                               
"process_rad.sh.e5380558" 68L, 5683C                          1,1           Top

Here's the shell script for ustacks:

#! /bin/bash
cd /home/tcglab/tcg_tech/Todd/Stacks/
export LD_LIBRARY_PATH=/usr/local/gcc/4.7.1/lib:/usr/local/gcc/4.7.1/lib64:${LD_LIBRARY_PATH}
time /usr/local/stacks/latest/bin/ustacks -t fastq -f ./Plethodon/17831_S92_L001_R1_001_trimmed.fastq -o ./output3/ -i 1
 
...and the resulting error when I use the problematic input files:

Min depth of coverage to create a stack: 3
Max distance allowed between stacks: 2
Max distance allowed to align secondary reads: 4
Max number of stacks allowed per de novo locus: 3
Deleveraging algorithm: disabled
Removal algorithm: disabled
Model type: SNP
Alpha significance level for model: 0.05
Parsing ./Plethodon/17831_S92_L001_R1_001_trimmed.fastq
  Loading RAD-Tag 0       ^M/var/spool/uge/the_zcluster/compute-9-17/job_scripts/5380559: line 4: 29783 Segmentation fault      (core dumped) /usr/local/stacks/latest/bin/ustacks -t fastq -f ./Plethodon/17831_S92_L001_R1_001_trimmed.fastq -o ./output3/ -i 1

real    0m0.813s
user    0m0.034s
sys     0m0.030s
~                                                                               
~                                                                               
~                                                                               
~                                                                               
~                                                                               
~                                                                               
"ustacks2.sh.e5380559" 14L, 654C                              1,1           All

Both of these shell scripts work just fine with the other set of files (e.g. 2D...) that I mentioned. Let me know what you think.

I greatly appreciate the help! Thanks!
Todd

Todd Pierson

unread,
Jun 15, 2014, 10:59:30 PM6/15/14
to stacks...@googlegroups.com, twpi...@gmail.com, jcat...@uoregon.edu
Hey all,

Any ideas? Even confirming that my file is also problematic for someone else would be informative and helpful.

Thanks!
Todd

ananta acharya

unread,
Jun 16, 2014, 10:55:00 AM6/16/14
to stacks...@googlegroups.com


it seems -b argument (for barcode) is missing on process_radtags



--
Stacks website: http://creskolab.uoregon.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Julian Catchen

unread,
Jun 25, 2014, 12:25:56 AM6/25/14
to stacks...@googlegroups.com, twpi...@gmail.com
Hi Todd,

In your command, I'm not sure what the effect of specifying "-p ./Plethodon/* " as the path to your files would have. In principle, the shell will replace the wildcard with a bunch of file paths from that directory, but process_radtags isn't prepared to parse a list like that, so I don't know if it would cause the program to stop processing command line options. Just specify the directory alone, "-p ./Plethodon/".

Second, your files are labled, "_trimmed", how have they already been modified before you give them to process_radtags?

Best,

julian

Todd Pierson

unread,
Jun 25, 2014, 7:58:33 AM6/25/14
to stacks...@googlegroups.com, twpi...@gmail.com, jcat...@uoregon.edu
Hey Julian,

Thanks for taking a look. Since I originally posted the issue, I've refined the script. The wildcard, and it's had no effect. Here's an updated script:

#! /bin/bash

cd /home/tcglab/tcg_tech/Todd/Stacks/Plethodon

export LD_LIBRARY_PATH=/usr/local/gcc/4.7.1/lib:/usr/local/gcc/4.7.1/lib64:${LD_LIBRARY_PATH}

time /usr/local/stacks/latest/bin/process_radtags -p ./rawdata/ -o ./process_output/ -c -q -r -t 230 --disable_rad_check -i fastq


We've been trimming off the adapters and cut sites prior to putting our already-demultiplexed data into process_radtags, as the adapters are variable in length (containing internal indices), and we're using more than two enzymes. So, in effect, we're just using process_radtags to remove low quality reads and trim to a fixed length. Now, we've successfully used it this way to go all the way through the denovo_map pipeline, and things are working swimmingly. Except with that same batch of files that repeatedly gives us the segmentation fault. We're still perplexed.

If you have any other ideas, we'd love to hear them.

Thanks!
Todd

Todd Pierson

unread,
Jun 25, 2014, 8:19:44 PM6/25/14
to stacks...@googlegroups.com, twpi...@gmail.com, jcat...@uoregon.edu
Never figured out what the hangup was, but I've bypassed the trimming step and tweaked the way we run process_radtags to accommodate it. Everything's working well now, and I think there must've been something odd going on with the way the fastq file was re-written after trimming.

Thanks!
Todd

Enrique Ortega

unread,
Jul 31, 2019, 11:45:32 AM7/31/19
to Stacks
Hello everyone !

I figured out what was the problem: When using cutadapt i used the default minimum length parameter, which is 0. So it lead to have a fastq.gz containing empty reads. As far as I understood the Core dump error means that a program cannot access to a memory space, so if you have a string '^$' and request for the ACTG contents it'll be denied and cause the error: 'Processing RAD-Tags...Segmentation fault (core dumped)'

For some reason it didn't worked when I tried to use grep -v to remove them. So I re-ran cutadapt with a minimum length of 2 and it all ran OK.

I submitted a ticket in the cutadapt git to change the default of minimum length to help avoid this problems. In the mean-time, keep your eyes peeled.

Tech details and code a bit lower:

Tech details:
cutadapt 2.4 running on python 3.6.5 on a virtual environment, installed using pip
process_radtags version 2.41

Code which caused the error (bash), and how to find it
adapter1=ACACTCTTTCCCTACACGACGCTCTTCCGATCT
adapter2
=AATTAGATCGGAAGAGCGAGAACAA
path
=/home/$USER/work/project/test  ## use your own path


cutadapt
-a $adapter1 -A $adapter2  -o ${path}cutadapt_out/Idx1_S1_L001_R1_001cut.fastq.gz  -p ${path}cutadapt_out/Idx1_S1_L001_R2_001cut.fastq.gz  ${path}raw_data_ln/Idx1_S1_L001_R1_001.fastq.gz ${path}raw_data_ln/Idx1_S1_L001_R2_001.fastq.gz

process_radtags
-p ${path}raw_data_ln/ -P -o ${path}out_radtags_e1/ -b ${path}barcode_names.tsv -e pstI -r -c -q

## I used fastqc to check the quality,
## only the R2 reads had empty reads, shown in the overrepresented sequences,
## not the R1
fastqc
-t 20 --noextract -o ${path}qual/qual_raw_cut ${path}cutadapt_out/*

## This will be t4 times the number of empty reads shown by fastqc

zgrep '^$' -A 2 -B 1 ${path}cutadapt_out/Idx1_S1_L001_R2_001cut.fastq.gz | wc -l  

## I don't know why the -v option didn't do it's work

zgrep '^$' -v -A 2 -B 1 ${path}cutadapt_out/Idx1_S1_L001_R2_001cut.fastq.gz | gzip  > Idx1_S1_L001_R2_001cut_noEmpty.fastq.gz

## The later idea was to re-pair the R1 and R2 using repair.sh from BBMap


The code that worked, the option -m <INT> did the trick:

adapter1=ACACTCTTTCCCTACACGACGCTCTTCCGATCT
adapter2
=AATTAGATCGGAAGAGCGAGAACAA
path
=/home/$USER/work/project/test


cutadapt
-m 10 -a $adapter1 -A $adapter2  -o ${path}cutadapt_out/Idx1_S1_L001_R1_001cut.fastq.gz  -p ${path}cutadapt_out/Idx1_S1_L001_R2_001cut.fastq.gz  ${path}raw_data_ln/Idx1_S1_L001_R1_001.fastq.gz ${path}raw_data_ln/Idx1_S1_L001_R2_001.fastq.gz

process_radtags
-p ${path}raw_data_ln/ -P -o ${path}out_radtags_e1/ -b ${path}barcode_names.tsv -e pstI -r -c -q


Cheers !
Enrique



Reply all
Reply to author
Forward
0 new messages