process_radtags throws segmentation fault only when using more than 4 threads

267 views
Skip to first unread message

Ethan Baldwin

unread,
Aug 30, 2023, 2:37:29 PM8/30/23
to Stacks
I am demultiplexing a large amount of GBS data using process_radtags and am running into an issue with multithreading. Anytime I run it single-threaded, I get no errors, but if I run it with more than 4 (I've tried 4, 6, 8, and 12) I get a segmentation fault as it starts to process the reads. This happens regardless of the amount of resources I give the job; I've tried all of these with 64gb of memory and double the number of threads I request in the command. 

I am using version 2.61 and here is an example command:

process_radtags -1 $R1 -2 $R2 -P -b $sample_sheet --inline_null --renz-1 mspI --renz-2 pstI -o radtags2 -q -c --disable-rad-check --threads 12

Example error log:

Processing paired-end data.
Using Phred+33 encoding for quality scores.
Found 1 paired input file(s).
Searching for single-end, inlined barcodes.
Loaded 192 barcodes (5-10bp).
Setting number of threads to 12; 1 reader thread; 4 writer threads; 7 worker threads.
Processing file 1 of 1 [Undetermined_S0_R1_001.1_of_4.fastq.gz]
  Reading data from:
  Undetermined_S0_R1_001.1_of_4.fastq.gz and
  Undetermined_S0_R2_001.1_of_4.fastq.gz
  Processing pairs of RAD-Tags.../var/lib/slurmd/job24465817/slurm_script: line 27: 168043 Segmentation fault      (core dumped) process_radtags -1 $R1 -2 $R2 -P -b $sample_sheet --inline_null --renz-1 mspI --renz-2 pstI -o radtags2 -q -c --disable-rad-check --threads 12

Thanks,
Ethan

Catchen, Julian

unread,
Aug 30, 2023, 5:25:51 PM8/30/23
to stacks...@googlegroups.com

Hi Ethan,

 

Thanks for the bug report. Does the segfault happen immediately, or does it run for a while first? Does it run to completion single-threaded?

 

I would ask you to upgrade to the latest release and let me know if you still see the segfault. If so, we can discuss possibly troubleshooting it, if you are up for it.

 

Best,

 

julian

Ethan Baldwin

unread,
Sep 2, 2023, 10:48:50 AM9/2/23
to stacks...@googlegroups.com
Hi Julian,

Thanks for the quick response! I tested the latest version as well. It runs for maybe a minute before the segfault, putting some reads into the sample fastq files. Both the 4 and single thread runs went to completion. I did some troubleshooting myself and figured out that it has to do with my data; multithreading is working fine on other datasets. Interestingly, I do not get the segfault if I run it on non-gzipped reads. 

I'm running process_radtags on the entire dataset over the weekend, so as long as it finishes I will not need to get multithreading working. However, I'd be happy to do some additional light testing for you or send you a subset of the troublesome dataset if that would be helpful for you.

Thanks,
Ethan

--
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to a topic in the Google Groups "Stacks" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/stacks-users/mva7F-0WWBE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to stacks-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/stacks-users/SN6PR11MB2557BE7BF036A0BA80065BAAA7E6A%40SN6PR11MB2557.namprd11.prod.outlook.com.

Catchen, Julian

unread,
Sep 5, 2023, 6:12:23 PM9/5/23
to stacks...@googlegroups.com

Hi Ethan,

 

If you could prepare a subset of your data that causes the segfault, and put it on Dropbox (or similar), I could download it and troubleshoot the segfault. If you want to try troubleshooting, you could run the program in the debugger, gdb, if you have it installed on your system. We have a script that will do this for you, stacks-gdb. Basically, you run:

 

stacks-gdb process_radtags…. <rest of your command here>

 

It will run and when the segfault hits it should print out a backtrace telling us where the problem is. That said, fixing multithreading bugs can be complex, so one backtrace from you may not be enough to figure things out (though it may help).

 

Best,

 

julian

 

 

You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/stacks-users/CAHzDFD7RcOj-yULprPxVuoxUfj0fSEC6%2BJV2m1aHrVfR_KWVLw%40mail.gmail.com.

Brian D.

unread,
Nov 29, 2023, 12:39:44 PM11/29/23
to Stacks
Hi all,

Just wanted to post that I've seen the same behavior on my Mac Studio M1 running process_radtags v 2.62. Running on 20 threads fails with segmentation fault but running 4 threads goes to completion. I didn't try any other numbers of threads. I am happy to provide more info if needed.

Brian D.

sedg...@gmail.com

unread,
Apr 9, 2024, 10:44:41 AM4/9/24
to Stacks
Hi,

Did anything come of this? I've just run into the same issue on process_radtags 2.65. I can start with the debugging steps on 2.6.6 if the issue hasn't been isolated yet.

- tyler

Catchen, Julian

unread,
Apr 9, 2024, 2:00:26 PM4/9/24
to stacks...@googlegroups.com

Hi Tyler,

 

I believe this was fixed in Stacks v2.66, it was related to retaining discarded reads in parallel mode. Please upgrade and try your data set again. If you still see a segfault with 2.66, we could go through either of the debugging steps I outlined in the previous post.

 

Best,

 

Julian

 

sedg...@gmail.com

unread,
Apr 10, 2024, 1:08:06 PM4/10/24
to Stacks
Hi Julian,

Turns out Stacks 2.66 was easy to install from source, and my new job with 8 CPUs is running without complaint.

Thanks!

Tyler

sedg...@gmail.com

unread,
Apr 10, 2024, 1:22:03 PM4/10/24
to Stacks
Hi again,

I spoke too soon, core dumped after 3 minutes.

I have a backtrace for you, attached. Let me know what else I can do to help!

Thanks!

Tyler

stacks-gdb.log
Reply all
Reply to author
Forward
0 new messages