>90% barcode not found drops!

362 views

Skip to first unread message

Brian A

unread,

Apr 15, 2019, 3:32:47 PM4/15/19

to Stacks

Hello, I'm new to Stacks and have been using it to analyze some single digest (SbfI) paired-end RADseq data I inherited. I'm not sure if this matters, but the data are from 2 species. The samples use 8nt barcodes, and according to Rochette & Catchen, 2017, I should expect to see the beginning of each sequence start with <8-nt barcode>TGCAGG<unique sequence> when I visually inspect some of the raw reads in the fastq file. Instead, the sequences show no similar sequences at the beginning at all.

I've used a command to return the next 5 nucleotides downstream of the barcode sequence, and for the most part, the most common sequence is TGCAN, where N is a variable nucleotide. This is a little encouraging since TGCA is part of the SbfI cut site, but the variable nucleotide afterward is troubling.

When I run Process Radtags, I receive 97% barcode not found drops. The command I use and output are as follows:

process_radtags -p /home/brallen/Pine_RAD/Pine4/clonefilter -b /home/brallen/Pine_RAD/Pine1/barcode1.txt -o /home/brallen/Pine_RAD/Pine4/clonefilter/process_radtags -e sbfI -P -r -c -q

54502372 total sequences
53065872 barcode not found drops (97.4%)
25879 low quality read drops (0.0%)
702424 RAD cutsite not found drops (1.3%)
708197 retained reads (1.3%)

I've also tried to run clone_filter first, but that didn't improve anything. Any troubleshooting advice on what may be going wrong is appreciated! Specifically, I'm trying to determine if there was a problem with the sequencing or my analysis.

Thank you in advance for any help!

Brian A

Dennis Larsson

unread,

Apr 20, 2019, 2:44:12 PM4/20/19

to Stacks

It looks to me like your reads are cut using PstI, try using the same command but switch sbfI with pstI (pst + capital ' i '). I am pretty sure my PstI cut reads start with TGCA.

Reply all

Reply to author

Forward

0 new messages