barcode issues with proces_radtags

411 views
Skip to first unread message

Nathan Jones

unread,
Apr 28, 2017, 12:21:09 PM4/28/17
to Stacks
Hi all,

I am a new user to stacks and ddRADseq data processing.  I have a lot of experience with metagenomics but newb at this.  I am trying to preprocess (filter and demultiplex) two pools of paired end illumina data but process_radtags is throwing out all my sequences due to ambiguous barcode errors.  Any help is greatly appreciated.

Stacks version 1.44 My command is as follows.

process_radtags -P -p ./raw_pool_1 -b ./barcodes/pool_1_barcodes_v2.txt -o ./samples_pool_1/ \ -c -q -r --inline_index --renz_1 nlaIII --renz_2 mluCI \ -i fastq

My sequence file headers look like this
@K00179:78:HJ2KFBBXX:6:1101:1377:1209 1:N:0:GCCGCG+CGCGGC

My barcodes are present inline in the R1 sequence only my barcode file looks like this. 
GCATG
AACCA
CGATC
TCGAT
TGCAT
CAACC
GGTTG
AAGGA
AGCTA
ACACA
AATTA
ACGGT
ACTGG
ACTTC
ATACG
ATGAG
ATTAC
CATAT
CGAAT
CGGCT
CGGTA
CGTAC
CGTCG
CTGAT

the log file from process_radtags looks like this.

ss_radtags -P -p ./raw_pool_1 -b ./barcodes/pool_1_barcodes_v2.txt -o ./samples_pool_1/ -q -r --inline_index --renz_1 nlaIII --renz_2 mluCI  -c  -i fastq
process_radtags version 1.44 executed 2017-04-28 11:09:44

File Retained Reads Low Quality Ambiguous Barcodes Ambiguous RAD-Tag Total
17090Pool1_S9_L006_R1_001.fastq 0 0 51362358 0 51362358
17090Pool1_S9_L007_R1_001.fastq 0 0 51187232 0 51187232

Total Sequences 102549590
Ambiguous Barcodes 102549590
Low Quality 0
Ambiguous RAD-Tag 0
Retained Reads 0

Barcode Filename Total No RadTag Low Quality Retained
GCATG-AACCA CGATC 0 0 0 0

Sequences not recorded
Barcode Total
AATTA-CGCGG 9304968
ACACA-CGCGG 9264464
AGCTA-CGCGG 8649976
CAACC-CGCGG 7674144
AAGGA-CGCGG 7480068
GCATG-CGCGG 7479678
GGTTG-CGCGG 6973130
AACCA-CGCGG 6182070

Nicolas Rochette

unread,
Apr 28, 2017, 1:26:22 PM4/28/17
to stacks...@googlegroups.com
Hi Nathan,

By indicating --inline_index you are specifying that your reads have barcodes both inline and in the adapters. If I understood you correctly what you want is --inline_null

Best,
Nicolas
--
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users...@googlegroups.com.
Visit this group at https://groups.google.com/group/stacks-users.
For more options, visit https://groups.google.com/d/optout.

Nathan Jones

unread,
Apr 28, 2017, 2:44:12 PM4/28/17
to Stacks, roch...@illinois.edu
Thanks Nicolas, I tried changing that command and I think I am also having a problem with my bar code file.  The initial output from the previous failed run when I pass the command is that process_radtags only sees 1 barcode (GCATG-AACCA) GCATG is the first barcode in the file but I don't understand where -AACCA is coming from or why it doesn't see the other bar codes in the file.... 

When I tried this 
process_radtags -P -p ./raw_pool_1 -b ./barcodes/pool_1_barcodes_v2.txt -o ./samples_pool_1/ \ -c -q -r --inline_null --renz_1 nlaIII --renz_2 mluCI \ -i fastq

process_radtags begins the analysis but just quits after searching for inline barcodes....

how do you prepare your barcode files? I had them in excel and saved as a tab delimeted text file. Was this the wrong approach?

Cheers,
Nathan

Nicolas Rochette

unread,
Apr 28, 2017, 3:32:46 PM4/28/17
to Stacks
Hi Nathan,

Could you post the log of the command that uses --inline_null? Preparing the file as you described works, but could you also show the result of

cat pool_1_barcodes_v2.txt | od -c | head

(n.b. Windows's '\r' character is not a problem.)

Nicolas


Nathan Jones wrote on 04/28/2017 01:44 PM:

Nathan Jones

unread,
Apr 28, 2017, 3:45:45 PM4/28/17
to Stacks, roch...@illinois.edu
Hi Nicolas,

The command with --inline_null I was passing didn't give any output other than to stdout but I think i've found the problem.  The barcode file had weird binary characters in it that were messing with the basic linux interaction.  I could open the file and view it in a text editor but using the more and head commands it would only show the first two lines.  For some reason my excel was adding non standard characters to the tab delimited output barcode file...  

It seems to work now.  Thanks so much for the help! I'm sure I'll have more questions when I get to actually analyzing some data and not just cleaning up! ;)

Cheers,

Nathan
Reply all
Reply to author
Forward
0 new messages