Dear All,
I'm using Adapterama III protocol (3RAD) according to 3rd design (PstI & DdeI, with NsiI as dimer cutting enzyme). The data has been demultiplexed by the sequencing company using Illumina indexes. I'm using inline barcodes to separate individuals.
To my surprise process_radtags recognizes cut sites only when barcodes contain first letter of the cut site (the one before the overhang). Does `--renz1` recognize the overhang only?
This issue is not discussed in the manual thus I am not sure if I do it right. It is not too intuitive so I prefer to ask.
Another thing is that even for the "non-intuitive" solution (part of cut site included in barcode) I get about 50% of i5 reads (forward, with PstI in the beginning) dropped due to missing cut site. When I examined some random reads I find no issues with the sequence. Could it be a result of using the second cutting enzyme in the protocol? Should I be bothered by the issues?
Thanks in advance,
Maciek
-----------------------
Below you'll find some examples:
PstI cut site
CTGCA (overhang italicised)
Working barcodes (final C and G are the first letters of the two cutting enzymes used PstI & DdeI):
CCGAATC CACATGTCG BCK312
TTAGGCAC TGTGCACGAG BCK313
AACTCGTCC GCATCAG BCK314
GGTCTACGTC ATGCTGTG BCK315
Not working barcodes (taken from the publication):
CCGAAT CACATGTC BCU05
TTAGGCA TGTGCACGA BCU06
AACTCGTC GCATCA BPU13
GGTCTACGT
ATGCTGT BPU14
Sample of input file (PstI/NsiI cut sites underlined, bold - barcodes):
@A00627:690:HKM2VDSX7:2:2150:9778:1141 1:N:0:ATTCAGAA+AGGCTATA
GATACCCTGCAGGTTTGCACTACACAACATCAGGAAGATCAGGCCCTTTCTGACAGAGCAT+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF@A00627:690:HKM2VDSX7:2:2150:10357:1141 1:N:0:ATTCAGAA+AGGCTATA
TTAGGCACTGCATTAAAGTAAATCGGGTCAGTTTTGATATCATGTTGATCTTAATCGTAACC+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFF:FFF@A00627:690:HKM2VDSX7:2:2150:22706:1141 1:N:0:ATTCAGAA+AGGCTATA
CCGAATCTGCAGCTAATGTGCTTCCTCGAGGCGCCTTGTGTTTGTGTGTCTGTAAGTGTGT+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF@A00627:690:HKM2VDSX7:2:2150:28203:1141 1:N:0:ATTCAGAA+AGGCTATA
TTAGGCACTGCAGTTTTTCTTTGGGTGCGATGAATGAAACTGAAATACTAGAAGCAGTAGA+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF@A00627:690:HKM2VDSX7:2:2150:28492:1141 1:N:0:ATTCAGAA+AGGCTATA
GATACCCTGCAGGACGCCGCTTCATGGCCTGACCGCTACCCACTACTGTTCAACGGGCTGG+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF@A00627:690:HKM2VDSX7:2:2150:30337:1141 1:N:0:ATTCAGAA+AGGCTATA
CTGCAACTCTGCATAAATCAAAGCAGACAGAAGGAAGCAGCTATGTAGCACAATTACAGTC+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF@A00627:690:HKM2VDSX7:2:2150:31891:1141 1:N:0:ATTCAGAA+AGGCTATA
CCGAATCTGCATTTCCATCTCCCATTATTCGCATTAACCCTTTTTTTTTAGGAATTAAGGCAT+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF:F:F@A00627:690:HKM2VDSX7:2:2150:32000:1141 1:N:0:ATTCAAAA+AGGCTATA
GATACCCTGCATCTTACCAAATTGCTAATCTTACCTTTTGATTCAAAACAGAACAAGAATGT+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF@A00627:690:HKM2VDSX7:2:2150:32452:1141 1:N:0:ATTCAGAA+AGGCTATA
CCGAATCTGCAGTAAACACAGAGCGCAATATCACCACGATGTCGTCATCGTCGCTACCTAA+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:@A00627:690:HKM2VDSX7:2:2150:13386:1157 1:N:0:ATTCAGAA+AGGCTATA
CCGAATCTGCATAAATACAACTGATGTCACTATTTATACAAATTCATAACAAGGAAGTGCAC+
FFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF@A00627:690:HKM2VDSX7:2:2150:16875:1157 1:N:0:ATTCAGAA+AGGCTATA
TGTCTACGTCTGCATGTTACAGGGCACAAAACCGTTCAATGATAATGAACAAGAAACCAAT+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF@A00627:690:HKM2VDSX7:2:2150:26259:1157 1:N:0:ATTCAGAA+AGGCTATA
AGCGTTGCTGCATGAATATATAGACAGGTCTGGGGAGTAACTAGTAACATGTAACGGAATTA+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFF@A00627:690:HKM2VDSX7:2:2150:27326:1157 1:N:0:ATTCAGAA+AGGCTATA
CCGAATCTGCATTCTGATTATTTGCGGGTGGGTCGTGGATAAAATAAATCACAAATGATGCA+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFF:FFFF,FFFF@A00627:690:HKM2VDSX7:2:2150:27344:1157 1:N:0:ATTCAGAA+AGGCTATA
AGCGTTGCTGCATGGTTTCTCTAAAGGCTTTTAATACAGCTTTAATGTGTGTGTGCTCGTGT+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFF