Hi, guys!
Let me be part of this conversation because I have a similar problem (but maybe a little bit worse lol).
We used 3RAD, claI and ecoRI for digestion, and Illumina for sequencing. We received the samples demultiplexed (four files for sample). Then, because the barcodes appeared only in the files header, I understood I didn't need to remove them (I don't even know if it's possible to remove something from the header). Is that right?
This is how the beginning of the files looks like:
zcat files_1/F10n05_R1_001.fastq.gz | head -n 20
@LH00401:259:22HKMYLT4:5:1101:2449:1070 1:N:0:AACCTTGG+GTCAGTAC
TNATGGTCAATCGGCCTCAAGGCGCGAATTATCGTGCACAAGATCGGAAGAGCACACGTCTGAACTCCAGTCACAACCTTGGATCTGGGGCGGCGCCCCCTCCCCTCAAAAGAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
+
I#IIIIIII9IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII9II-9----9-9------------99--99---9-9I99-99-9999999I99I9IIIIII9-99
@LH00401:259:22HKMYLT4:5:1101:46365:1084 1:N:0:AACCTTGG+GTCAGTAC
TNATGGTCAATCGGAACAAATCAGTTTTAAGTGGGACTGTCTGCTCAGTGCTATGACGACCTGGTGCGCAACGGCTGAGGAAGAAAGAATTATCGTGCACAAGATCGGAAGAGCACACGTCTGAACTCCAGTCACAACCTTGGATCTGGG
+
I#IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IIIIII9IIII-II9IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IIIIIIIIIIIIIIII9IIIIIIIIIIIII99I-
@LH00401:259:22HKMYLT4:5:1101:49853:1098 1:N:0:AACCTTGG+GTCAGTAC
TNATGGTCAATCGGGCGAGGGTCCGAGAATTATCGTGCACAAGATCGGAAGAGCACACGTCTGAACTCCAGTCACAACCTTGGATCTGGGGGGGCCCCCCCCCCCCCCCAACAAACGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
+
9#IIIII9999II9IIIIIIIIIIIIIII9IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII9IIII9IIIIIIII99I9----9--9-----9------9--9--------99999-99999I-II-9I9999-99999
@LH00401:259:22HKMYLT4:5:1101:8866:1140 1:N:0:AACCTTGG+GTCAGTAC
TCATGGTCAATCGGAACACCCAGGACGCGATCTCACCATCTCGAGATCTCGGGAGCTGTGTCTTAGCCCACGTGGCCATCCCGCCCCCTTTCAAGAAACGTCCATAGCCGTCCTGTTAGCACGGCTCCTTTATATCAAAGACAACTTTCT
+
IIIIIIII9IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII99IIIIIIIII9IIIIIII--99III9IIII999IIIIIIIIIIIIIIIIIIIIIIIII9IIIIIIIIIIIIII
@LH00401:259:22HKMYLT4:5:1101:26272:1154 1:N:0:AACCTTGG+GTCAGTAC
TCATGGTCACTCGATTCTGTTTCCCACAGTTTTCAAACTGCTGAAAATTTTCTGTACCTTACCAGTGTCCACAAGCACAGCTGAACGTTCATTTCCATCTTTGATAATAGTTAAAACATACTTACGCAGTACAATGTCCGAAAACCGTCT
I'm still in doubt with the barcodes because when I run process_radtags, the % of good-quality reads is around 0.1%, which is making me desesperate. But in my case, the problem is with RAD cutsite not found (I ran an example for only one sample, but the issues are the same):
1657030 total sequences
0 barcode not found drops (0.0%)
0 low quality read drops (0.0%)
1656811 RAD cutsite not found drops (100.0%)
219 retained reads (0.0%)
Thank you for any idea and suggestion!
Best,
Laís