--
Stacks website: http://catchenlab.life.illinois.edu/stacks/
---
You received this message because you are subscribed to the Google Groups "Stacks" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stacks-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/stacks-users/eef9d0bd-ed68-4b65-87fd-b79988837568n%40googlegroups.com.
Hi Uyen,
It is hard to follow the different labels of sequences in your message. For example, we would talk about paired-end barcodes, which may be either inline (as part of the sequenced read) or index (encoded in the FASTQ header by Illumina’s software). The inline barcode is typically part of the molecular P1 and/or P2 adaptor, while the index would be in the i5 or i7 region, upstream of this adaptor in the molecular protocol. We would not combine the labels ‘barcode’ and ‘primer’, though both of these occur in the molecular protocol and we would not use the ‘primer’ to demultiplex.
If you did not specify the indexes to the Illumina software, it is possible that the sequencing machine not only did not demultiplex the files but did not sequence the i5 and/or i7 region so these barcodes are unknown. (I don’t know enough about setting up the iseq100 to say what may have happened.)
Anyway, the simplest thing might be for you to tell us where in the FASTQ file or in the sequence you expect the barcodes to be located.
Julian
FYI I tried using the function of Separation by barcodes of Geneious Prime but apparently it only found the primers in a handful (2%) of my libary.
Hi Uyen,
The primers are not sequenced on the Illumina machine, they set up the molecule to be sequenced by binding to the flow cell. If you look at the Peterson diagram, the only sequence the machine will produce is where you see the “NNNN”, which represents your inserted DNA, along with the upstream restriction enzyme cutsite and the inline barcode (in blue in the diagram).
The GGGGGG… your report in your output sequences represents “no data”, that is, one of the nucleotides on the Illumina flow cell is detected by a lack of florescence (‘G’), so when you try to sequence short DNA fragments the machine reports GGGGGGG… for all the sequencing rounds where there was no molecule left on the flow cell to sequence. This suggests that your size selection was incorrect and/or you have a lot of sequencing primer-dimers (nothing but sequencing primers stuck together without your sequence inserted in them, the ‘NNNN’ in the diagram).
You probably need to revisit the molecular protocol and try to verify if you have useable DNA that is making it through the whole protocol and evaluate if you did the size selection correctly.
Best,
Julian
From:
stacks...@googlegroups.com <stacks...@googlegroups.com> on behalf of Uyen Nguyen <devil...@gmail.com>
Date: Wednesday, March 27, 2024 at 8:18 AM
To: Stacks <stacks...@googlegroups.com>
Subject: Re: [stacks] ddRAD combination barcodes did not yield anything in process_radtags
Hello Julian, thanks so much for your response!
" the simplest thing might be for you to tell us where in the FASTQ file or in the sequence you expect the barcodes to be located". I am not entirely sure. As in this diagram from Peterson et al, 2012 (the doc at the bottom), I expect to see:
1. The first part of a sequence to be the PCR 1, which is the same for all. It is supposed to be AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACG, but I can't find this exact sequence (or the pink chunk in the diagram or even 6 bp) anywhere.
2. The barcode in the middle. However, when I tried searching for a barcode it'd be all over the place.
FYI I tried using the function of Separation by barcodes of Geneious Prime but apparently it only found the primers in a handful (2%) of my libary.
Any help or suggestions would be highly appreciated.