Illumina Sequencing of GeCKO Library

2,195 views
Skip to first unread message

Riley Cook

unread,
Aug 12, 2016, 2:49:36 PM8/12/16
to Genome Engineering using CRISPR/Cas Systems
I wanted to ask for some clarification on the Illumina primers used for sequencing before I actually go ahead and perform the run on the NextSeq machine.


1. I cannot find the the indexes used in your excel guide (Readout primers for Illumina sequencing of sgRNA cassette : MS Excel - http://genome-engineering.org/gecko/?page_id=15) anywhere in Illumina document of indexes that they offer with their kits. In addition, the forward primer seems to use inline barcodes that are staggered wheras the reverse primer appears to use a multiplex barcode that is not staggered until after the barcode is read (perhaps I am mistaken here). Can anyone provide some guidance on how to demultiplex this data once I run it?


2. The tech who amplified the libraries used the following design:

PCR 1

-          PCR will be done to amplify the sgRNA sequences

-          Forward primer AATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCG

-          Reverse primer CTTTAGTTTGTATGTCTGTTGCTATTATGTCTACTATTCTTTCC

-          PCR product is ~312 bp

 

PCR 2

-          a second PCR will be done to add Illumina adapters for sequencing

-          6 different forward/reverse primers will be used for each of the 6 samples

-          Forward Illumina Primers:

1)     AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTtAAGTAGAGtcttgtggaaaggacgaaacaccg

2)     AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTatACACGATCtcttgtggaaaggacgaaacaccg

3)     AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTgatCGCGCGGTtcttgtggaaaggacgaaacaccg

4)     AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTcgatCATGATCGtcttgtggaaaggacgaaacaccg

5)     AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTtcgatCGTTACCAtcttgtggaaaggacgaaacaccg

6)     AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTatcgatTCCTTGGTtcttgtggaaaggacgaaacaccg

-          Reverse Illumina Primers:

1)     CAAGCAGAAGACGGCATACGAGATAAGTAGAGGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTtTCTACTATTCTTTCCCCTGCACTGT

2)     CAAGCAGAAGACGGCATACGAGATACACGATCGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTatTCTACTATTCTTTCCCCTGCACTGT

3)     CAAGCAGAAGACGGCATACGAGATCGCGCGGTGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTgatTCTACTATTCTTTCCCCTGCACTGT

4)     CAAGCAGAAGACGGCATACGAGATCATGATCGGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTcgatTCTACTATTCTTTCCCCTGCACTGT

5)     CAAGCAGAAGACGGCATACGAGATCGTTACCAGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTtcgatTCTACTATTCTTTCCCCTGCACTGT

6)     CAAGCAGAAGACGGCATACGAGATTCCTTGGTGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTatcgatTCTACTATTCTTTCCCCTGCACTGT

-          the size of the PCR product should be ~370 bp



This does not seem to be a two step PCR approach, as I cannot find any overlap between the first set and second set primers... so then what is the point of the first set of primers? Are they nested? What would the point of nesting be?

Thanks for any help provided!!!

Neville Sanjana

unread,
Aug 15, 2016, 8:12:16 AM8/15/16
to Riley Cook, Genome Engineering using CRISPR/Cas Systems
Hi Riley,

Those are great questions and I apologize for the confusion. I think the source of the confusion is that the forward barcodes are both in the forward reads (as you note, after the stagger) and also in the “second index read” (a.k.a. I2) position. Those barcodes are identical, so you can demultiplex using either the Illumina software (using I1 and I2 indexes to demultiplex) or do it on your own using only the forward read (R1). 

The reason we did this is that the “default” sequencing mode of the Illumina machines does not capture the I2 read and we wanted to reduce the possibility of folks forgetting to activate the dual-indexing mode and thus not getting the forward barcode (=wasted run). In the “default” mode, you always get I1, so no danger of missing the reverse barcode.

Finally, our indexes are homebrew, which means that they might differ from those Illumina uses in their kits like Nextera or TruSeq. But the Excel guide very clearly indicates what the sequences are for all of the barcodes on both F and R primers. Unlike MiSeq and HiSeq, remember that the I2 (forward index) sequencing read for NextSeq needs the reverse complement of the typical MiSeq/HiSeq barcode. 

Hope that helps,
Neville
--
You received this message because you are subscribed to the Google Groups "Genome Engineering using CRISPR/Cas Systems" group.
To unsubscribe from this group and stop receiving emails from it, send an email to crispr+un...@googlegroups.com.
To post to this group, send email to cri...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Edward Chuong

unread,
Aug 15, 2016, 11:50:27 AM8/15/16
to Genome Engineering using CRISPR/Cas Systems, riley...@gmail.com
Hi Neville, 
Thanks for this clarification. Unfortunately I'm still a little confused regarding Index 1 and 2. My understanding of Illumina scheme is that Index 1 ("i7") is located on the reverse primer, and the Index 2 ("i5") index is read from the forward primer (from the Illumina indexed sequencing overview PDF). 

I've designed my primers such that the Fwd primer is constant (with 9 N's) and I've only retained the barcode in the Revprimer. I was planning on telling the sequencing core to read just Index 1/i7, but your post suggests otherwise. Can you please clarify? Also if the first barcode is AAGTAGAG in the reverse primer, but the i7 index primer reads in the opposite direction, so would I tell the core to use AAGTAGAG or the reverse complement (CACTACTT) as the barcode?

Thanks again for all your help! This group is an excellent resource.

-Ed

Neville Sanjana

unread,
Aug 16, 2016, 8:53:26 AM8/16/16
to Edward Chuong, Genome Engineering using CRISPR/Cas Systems, riley...@gmail.com
Hi Ed,

Your understanding of the Illumina setup/naming scheme is correct. I’m not sure how your primer design differs from ours but, if you’re using our primers, you need to sequence R1 (forward sequencing read, which contains our forward barcode after the stagger sequence) and I1 (first index read, which is the reverse barcode located on the reverse primer). Hope that clarifies.

Also, I want to correct something I realize was incorrect in my reply to Riley: In some of our primer designs, we have used the I2 (aka i5) index but not in the design on the GeCKO website. So, for demultiplexing using the GeCKO website primer design, use R1 for the forward barcode (right after the stagger) and I1 for the reverse barcode (entire 8bp read). My apologies for the confusion.

All the best,
Neville

Riley Cook

unread,
Aug 16, 2016, 3:08:54 PM8/16/16
to Genome Engineering using CRISPR/Cas Systems, ech...@gmail.com, riley...@gmail.com
Thanks for the response Neville. I think I will be able to demultiplex the data successfully.

One question that remains, is that with the design we followed - will a 75bp SE read on the NextSeq machine be enough to sequence the entire sgRNA within the amplicon? I am not entirely sure where in the amplicon the sgRNA sequence is.

Thanks!

Riley Cook

unread,
Aug 19, 2016, 6:25:32 PM8/19/16
to Genome Engineering using CRISPR/Cas Systems, ech...@gmail.com, riley...@gmail.com
Nevermind, I scoured previous questions asked in this group and revisited some of the other GeCKO protocols and was able to determine that the sgRNA is in fact, within 75 bp of the start of sequencing using the design above. However, I have major issues with low CPF (clusters passing filter) caused probably by low complexity, although I will inquire about that in a new thread.

Pedro Mogollon

unread,
May 8, 2019, 1:31:35 PM5/8/19
to Genome Engineering using CRISPR/Cas Systems
Hey Neville, 

Thank you and Ed for all your clarifications, it helped a lot. We can now understand the reason for designing two identical barcodes:

"The reason we did this is that the “default” sequencing mode of the Illumina machines does not capture the I2 read and we wanted to reduce the possibility of folks forgetting to activate the dual-indexing mode and thus not getting the forward barcode (=wasted run). In the “default” mode, you always get I1, so no danger of missing the reverse barcode.

But still, I cannot fully understand:

(1) If a dual-indexing or default mode -single index- should be performed (or if there would be any difference in terms of results) and 
(2) the approach in case of using NextSeq. "Unlike MiSeq and HiSeq, remember that the I2 (forward index) sequencing read for NextSeq needs the reverse complement of the typical MiSeq/HiSeq barcode". Does it mean that using NextSeq should we design other PCR2 Reverse Primers containing the reverse complement of the typical Miseq/Hiseq barcodes? OR should we just take it into account when configuring the sample sheet and use the same primers (Readout primers for Illumina sequencing of sgRNA cassette : MS Excel - http://genome-engineering.org/gecko/?page_id=15)?

Thanks for your help! All the best,
Pedro.

To unsubscribe from this group and stop receiving emails from it, send an email to cri...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages