how to do a test run on the MiSeq

105 views
Skip to first unread message

Joe Foley

unread,
Jan 14, 2022, 3:24:36 PM1/14/22
to Smart-3SEQ
If it's your first batch of libraries and you're nervous about doing an expensive run on a high-throughput sequencer, you can get a sneak peek with a cheap test run on a small sequencer first. Currently Illumina's cheapest sequencing kit appears to be the MiSeq v2 Nano 300-cycle reagents (#MS-103-1001), US list price $325 (1/5 the cost of a NextSeq kit), though if you are using a core facility they will still charge more to cover labor etc.

The MiSeq Nano kit gives you about 1 million post-filter clusters, which is not enough to measure biological signals, but you can get a rough idea of these QC metrics:
  • Library clustering efficiency
  • % clusters passing chastity filter
  • Balance of multiplexed libraries
  • Distribution of sequenceable insert lengths
  • % alignable reads
  • Estimated library size (very rough, and only if you have a substantial number of duplicate reads to count)
Of course these metrics will be somewhat different on another sequencing platform, but they should be close enough that you can tell whether your libraries are worth sequencing deeper.

To configure the MiSeq run: There is no reason to do paired-end sequencing, so in theory you can use this kit for a single read of 301 nt. However, the error rate increases dramatically in long reads and Illumina officially supports only 251 nt reads with the v2 kits, while for Smart-3SEQ that's still much more than necessary, so to save time on the sequencer you may choose to do a much shorter read 1 (perhaps as short as 76 nt like our preferred NextSeq configuration) and discard the leftover reagents. PhiX is probably not required if your libraries are good, but you might as well do the 10% spike-in just in case; this can also help you estimate your clustering efficiency (if you loaded 10% PhiX but 20% of reads aligned to it, you know you should load more library next time).

As for the index read(s):
  • Legacy 48-plex i7 indexing requires 6 nt index read 1 and no index read 2.
  • IDT/Illumina 96-plex unique dual indexing (UDI) requres 8 nt index read 1 and 8 nt index
  • For our special i5-only indexing scheme, Illumina informed me that the MiSeq requires (and I have successfully validated) a useless 3 nt index read 1 just to satisfy the software, then 8 nt index read 2. With the provided design for the universal index-less P7 PCR primer, the useless i7 read will always be "ATC" except for sequencing errors; if you use a different primer it could be something else, but the latest version of the bcl2fastq script in 3SEQtools will mask the i7 read regardless.
All three configurations are provided in the updated configuration files for Illumina Experiment Manager (IEM) posted separately in this discussion thread. You must use the latest version of the IEM files to do i5-only indexing on the MiSeq because the older version doesn't have that configuration. If you are using a version of the MiSeq software that is not compatible with IEM, please let me know so I help you set it up; I don't have a MiSeq like that to test with.

Joe Foley

unread,
Jan 14, 2022, 3:31:44 PM1/14/22
to Smart-3SEQ
Sorry, that should read: "IDT/Illumina 96-plex unique dual indexing (UDI) requires 8 nt index read 1 and 8 nt index read 2"

jm googlecalendar

unread,
May 26, 2022, 12:42:16 PM5/26/22
to Smart-3SEQ

We have followed your suggestion to run a MiSeq test run prior to NovaSeq and are going through your list of QC metrics…

I am now figuring what, if anything, might improve performance on the later platform.

Two MiSeq runs:

Run 1

PhiX 1%

Clustered but PhiX did not workIllumina suggested overloading of library (displaced PhiX)

Without a clear positive control, we repeated the MiSeq with altered amounts of library and PhiX.

 

Run 2

16 Zeiss LCM caps (one no-template) with non-precious sample tissues (nervous tissue, liver)

Pooled cDNA libraries (AMPure SPRI purification) with unique i5 indexing approach.


Fig1 26May22.jpg

Library type:                SMART-3Seq               

Sequencing:                MiSeq 300 cycle v2 nano                   

Read lengths:              Read 1 = 76, i7 = 3, i5 = 8       

 

Final pool nM:                         4

Qubit final pool (ng/uL):         5.6

TapeStation size:                    295

Calculated nM:                       28.76

Loading pM:                            4

%phiX:                                     25

cluster density:                       332

 

Loaded 4 pM cDNA library (25% of 1st run)

Added 25% PhiX (but coming back as 46%)

Aligned = 46% (Read 1)

**Suggests PhiX is clustering more effectively than what else in in the tube…

 

BaseSpace report

“QC Passed”

%Q30 = 93%

% Cluster PF = 54%

Density = 332 K/mm^2 – seems low but tile image appears higher


Fig2 26May22.jpg

Fig3 26May22.jpg

A bit of a dose-response curve (two failures also plotted—see below).

Fig4 26May22.jpg

I have included both biological (different colours) and technical replicates (similar colours).

To my rookie eyes, it seems that samples Q11 and Q14 have failed, though it is unclear to me whether that was at the RT, PCR, or sequencing step(s).

Based on what I have read in this Group, failure of individual libraries is not unexpected from FFPE samples. 

The remaining question is whether anything else might be done (e.g., size purification, changes in PCR cy cles, etc.) to improve the Cluster PF%? 

My guess is that the primer dimers living in the Tapestation peak at 165 bp is sufficiently abundant to cluster efficiently, take up space on the lane, but not sequencing, to account for the low Cluster PF%.  This would be a problem if it interferes with deep sequencing of the “real” library on the NovaSeq, though I am unsure how much to be concerned as this is our first rodeo.

 

With appropriately amplified gratitude.

J. Matyas

Joe Foley

unread,
May 27, 2022, 12:12:45 PM5/27/22
to smart...@googlegroups.com
The QC metric that will be really informative from this is the proportion of alignable reads after you run them through the pipeline. It looks like you do have a lot of short byproducts, and yes those are responsible for the low proportion of clusters passing Illumina's chastity filter, but the question is how much real data you'll get in addition to that.

I agree that samples Q11 and Q14 failed completely, but the lack of even byproduct reads is a little unusual. Even the no-template control usually gives you something. You might double-check that the index sequences match correctly between the PCR primers and the data demultiplexing, and that the proportion of Undetermined reads (unrecognized index) is less than 10%.
--
You received this message because you are subscribed to the Google Groups "Smart-3SEQ" group.
To unsubscribe from this group and stop receiving emails from it, send an email to smart-3seq+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/c1010d14-9b05-4b7d-97e1-4fc65b8014e0n%40googlegroups.com.

OpenPGP_signature
Reply all
Reply to author
Forward
0 new messages