Failed sequencing runs

Adam Passman

unread,

Nov 18, 2022, 8:17:04 AM11/18/22

to Smart-3SEQ

Hi forum,

I'm hoping someone may have an idea of what's going on.

After lots of recent Smart-3SEQ successes, we are starting to have a lot of sequencing run failures and we can't pinpoint the source of the issue.

I've attached a Sequence Quality Histogram from MultiQC showing that we are getting poor Phred scores at the bases we expect to see our GGG signal. We've been using 7xN for our UMI so these are bases 8-10. We sequence on the NextSeq.

Additionally, overall, we are getting underclustering (about 250M reads total) and our genome facility isn't able to demultiplex 95%+ of our reads.

I've also attached a Tapestation plot for a pool where the sequencing failed. The pattern is similar to what successful runs have looked like. So we are amplifying something and that something is the expected size and distribution so it would seem our P5+P7 are functioning, but when when it comes to sequencing, we undercluster, can't demultiplex (almost no input index sequences detected) and we get poor quality around GGG.

Any thoughts would be greatly appreciated!

Many thanks,

Adam

HSD1000 library pool.jpg

fastqc_per_base_sequence_quality_plot.png

Joe Foley

unread,

Nov 18, 2022, 12:52:24 PM11/18/22

to smart...@googlegroups.com

we are getting poor Phred scores at the bases we expect to see our GGG signal.

This is normal: G is the base that gets no fluorescence in the 2-color encoding on the newer Illumina machines, so there should be no signal in those cycles except from the PhiX spike-in.

Additionally, overall, we are getting underclustering (about 250M reads total)

Our lab routinely loads up to 200% of the recommended molarity (according to TaqMan qPCR) on the NextSeq 500 so this could be normal too.

and our genome facility isn't able to demultiplex 95%+ of our reads.

This is the worst and most unexpected part but it might be completely recoverable. Which indexing scheme are you using? Are you sure you didn't select the wrong configuration in Illumina Experiment Manager or equivalent, or accidentally reverse-complement the indexes? If that doesn't explain it, you can export the index reads with "bcl2fastq --create-fastq-for-index-reads" and see what's in them. We have seen some poly(A) contamination in i7 reads, which is why our lab uses i5-only indexing, but far less severe than that.

We've been using 7xN for our UMI

Did you happen to do a no-RNA control? A longer random stretch would probably result in more adapter dimers, though I don't think these would be the symptoms.

JWF

--
You received this message because you are subscribed to the Google Groups "Smart-3SEQ" group.
To unsubscribe from this group and stop receiving emails from it, send an email to smart-3seq+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/4ccea7fd-5041-4396-9365-1f7f402f8c4fn%40googlegroups.com.

OpenPGP_signature

Adam Passman

unread,

Nov 21, 2022, 12:44:27 PM11/21/22

to Smart-3SEQ

Hi Joe,

Thanks again for your rapid responses.

In our successful runs, we usually get a little dip at the GGG (see new attachment compared to the one in the first post), but nothing as dramatic as in our recent failed runs. Our genome centre has been doing a 1% PhiX spike-in, should they be using more?

Thanks for the 200% suggestion, I will run it passed our genome centre and see what they think.

We are using the i7 indexes. I found a list of 96 8nt indexes somewhere and went ahead and bought a good 70 or so by the time I found out about your i5 only strategy. So, I thought I’d plod along with the i7 indexing. Up until recently, it’s been going well. I really hope this is all as simple as a reverse-complement issue. I just can’t understand how we get the profile of good libraries/pools on the tapestation and then the sequencing is a disaster. I will ask our centre for the index reads and see if I can notice anything obvious. Is there anything non-standard about the Illumina Experiment Manager config that I should inform our genome centre of?

I’ve not sequenced a no-RNA control - we do have some difficulty with adapter dimers, though nothing dramatic.

I’ll update if we I ever get to the bottom of this!

Adam

Adam Passman

unread,

Nov 21, 2022, 12:46:01 PM11/21/22

to Smart-3SEQ

Forgot to attach one of our successful Qual Score graphs.

Seq Qual Hist.jpg

Joe Foley

unread,

Nov 21, 2022, 1:15:29 PM11/21/22

to smart...@googlegroups.com

Our genome centre has been doing a 1% PhiX spike-in, should they be using more?

No, the low signal during the all-G cycles doesn't cause any trouble since they're after the cycles used for cluster registration (1-5).

Is there anything non-standard about the Illumina Experiment Manager config that I should inform our genome centre of?

In the sample sheet, the index sequences should be the reverse complements of their sequences in the PCR primers. This is standard but if you made a custom configuration it's an easy thing to miss.

I will ask our centre for the index reads and see if I can notice anything obvious.

That's definitely the best diagnostic at this point. FYI here's a one-liner to tally the reads in the index FASTQ file:

$ zcat Undetermined_S0_I1_001.fastq.gz | awk 'NR % 4 == 2' | sort | uniq -c | sort -rnk 1,1 > index_read_frequency.txt

To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/df032e4f-6f5c-4f56-97fd-3410f4bb23e1n%40googlegroups.com.

OpenPGP_signature

Reply all

Reply to author

Forward