Odd base composition - Illumina metrics

74 views
Skip to first unread message

Alejandro Pezzulo

unread,
Oct 25, 2019, 4:13:20 PM10/25/19
to Smart-3SEQ
Hi,

I'm having trouble interpreting base composition and run QC metrics on a SMART-3SEQ run of 24 pooled samples (HiSeq4000).
The Bioanalyzer trace of the sample (starting material was cells, lysed and used directly) looked very promising (attached).
In terms of library quantification, Kapa PCR was roughly 70% of the value of Qubit (which has been typical for me).

Unfortunately, the Illumina metrics look really bad for this run (lane 7 of this run). Base composition is odd throughout, and only 4% of clusters pass filter.
We ran this with 10% PhiX to minimize issues with the GGG early in read 1.

Does anyone have ideas on how to interpret / what to optimize? I previously ran a successful HiSeq4000 lane with a pool of 12 samples.

Alejandro Pezzulo
Bioanalyzer trace.png
Pezzulo- %Base.png
Pezzulo Lane 7 stats.png

Joe Foley

unread,
Oct 25, 2019, 4:58:35 PM10/25/19
to smart...@googlegroups.com
Is is the same pool of libraries in every lane of this run?
--
You received this message because you are subscribed to the Google Groups "Smart-3SEQ" group.
To unsubscribe from this group and stop receiving emails from it, send an email to smart-3seq+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/8c8c1b06-dd8b-45e3-b108-d2b7d7caa3fe%40googlegroups.com.

signature.asc

Lutz Froenicke

unread,
Oct 26, 2019, 8:28:26 PM10/26/19
to smart...@googlegroups.com
I would suggest studying the thumbnail images (for the first cycles) and the fluorescence intensities.
Do you see significantly fewer clusters than in the other lanes?

Alejandro Pezzulo

unread,
Oct 27, 2019, 3:43:49 PM10/27/19
to Smart-3SEQ
My pooled library is in lane 7. All other lanes are regular truseq runs for unrelated projects.

Thank you,
Alejandro

Joe Foley

unread,
Oct 28, 2019, 1:29:26 PM10/28/19
to smart...@googlegroups.com
The base composition looks like a low-quality library that's mostly RT primer dimers, i.e. 5N + 3G + 30A with no cDNA insert. That's odd because the size distribution looks so good on the Bioanalyzer. Can you process the few million reads that did pass the chastity filter and see what's in them? If it's possible to rerun bcl2fastq and examine the reads that failed the filter too, that might be informative.

I haven't used the HiSeq 4000 very much; is it normal that it reports exactly the same cluster density in every lane? If those numbers aren't accurate then it could just be underloaded.
signature.asc

Alejandro Pezzulo

unread,
Oct 28, 2019, 2:50:25 PM10/28/19
to Smart-3SEQ

We looked at the thumbnails as Lutz suggested and found similar numbers of clusters, and the intensities were high.
I'll check with our core facility re: possible underloading.
We'll process the reads and try to look at failed reads too.

Thanks so much
Alejandro

Lutz Froenicke

unread,
Oct 28, 2019, 6:23:12 PM10/28/19
to smart...@googlegroups.com
This then indicates "hidden" RT primer dimers as suggested by Joe.

Yes, the cluster density metric is unfortunately useless for the HiSeq 4000 sequencers.

--
You received this message because you are subscribed to the Google Groups "Smart-3SEQ" group.
To unsubscribe from this group and stop receiving emails from it, send an email to smart-3seq+...@googlegroups.com.

Alejandro Pezzulo

unread,
Nov 26, 2019, 4:41:31 PM11/26/19
to Smart-3SEQ
Hi all,

Just wanted to follow up on this. You may recall we had good looking length distribution on bioanalyzer, but had very low quality/# reads. We tried running the same library with 50% PhiX (on a MiSeq Nano chip just for QC purposes) and got perfect results (good quality throughout, and the expected # reads).

Based on this, we were thinking that the three Gs early in read 1 threw the sequencer off and resulted in the poor results. When we run this library in the NovaSeq we may need to start with around 30% PhiX and go up or down based on results for future runs.

Alejandro

To unsubscribe from this group and stop receiving emails from it, send an email to smart...@googlegroups.com.

Joe Foley

unread,
Nov 27, 2019, 1:31:46 PM11/27/19
to smart...@googlegroups.com
That's great that your libraries are fine after all. I'm not sure about the implications though. Smart-3SEQ libraries have always worked with little or no PhiX spike-in on the MiSeq.

One of our guesses was underloading; did you load the library pool at a higher concentration on the MiSeq relative to your previous HiSeq run? And how was the cluster density?

Illumina's basecaller has improved in recent years so their recommendations for PhiX spike-ins with low-diversity libraries have eased down; in the worst case it looks like you should only need 10-20% PhiX on the HiSeq 4000 with updated software: https://support.illumina.com/bulletins/2017/02/how-much-phix-spike-in-is-recommended-when-sequencing-low-divers.html. That would be the cautious thing to do, but I'm still skeptical whether that was really the problem this time because people (other than our lab) have already been running Smart-3SEQ libraries on the HiSeq 3000/4000 and I haven't heard of this outcome before, and you mentioned it was successful on your previous run too. Are other people using that much PhiX on the HiSeq 4000?

If there's no other explanation then your proposal to start with high PhiX and gradually reduce future runs makes sense, though if you ever happen to have a batch with so many libraries that you need to load it in more than one lane anyway, it would be especially informative to load the same library pool with different amounts of PhiX.
To unsubscribe from this group and stop receiving emails from it, send an email to smart-3seq+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/461eea5f-e387-40ec-8a30-975c9dcc2943%40googlegroups.com.

signature.asc

Luis Barreiro

unread,
Mar 18, 2021, 8:36:43 PM3/18/21
to Smart-3SEQ

Hi Alejandro,

I am curious to know if you ever got to optimize the % of PhiX needed to get good data on the NovaSeq? 

Best
Luis

pezz...@gmail.com

unread,
Apr 1, 2021, 5:46:52 PM4/1/21
to Smart-3SEQ
Hi,

For our last set of experiments we did 30% on the NovaSeq, with the intention of optimizing down as needed. In the end, we got so many reads that we didn't have to repeat the experiment...
We are planning to use again for a lot of experiments. I'm a bit reluctant to do the PhiX dose response I had planned since the cheapest option in NovaSeq would be (using SP lanes) ~$1700/condition, but depending on how the experiments go I may be forced to do so to save money in the long run.

Alejandro
Reply all
Reply to author
Forward
0 new messages