Large libraries

Cary Brandolino

unread,

Feb 22, 2023, 6:26:43 PM2/22/23

to Smart-3SEQ

Hello everyone,

I'm trying out a newer version of this protocol and having a bit of an issue. The fragment sizes of the libraries I've produced are quite large (~1200bp). In a test I did about a year ago, with an older protocol version, I had around ~500bp fragments, which while still on the larger end seemed more appropriate. I'm using pre-isolated RNA, 96 samples at a time with pre-pooling, 12 PCR cycles, and with the xGen 10nt IDT Indexes. I haven't done any tinkering of my own to try and solve this, admittedly I'm still new to this and not entirely sure where to start or what would cause this. I'd greatly appreciate any insight.

Thanks,

Cary

b3c1A_library_High Sensitivity DNA Assay_DEDAE00125_2023-02-21_14-30-26.pdf

Joe Foley

unread,

Feb 23, 2023, 10:17:25 AM2/23/23

to smart...@googlegroups.com

Hi Cary,

This might be normal and usable. Keep in mind the raw-data electropherogram displays length distributions in a very misleading way:

The x-axis is migration speed, which is roughly logarithmic with molecule length, so the quantity of short molecules is spread out over a wide horizontal space and therefore reaches lower vertically, while the quantity of long molecules is compressed into a narrow horizontal space and is therefore taller.
The y-axis is fluorescence, which is proportional to molecule length (a 600 bp molecule attracts twice as many fluorophores as a 300 bp molecule), so longer molecules will show higher fluorescence than shorter molecules at the same copy number. In other words the graph corresponds with mass concentration (pg/µL), which is misleading if what you care about is molarity (pM).
The Bioanalyzer and Fragment Analyzer (not TapeStation) measure fluorescence as mobile molecules pass by a stationary detector, so faster-moving molecules spend less time in front of the detector and give less detectable fluorescence signal, meaning (again) longer molecules will show higher fluorescence than shorter molecules at the same copy number.
I'm not sure but it's possible the "average library size" in your table may be an average by mass, not an average by copy number (molarity), which would estimate a shorter average length (see #2).

So basically the raw data will make the library look shifted more to the right (longer molecules) than an intuitive histogram of molarity vs. length.

If you send me the original .xad file I can regraph it.

JWF

--
You received this message because you are subscribed to the Google Groups "Smart-3SEQ" group.
To unsubscribe from this group and stop receiving emails from it, send an email to smart-3seq+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/389c14ee-96cd-4210-bc5e-f81dc296c861n%40googlegroups.com.

Cary Brandolino

unread,

Feb 23, 2023, 11:30:21 AM2/23/23

to smart...@googlegroups.com

Hi Joe,

Thank you for your quick reply! I really appreciate your insight, it is so helpful and I had no idea how misleading bioanalyzer results could be. Here is the XAD file, thank you so much for the offer to regraph. In the future qc that I do on the later libraries I generate, is there a better method than using the Bioanalyzer? Would TapeStation be preferable? Thank you again.

Best,

Cary

You received this message because you are subscribed to a topic in the Google Groups "Smart-3SEQ" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/smart-3seq/zfunU5SVrPU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to smart-3seq+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/fb313e60-75ae-06ad-1c03-7557d2fb4cad%40stanford.edu.

b3c1A_library_High Sensitivity DNA Assay_DEDAE00125_2023-02-21_14-30-26.xad

Joe Foley

unread,

Feb 23, 2023, 3:34:40 PM2/23/23

to smart...@googlegroups.com

Here's a graph that's rescaled to molarity vs. (linear) molecule length:

The molar mean length, within the range of 100 to 2500 bp, is 896 bp and the molar median is 775 bp. 42% of molecules within that range are within the range of 100-700 bp, roughly the tolerance of Illumina sequencers. Longer molecules don't create a big problem; they simply don't produce amplifiable clusters on the flow cell.

So yes, these libraries are skewed toward somewhat long molecules, but not terribly so. I would go ahead and sequence them, and just use a high loading concentration, e.g. 150% of the recommended molarity on the NextSeq 500 is a good starting point for Smart-3SEQ libraries.

JWF

P.S. No, you don't need a different electrophoresis instrument to see this, just different software, if you're familiar with R: https://stanford.edu/~jwfoley/bioanalyzeR.html

Unfortunately the missing peaks in your extra wells trigger a bug in the most recent release; the latest development version has a patch to fix it but that's not merged into a new release yet so you'll have to install from the development branch:

> library(remotes)
> install_github("jwfoley/bioanalyzeR", "devel")

Then this is how I did the above analysis:

> library(ggplot2)
> library(bioanalyzeR)
> rundata <- subset(read.electrophoresis("b3c1A_library.xml"), well.number == 1)
> qplot.electrophoresis(rundata, xlim = c(100, 2500), show.peaks = "n", region.alpha = NA) + geom_vline(xintercept = 700, linetype = "dashed")
> summarize.custom(rundata, 100, 2500)
> integrate.custom(rundata, 100, 700) / integrate.custom(rundata, 100, 2500)

To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/CADb%3DKMEYk1Z%2BS1KU74s5NNFg7Jz3rAMv-%2BTWC3QEf-4fvQBzCQ%40mail.gmail.com.

Cary Brandolino

unread,

Feb 23, 2023, 5:16:16 PM2/23/23

to smart...@googlegroups.com

Thank you so much for the analysis!! That's great to hear it is still sequenceable and I'll be sure to request a high loading concentration when we sequence. And thank you as well for the info on how to do this analysis, that is extremely helpful to have and I will certainly be using it in the future. I do wonder, though, why would it be that the fragments are tending towards the higher end? Before when we tested, they seemed to be a bit closer to the normal range, so I am curious as to why this is. Do you have any insight on this?

To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/7b0e2069-fa0d-2ea6-03b2-0afd2f74bdaf%40stanford.edu.

Reply all

Reply to author

Forward