Re: [Smart-3SEQ] Re: gene expression data

50 views

Skip to first unread message

Joe Foley

unread,

Aug 31, 2020, 1:47:21 PM8/31/20

to smart...@googlegroups.com

There are several possible reasons why we see read alignments in negative-control samples. One possibility is library preparation artifacts: contamination of the sample with some other RNA or some other cDNA library with compatible adapters, which we try to avoid by working carefully in a dedicated pre-PCR workstation, but in this case we used the highest degree of amplification because we were comparing these with single-cell samples so even a very small amount of contamination could be detectable. Another possibility is sequencing artifacts: those sequence reads were actually derived from a different sample but misassigned to the control samples, sequenced together in the same pool, because of problems reading the index; we've found that low-quality Smart-3SEQ libraries often have messy reads of the i7 index so our lab has switched to i5 indexing for LCM samples. The third possibility is bioinformatic artifacts: the byproducts molecules that accumulate in low-input, high-amplification libraries have random sequences and some of these may happen to match some part of a large, complex genome like human well enough to get a false-positive alignment.

We didn't investigate this enough to distinguish the possibilities because it looked like a very small problem even in the worst-case scenario of the highest amplification and less than the lowest amount of RNA input (e.g. Figure 4 in the paper). The controls are the baseline for false-positive hits that occur by chance, and the results showed that even the single-cell FFPE LCM libraries exceeded that baseline, plus the single-cell libraries tended to recapitulate the biological signal of the bulk libraries, with more noise of course, while the controls gave us only noise.

On 8/28/20 11:07 PM, Erin Wu wrote:

The final row shows of this screenshoot of the part of the data shows what I calculated using the function, COUNTIF(range, ">0"). Am I right like that calculation?

在2020年8月29日星期六 UTC+8 下午1:47:46<Erin Wu> 写道：

Hello,
I'm very interesting to your good method. But having some questions from your data. I downloaded the gene expression tables from supplimentary material of the article，Gene-expression profiling of single cells from archival tissue with laser-capture microdissection and Smart-3SEQ，and checked the data , founding that the sum of detected genes from the control1, 2, 3, 4, 5, 6 can be up to about 4000. I definitely want to know what control is? NTC(no template control)? If there was，why you could still detect mapped genes in control?

I want to know if I got it right. Thanks so much if you can explain.

--
You received this message because you are subscribed to the Google Groups "Smart-3SEQ" group.
To unsubscribe from this group and stop receiving emails from it, send an email to smart-3seq+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/322ac41a-0a63-4669-8c34-13edcce2af76n%40googlegroups.com.