There are several possible reasons why we see read alignments in
negative-control samples. One possibility is library preparation
artifacts: contamination of the sample with some other RNA or some
other cDNA library with compatible adapters, which we try to avoid
by working carefully in a dedicated pre-PCR workstation, but in this
case we used the highest degree of amplification because we were
comparing these with single-cell samples so even a very small amount
of contamination could be detectable. Another possibility is
sequencing artifacts: those sequence reads were actually derived
from a different sample but misassigned to the control samples,
sequenced together in the same pool, because of problems reading the
index; we've found that low-quality Smart-3SEQ libraries often have
messy reads of the i7 index so our lab has switched to i5 indexing
for LCM samples. The third possibility is bioinformatic artifacts:
the byproducts molecules that accumulate in low-input,
high-amplification libraries have random sequences and some of these
may happen to match some part of a large, complex genome like human
well enough to get a false-positive alignment.
We didn't investigate this enough to distinguish the possibilities
because it looked like a very small problem even in the worst-case
scenario of the highest amplification and less than the lowest
amount of RNA input (e.g. Figure 4 in the paper). The controls are
the baseline for false-positive hits that occur by chance, and the
results showed that even the single-cell FFPE LCM libraries exceeded
that baseline, plus the single-cell libraries tended to recapitulate
the biological signal of the bulk libraries, with more noise of
course, while the controls gave us only noise.
On 8/28/20 11:07 PM, Erin Wu wrote:
The final row shows of this screenshoot of the part of the data
shows what I calculated using the function, COUNTIF(range,
">0"). Am I right like that calculation?
在2020年8月29日星期六 UTC+8
下午1:47:46<Erin Wu> 写道:
Hello,
I'm very interesting to your good method. But
having some questions from your data. I downloaded the gene
expression tables from supplimentary material of the
article,Gene-expression
profiling of single cells from archival tissue with
laser-capture microdissection and Smart-3SEQ,and
checked the data , founding that the sum of detected genes
from the control1, 2, 3, 4, 5, 6 can be up to about 4000. I
definitely want to know what control is? NTC(no template
control)? If there was,why you could still detect mapped
genes in control?
I want to know if I got it right. Thanks so much
if you can explain.
--
You received this message because you are subscribed to the Google
Groups "Smart-3SEQ" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to smart-3seq+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smart-3seq/322ac41a-0a63-4669-8c34-13edcce2af76n%40googlegroups.com.