Can I use the absolute abundance? and how

199 views
Skip to first unread message

yun li

unread,
Dec 22, 2015, 11:18:46 AM12/22/15
to Qiime 1 Forum

Hi QIIME experts,
 I have a question concerning if I can use the absolute abundance from QIIME, if possible? how can I do it?
 I have some 16sRNA sequencing data, and normally I just use the relative abundance. However, my collaborator is interested in the absolute abundance(or abundance adjusted by total DNA amount for each sample). They provide me the DNA concentration data for each sample. However, they said that they did not weight the samples before extracting the DNA. Is it right to adjust the total abundance of bacterial DNA for each sample by the DNA concentration? if it is ok, how can I do it? if it is not ok, what else information do I need to do that?


Thanks,
Yun

Colin Brislawn

unread,
Dec 22, 2015, 1:01:09 PM12/22/15
to Qiime 1 Forum
Hello Yun,

absolute abundance(or abundance adjusted by total DNA amount for each sample)
I don't think we have implemented something like that... 

Is it right to adjust the total abundance of bacterial DNA for each sample by the DNA concentration? 
I don't think it is, because qiime never knows about the concentration of DNA in a sample, qiime only know about the number of reads in a sample.

I'm guessing that your collaborator has a background in biochemistry or wet-lab biology, where DNA concentration and sample mass matter for wet-lab techniques. When the data comes off the sequencer, we enter dry-lab biology or informatics, where the reads per sample is the comparable metric that matters more.

Let me know if that answers your question. If your collaborator has a question, they can post here too! If there a context when this is needed, I would love to hear about it.

Thanks!
Colin

yun li

unread,
Dec 22, 2015, 1:29:00 PM12/22/15
to Qiime 1 Forum
Hi Colin,
Thank you for your reply. So after I get an OTU table from QIIME, can I adjust the reads number of each OTU (or higher taxa level) with DNA concentration of each sample? If not, do you mind explain why?

Thanks,
Yun

Colin Brislawn

unread,
Dec 22, 2015, 2:00:40 PM12/22/15
to Qiime 1 Forum
Hello Yun,

can I adjust the reads number of each OTU (or higher taxa level) with DNA concentration of each sample?
No, not with qiime.
If not, do you mind explain why?
We find that it's more important to normalize by number of reads in a sample. This qiime script works great:

Colin

yun li

unread,
Dec 22, 2015, 2:24:55 PM12/22/15
to Qiime 1 Forum
Hi Colin,
I still do not understand, if I have DNA concentration data, can I use DNA concentration instead of total reads number to normalize the OTU table?

Thanks,
Yun

Colin Brislawn

unread,
Dec 22, 2015, 7:09:01 PM12/22/15
to Qiime 1 Forum
Hello Yun,

can I use DNA concentration instead of total reads number to normalize the OTU table?
No. (You can do it using another program like Excel or R or Python, but not with qiime.)

It's a bad idea! Don't try it!

Colin

yun li

unread,
Dec 23, 2015, 9:38:36 AM12/23/15
to Qiime 1 Forum
Hi Colin,
Thank you for pointing me out.

Thanks,
Yun

Colin Brislawn

unread,
Dec 23, 2015, 5:01:52 PM12/23/15
to Qiime 1 Forum
Hello Yun,

I was just rereading our conversation, and I realized that I should explain why you should normalize by number of reads in a sample, not DNA concentration.

Example:
Sample1  11,000 reads
Sample2  11,500 reads
Sample3  22,000 reads

Sample: 1,   2,   3
OTU1   45,  43, 102
OTU2   42,  46,  44
OTU3    0,   2,  92

If we look at the OTUs (and ignore the number of reads) we would make these conclusions:
OTU2 is equally common in all samples.
OTU1 is most common in sample1.

But we know these conclusions are not perfect because samples do not have the same number of reads! Let's randomly sample 10,000 reads from each sample (this is called subsampling or rarefying):
Sample1  10,000 reads
Sample2  10,000 reads
Sample3  10,000 reads

Sample: 1,   2,   3
OTU1   43,  41,  47
OTU2   39,  45,  22
OTU3    0,   0,  45

Ah ha! Now we reach these conclusions:
OTU2 is less common in sample 3.
OTU1 is equally common all samples.

This is why normalizing by number of reads is important. 

...
Now let's add DNA concentration
Sample1  11,000 reads 22 ng/ml
Sample2  11,500 reads  8 ng/ml (what!?!)
Sample3  22,000 reads 50 ng/ml
OK, so the sample with the most DNA has the most reads. That makes sense. But Sample2 with very little DNA has a normal amount of reads. This may sound strange, but I see it all the time on the MiSeq. Some samples get lots of reads even with very little input DNA. Because there is not a consistent link between number of reads and DNA concentration, you have to choose which method is better for normalization. The conscientious in the field is to use number of reads, not DNA concentration, because we think number of reads makes more difference in downstream analysis. 

I hope that helps!
Happy holidays!
Colin

yun li

unread,
Dec 29, 2015, 2:58:45 PM12/29/15
to Qiime 1 Forum
Hi Colin,
Thank you so much for your last email. That makes much clear for me.

Happy holidays!
Yun
Reply all
Reply to author
Forward
0 new messages