Contamination from negative controls

Rune Grønseth

unread,

Nov 2, 2015, 4:21:56 PM11/2/15

to Qiime Forum

Dear all,

We have low-biomass samples from the respiratory tract and have also negative controls (i.e. phosphate-buffered saline (PBS) that we use to suspend sampling swabs etc) that go through the same DNA extraction, PCR-protocol and sequencing (Illumina MiSeq, 2*300 bp, v3-v4).

Is there an accepted method for how to control for contamination seen in the negative controls? If I remove all OTUs seen in the negative controls (by participant/individual) I'm afraid to remove important signals from the actual samples.

Best wishes from Rune

Kyle Bittinger

unread,

Nov 3, 2015, 11:17:20 AM11/3/15

to Qiime Forum

Rune,

First of all, kudos for sequencing negative controls and including in your analysis. I hope that you have some oral samples, too!

There is no automatic method to do this. I agree that removing any OTU seen in a negative control is a poor method. However, the best method depends on your study. For example, group_significance.py can help to show that particular OTUs occur more frequently or have greater abundance in respiratory tract samples relative to negative controls.

See this article for an example of a respiratory tract study that includes negative control samples.

http://www.ncbi.nlm.nih.gov/pubmed/21680950

--Kyle

Rune Grønseth

unread,

Nov 3, 2015, 2:29:05 PM11/3/15

to Qiime Forum

Thank you, Kyle! Yes we read your very nice paper, and this is part of the reason that we have chosen to include oral samples, negative controls and quite extensive sampling from the airways. Would it be possible to consider those OTUs that occur relatively more in the airway samples (lets say by a factor of 5) as identified (and not contamination)?

Rune

Kyle Bittinger

unread,

Nov 3, 2015, 3:02:48 PM11/3/15

to Qiime Forum

Yes, you can follow the method behind figure 5 in this paper (written about fungi but the procedure still applies):

http://www.ncbi.nlm.nih.gov/pubmed/25344286

Basically, you're doing a Fisher's exact test for differences in presence/absence between sample types. (You can use the g_test method in group_significance.py) This quantifies the level of evidence that the frequency is different between sample types.

As a final note, this is for population-level testing, across a group of samples. If you are interested in proving that you detected something in a *particular* sample above the contamination level, you have to take a different approach. This means estimating the abundance distribution of a particular taxon in the contamination control data, then quantifying how unlikely it is that your observation came from that distribution.

Best,

Kyle

Rune Grønseth

unread,

Nov 3, 2015, 3:59:33 PM11/3/15

to Qiime Forum

Thanks again,

If one were to follow the latter approach - are there options to do this in QIIME? I've seen someone use what they called a neutral community model based on analyses in Mothur, but I think that required fantastic R-Jedi skills, that I don't possess (I'm a clinician).

Best wishes from Rune

Kyle Bittinger

unread,

Nov 3, 2015, 4:12:48 PM11/3/15

to Qiime Forum

Rune, there is no automatic way to do this in QIIME. You may have some luck by using summarize_taxa.py and loading the taxon summaries in an external program for custom plots.

I am a bit behind on reading the "neutral model" papers on the respiratory tract. The early papers used the frequency of occurrence in lung compared to mean abundance in the oral cavity to identify candidate OTUs. I think you need to take the abundance into account if you want to prove that the taxon is present in a specific LRT sample.