Filtering negative controls OTUs from samples

136 views
Skip to first unread message

Paz Aranega

unread,
Nov 22, 2016, 1:23:51 PM11/22/16
to Qiime 1 Forum
Hi,

I did quite a lot of pre-processing with my samples prior to DNA extraction (filtering, sonication, etc.) so I had to include different negative controls that apply to different samples. I have quality filtered my sequences using split-libraries, checked for chimeras with usearch and picked OTUs using pick_closed_reference.py. After it I thought it was a good idea to filter the OTUs found in my controls so I split the OTU table and filtered the different samples with their correspondent controls and finally merged back all the OTU tables. I am a bit confused about how to carry on because now I don't have a OTU mapping file that corresponds to this corrected OTU table to use with pick_rep_set.py. Is there any way I can generate one? Have I just picked the wrong moment to filter my samples with the negative controls? If so, when is the best one?

Thanks very much!


Paz

Colin Brislawn

unread,
Nov 22, 2016, 2:30:30 PM11/22/16
to Qiime 1 Forum
Hello Paz,

Thanks for getting in touch with us. Unfortunately, there is no single, official script to do this using qiime. Take a look at these two threads where this process is discussed:

Also, kudos for including negative controls in your samples. This is an important step to identify possible levels and sources of contamination. However, once some level of signal is observed in your negative controls, it's not totally clear what to do with that. The MiSeq machines have some level of cross-talk, that is, true reads get assigned to the wrong samples. Some of the signal in your negative controls is probably coming from real signal from your real samples. Because these OTUs are real, it's probably not a good idea to TOTALLY remove them from the full data set... 

Have I just picked the wrong moment to filter my samples with the negative controls? If so, when is the best one?
I don't know!

This is a subtle and fascinating question. What do you think?
Colin

Paz Aranega

unread,
Nov 23, 2016, 4:51:49 AM11/23/16
to Qiime 1 Forum
Hi Collin,

Thanks for your help. It is indeed a fascinating question. There are good points that get raised in one of the threads about looking at overlap of OTUs between sample types and controls and removing only OTUs that are frequent in the controls. I will definitely check the papers. However, in my case, as I am not so interested in knowing the exact sample composition but more so in the spatio-temporal patterns, and considering all the potential for contamination during the processing (especially for sonication and filtering steps) it might be better to be conservative and remove all OTUs and I don't think it should bias too much my conclusions.

I know there is no script and I have done it by hand following the instructions found in this qiime tutorial : http://qiime.org/tutorials/filtering_contamination_otus.html. It's a bit laborious because I have different controls that apply to different lots of samples but easy enough. I guess my question is how to generate and OTU map when it is done (if it is possible to do so).

Thanks,

Paz

Colin Brislawn

unread,
Nov 23, 2016, 9:54:24 AM11/23/16
to Qiime 1 Forum
Hello Paz,

Ah, I forgot about that tutorial. Glad you found it!

I guess my question is how to generate and OTU map when it is done (if it is possible to do so).
I'm not totally sure. Usually, the OTU map is used to build the OTU table the first time, then all downstream work takes place on the OTU table itself. In this tutorial, you already have a OTU table (which we are now subsetting and filtering). What do you plan to do with the OTU map? Are you asking about the map.txt file in this tutorial?

I want to briefly respond to your comment here. 
However, in my case, as I am not so interested in knowing the exact sample composition but more so in the spatio-temporal patterns, and considering all the potential for contamination during the processing (especially for sonication and filtering steps) it might be better to be conservative and remove all OTUs and I don't think it should bias too much my conclusions.
When you run the above tutorial, what percentage of your total OTUs are listed in otus_to_remove.txt? In which samples do the otus_to_remove.txt appear? If all these OTU mostly appear in the negative controls, removing them is fine. However, if they mostly appear in the negative controls, they won't have a large impact on the other samples, so removing them won't make too much difference anyway. If the otus_to_remove.txt also appear in normal samples, then I would be really worried that removing them would introduce more bias. I would argue that the conservative option is to leave them in, then discuss what OTUs are in these controls, instead of dropping swaths of your OTU table.  

Either way, the 'spatio-temporal patterns' should be based on changes to community structure, which is largely unaffected by some contamination. I think most methods should work well. 

Keep in touch,
Colin

Paz Aranega

unread,
Nov 23, 2016, 12:39:27 PM11/23/16
to Qiime 1 Forum
Hi Colin,

Sorry, I didn't explain myself too well. I meant the OTU mapping file that you get in the output from pick_closed_reference and then you can use to pick a representative set of sequences and assign taxonomy. Can I just use the one from my original OTU table and then use my filtered OTU table in the make_OTU_table step?

Thanks for your comment about whether or not to remove OTUs from controls. I don't have a straight answer to which OTU's are listed in otus_to_remove.txt or in which samples they appear because I have different controls that apply to different samples and it gets a bit confusing. I am not really sure how to check it out. I hadn't really considered that I could be removing OTUs that are not very abundant in my controls but are in my samples from the whole analysis. It worries me a bit now. I still need to check the papers from the other thread but I think what I am going to do is do parallel analysis with the original OTU table and the filtered one and compare the downstream results (and hope they look either very similar or they make a bit more sense when I filter the controls). 

Many thanks for this Colin!!

Paz

Colin Brislawn

unread,
Nov 23, 2016, 9:41:15 PM11/23/16
to Qiime 1 Forum
Hello Paz,

Thanks for getting in touch.

Can I just use the one from my original OTU table and then use my filtered OTU table in the make_OTU_table step?
I think that should work... The taxonomy annotations should still be in the filtered OTU table, so I hopefully you will not have to run assign_taxonomy again. Not totally sure though. You could also add taxonomy to this table using the results of the original assign taxonomy step because the OTU IDs will be the same. 

I really like your idea of doing analysis on the tables with and without controls removed. This may not provide an answer on which is 'right,' but it's a great way to track the difference and will let you make an informed decision. Given that you have multiple controls for multiple sample types, this is probably the best option. 

Thank you for the excellent discussion Paz. Let me know what you find,
Colin

Paz Aranega

unread,
Nov 28, 2016, 5:14:54 AM11/28/16
to Qiime 1 Forum
Thanks a lot for all your suggestions Colin! I will let you know how it works out.

Paz
Reply all
Reply to author
Forward
0 new messages