How to process biological replicates in 16S amplicon sequencing with MiSeq?

627 views
Skip to first unread message

Valentina Imparato

unread,
Aug 8, 2014, 5:14:38 AM8/8/14
to qiime...@googlegroups.com

Hi all!

I am currently running  a 16S amplicon analysis with MiSeq.

I have a couple of doubts still about reps treatment for 16S sequencing and on testing variability of reps.
I am working on soil samples with 5 different treatments. Each treatment  have 9 biological replicates (from DNA extraction to sequencing).

I would like to compare the diversity between the treatments. So I am currently looking for the right way how to combine/sum/… the reads for all the replicates in one in order to have unique data set per every treatment.

Which is the best way to proceed with the analysis testing the variability between my replicates? PCoA plot is the right tool to evaluate the reproducibility of the reps?

How would you process the replicates? Could you suggest me the best method to have a unique data set per treatment?
Does a tool exist for 9vs9 comparison?
Do you have ideas?

Thank you very much I advance for your inputs!

Cheers

 

Prakhar Gaur

unread,
Aug 8, 2014, 7:00:34 AM8/8/14
to qiime...@googlegroups.com
Hello,

Using Qiime 1.8 on AWS.

I have a similar data set, 13 samples with 3 replicates each. 
Currently struggling to create the metadata mapping fie for this analysis. 

Similar query, how do process the samples to utilize the biology replicate data efficiently ?

A simple strategy would be to pool all the replicates together to create a single meta sample.
But that could mean that I loose between replicate variability information.

comments, pointers are welcome.

Regards,
--
prakhar

Daniel McDonald

unread,
Aug 8, 2014, 2:40:34 PM8/8/14
to qiime...@googlegroups.com
Hello Valentina and Prakhar,

I think you'll both find this thread useful to review.

Best,
Daniel


--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Valentina Imparato

unread,
Aug 8, 2014, 4:34:15 PM8/8/14
to qiime...@googlegroups.com
​Hi Daniel!

thank you very much for reporting this discussion. I have found it very interesting for my purpose. I will have a lot to study now.

thanks again ​


--

---
You received this message because you are subscribed to a topic in the Google Groups "Qiime Forum" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/qiime-forum/TMFG-jHEnuI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to qiime-forum...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Valentina Imparato

Daniel McDonald

unread,
Aug 8, 2014, 7:01:39 PM8/8/14
to qiime...@googlegroups.com
No problem! Let us know if you have more questions

Best,
Daniel

Valentina Imparato

unread,
Aug 11, 2014, 11:42:15 AM8/11/14
to qiime...@googlegroups.com
Hi Daniel,
sadly most of the link in the suggested thread are not available any longer.  Although, users were not recommending some of them.

I have tried the functions of compared categories with adonis and anosim. Both referred a high similarity between the treatment categories. I have a doubt regarding the treatment of positive and negative categories, which are controls I run through the amplicons library production. they are positive and negative controls of the PCR amplifications. I am thinking I should process those some steps ahead. Do you have suggestions how those samples have to be processed?
Can they affect the results on beta diversity and comparisons significance?

Thank you very much.

vale
  

Daniel McDonald

unread,
Aug 11, 2014, 12:43:05 PM8/11/14
to qiime...@googlegroups.com
Hi Vale,

I'm not sure I fully understand the question, sorry. Are you indicating that the controls and treatment samples are not significantly different, and are wondering if the samples should be processed differently? If so, I do not recommend processing the samples differently as that may introduce an artificial bias. Do you see obvious clustering within PCoA space?

Best,
Daniel

Valentina Imparato

unread,
Aug 12, 2014, 4:30:41 AM8/12/14
to qiime...@googlegroups.com
Hi Daniel,

Sorry, I was not clear maybe...
I have 4 between treatments and control in the field. then after the soil extraction I decided to have PCRs control during the construction of the 16S amplicon library. I decided to sequence also those ones, the PCR controls using a pure strain of Pseudomonas and a PCR negative control.

 The categories I am currently thinking to process are related with the field treatments and its control. and I am speculating wheter I can process the PCRs controls which I sequnced anyhow, by for instance subctracting the reads. Did you have experience with it?
considering that the beta-diversity analyis compare between categories, do you think that mantaining those PCRs - sequencing controls could have any negative effect on the results? Therefore  should I process those before comparing field treatments? would you suggest to remove the PCRs controls from the mapping file so that I exclude from considering "categories"?
Have you ever had experience with this kind of data treatment?

I wish I could explain myself better and I look forward to hearing some comments and suggestion on this topic.

Thank you in advance.

Best,

vale


 

 

Prakhar Gaur

unread,
Aug 12, 2014, 9:56:51 AM8/12/14
to qiime...@googlegroups.com
Hello Daniel,

Thank you for the link.
I am currently reading the documentation for the scripts mentioned in that thread.

Regards,
--
prakhar

Prakhar Gaur

unread,
Aug 12, 2014, 10:01:35 AM8/12/14
to qiime...@googlegroups.com
Hello Vale,

I found alternate page for two of the links that were not working in the thread Daniel quoted.


I suppose these scripts have been renamed in Qiime 1.8.0.

Regards,
--
prakhar

Valentina Imparato

unread,
Aug 12, 2014, 10:33:38 AM8/12/14
to qiime...@googlegroups.com
Thanks a lot.
I will have a look at them.

I have run the beta diversity_through_plots.py and compare_catergories with ANOSIM and adonis.

Answering to my morning question, I checked the genus identity of my positive control in the taxonomic assignation and, since corrispond I decided to remove this category.
regarding the PCR negative control (which reads belongs to sphingomonas) I decided to use filter_samples_from_otu_table.py filtering out the OTU found in the related category.
I hope it was a good choice.
Waiting for your comments..

Cheerss

Daniel McDonald

unread,
Aug 12, 2014, 11:48:25 AM8/12/14
to qiime...@googlegroups.com
Hi Vale,

Glad to hear you're making progress! 

One risk with filtering like you did is that you may remove a real component of the non-control samples, and something that reviewers may perk up about. Does the subtraction significantly change the biological conclusions you draw without doing the subtraction?

Best,
Daniel

Valentina Imparato

unread,
Aug 14, 2014, 12:03:21 PM8/14/14
to qiime...@googlegroups.com
Hi Daniel,

yes, you are right. I cannot be sure that the OTU found in the negative control are not present in my soil sample per-se and not due to the contamination during PCR amplicon library.
The overall result does not change, mantaining or removing the OTUs in the negative control. 
tha OTUs found are 6 with abundance  1694/2436, 4/2436, 7/2436, 692/2436, 35/2436, 1/2436. The most abundant ones are belonging to sphingomonas genus. I though to take this decision since sphingomonas is a working strain in our lab. But again, I cannot prove that the sphingomonas detected in my samples are coming from  contamination due to lab or from real presence in soil samples. Am I right?

Now I am treating the all data set again removing since the begging the PCRs controls categories. Considering the fact that the 2 analysis still show the same results. Do you have any suggestion that could help in the future analysis to detect the OTUs deriving from lab-contamination?

I have another question to raise and it is related with the suggestions of my companion Prakhar. I would like to have a graph now pooling the reps together. Having 4 treatments and 9 replicates each, I would like to semplify the graph pooling them together to show the results as only the treatments.  He suggested pooling_by_metadata.py, but the function is not found in qiime 1.8.0. It seems that it is functional for 1.7.0, and I was not able to find any info about the latest version of it.
Can you give me any info about it?  

Thank you very much for your help and I look forward to hearing form you comments and new input for other thinking! :)

happy day
Cheers,

Daniel McDonald

unread,
Aug 14, 2014, 12:42:23 PM8/14/14
to qiime...@googlegroups.com
Hi Vale,

Contamination is not an easy thing of course. But, if you don't see a significant difference in results with or without the sphingomonas, then is it worth worrying about? One potential way to look at it is, if it is contamination, then the signal if provides is weak. However, there is of course the possible issue that sphingomonas is real in some samples, and a contaminant in others. I'd be curious to hear what others think about this situation though, I unfortunately do not have any specific recommendations.

With regard to pooling by metadata, that functionality has been moved to summarize_otu_by_cat.py. I believe that script should cover what you need.

Best,
Daniel

Valentina Imparato

unread,
Aug 14, 2014, 6:01:19 PM8/14/14
to qiime...@googlegroups.com
HI Daniel :)

Thank you very much for your comments about the topic. I am very much looking forward to hear any other ideas and brainstorming!

I will try summarize option as you suggested next week.
I will update on my results.

Have a peaceful night and bye to everyone!

Cheers,
Vale

Prakhar Gaur

unread,
Sep 14, 2014, 3:20:46 PM9/14/14
to qiime...@googlegroups.com
Hello Vale,

If its not an inconvenience,
could you please update us about your analysis.

What all scripts you used and for what purpose ?

Since I have similar kind of data it would be a great help to compare with my work flow.

Appreciate your time,

Regards,
--
prakhar
Reply all
Reply to author
Forward
0 new messages