Using different set of primers

30 views
Skip to first unread message

Maria

unread,
Apr 23, 2017, 6:52:58 PM4/23/17
to Qiime 1 Forum
Hi,

I'm having trouble in solving this question alone, so I decide to ask here.

I've used qiime to analyze samples from fermentation tanks, and the literature says that is not a really diverse environment. Nonetheless, after using pick_otu.py default (yes, I did this analysis some time ago) it returned me more than 30.000 OTUs. At the time, it didn't sound strange to me, but now, I think this number is way too high.

But the beta-diversity analysis, taxonomy analysis (I performed assign_taxonomy.py with -m rdp -c 0.5) and alpha diversity seemed to make sense.

I was thinking about something that may have inflated the analysis and I don't know if this is the case. I used a set of primers with 4 primers foward and 4 primers reverse, do you think that this may be inflating my community?

And what do you suggest me to do?

Thank you in advance, and if I wasn't clear, please let me know. Also, if you need more information, I can tell you.

Maria

Greg Caporaso

unread,
Apr 24, 2017, 10:46:25 AM4/24/17
to Qiime 1 Forum
Hi Maria, 
This is a known issue with QIIME 1, and other pipelines from the "previous generation" of microbiome bioinformatics tools: mistaking sequencing error for biological diversity was very common. A consequence of that is that community richness estimates were known to be very inflated, like you describe here. However, taxonomic composition, relative alpha diversities (e.g., comparing richness across different samples in the same analysis), and beta diversities were generally still thought to be reliable, also as you note here. 

QIIME 2 does much better with filtering sequencing error. If you're interested in more accurate measures of community richness, you should start thinking about trying out QIIME 2

Best,
Greg

Maria

unread,
Apr 24, 2017, 11:10:14 AM4/24/17
to Qiime 1 Forum
Hi Greg!

Thank you for your answer! I understand and I will certainly try QIIME 2. But as I understand, QIIME 2 is in its alpha version and, to be honest, I'm a beginner at metagenomic analysis so I don't feel so confident in using this version.

If it is not a problem, could I ask you more things?

So, there is something that I could do to with the inflated number of OTUs, do you think using 4 set of primers to perfom the sequencing may have something to do you that?

And also, this inflated number is due to sequencing error? But I don't have a lot of sequences that classified as "Unclassified bacteria". Do this make sense? Sequencing error could return me sequences that were classified in taxonomy analysis?

Thank you again!
Maria

Greg Caporaso

unread,
Apr 25, 2017, 5:17:13 PM4/25/17
to Qiime 1 Forum
Hi Maria,
Responses below:

But as I understand, QIIME 2 is in its alpha version and, to be honest, I'm a beginner at metagenomic analysis so I don't feel so confident in using this version.

That makes sense - you might want to read this post, which defines what we mean when we say that QIIME 2 is in alpha release. I think you'd find it as easy, or easier than working with QIIME 1. 

So, there is something that I could do to with the inflated number of OTUs...

You could try increasing some of the quality control parameters. For example, using split_libraries_fastq.py -q 19 would be a good place to start. This won't work as well as the newer methods in QIIME 2 (for example, DADA2).

> do you think using 4 set of primers to perfom the sequencing may have something to do you that?

Are these primers all covering the same region of your marker gene? Or are they for different regions? If they're for different regions, you'd need to use closed-reference OTU picking for this (pick_closed_reference_otus.py), or it would result in higher OTU counts.

If the primers all cover the same region, that sounds like how most people do 16S sequencing with degenerate primers (meaning they are actually pools of a few primers) so I don't think that would be contributing to the inflated OTU counts.

And also, this inflated number is due to sequencing error?

I think that's the most likely case. In QIIME 1, sequencing error could often be mistaken for novel biological diversity. 

Sequencing error could return me sequences that were classified in taxonomy analysis?

Yes, it could. If you had a few bad bases in a sequence that was used to define an OTU, it could still match a reference database during taxonomy assignment.

Maria

unread,
Apr 25, 2017, 5:33:43 PM4/25/17
to Qiime 1 Forum
Hi Greg!

Thank you for all the answers, it was clarifying.

It seem to resolved my issue using pick_open_reference_otu.py. It returned me a much more reasonable numbers of otu and, as expected, didn't significantly changeded the other analysis - beta and alpha diversity. I think it only improved this analysis, actually. Some differences were reported in taxa summary, but only in the low frequence taxon. So, I think, everything was solved with that method for my analysis.

As well, good to know, I used split_libraries_fastq.py -q 19. Also, Illumina generated high quality sequences.

Anyway, thank you for your answer!
Best,
Maria
Reply all
Reply to author
Forward
0 new messages