Closed or Opened-referece OTU picking method

540 views
Skip to first unread message

HyoShin Yoon

unread,
Jul 28, 2014, 10:46:40 PM7/28/14
to qiime...@googlegroups.com

Hi, we sequenced V4 region of the 16S RNA Vagina sample to investigate vaginal microbiota with Illumina Miseq. In OTU picking process, we tried both opened and closed methods using gg_97_otus_4feb2011 green genes database . The major problem here is that we could not find the major spp level of vagina microbiota with the closed reference although it was found with open reference (for example, L. crispatus was not shown in the closed picking).

 

And we found the following information at the Qiime homepage:

You must use closed-reference OTU picking if:

  • You are comparing non-overlapping amplicons, such as the V2 and the V4 regions of the 16S rRNA. Your reference sequences must span both of the regions being sequenced.

 

So, according to this notice, we have to use closed reference OTU picking since we sequenced V4 region only,,,, Here is my question.  

1. How strict it is to follow this reference pipelines when we sequence V4 region and if it is okay to use open reference method

According to my understanding, in a closed reference OTU picking process, reads are clustered according to reference and others which did not matched were excluded from downstream….So If I want to check sequences in spp level, it seems easy to lose data.

I am worried if the closed method would limit the info we could get regarding the spp. level as it excludes the ones not matched

2. I would like to know the algorithm of this classifying sequences method (Closed or Opened method) how it works since sometimes it includes the specific spp. But sometimes it is not.
 
Thank you,
 
Hyo

 

 

Justine Debelius

unread,
Jul 29, 2014, 12:29:08 AM7/29/14
to qiime...@googlegroups.com
Hi Hyo,

I think open reference vs closed reference is matter of personal opinion. I strongly prefer closed reference data as much as possible because it's easier to compare closed reference studies and because the taxonomic assignments are more trustworthy, in my opinion. Even with open reference, there is still the caveat that 16S sequencing is not always the best method for species level identification, especially depending on your read length. This was discussed in the paper on RDP classifier.

One solution I might suggest is looking at a different reference set. I'd suggest updating to greengenes 13_8. You could also look for a vagina-specific reference set. What's linked might not be right for your project, but it may be a starting place.

Cheers,
Justine


--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Daniel McDonald

unread,
Jul 29, 2014, 11:28:37 AM7/29/14
to qiime...@googlegroups.com
Hi Hyo,

You may also be interested to read through Consistent, comprehensive and computationally efficient OTU definitions for a more detailed discussion on OTU picking methods including open and subsampled open reference.

Best,
Daniel

Colin Brislawn

unread,
Jul 29, 2014, 1:43:55 PM7/29/14
to qiime...@googlegroups.com
Hello Hyo,

You mentioned that
"
I am worried if the closed method would limit the info we could get regarding the spp. level as it excludes the ones not matched"

You are totally right about closed ref excluding unknown/novel OTUs. This creates the consistency between studies that Justine appreciates, but also reduces sensitivity like you have observed first hand.

Our lab strongly prefers open ref or de novo methods because it preserves all observed diversity. In order to reduce false positives we use strict quality filtering, conservative OTU picking, and a trusted taxonomy assigner.
Quality filtering: http://www.drive5.com/usearch/manual/fastq_filter.html
OTU picking (fewer/better OTUs then uclust): http://www.drive5.com/usearch/manual/cluster_otus.html
Taxonomy assignment: http://qiime.org/scripts/parallel_assign_taxonomy_rdp.html
(We also use and recommend the new greengenes that Justine mentions.)

We find this reduces noise/false OTUs while preserving the unique trends among uncharacterized OTUs.


To address your second question, OTU picking algorithms are separate from the distinction of open/closed/de-novo. So you can use the uclust algorithm to pick open-ref, or to pick closed-ref, or pick de-novo. UCLUST was created by Robert Edgar and is described here: http://drive5.com/usearch/manual/uclust_algo.html

We recommend UPARSE, the newest creation of Robert Edgar. This is a new OTU picking algorithm which is always de-novo, and is described here: http://drive5.com/usearch/manual/uparseotu_algo.html


I hope that helps!
Let me know what you find works well for you,
Colin

Michele Williams

unread,
Jul 29, 2014, 3:44:45 PM7/29/14
to qiime...@googlegroups.com
Hi Hyo,

My understanding of the statement "You must use closed-reference OTU picking if: You are comparing non-overlapping amplicons, such as the V2 and the V4 regions of the 16S rRNA. Your reference sequences must span both of the regions being sequenced." is that V2 and V4 have both been sequenced for each sample and will both be used for comparisons. Since you only sequenced V4, closed-reference OTU picking is not a requirement.

Michele


--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Michele L. Williams, DVM, PhD
Research Scientist
The Ohio State University
Wooster, OH  44691
Cell: 662-312-2827
Office: 330-263-3747

Daniel McDonald

unread,
Jul 29, 2014, 3:48:07 PM7/29/14
to qiime...@googlegroups.com
Correct, though you will get bias comparing V2 and V4 even with closed ref.
-Daniel

HyoShin Yoon

unread,
Aug 1, 2014, 2:04:54 AM8/1/14
to qiime...@googlegroups.com
-
Michele L. Williams, DVM, PhD
Research Scientist
The Ohio State University
Wooster, OH  44691
Cell: 662-312-2827
Office: 330-263-3747

--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

 
Thanks a lot! I really appreciate your comments and help !!! =)
 
-Hyo
Reply all
Reply to author
Forward
0 new messages