Serious error: Qiime says "Cyanobacteria", RDP 16S rRNA gene database says "Proteobacteria"

371 views
Skip to first unread message

Monika

unread,
Jan 24, 2013, 6:28:52 AM1/24/13
to qiime...@googlegroups.com
Hi,

In my data set, OTUs that were of lineage called "k__Bacteria;  p__Cyanobacteria;  c__4C0d-2;  o__YS2;  f__;  g__;  s__" caught my interest and I ran pick_rep_set.py to obtain representative dna sequence for these OTUs, both from Qiime reference sequence collection and from out own data (<130 bp). The match results on these sequences against RDP database shows however up to 100% sequence similarity of reference sequences (>1000 bp) with a genus named Vampirovibrio (within Proteobacteria_Deltaproteobacteria_Bdellovibrionales_Bdellovibrionaceae group). The NCBI blast shows only match to uncultured bacteria.

How is it possible to obtain so different results (two totaly different phylogroups) using Qiime and RDP?

Greg Caporaso

unread,
Jan 24, 2013, 8:45:48 AM1/24/13
to qiime...@googlegroups.com
Hi Monika,
Can you post the representative sequence?

Greg

Daniel McDonald

unread,
Jan 24, 2013, 3:17:18 PM1/24/13
to qiime...@googlegroups.com
Hey Monika,

How did you pick OTUs and what version of the Greengenes OTUs were you using?

When you indicate that the sequences match RDP at 100%, did you use
the RDP website to determine this?

QIIME uses the RDP classifier for classification in the case of
non-reference based OTUs. Typically, the training set used is
Greengenes. Regarding the different reference databases, it is
possible to get very different taxonomic results as the reference
databases are constructed differently, each have their pros and cons,
and differing levels of annotation. In some cases, there are
misannotation issues as well. A further problem is chimeric sequences
within the references, which can confound classification among other
problems. In short, no reference is perfect and differences between
references are not necessarily surprising, however, I agree that it is
concerning to get different phylum level classifications.

Best,
Daniel
> --
>
>
>

Monika

unread,
Jan 28, 2013, 9:38:45 AM1/28/13
to qiime...@googlegroups.com
Hi Greg,

Here are all 10 sequences, resulting in 10 different OTUs but all matched to the same lineage:
>515333_reference_collection
AGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCTTCATACATGCAAGTCGAACGAGAATCTTAACTTCGGTTGAGAGGAAAGTGGCGGACGGGTGAGTAGTGTGTAGAGAATCTGCCCTTGAGTGGGGGACAACAGCTGGAAACGGTTGCTAATACCCCATATGAGATTAGTTGAGATATTAATCTTGAAAACTCCGGTGCTCAAGGATGAGTCTGCATCTGATTAGCTAGTTGGGGGTGTAATGGACCACCAAGGCGACGATCAGTAGCTGGTTTGAGAGGATGATCAGCCACAATGGGACTGAGACACGGCCCATACTCCTACGGGAGACAGCAGTAGGGAATTTTGCGCAATGGGCGAAAGCCTGACGCAGCAACGCCGCGTGATTGATTAAGCCCTTCGGGGTGTAAAGATCTGTCAGTAGGGACGAACAATGACGGTACCTACAGAGGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTGTCCGGAATCATTGGGCGTAAAGAGTTCGTAGGTGGCTTGTTAAGTCTGGTGTTACATGCTGGGGCTCAATCCAGTTCGGCACTGGATACTGGCATGCTTGAATGCGGTAGAGGTAAAGGGAATTCCTGCTGTACCGCTGACATGGCTCNATATCACGAGGAACATCGGTGGCGTAAGCGCTTTACTGGGCCGTAATTGACACTGAGGAACGAAAGCCAGGGTAGCAAATGGGATTAGATACCCCAGTAGTCCTGGCCGTAAACGATGGATACTAGGTGTTGCGGGTATCGACCCCTGCAGTGCCGAAGCTAACGCGATAAGTATCCCGCCTGGGGAGTACGCACGCAAGTGTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAACATGTGGTTTAATTCGAAGCAACGCGAAAAACCTTACCAGGGCTTGACATCTAACAAACACTTGTGAAAGCGAGTGGTGCTCTTCGGAGAATGTTAAGACAGGTGGTGCACGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTCGTCGTTAGTTGGATGTGTCGGTATACTTACACATTCCTCTCTAGCGAGACTGCCGGTGATAAACCGGAGGAAGGTGGGGACGACGTCAAATCATCATGCCCCTTATGCTCTGGGCTACACACGTGTTACAATGGCTGGTACAACGAGCGGCCAACTCGCGAGAGTGAGCAAATCTCTAAAAACCAGTCTCAGTTCGGATTGCACTCTGCAACTCGAGTGCATGAAGTCGGAATCGCTAGTAAACGCAGATCAGCACGCTGCGTTGAATACGTTCCCGGGTCTTGT
>320115_reference_collection
GACGAACGCTGGCGGCGTGCTTCATACATGCAAGTCGAACGAGAATCTCTAGCTTGCTAGAGAGGAAAGTGGCGGACGGGTGAGTAATATGTAGAGAATCTGCCCTTGAGAGGGGGACAACAGCTGGAAACGGTTGCTAATACCCCATATGAGCGTACCTGAGATGGTATTCTTGAAAACTCTGGTGCTCAAGGATGAGTCTGCATCTGATTAGCTAGTTGGCGGGGTAATGGCCCACCAAGGCGACGATCAGTAGCTGGTTTGAGAGGATGATCAGCCACAATGGGACTGAGACACGGCCCATACTCCTACGGGAGGCAGCAGTAGGGAATTTTGCGCAATGGGCGAAAGCCTGACGCAGCAACGCCGCGTGAACGAGACGCCCTTCGGGGTGTAAAGTTTTGTCAGTAGGGACGAAAGATGACGGTACCTACAGAGGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTGTCCGGAACCATTGGGCGTAAAGAGTTCGTAGGCGGCATGTAAAGTCAGGTGTTAAAGGCTGAGGCTCAACCTCAGTATGGCACTTGATACTTGCAAGCTAGAATGCGGTAGAGGTAAAGGGAATTCCTGGTGTAGCGGTGAAATGCTTAGATATCAGGAGGAACATCGGTGGCGTAAGCGCTTTACTGGGCCGTAATTGACGCTGAGGAACGAAAGCCAGGGTAGCAAATGGGATTAGATACCCCAGTAGTCCTGGCCGTAAACGATGGATACTAGGTGTTGCGGGTATCGACCCCTGCAGTGCCGCAGCAAACGCGTTAAGTATCCCGCCTGGGGAGTACGCACGCAAGTGTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAACATGTGGTTTAATTCGAAGCAACGCGAAAAACTTTACCAGGGCTTGACATCTGACGAATCTGGATGAAAGTTCGGAGTGCTCTTCGGAGAGCGTCAAGACAGGTGGTGCACGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTCGTCGTTAGTTGGTGTTATCGGTACACTGATAATAACCTCTCTAGCGAGACTGCCGGTGATAAACCGGAGGAAGGTGGGGACGACGTCAAATCATCATGCCCCTTATGCTCTGGGCTACACACGTGTTACAATGGTTAGGACAGCGAGCAGCGAACCTGTGAGGGTAAGCAAATCTCTAAAACCTAGCCTCAGTTCGGATTGCACTCTGCAACTCGAGTGCATGAAGTCGGAATCGCTAGTAAACGCAGATCAGCACGCTGCGTTGAATACGTTCCCGGGTCT
>918242_reference_collection
GTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCTTCATACATGCAAGTCGAACGAGAAGCTCACTTCGGTGAGTGGAAAGTGGCGGACGGGTGAGTAATGTGTAGAGAATCTGCCCTTGAGTGGGGGACAACAGCTGGAAACGGTTGCTAATACCCCATATGAGATATTTTGCAATAAATATCTTGAAAACTCCGGTGCTCAAGGATGAGTCTGCATCTGATTAGCTAGTTGGGGGTGTAATGGACCACCAAGGCGACGATCAGTAGCTGGTTTGAGAGGATGATCAGCCACAATGGGACTGAGACACGGCCCATACTCCTACGGGAGGCAGCAGTAGGGAATTTTGCGCAATGGGCGAAAGCCTGACGCAGCAACGCCGCGTGATTGATAAAGCCCTTCGGGGTGTAAAGATCTGTCAGTGGGGACGAACAATGACGGTACCCACAGAGGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCGAGCGTTGTCCGGAATCATTGGGCGTAAAGAGTTCGTAGGTGGCCTGTTAAGTCTGGTGTTAAATGCAGATGCTCAACATCTGTTCGGCACTGGTACTGGCAAGCTTGAATGCGGTAGAGGTAAAGGGAATTCCTGGTGTAGCGGTGAAATGCGTAGATATCAGGAGGAACATCGGTGGCGTAAGCGCTTTACTGGGCCGTAATTGACACTGAGGAACGAAAGCCAGGGTAGCAAATGGGATTAGATACCCCAGTAGTCCTGGCCGTAAACGATGGATACTAGGTGTTGCGGGTATCGACCCCTGCAGTGCCGAAGCAAACGCGATAAGTATCCCGCCTGGGGAGTACGCACGCAAGTGTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAACATGTGGTTTAATTCGAAGCAACGCGAAAAACCTTACCAGGGCTTGACATCTAACAAACACTTGTGAAAGCGAGTGGTGCTCTTCGGAGAATGTTAAGACAGGTGGTGCACGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTCGTCGTTAGTTGGATGTGTCGGTACACTTACACATTCCTCTCTAGCGAGACTGCCGGTGATAAACCGGAGGAAGGTGGGGACGACGTCAAATCATCATGCCCCTTATGCTCTGGGCTACACACGTGTTACAATGGCTGGTACAACGAGCCGCCAACTCGTGAGAGTGAGCAAATCTCTAAAAACCAGTCTCAGTTCGGATTGCACTCTGCAACTCGAGTGCATGAAGTCGGAATCGCTAGTAAACGCAGATCAGCACGCTGCGTTGAATACGTTCCCGGGTCTTGTACACACCGCCCGTCACACCATGGAAGTCGACCACGCCCGAAGTACGTGAGCTAACCGTAAGGGAGCAGCGTCCTAAGGCAGGGTTGGTGACTGGGGTGAAGTCGTAACAAGGTA
>192999_reference_collection
AGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCTTCATACATGCAAGTCGAACGAGAAATTCACTTCGGTGGATGGAAAGTGGCGGACGGGTGAGTAATGTGTAGAGAATCTGCCCTAGAGCGGGGGACAACAGCTGGAAACGGTTGCTAATACCCCATATGAGCGTATCTGAAATGGTATTCTTGAAAACTCCGGTGCTCTAGGATGAGTCTGCATCTGATTAGCTAGTTGGGGGTGTAATGGACCACCAAGGCGACGATCAGTAGCTGGTTTGAGAGGATGATCAGCCACAATGGGACTGAGACACGGCCCATACTCCTACGGGAGGCAGCAGTAGGGAATTTTGCGCAATGGGCGAAAGCCTGACGCAGCAACGCCGCGTGATTGATAAAGCCCTTCGGGGTGTAAAGATCTGTCAGTGGGGACGAAACTTGACGGTACCCACAGAGGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTGTCCGGAATCATTGGGCGTAAAGAGTTCGTAGGTGGTTTGTTAAGTTTGGTGTTAAATGCAGAGGCTCAACTTCTGTTCGGCATCGGATACTGGCAGACTAGAATGCGGTAGAGGTAAAGGGAATTCCTGGTGTAGCGGTGAAATGCGTAGATATCAGGAGGAACATCGGTGGCGTAAGCGCTTTACTGGGCCGTAATTGACACTGAGGAACGAAAGCCAGGGTAGCAAATGGGATTAGATACCCCAGTAGTCCTGGCCGTAAACGATGGATACTAGGTGTTGCGGGTATCGACCCCTGCAGTGCCGTAGCTAACGCGTTAAGTATCCCGCCTGGGGAGTACGCACGCAAGTGTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAACATGTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAGGGCTTGACATCTAACAAACCTTTGTGAAAGCAGAGGGTGCTCTTCGGAGAATGTTAAGACAGGTGGTGCACGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTCGTCGTTAGTTGGTACTATCGGTATACTTATAGTAACCTCTCTAGCGAGACTGCCGGTGATAAACCGGAGGAAGGTGAGGACGACGTCAAATCATCATGCCCCTTATGCTCTGGGCTACACACGTGTTACAATGGCCGGTACAATGAGCCGCCAACTCGCGAGAGTGAGCAAATCTCTAAAAGCCGGTCTCAGTTCGGATTGCAGTCTGCAACTCGACTGCATGAAGTCGGAATCGCTAGTAAACGCAGATCAGCACGCTGCGTTGAATACGTTCCCGGGTCTTGCACACACCGCCCGTCA
>517108_reference_collection
AGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCTTCATACATGCAAGTCGAACGAGAAATTAGAGCTTGCTCTAATGGAAAGTGGCGGACGGGTGAGTAATATGTAGGAAATCTGCCCTAGAGAGGGGGACAACAGAGGGAAACTTCTGCTAATACCCCATATGAGCGTACTTGAGATAGTATTCTTGAAAACTCCGGTGCTCTGGGATGAGCCTGCATCTGATTAGCTAGTTGGTGGTGTAATGGACTACCAAGGCGACGATCAGTAGCTGGTTTGAGAGGATGATCAGCCACAATGGGACTGAGACACGGCCCATACTCCTACGGGAGGCAGCAGTAGGGAATTTTGCGCAATGGGCGAAAGCCTGACGCAGCAACGCCGCGTGAATGTTAAACCCTTCGGGGTGTAAAGTTCTGTCAGTGGGGACGAACAAATGACGGTACCCACAGAGGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTGTCCGGAATCATTGGGCGTAAAGAGTTCGTAGGCGGTTTGTTAAGTCTGGTGTTAAAGCCCGAAGCTCAACTTCGGTTCGGCACTGGATACTGGCAGACTAGAATGCGGTAGAGGTAAAGGGAATTCCTGGTGTAGCGGTGAAATGCGTAGATATCAGGAGGAACATCGGTGGCGTAAGCGCTTTACTGGGCCGTAATGACGCTGAGGAACGAAAGCCAGGGTAGCGAATGGGATTAGATACCCCAGTAGTCCTGGCCGTAAACGATGGATACTAGGTGTTGCGGGTATCGACCCCTGCAGTGCCGTAGCTAACGCGATAAGTATCCCGCCTGGGGAGTACGCACGCAAGTGTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAACATGTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAAGACTTGACATCTAGAGAACCTTTATGAAAGTAGAGGGTGCTCTTCGGAGAACTCTAAGACAGGTGGTGCACGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCACAACGAGCGCAACCCTCGTTGTTAGTTGCATATATCGGTATACTGATATATTGCTCTCTAGCAAGACTGCCGGTGATAAACCGGAGGAAGGTGGGGACGACGTCAAATCATCATGCCCCTTATGTCTTGGGCTACACACGTGTTACAATGGCTAGTACAACGAGTCGCCAACTCGCGAGAGTGAGCAAATCTCTTAAAACTAGTCTCAGTTCGGAATGCACTCTGCGACTCCAGTGCATGAAGTCGGAATCGCTAGTAAACGCAGATCAGCACGCTGCGTTGAATACATTACCCGGGGCTGGTACACACCGCCCGTCA
>549658_reference_collection
AGAGTTTGATCATGGCTCAGGACGAACGCTGGCGGCGTGCTTCATACATGCAAGTCGAACGAGAAGCTGACTTCGGTCAGTGGAAAGTGGCGGACGGGTGAGTAATATGTAGGAAATCTGCCCTAGAGAGGGGGACAACAGAGGGAAACTTCTGCTAATACCCCATATGAGCGTACTTGAAATAGTATTCTTGAAAACTCCGGTGCTCTAGGATGAGCCTGCATCTGATTAGCTTGTTGGTGGTGTAATGGACTACCAAGGCGACGATCAGTAGCTGGTTTGAGAGGATGATCAGCCACAATGGGACTGAGACACGGCCCATACTCCTACGGGAGGCAGCAGTAGGGAATTTTGCGCAATGGGCGAAAGCCTGACGCAGCAACGCCGCGTGTGTGATGACGCCCTTCGGGGTGTAAAACACTGTCAGTAGGGACGAAACTTGACGGTACCTACAGAGGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTGTCCGGAATCATTGGGCGTAAAGAGTTCGTAGGTGGTTTGTTAAGTCTGGTGTTAAAGCCCGAAGCTCAACTTCGGTTCGGCATCGGATACTGGCAGGCTAGAATGCGGTAGAGGTAAAGGGAATTCCTGGTGTAGCGGTGAAATGCGTAGATATCAGGAGGAACATCGGTGGCGTAAGCGCTTTACTGGGCCGTAATTGACACTGAGGAACGAAAGCCGGGGTAGCAAATGGGATTAGATACCCCAGTAGTCCCGGCCGTAAACGATGGATACTAGGGTGTTGCGGGTATCGACCCCTGCAGTGCCGTAGTTAACGCGATAAGTATCCCGCCTGGGGAGTACGCACGCAAGTGTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAACATGTGGTTTAATTCGAAGCAACGCGAAAAACCTTACCAGGGCTTGACATCTGAGGAACCTTTGTGAAAGCAGAGGGTGCTCTTCGGAGAACCTCAAGACAGGTGGTGCACGGTTGTCGCCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTCGTCGTTAGTTGCATATATTGGTATACTGATATATTGCTCTCTAGCGAGACTGCCGGTGATAAACCGGAGGAAGGTGGGGACGACGTCAAATCATCATGCCCCTTATGTCCTGGGCTACACACGTGTTACAATGGCTAAGACAGCGAGCAGCCAACCCGCGAGGGTGAGCAAATCTCTTAAACTTAGTCTCAGTTCGGATTGCACTCTGCAACTCGAGTGCATGAAGTCGGAATCGCTAGTAACCGTAGATCAGCACGCTGCGGTGAATACGTTCCCGGGTCTTGTACACACCGCCCGTCACACCATGGAAGTCGACCACGCCCGAAGCACGTGAGCTAACCTTTTGGAGGCAGCGTTCTAAGGCAGGGTTGGTGGCTGGGGTGAAGTCGTAACAAGGTAACC
>532474_reference_collection
AGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCTTCATACATGCAAGTCGAACGAGAATCTTGACTTCGGTTGAGAGGAAAGTGGCGGACGGGTGAGTAATGTGTAGAGAATCTGCCCTTGAGTGGGGGACAACAGCTGGAAACGGTTGCTAATACCCCATATGAGATTAGTTGAGATATTAATCTTGAAAACTCCGGTGCTCAAGGATGAGTCTGCATCTGATTAGCTAGTTGGGGGTGTAATGGACCACCAAGGCGACGATCAGTAGCTGGTTTGAGAGGATGATCAGCCACAATGGGACTGAGACACGGCCCATACTCCTACGGGAGGCAGCAGTAGGGAATTTTGCGCAATGGGCGAAAGCCTGACGCAGCAACGCCGCGTGATTGATTAAGCCCTTCGGGGTGTAAAGATCTGTCAGTAGGGACGAACAATGACGGTACCTACAGAGGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTGTCCGGAATCATTGGGCGTAAAGAGTTCGTAGGTGGCTTGTTAAGTCTGGTGTTANATGCTGGGGCTCAACTTCAGTTCGGCACTGGATACTGGCAGGCTTGAATGCGGTACAGGTAAAGGGAATTCCTGGTGTACCGGTGAAATGCGTAGATATCAGAGGAACATCGGTGGCGTAAGCGCTTTACTGGCCCGTAATTGACACTGAGGAACGAAAGCCAGGGTAGCAAATGGGATTAGATACCCCAGTAGTCCTGGCCGTAAACGATGGATACTAGGTGTTGCGGGTATCGACCCCTGCAGTGCCGAAGCTAACGCGATAAGTATCCCGCCTGGGGAGTACGCACGCAAGTGTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAACATGTGGTTTAATTCGAAGCAACGCGAAAAACCTTACCAGGGCTTGACATCTAACAAACACTTGTGAAAGCGAGTGGTGCTCTTCGGAGAATGTTAAGACAGGTGGTGCACGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTCGTCGTTAGTTGGATGTGTCGGTATACTTACACATTCCTCTCTAGCGAGACTGCCGGTGATAAACCGGAGGAAGGTGGGGACGACGTCAAATCATCATGCCCCTTATGCTCTGGGCTACACACGTGTTACAATGGCTGGTACAACGAGCGGCCAACTCGCGAGAGTGAGCAAATCTCTAAAAACCAGTCTCAGTTCGGATTGCACTCTGCAACTCGAGTGCATGAAGTCGGAATCGCTAGTAAACGCAGATCAGCACGCTGCGTTGAATACGTTCCCGGGTCATTGTACACAGCGCCCGTCA
>3449666_reference_collection
CCTGGCTCAGGACGAACGCTGGCGGCGTGCTTCATACATGCAAGTCGAACGAGAATTCACCTTCGGGTGGAGGACAGTGGCGGACGGGTGAGTAAAGTGTAGAGAATCTGCCCTAGAGTGGGGGACAACATTGGGAAACCGGTGCTAATACCCCATATGAGTCAAGCTGCAATGCTTGCCTTGAAAACTCCGGTGCTCTAGGATGAGTCTGCATCTGATTAGCTAGTTGGGGGTGTAATGGACCACCAAGGCGACGATCAGTAGCTGGTTTGAGAGGATGATCAGCCACAATGGGACTGAGACACGGCCCATACTCCTACGGGAGGCAGCAGTAGGGAATTTTGCGCAATGGGCGAAAGCCTGACGCAGCAACGCCGCGTGATTGATGACGCCCTTCGGGGTGTAAAGATCTGTCAGTGGGGACGAAAAATGACGGTACCCACAGAGGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTGTCCGGAATCATTGGGCGTAAAGAGTTCGTAGGTGGCCTGTTAAGTCAGGTGTTAAAGGTCGAGGCTCAACCTCGGTATGGCACTTGATACTGGCAAGCTTGAATTCGGTAGAGGTAAAGGGAATTCCTGGTGTAGCGGTGAAATGCGTAGATATCAGGAGGAACATCGGTGGCGTAAGCGCTTTACTGGGCCGACATTGACACTGAGGAACGAAAGCCGGGGTAGCAAATGGGATTAGATACCNCAGTAGTCCCGGCCGTAAACGATGGATACTAGGTGTTGCGGGTATCGACCCCTGCAGTGCCGCAGCTAACGCGATAAGTATCCCGCCTGGGGAGTACGCACGCAAGTGTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAACATGTGGTTTAATTTGAAGCAACGCGAAAAACCTTACCAGGGCTTGACATCTAACTAATATTTAAGAAATTAGATAGTGCTCTTCGGAGAAAGTTAAGACAGGTGGTGCACGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTCGTCGTTAGTTGGATATATTGGTACACTGATATATTCCTCTCTAGCGAGACTGCCGGTGATAAACCGGAGGAAGGTGGGGANGACGTCAAATCATCATGCCCCTTATGCTNTGGGCTACACACGTGTTACAATGGCCGGTACAACGAGCCGCCAACTCGCGAGAGTGAGCAAATCTCTAAAAACCGGTCTCAGTTCGGATTGCACTCTGCAACTCGAGTGCATGAAGTCGGAATCGCTAGTAAACGCAGATCAGCACGCTGCGTTGAATACGTTCCCGGGTCTTGTACACACCGCCCGTCACACCATGGAAGTCGACCACGCCCGAAGTACGTGAGCTAACCTTAACGGAAGCAGCGTCCTAAGGCAGGGTTGGTGACTGGGGNGAAGT
>181544_reference_collection
AGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCTTCATACATGCAAGTCGAACGAGAAGTTACCTTCGGGTAATGGACAGTGGCGGACGGGTGAGTAATGTGTAGAGAATCTGCCCTTGAGCGGGGGACAACAGCTGGAAACGGTTGCTAATACCCCATATGATGCCGGTTGTGATACTGGTATTGAAAACTCCGGTGCTCAAGGATGAGTCTGCATCTGATTAGCTAGTTGGGGGTGTAATGGACCACCAAGGCGACGATCAGTAGCTGGTTTGAGAGGATGATCAGCCACAATGGGACTGAGACACGGCCCATACTCCTACGGGAGGCAGCAGTAGGGAATTTTGCGCAATGGGGGCAACCCTGACGCAGCAACGCCGCGTGATTGATTAAGCCCTTCGGGGTGTAAAGATCTGTCAGTGGGGACGAAATATGACGGTACCCACAGAGGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTGTCCGGAATCATTGGGCGTAAAGAGTTCGTAGGTGGCATGTAAAGTCTGGTGTTAAAGGCAGAAGCTCAACTTCTGTAGGGCACTGGATACTTGCAAGCTGGAATGCGGTAGAGGTAAAGGGAATTCCTGGTGTAGCGGTGAAATGCGTAGATATCAGGAGGAACATCGGTGGCGTAAGCGCTTTACTGGGCCGTAATTGACACTGAGGAACGAAAGCCGGGGTAGCAAATGGGATTAGATACCCCAGTAGTCCCGGCCGTAAACGATGGATAATAGGTGTTGCGGGTATCGACCCCTGCAGTGCCGCAGCTAAACGCGTTAAGTATCCCGCCTGGGGAGTACGCACGCAAGTGTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAACATGTGGTTTAATTCGAAGCAACGCGAAGAACCTTACCAGGGCTTGACATCTAACAAATCTCTATGAAAGTAGAGAGTGCTCTTCGGAGAATGTTAAGACAGGTGGTGCACGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTCGTCGTTAGTTGGATGCATCGGTATACTTATACATTCCTCTCTAGCGAGACTGCCGGTGATAAACCGGAGGAAGGTGGGGACGACGTCAAATCATCATGCCCCTTATGCTCTGGGCTACACACGTGTTACAATGGCCAGGACAACGAGCCGCCAACTCGCGAGAGTGAGCAAATCTCTAAAACCTGGTCTCAGTTCGGATTGCACTCTGCAACTCGAGTGCATGAAGTCGGAATCGCTAGTAAACGCAGATCAGCACGCTGCGTTGAATACGTTCCCGGGTCTTGCACACACCGCCCGTCA
>302878_reference_collection
AGAGTTTGATCCTGGCTCAGGACGAACGCTGGCGGCGTGCTTCATACATGCAAGTCGAACGAGAAGTCACCTTCGGGTGATGGAAAGTGGCGGACGGGTGAGTAATATGTAGAGAATCTGCCCTTGAGAGGGGGACAACAGAGGGAAACTTCTGCTAATACCCCATATGAGCGTAGCTGAAATGCTATTCTTGAAAACTCCGGTGCTCAAGGATGAGTCTGCATCTGATTAGCTAGTTGGCGGTGTAATGGACCACCAAGGCGACGATCAGTAGCTGGTTTGAGAGGATGATCAGCCACAATGGGACTGAGACACGGCCCATACTCCTACGGGAGGCAGCAGTAGGGAATTTTGCGCAATGGGCGAAAGCCTGACGCAGCAACGCCGCGTGAACGAGAAGCCCCTCGGGGTGTAAAGTTCTGTCAGTAGGGACGAACGATGACGGTACCTACAGAGGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTGTCCGGAATCATTGGGCGTAAAGAGTTCGTAGGCGGCACGTAAAGTCTGGTGTTAAAGGCGGAGGCTCAACTTCCGTACGGCACTGGATACTTGCGAGCTAGAATGCGGTAGAGGTAAAGGGAATTCCTGGTGTAGCGGTGAAATGCTTAGATATCAGGAGGAACATCGGTGGCGTAAGCGCTTTACTGGGCCGTGATTGACGCTGAGGAACGAAAGCCAGGGTAGCAAATGGGATTAGATACCCCAGTAGTCCTGGCCGTAAACGATGGATACTAGGTGTTGCGGGTATCGACCCCTGCAGTGCCGAAGCTAACGCGATAAGTATCCCGCCTGGGGAGTACGCACGCAAGTGTGAAACTCAAAGGAATTGACGGGGACCCGCACAAGCGGTGGAACATGTGGTTTAATTCGAAGCAATGCGAAGAACCTTACCAGGGCTTGACATCTAACAAATCTAGCGGAAACGTTGGAGTGCTCTTCGGAGAATGTTAAGACAGGTGGTGCACGGTTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCACGTTGTTAGTTGGTCATATTGGTATACTGATATGAATCTCTCTAGCGAGACTGCCGGTGACAAACCGGAGGAAGGTGTGGACGACGTCAAATCATCATGCCCCTTATGCTCTGGGCTACACACGTGTTACAATGGGTAGGACAGCGAGCAGCGAACCTGCGAGGGCAAGCAAATCTCTAAAACCTATCCTCAGTTCGGATTGCACTCTGCAACTCGAGTGCATGAAGTCGGAATCGCTAGTAAACGCAGATCAGCACGCTGCGTTGAATACGTTCCCGGGTCTTGCACTCACCGCCCGTCA

I used /home/ubuntu/qiime_software/gg_otus-12_10-release/rep_set/99_otus.fasta as reference sequence.

- Monika

Monika

unread,
Jan 28, 2013, 9:41:34 AM1/28/13
to qiime...@googlegroups.com
Hi Daniel,

I used /home/ubuntu/qiime_software/gg_otus-12_10-release/rep_set/99_otus.fasta as reference sequence set.

Yes, I was using the RDP website to match these sequences.

In case of chimeras, where do you think the error is: are chimeras in the reference sequence set or in the RDP database?

Daniel McDonald

unread,
Jan 28, 2013, 9:46:26 AM1/28/13
to qiime...@googlegroups.com
Hey Monika,

I'll verify with Phil Hugenholtz (Greengenes) and Jim Cole (RDP) and
see what the discrepency is.

I have not seen any evidence yet of chimeric artifacts in these
reference sequences.

Best,
Daniel
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "Qiime Forum" group.
> To unsubscribe from this group, send email to
> qiime-forum...@googlegroups.com.
>
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Daniel McDonald

unread,
Jan 28, 2013, 3:06:53 PM1/28/13
to qiime...@googlegroups.com
Monika,

I just received a response back from Jim Cole at RDP. Here is what he
had to say:

We classified the unknown as a Vampirovibrio. Qiong in our group did
some quick detective work. Looks like there is something suspicious
with either the classification of Vampirovibrio chlorellavorus or with
it's 16S sequence. V. chlorellavorus was first published by a Russian
group in 1980 as a Bdellovibrio that grows on Chlorella (a Eukaryotic
algae - Bdellovibrio are usually parasites of Bacteria.) Anyway, the
16S sequence wasn't done until 2010 by ATCC and the sequence appears
specifically related to the Cyanobacteria and chloroplasts. This leads
to the question of whether V. chlorellavorus is a motile Cyanobacteria
with a very unusual lifestyle or whether ATCC accidentally sequenced
it's food chloroplast sequence. To test this, Qiong compared the
sequence to the Chlorella vulgaris chloroplast and it's not a close
match. I'm going to e-mail Tim Lilburn at ATCC to see if he knows
anything more.

Anyway, the unknown sequence is not a Proteobacterial and is likely
related to the Cyanobacteria.

Thanks for pointing out the discrepancy,
Reply all
Reply to author
Forward
0 new messages