I am trying to analyze samples for Vaginal microbiome. I started with the usual pipeline a where I pick_otus, prick_rep_set, assign_taxonomy and then make an otu table. I also using a tree and use it downstream. I used the Greengenes db as reference and used userach61 to assign taxonomy. I did not get classification at the species level, although my application requires me to do (Study about bacterial vaginosis). So I dug around to find that
Vaginal Microbiome Consortium has both a classifier and a reference data in STIRRUPS. It classified almost 98% of my OTU sequences to a species successfully.
My problem is, I am not sure how to incorporate the results into the already existing assigned taxonomy text file that can be used to create a BIOM table. I am very familiar using BIOM tables for downstream application, so I am working to get it rather than try other means.
The stirrup classifies only in the Genus and species level and nothing above. For the missing species level, I backfilled the values I got from STIRRUP classification in my existing assign taxonomy step. Most of it seems to be classified till the Genus level. Although there is almost 20 % of the otu seems to have "unidentified" as the taxa assignment. I have the genus and species classification from stirrup for it. I was wondering if I can just fill back g__;s__ values without the filling the above and still make it to work to build an OTU table ? As I am only interested in species classification I think I can afford to backfill these values. If not, can you please point me how such studies are performed ?
Kindly let me know your thought and I appreciate your guidance.
PS: Example classification results for the same OTU:
Using QIIME and usearch-
denovo414714 k__Bacteria; p__Firmicutes; c__Bacilli; o__Lactobacillales; f__Lactobacillaceae; g__Lactobacillus; s__ 0.67 3
using stirrup :
denovo414714|Lactobacillus jensenii|BT|1|100|89.8
As I can backfill my file in the above example, I will not be able to do so in the below example.
example:
Using QIIME and usearch-
denovo234088 Unassigned 1.00 1
using stirrup:
denovo234088|Lactobacillus iners|BT|1|100|80.6
Thanks,
RV