Samuel Major

unread,
Apr 19, 2016, 2:51:42 PM4/19/16
to Qiime 1 Forum
Hello,
I'm trying to Identify some algal plastid DNA alongside 16S sequences from bacteria from environmental samples. Using the greengenes database, it looks like the 16S sequences weren't able to adequately identify the majority of algae past the "Order" classification. Does anyone know if I'm able to download and use and alternate Database that may be able to better classify these samples? 
Second question: If this isn't possible, we're going to attempt either a 23S or an 18S MiSeq run on the same samples to achieve better results. Has anyone had any experience/success performing such an analysis with QIIME?

Thank you!!!
Sam

TonyWalters

unread,
Apr 19, 2016, 3:25:33 PM4/19/16
to qiime...@googlegroups.com
Hello Sam,

The actual plastid (and mitochondrial) sequences are quite limited in Greengenes, and are more there to help capture the reads that people would generally filter out of their data-the host-related genes. If you wanted to identify the reads, you'd probably want to get the most abundant ones, and blast them on NCBI to see which organism they are likely from. If you were to do this approach, I'd first filter out all of the other taxa from the table besides the chloroplasts (use the exact taxa string) to make it easier to handle  with filter_taxa_from_otu_table.py, convert the OTU table to tab separated form (biom convert -i otu_table.biom -o otu_table.txt --to-tsv), find the abundant OTUs in the resulting table which you can view in Excel, and then get the corresponding sequence out of the rep_set.fna file. If you have an OTU, such as 123456, you can use a grep command to query it, e.g.
grep -A 1 "^123456 " rep_set.fna

As for doing 18S analyses, those are possible, but you would need to use different PCR primers. There already exists SILVA releases which include 18S data and taxonomy files: http://www.arb-silva.de/download/archive/qiime/
You would have to be careful to examine whether the fragment amplified by your primers can distinguish the taxa involved.

LSU reads are also possible, but, you'd have to either create a LSU QIIME compatible database from SILVA or some other raw data source, or find one that someone has created already.
Reply all
Reply to author
Forward
0 new messages