Assign Taxonomy step

67 views
Skip to first unread message

Alexandra Puértolas

unread,
Nov 17, 2016, 5:54:58 AM11/17/16
to qiime...@googlegroups.com
Dear Qiime community,

Finally, I reached the last step of my data analysis! I am performing a metagenomic study using four different primer pairs, and I have a little issue on the taxonomy result.

My question is about assign taxonomy to the representative OTU sequences. When I run the assing_taxonomy.py scrip with the blast option, for some of the samples I get the correct taxonomy assignment to the correspondent sequence of my database, but for some samples (sometimes are quite a lot of them), I don't get a taxonomy-sequence result, I just get as a result the name of the fasta database I used on this step with a number taht sometimes is the same but different in other otus (I have created my own database, which is a fasta file of a onw-curated collection of unique sequences. Not all the sequences on the reference database have the same lenght of my Illumina reads - for example, on the database I have sequences for the entire ITS region but I have only sequenced the ITS1 spacer).

I am a bit confused about what does it mean... I don't know if that is because the consensus sequence of this otu fits to more than one sequence of the database at the same time, and it cannot return a specific result.

Here is what I am talking about (see the otus 'denovo189, denovo370 and denovo430). Any idea of what does is it?

denovo748  Eukaryota  Stramenopiles Oomycetes PeronosporalesPhytophthora PHYTOPHTHORA_TENTACULATA_CBS_41296_ITS 9 E-024 PHYTOPHTHORA_TENTACULATA_CBS_41296_ITS
denovo189         None 4 E-011 786_DBsingles.fasta
denovo370   None  0.000000003 786_DBsingles.fasta
denovo430  None 4 E-135 438_DBsingles.fasta
denovo438 Eukaryota Stramenopiles Oomycetes Peronosporales Phytophthora PD_00420_ITS_Phytophthora_taxon_Pgchlamydo 4E-051 PD_00420_ITS_Phytophthora_taxon_Pgchlamydo

Thanks a lot in advance!

Alexandra.


Alexandra Puértolas

unread,
Nov 17, 2016, 9:11:06 AM11/17/16
to Qiime 1 Forum
I have also tried the uclust method in assign_taxonomy.py, and here I don't get that problem with the assigment of the result to the otus. The problem here is that on the log file of assign_taxonomy I read the complete taxonomy but I don't see it in the rep_seq_tax_assigned.txt file, and in consequence, it doesn't appear when I merge the otu table with the taxonomy assignment.

Does anyone know why is that? 

I have attached the assign_taxonomy file and log file, as well as the otu table.

Thank you!
otu_table.biom
otu_table.txt
rep_set_tax_assignments.log

zech xu

unread,
Nov 21, 2016, 1:43:47 AM11/21/16
to Qiime 1 Forum
Hi Alexandra,

What reference ITS database are you using? could you grab a representative sequence of the OTUs that get weird or no taxonomy assignment and blast it against NCBI to see if it truely matches ITS?

Alexandra Puértolas

unread,
Nov 21, 2016, 6:26:51 AM11/21/16
to Qiime 1 Forum
Hi Zech Xu

Thank you for your reply. I have checked and these sequences labelled on this way belong to the ITS region but they look like they are not my target organisms. The database I am using is a complilation of sequences I downloaded from different databases and papers for the four loci I have studied.

I have been checking and I will use the uclust method to assign the taxonomy. I am loosign diversity when I use the blast method instead of uclust, and with it I can change the parameters, and modify the similarity to 0.8 (with blast I cannot do that and it will be always 0.9, and I think that this is why I have more diversity with uclust).

Do you know how can I fix the step to make an otu table with the taxonomy without loosing the name of each reference sequence?

Thanks a lot for your help!

Alexandra

zech xu

unread,
Nov 22, 2016, 6:33:55 PM11/22/16
to Qiime 1 Forum
Hi Alexandra,

To get the ref seq ID that your ITS match, you can get this info in the raw format output from uclust. If you look at the ouput dir from assign_taxonomy.py command, there is a log file. You have to parse that file to get which rep seq matches which ref seq.

Alexandra Puértolas

unread,
Nov 23, 2016, 3:54:24 AM11/23/16
to Qiime 1 Forum

Hi! Thank you for the information!


I saw the log file with the results of the taxonomy. How can I parse this information and merge it in an otu table to get in there the otu counts and the species assignments? Is there a Qiime or other script to do it?


Thanks a lot for your help


Alexandra

Reply all
Reply to author
Forward
0 new messages