Analyzing fungal ITS sequencing in QIIME

2,398 views
Skip to first unread message

Newbie

unread,
Aug 9, 2012, 5:33:50 PM8/9/12
to Qiime Forum
Hi everyone,

I have fungal ITS1 sequences that I need to process through QIIME. I
see that there are UNITE database files that can be used as a
reference data set. But, I have never used any other reference
database than the default Greengenes, and I am unclear as to where all
I need to specify this. I thought that I should go separately through
the steps making up pick_otus_through_otu_table.py. Is this correct?

So far, this is what I have done:

pick_otus.py -i '/home/qiime/Desktop/
08092012_Rhizobox_Fungal_ITS_Analysis/sl_out_1.5errors/seqs.fna' -o
pick_otus

pick_rep_set.py -i '/home/qiime/Desktop/
08092012_Rhizobox_Fungal_ITS_Analysis/pick_otus/seqs_otus.txt' -f '/
home/qiime/Desktop/08092012_Rhizobox_Fungal_ITS_Analysis/
sl_out_1.5errors/seqs.fna' -r '/home/qiime/qiime_software/
unite_taxonomy_21nov2011/unite_ref_seqs_21nov2011.fasta' -o
rep_set.fna

I am wondering if what I have entered so far seems correct, and also,
how does align_seqs.py need to be altered to also use the UNITE
reference sequences? I tried inputting it in under -t, but that was
not correct.

Thank you in advance,
Molli

Jai Ram Rideout

unread,
Aug 10, 2012, 2:51:17 PM8/10/12
to qiime...@googlegroups.com
Hi Molli,

You can either run the full pick_otus_through_otu_table.py script and
then backtrack a bit, or run each of the steps separately as you have
already done. The steps you have performed so far seem correct.

The next step is to assign taxonomy to your representative set. We've
tried the UNITE database out on BLAST and RDP, and found that there
were fewer unclassifiable hits when using BLAST vs. RDP (but BLAST
will always classify down to the lowest taxonomic level, which may not
always be accurate). There is a tutorial that shows how to retrain the
RDP classifier, and the steps will be very similar whether you end up
using BLAST or RDP to assign taxonomy.

Briefly, here are the steps you'll take (following the tutorial noted
above) after downloading the UNITE database:

assign_taxonomy.py -i rep_set.fna -t
unite_taxonomy_21nov2011/unite_id_to_taxonomy_map_21nov2011.txt -r
unite_taxonomy_21nov2011/unite_ref_seqs_21nov2011.fasta -o
pick_otus/rdp_assigned_taxonomy_unite/
make_otu_table.py -i pick_otus/seqs_otus.txt -t
rdp_assigned_taxonomy_unite/rep_set_tax_assignments.txt -o
pick_otus/otu_table_rdp_unite.biom

You can alternatively use BLAST by passing "-m blast" to
assign_taxonomy.py. From there you can use the new OTU table in
downstream analyses.

For the align_seqs.py step, you won't be able to use PyNAST (the
default alignment method) because we do not have a template alignment
for fungal sequences. I recommend using muscle (i.e. by passing "-m
muscle" to align_seqs.py) instead, and then you won't need to provide
a template alignment.

Please let me know if you get stuck anywhere in this process. I also
recommend reading the README file that is included with the UNITE
database for more info about how the database was created, as well as
limitations we have come up against when using this database.

Thanks,
Jai
> --
>
>
>

Molli Newman

unread,
Aug 10, 2012, 2:55:22 PM8/10/12
to qiime...@googlegroups.com
Thank you so much Jai!

I am unable to try those adjustments out at the moment, but I will do it first thing when I return to the office on Monday and contact you if I run into anymore hang ups.

Thanks again!
Molli
--



Blanca Landa del Castillo

unread,
Aug 10, 2012, 4:34:42 PM8/10/12
to qiime...@googlegroups.com
Hi Tony
It is nice you refreshed the UNITE datadase procedure to analyze ITS.
I would like to know what is the difference between using the files:

unite_id_to_taxonomy_map_21nov2011.txt

and the unite_taxonomy_mapping_fileterd_numbers.txt
Thanks

Blanca

El 8/10/2012 8:51 PM, Jai Ram Rideout escribi�:

Tony Walters

unread,
Aug 12, 2012, 1:04:56 PM8/12/12
to qiime...@googlegroups.com
Hello Blanca,

I think you meant Jai, but I do not see the unite_taxonomy_mappinged_filtered_numbers.txt file in the UNITE files.  Where did you get this file?  You probably want to use the unite_id_to_taxonomy_map_21nov2011.txt file for taxonomic assignments in any case.

-Tony

On Fri, Aug 10, 2012 at 2:34 PM, Blanca Landa del Castillo <ag2l...@uco.es> wrote:
Hi Tony
It is nice you refreshed the UNITE datadase procedure to analyze ITS.
I would like to know what is the difference between using the files:

unite_id_to_taxonomy_map_21nov2011.txt

and the unite_taxonomy_mapping_fileterd_numbers.txt
Thanks

Blanca

--




Blanca Landa del Castillo

unread,
Aug 12, 2012, 1:47:34 PM8/12/12
to qiime...@googlegroups.com
Hello Tony
Good question it is inside the folder but no idea where I did  downloaded I assumed that  I downloaded from qiime web pages but......
In any case I did use and worked it seems that is a shorted database. I will run it again with the original file
Thanks for the input
Blanca
--
 
 
 

Molli Newman

unread,
Aug 13, 2012, 9:35:45 AM8/13/12
to qiime...@googlegroups.com
Hi Jai,

I went back and was trying to run assign_taxonomy.py. I got the following error message:

qiime@qiime-VirtualBox:~/Desktop/08092012_Rhizobox_Fungal_ITS_Analysis$ assign_taxonomy.py -i '/home/qiime/Desktop/08092012_Rhizobox_Fungal_ITS_Analysis/rep_set.fna' -t '/home/qiime/qiime_software/unite_taxonomy_21nov2011/unite_id_to_taxonomy_map_21nov2011.txt' -r '/home/qiime/qiime_software/unite_taxonomy_21nov2011/unite_ref_seqs_21nov2011.fasta' -o pick_otus/rdp_assigned_taxonomy_unite/
Traceback (most recent call last):
File "/home/qiime/qiime_software/qiime-1.5.0-release/bin/assign_taxonomy.py", line 226, in <module>
main()
File "/home/qiime/qiime_software/qiime-1.5.0-release/bin/assign_taxonomy.py", line 222, in main
result_path=result_path,log_path=log_path)
File "/home/qiime/qiime_software/qiime-1.5.0-release/lib/qiime/assign_taxonomy.py", line 364, in __call__
max_memory=max_memory)
File "/home/qiime/qiime_software/qiime-1.5.0-release/lib/qiime/pycogent_backports/rdp_classifier.py", line 492, in train_rdp_classifier_and_assign_taxonomy
training_seqs_file, taxonomy_file, training_dir, max_memory=max_memory)
File "/home/qiime/qiime_software/qiime-1.5.0-release/lib/qiime/pycogent_backports/rdp_classifier.py", line 464, in train_rdp_classifier
return app(training_seqs_file)
File "/home/qiime/qiime_software/qiime-1.5.0-release/lib/qiime/pycogent_backports/rdp_classifier.py", line 320, in __call__
result = super(RdpClassifier, self).__call__(data=data, remove_tmp=remove_tmp)
File "/home/qiime/qiime_software/pycogent-1.5.1-release/lib/python2.7/site-packages/cogent/app/util.py", line 250, in __call__
% (str(exit_status),command)
cogent.app.util.ApplicationError: Unacceptable application exit status: 1, command: cd "/home/qiime/Desktop/08092012_Rhizobox_Fungal_ITS_Analysis/"; java -Xmx1000M -cp "/home/qiime/qiime_software/rdpclassifier-2.2-release/rdp_classifier-2.2.jar" edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker "/tmp/RdpTaxonomy_EzpU59.txt" "/tmp/tmpbS0IQDnrbsMMbTrgtETD.txt" 1 version1 cogent "/tmp/RdpTrainer_2QoXDH" > "/tmp/tmpNROAHs5B2VekxLY2gCQv.txt" 2> "/tmp/tmpuG0GcUWlGGDRep5Zx0h3.txt"

I read a solution to this on the QIIME forum here: https://groups.google.com/forum/?fromgroups#!topic/qiime-forum/ZAwQrnZqETA[1-25]

I downloaded and extracted version 0.982 to my qiime_software folder and removed the previous version (0.981), but I am not sure how to check and make sure it is in my path. As it is now, I re-run the command and get the same error. Any idea what I am doing wrong?

Thanks again,
Molli

Post-Doctoral Fellow
Department of Entomology and Plant Pathology
209 Rouse Life Sciences Building
Auburn University
Auburn, AL 36849 USA
www.wix.com/ramsemm/newman


-----Original Message-----
From: qiime...@googlegroups.com [mailto:qiime...@googlegroups.com] On Behalf Of Jai Ram Rideout
Sent: Friday, August 10, 2012 1:51 PM
To: qiime...@googlegroups.com
Subject: Re: [qiime-forum 1.5.0] Analyzing fungal ITS sequencing in QIIME

--



Jai Ram Rideout

unread,
Aug 13, 2012, 11:33:21 AM8/13/12
to qiime...@googlegroups.com
Hi Molli,

It looks like you upgraded the rtax version, but the issue you are
running into is specific to the RDP classifier. It seems Tony resolved
your issue in a different thread, so for anyone else having this
issue, please see the following post:

https://groups.google.com/d/topic/qiime-forum/HhcIAkdgRxo/discussion

-Jai
> --
>
>
>

SoManyGenes!

unread,
Aug 13, 2012, 3:03:00 PM8/13/12
to qiime...@googlegroups.com
Hi Molli and Jai,

Apologies for jumping into the middle of your conversation.  Regrettably, among people who use fungal ITS routinely, that locus is generally considered "unalignable" across distantly related taxa due to the size and frequency of indels (which may be why there is no alignment template). You might have a hard time selling a multiple sequence alignment of ITS to reviewers.


manpreet

unread,
Aug 21, 2012, 7:30:31 PM8/21/12
to qiime...@googlegroups.com
You are quite right about the ITS being difficult to align across a range of distantly related fungi.  I used MAFFT (as for some reason muscle was having issues with it, would just hang and crash the system with no error message at the end).  Later on, I looked at this alignment as I wanted to use an alternative program to build a simple phylogeny based on the alignment.  There were large gaps in the alignment (as expected) but to such as degree that your ordinary UPMGA or neighbour joining algorithms based on distance matrices were just unable to make sense of it.  I have fiddled and fiddled with this alignment manually but to no avail.  To a point where I have no confidence in any phylogeny based metrics that are computed later on the process.  Is there any better options for ITS alignment (maybe something that uses the intercalary regions that are slightly more conserved in the case of long reads) that may be out there that I dont know of yet?

Thanks

Manpreet

richard rodrigues

unread,
Jun 5, 2014, 1:37:18 PM6/5/14
to qiime...@googlegroups.com
I see this reply was long time back. Do you have the template alignment
for fungal sequences and where can I find it.

Thanks.

-Rich
 

Tony Walters

unread,
Jun 5, 2014, 1:42:04 PM6/5/14
to qiime...@googlegroups.com
Hello Rich,

There really isn't a way to make a ITS template alignment for fungi (at least a general one). Your best bet is to group closely related taxa and do de novo alignments (e.g. muscle) on those separately if you need to build a tree for target fungi. For overall community analyses, you'd want to stick to non-phylogenetic metrics, such as Bray-Curtis, for beta-diversity analysis and non-PD metrics for alpha diversity.


--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages