Fungal ITS database

1,974 views
Skip to first unread message

Sam

unread,
Nov 14, 2011, 1:00:42 PM11/14/11
to Qiime Forum
Hello,
I wonder if there is a qiime-compatible database for fungal ITS
sequences or anybody would like to share it, so that it can be used
for taxa assignment by blast?

There is a Blast database available for fungal ITS, but do not know
how to compile it to become qiime compatible?
Any suggestion or instruction is greatly appreciated.

Thanks,
Xueju

Chris B

unread,
Nov 15, 2011, 8:11:07 AM11/15/11
to Qiime Forum
Hi Xueju,

I have a python script that goes through the ncbi's nt database,
generates the taxonomy mapping file that qiime requires to use blast,
and generates a list of accession numbers that you can use as an input
to the ncbi's alias tool to make a custom database. I use this to
restrict the database to sequences labelled as fungi. It does not
specifically pull out ITS-labelled sequences, but I find it works OK
to blast ITS against the resulting set anyhow. If you did want to
restrict the database to ITS, I think you could probably do that as a
preliminary step.

An alternative would be to use the Unite database, and maybe this is
what you are referring to? Here it's a matter of going through the
fasta file that can be downloaded from their site and generating the
qiime taxonomy mapping file from the sequence identifier lines in that
file, which can also be done with python. Then it's just a matter of
turning the fasta file into a blast-formatted database using the
ncbi's tool.

If you drop me a line at cba...@oeb.harvard.edu I'd be more than happy
to share code.

Cheers,
Chris

Jai Ram Rideout

unread,
Nov 15, 2011, 10:55:56 AM11/15/11
to qiime...@googlegroups.com
Hi Xueju and Chris,

We are currently working on getting the UNITE ITS database into a format
compatible with QIIME. We've been getting a lot of unclassified
sequences when using the RDP classifier, but much less when using BLAST,
so we are looking into this issue before we post the files publicly. The
files should be available shortly.

Thanks,
Jai

Greg Caporaso

unread,
Nov 15, 2011, 11:02:35 AM11/15/11
to qiime...@googlegroups.com
Thanks Jai!

Jai's files will be going up on the 'Data Files' section of the QIIME
website (http://qiime.org/home_static/dataFiles.html) when they're
ready.

Greg

garlicscape

unread,
Dec 3, 2011, 8:36:27 PM12/3/11
to Qiime Forum
I just tried using this ITS database, and it is resulting in a very
large number of unknowns, even at phylum level. Any idea why?
Before I was using a database I made on my own. I downloaded AFTOL
sequences from ncbi and formatted them myself. This home-made database
was only 600 sequences, but I was getting far fewer unknowns.
I am using default values for clustering, skipping aligning steps, and
assigning taxonomy using blast method.

Sarah

On Nov 15, 9:02 am, Greg Caporaso <gregcapor...@gmail.com> wrote:
> Thanks Jai!
>
> Jai's files will be going up on the 'Data Files' section of the QIIME
> website (http://qiime.org/home_static/dataFiles.html) when they're
> ready.
>
> Greg
>

> On Tue, Nov 15, 2011 at 8:55 AM, Jai Ram Rideout <jai.ride...@gmail.com> wrote:
>
>
>
>
>
>
>
> > Hi Xueju and Chris,
>

> > We are currently working on getting the UNITEITSdatabase into a format


> > compatible with QIIME. We've been getting a lot of unclassified sequences
> > when using the RDP classifier, but much less when using BLAST, so we are
> > looking into this issue before we post the files publicly. The files should
> > be available shortly.
>
> > Thanks,
> > Jai
>
> > On 11/15/2011 06:11 AM, Chris B wrote:
>
> >> Hi Xueju,
>
> >> I have a python script that goes through the ncbi's nt database,
> >> generates the taxonomy mapping file that qiime requires to use blast,
> >> and generates a list of accession numbers that you can use as an input
> >> to the ncbi's alias tool to make a custom database. I use this to
> >> restrict the database to sequences labelled as fungi. It does not

> >> specifically pull outITS-labelled sequences, but I find it works OK
> >> to blastITSagainst the resulting set anyhow. If you did want to
> >> restrict the database toITS, I think you could probably do that as a


> >> preliminary step.
>
> >> An alternative would be to use the Unite database, and maybe this is
> >> what you are referring to? Here it's a matter of going through the
> >> fasta file that can be downloaded from their site and generating the
> >> qiime taxonomy mapping file from the sequence identifier lines in that
> >> file, which can also be done with python. Then it's just a matter of
> >> turning the fasta file into a blast-formatted database using the
> >> ncbi's tool.
>
> >> If you drop me a line at cba...@oeb.harvard.edu I'd be more than happy
> >> to share code.
>
> >> Cheers,
> >> Chris
>
> >> On Nov 14, 1:00 pm, Sam<xueju...@gmail.com>  wrote:
>
> >>> Hello,
> >>> I wonder if there is a qiime-compatible database for fungalITS
> >>> sequences or anybody would like to share it, so that it can be used
> >>> for taxa assignment by blast?
>

> >>> There is a Blast database available for fungalITS, but do not know

Greg Caporaso

unread,
Dec 3, 2011, 10:09:32 PM12/3/11
to qiime...@googlegroups.com
Hi Sarah,
There are no great ITS reference collections that we're aware of. We
put this one together for one specific study, but from what I
understand the UNITE database is very ectomycorrhizal biased as that
was the original purpose of the database. We mention the issue with
many unclassified sequences in the README.txt associated with the
unite_taxonomy_21nov2011.zip file.

If you have a reference collection that achieves better results we'd
definitely be interested in trying it out. Is this something you're
able to share with us?

Greg

Nigel Percy

unread,
Dec 4, 2011, 5:45:46 PM12/4/11
to Qiime Forum
Hi,

I'm keeping an eye on this thread because I'm just starting to move
into fungal sequences (still have to work out primers that work with
the sequencing), but I noticed talk of trying to find a decent
database. Through one of the papers I have read I found this database
http://www.emerencia.org/fungalitspipeline.html (click the "download"
for the actual fasta file). It is not listed out with taxonomy that
is set up for the RDP, but it has a lot of ITS1 sequences in it (and
the strains I was looking at). I have no idea (but watching in case I
can learn) how to turn this into a usable database for Qiime, but it
might be use to those that can.

Nigel

Sam

unread,
Dec 5, 2011, 9:51:37 AM12/5/11
to Qiime Forum
Hi Sarah and Greg,
I had the same problem as described by Sarah.
To reduce the % of unknown fungal in the assignment, one strategy is
to remove all unknown fungal sequences in the unite_ref_seqs file.

For example, we have one fungal ITS sequence which is assigned as
Root;Fungi;Unknown;Unknown;Unknown;Unknown7637 in the qiime blast;
If we use Blastn in the UNITE website, we can get better phylogenetic
resolution:

Sequences producing significant alignments:
(bits) Value
EF031111 fungal sp W303
1043 0.0
HQ445982 uncultured fungus
1035 0.0
EU725701 fungal sp Q1
1021 0.0
EU725700 fungal sp M23
1021 0.0
EU725698 fungal sp M21
1021 0.0
EU725696 fungal sp M18
1021 0.0
EU725675 fungal sp F13
1021 0.0
EU725674 fungal sp F12
1021 0.0
EF126341 fungal sp WD34A
1021 0.0
EU240135 fungal sp WD32A
1011 0.0
EF434139 uncultured fungus
1011 0.0
HM589305 Mucoromycotina sp BEA_2010
1001 0.0
AY969842 uncultured fungus
1001 0.0
FJ553914 uncultured Mortierella
999 0.0
EU240133 Mortierella sp WD2G
997 0.0
EU240132 fungal sp WD2F
997 0.0
EU240130 Mortierella sp WD25F
997 0.0

If we remove those "fungal sp." near the top, Blast in qiime will
assign my sequence to Mucoromycotina or Mortierella sp. so that we can
get a phylogenetic resolution to a phylum level or finer.

Xueju

kyle.j.bibby

unread,
Dec 5, 2011, 10:34:13 AM12/5/11
to Qiime Forum
All,
We have experienced the best luck with avoiding the "unnamed" problem
with using the "named" database constructed in "A software pipeline
for processing and identification of fungal ITS sequences" by Nilsson
etal.
kyle

On Dec 5, 9:51 am, Sam <xueju...@gmail.com> wrote:
> Hi Sarah and Greg,
> I had the same problem as described by Sarah.
> To reduce the % of unknown fungal in the assignment, one strategy is
> to remove all unknown fungal sequences in the unite_ref_seqs file.
>

> For example, we have one fungalITSsequence which is assigned as

> > There are no greatITSreference collections that we're aware of. We


> > put this one together for one specific study, but from what I

> > understand the UNITEdatabaseis very ectomycorrhizal biased as that
> > was the original purpose of thedatabase. We mention the issue with


> > many unclassified sequences in the README.txt associated with the
> > unite_taxonomy_21nov2011.zip file.
>
> > If you have a reference collection that achieves better results we'd
> > definitely be interested in trying it out. Is this something you're
> > able to share with us?
>
> > Greg
>
> > On Sat, Dec 3, 2011 at 6:36 PM, garlicscape <garlicsc...@gmail.com> wrote:

> > > I just tried using thisITSdatabase, and it is resulting in a very


> > > large number of unknowns, even at phylum level. Any idea why?

> > > Before I was using adatabaseI made on my own. I downloaded AFTOL

> > >> >> to the ncbi's alias tool to make a customdatabase. I use this to
> > >> >> restrict thedatabaseto sequences labelled as fungi. It does not


> > >> >> specifically pull outITS-labelled sequences, but I find it works OK
> > >> >> to blastITSagainst the resulting set anyhow. If you did want to

> > >> >> restrict thedatabasetoITS, I think you could probably do that as a
> > >> >> preliminary step.
>
> > >> >> An alternative would be to use the Unitedatabase, and maybe this is


> > >> >> what you are referring to? Here it's a matter of going through the
> > >> >> fasta file that can be downloaded from their site and generating the
> > >> >> qiime taxonomy mapping file from the sequence identifier lines in that
> > >> >> file, which can also be done with python. Then it's just a matter of

> > >> >> turning the fasta file into a blast-formatteddatabaseusing the


> > >> >> ncbi's tool.
>
> > >> >> If you drop me a line at cba...@oeb.harvard.edu I'd be more than happy
> > >> >> to share code.
>
> > >> >> Cheers,
> > >> >> Chris
>
> > >> >> On Nov 14, 1:00 pm, Sam<xueju...@gmail.com>  wrote:
>
> > >> >>> Hello,

> > >> >>> I wonder if there is a qiime-compatibledatabasefor fungalITS


> > >> >>> sequences or anybody would like to share it, so that it can be used
> > >> >>> for taxa assignment by blast?
>

> > >> >>> There is a Blastdatabaseavailable for fungalITS, but do not know

garlicscape

unread,
Dec 6, 2011, 7:30:42 PM12/6/11
to Qiime Forum
Ok, I'll try removing all unknowns to see how that works.

I am happy to share my database, but I feel it is not very
professionally made. Basically I went to ncbi and seached for:
fungi[orgn] NOT (unknown OR uncultured) AND internal transcribed
spacer [title] AND AFTOL [word]. I downloaded thise as INDSseq xml
file. A year ago this resulted in ~600 sequences (which I hope are
representative of the fungal kingdom). Then I managed to open in
excel, but had to do some lengthy editing because I couldn't figure
out how to convert .xml to .xls. It wasn't TOO bad because there was
some consistency between entries, but the most time consuming part was
going through each line one by one to make sure only 6 taxa levels
were listed, and that they were the correct levels (ie no sub-levels).

I could not figure out what file formats were required for RDP
analysis, but it seemed even more complicated, so I made the two files
required for the blast method in assign taxonomy. It seems like that's
what should be used anyway, since ITS is hyper-variable and cannot be
aligned and therefore arguably should not be used to develop
phylogenetic trees.

I'm happy to share the files, if anyone still wants them after reading
about these clumsy methods, but I don't know how to attach on this
forum :)

Sarah

Greg Caporaso

unread,
Dec 6, 2011, 10:15:07 PM12/6/11
to qiime...@googlegroups.com
Thanks for offering to share Sarah. Users who are logged in to Google Groups (and members of the QIIME Forum) can 'Reply to Author' to reply to you directly, so that could be a good way to share. We also have a few suggestions for how to share files on this post:


It would be good to compare these different data sets at some point.

Greg

Nicolas Rascovan

unread,
Mar 29, 2012, 2:25:51 PM3/29/12
to qiime...@googlegroups.com
Hi everybody,

I'm starting to analyse some ITS1-ITS2 amplicons with qiime. I've found another DB to be used with fungal sequences from this webpage: http://www.emerencia.org/fungalitspipeline.html
(click on download). This DB might not be really up to date (It seem it is from 2009) and as I am not mycologist I don't know how reliable it is. I formatted this DB to be use by the RDP classifier in Qiime and did a comparative analysis of this DB vs. Unite and Silva. I found much better results with this one. Silva, of course, doesn't work at all. 
In many cases where unite could classify up to phylum or class level, this one could go further. In general results are similar, it's just that this other DB goes further. Plus this DB has 95944 sequences whereas Unite only 30379.

Nicolas.

Blanca B. Landa

unread,
Mar 29, 2012, 5:30:20 PM3/29/12
to qiime...@googlegroups.com
Dear Nicolas
I tried UNite database before but would like to try this new database. Would you mind sharing it?
Thanks
Blanca
****************************************************************************************
Dra. Blanca B. Landa del Castillo
Instituto de Agricultura Sostenible
Consejo Superior de Investigaciones Científicas (CSIC)
Alameda del Obispo s/n
Apdo. 4084
14080-Córdoba
España

Fax:+34.957.499252
Tfno.:
+34.957.499279
+34.689.576177
e-mail:
blanca...@ias.csic.es
ag2l...@uco.es
*****************************************************************************************

Nicolás Rascovan

unread,
Mar 29, 2012, 5:41:00 PM3/29/12
to qiime...@googlegroups.com
Hi Blanca,

Of course I can share it. I formatted it myself to be used with Qiime, so I hope I did it ok. It worked for me at least, but it is not deeply tested to be honest. I recommend you to validate your results with this database with those obtained with Unit or by blast against NCBI of at least some sequences.

I couldn't attach the DB (only 6mb compressed) in this message but if you give me your email I can send it to you.

Good Luck,

Nicolas.

2012/3/29 Blanca B. Landa <ag2l...@uco.es>

PJ_FSU

unread,
Nov 8, 2012, 12:59:38 PM11/8/12
to qiime...@googlegroups.com
Dear Nicolas, 

I am using Qiime to analyse my ITS amplicons.I would like to try this new database. Would you mind shairing it.

Thanks,

Puja
Graduate Student 
Department of Earth, Ocean and Atmospheric Sciences (EOAS)
Florida State University
1060 Atomic Way, Bldg 42, RM 326 NRB
Tallahassee, FL 32306

Nicolás Rascovan

unread,
Nov 8, 2012, 7:56:08 PM11/8/12
to qiime...@googlegroups.com
Yes, sure, no problem. I'll send it to your personal email as qiime forum does not allow the DB size. 

Cheers,

Nico


2012/11/8 PJ_FSU <pujaja...@gmail.com>
--
 
 
 

casa...@ceinge.unina.it

unread,
Nov 9, 2012, 12:36:33 AM11/9/12
to qiime...@googlegroups.com
Dear Nicolas,

I would love to try your ITS db since I'm studying fungal communities in the human gut. Could you be so kind to email it to me too?

Thank you in advance,
Giorgio

casa...@ceinge.unina.it

Nicolás Rascovan

unread,
Nov 9, 2012, 9:26:24 AM11/9/12
to qiime...@googlegroups.com
Sure Giorgio, 

I'll email it to you. 

Cheers,

Branislavik

unread,
Nov 22, 2012, 11:17:03 AM11/22/12
to qiime...@googlegroups.com

Dear Nicolas,

I found out, that the they probably have a new release of the database at http://www.emerencia.org/fungalitspipeline.html which is supposed to be from March 2012. Did you make your DB out of this version, or from the older one?

If you made it from the older one, do you think it would be possible to make it the same way as the previous one as "updated" version???

Best regards,

Branislav

Branislavik

unread,
Nov 22, 2012, 11:22:34 AM11/22/12
to qiime...@googlegroups.com

Dear all,

I recently found also this database:

http://itsonedb.ba.itb.cnr.it:8080/ITS1/

Does anyone have any experience with it???

I was able to use it just for usearch_qf chimera removal, but not for other steps... I had to BLAST the representative sequences against NCBI nr database and transform the results and create taxonomy file manually, but it worked...

Do you think it would be possible/reasonable to somehow make this db "qiime compatible"???

Best regards,

Branislav

Tony Walters

unread,
Nov 22, 2012, 12:04:24 PM11/22/12
to qiime...@googlegroups.com
Hello Branislav,

Making the id to taxonomy mapping file work with BLAST is straightforward with QIIME, as described here: http://qiime.org/documentation/file_formats.html#id-to-taxonomy-map

If you want to use RDP for taxonomic assignments, the requirements are more particular, see:

It's also important that the reference fasta file has IDs that are exact matches to the first column values in the id to taxonomy mapping file (nothing else in the label but the id).

Troubleshooting problems (e.g. empty taxonomic levels, ";;") can be tedious as the error messages generally are not informative, so it can get frustrating.

-Tony


--
 
 
 

Branislavik

unread,
Nov 22, 2012, 12:25:25 PM11/22/12
to qiime...@googlegroups.com

Maybe just a really stupid question but to be sure I understand it correctly:

by the sequence ID you mean what is written after ">" in the fasta file which I will use as a ref seqs database file, right?

Tony Walters

unread,
Nov 22, 2012, 12:42:28 PM11/22/12
to qiime...@googlegroups.com
Yes, that's the fasta label/sequence ID.

On Thu, Nov 22, 2012 at 10:25 AM, Branislavik <brani...@gmail.com> wrote:

Maybe just a really stupid question but to be sure I understand it correctly:

by the sequence ID you mean what is written after ">" in the fasta file which I will use as a ref seqs database file, right?

--
 
 
 

Nicolás Rascovan

unread,
Nov 22, 2012, 5:49:26 PM11/22/12
to qiime...@googlegroups.com
Hi Branislav,

I donwloaded the new version and formated it to be used in QIIME. I haven't tried it, though. Give it a try and let me know if it works. I'll email it you in a separate email.

Cheers,

Nico


2012/11/22 Branislavik <brani...@gmail.com>

--
 
 
 

Greg Caporaso

unread,
Nov 24, 2012, 11:25:32 AM11/24/12
to Qiime Forum
Hi Nico,
I haven't used this one, but we will be putting a very preliminary version of ITS reference OTUs derived from the UNITE database online very shortly (within a week max). We'll post to the QIIME Blog when that is ready.

Greg

On Thu, Nov 22, 2012 at 3:49 PM, Nicolás Rascovan <nico...@gmail.com> wrote:
Nico

Nicolás Rascovan

unread,
Nov 26, 2012, 9:04:50 PM11/26/12
to qiime...@googlegroups.com
Hi Greg,

Thanks a lot for your answer. I you told me you were planning to release that db some months ago with scott bates. I look forward to try it. It will be very useful for the community as this one I'm sharing is not properly done, just a quick adaptation from another db available.

Cheers,

Nico
--
 
 
 

Greg Caporaso

unread,
Nov 26, 2012, 9:16:25 PM11/26/12
to Qiime Forum
Yeah, it's been chronically delayed, but really should be out this week. Keep an eye on the forum/blog.

Greg


--
 
 
 

urooj

unread,
Apr 15, 2013, 8:26:17 AM4/15/13
to qiime...@googlegroups.com
Is it possible if I can get the database for ITS taxonomy.

Urooj


On Tuesday, 15 November 2011 13:11:07 UTC, Chris B wrote:
Hi Xueju,

I have a python script that goes through the ncbi's nt database,
generates the taxonomy mapping file that qiime requires to use blast,
and generates a list of accession numbers that you can use as an input
to the ncbi's alias tool to make a custom database. I use this to
restrict the database to sequences labelled as fungi. It does not
specifically pull out ITS-labelled sequences, but I find it works OK
to blast ITS against the resulting set anyhow. If you did want to
restrict the database to ITS, I think you could probably do that as a
preliminary step.

An alternative would be to use the Unite database, and maybe this is
what you are referring to? Here it's a matter of going through the
fasta file that can be downloaded from their site and generating the
qiime taxonomy mapping file from the sequence identifier lines in that
file, which can also be done with python. Then it's just a matter of
turning the fasta file into a blast-formatted database using the
ncbi's tool.

If you drop me a line at cba...@oeb.harvard.edu I'd be more than happy
to share code.

Cheers,
Chris

On Nov 14, 1:00 pm, Sam <xueju...@gmail.com> wrote:

Nicolás Rascovan

unread,
Apr 15, 2013, 8:40:27 AM4/15/13
to qiime...@googlegroups.com
Hi Urooj,


Let me know in case you want the one I was sharing anyway (that is not as good as the one in Qiime). 

Best,

Nico.


2013/4/15 urooj <uroojz...@gmail.com>

--
 
---
You received this message because you are subscribed to a topic in the Google Groups "Qiime Forum" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/qiime-forum/v57AXiNsm9c/unsubscribe?hl=en-US.
To unsubscribe from this group and all its topics, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

urooj

unread,
Apr 15, 2013, 8:50:37 AM4/15/13
to qiime...@googlegroups.com
I think this one is good. Thanks

urooj

unread,
Apr 15, 2013, 12:14:50 PM4/15/13
to qiime...@googlegroups.com
Hello
Would somebody please send me the link for tutorial for ITS fungal QIIME. I could only find the following ana I think its not correct



Many thanks 

Urooj

Tony Walters

unread,
Apr 15, 2013, 12:25:55 PM4/15/13
to qiime...@googlegroups.com
Hello,

Here is the direct link for the 12_11 ITS download: https://github.com/downloads/qiime/its-reference-otus/its_12_11_otus.tar.gz

-Tony


--
 
---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.

urooj

unread,
Apr 15, 2013, 12:36:09 PM4/15/13
to qiime...@googlegroups.com
yes I already downloaded it but I am asking the codes need after splitting library. 

Regards

Urooj

Tony Walters

unread,
Apr 15, 2013, 12:42:30 PM4/15/13
to qiime...@googlegroups.com
It's going to be very similar to the 16S (standard pipeline), described here: http://qiime.org/tutorials/tutorial.html#picking-operational-taxonomic-units-otus-through-making-otu-table

You will want to supply a qiime_parameters.txt file to point to the reference sequences/taxonomy mapping file (make sure you unzip all of the files) if you call any workflows, like pick_otus_through_otu_table.py.  See http://qiime.org/documentation/qiime_parameters_files.html

-Tony

Andrea Campisano

unread,
Apr 8, 2015, 12:46:01 PM4/8/15
to qiime...@googlegroups.com
Dear users.
I have used Qiime 1.8 and a freshly downloaded Unite database (march-april 2015) to analyse some sequences from plant associated fungi amplified using ITS1-4 primers.

I get a smoothly running analysis but when I assign taxonomy, 70-80% of my sequences are named "k__Fungi;p__Ascomycota;Other;Other;Other;Other" or "k__Fungi;p__Ascomycota;c__unidentified;o__unidentified;f__unidentified;g__unidentified" (i take it that unidentified and other are the same thing).

I am worried tha the unite database may not be quite the right one when looking at these data, so i am curious to understand if you have had better experience with other ones. The messages in this thread are quite old so i wonder if newer, more comprehensive databases may have meanwhile come out that are as broad as possible.

Thanks in advance for any help you can give me, i am really hitting a concrete wall otherwise.
Cheers
Andrea

Kyle Bittinger

unread,
Apr 8, 2015, 1:06:18 PM4/8/15
to qiime...@googlegroups.com
Andrea,

A few years ago, a graduate student from my lab made a program to assign ITS reads, called Brocc.  I advised her on the initial project, and have now taken over maintenance of the software.  This tool may help with your ITS assignments.  I have used it in a few recent papers, listed below.

The strategy employed in Brocc is to use a large, messy database (nt) instead of a fungal only database.  We find that there seems to be some legitimate non-fungal reads that pop up with our ITS primers (see papers for sequence), and this tool helps to identify those cases.  The Brocc software also adjusts the percent identity threshold as you go up the taxonomic ranks, which helps.  Most people are (rightly) hesitant to use a tool that BLASTs against the nt database, but I'd urge you to try it and see what you think.  You should compare the calls from Brocc to the results from UNITE and also think about how you would classify a few reads manually, just as an example.  Let me know what your thoughts are.

To use this tool with QIIME, run Brocc according to the instructions, then use the "standard taxonomy" output file with the QIIME script make_otu_table.py.  This will create a new OTU table with taxonomic assignments from Brocc.  You can then proceed from there as usual.




Best,
Kyle



--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrea Campisano

unread,
Apr 8, 2015, 2:37:19 PM4/8/15
to qiime...@googlegroups.com
Dear Kyle
thanks a lot for the information.
I will try it and if i can, i will give feedback here on how it performed.

Indeed ITS are at the moment a bit of a challenge in terms of taxonomic assignments as no established pipeline exists, as it does for example, with 16S reads.

I am also interested in knowing how many unassigned/other taxa get those who deal ith fungal ITS reads when using the standard UNITE-Qiime protocol

Andrea

Andrea Campisano

unread,
Apr 9, 2015, 4:57:44 AM4/9/15
to qiime...@googlegroups.com
Dear Kyle,
I have an issue and if you still read this you may have encountered it before.
Upon running the brocc.py command i receive this error:

 brocc.py -i //FungalITSPipeline/indata/indata.fasta -b //FungalITSPipeline/BLASTdatabase/fungalITSdatabase.fasta -o //FungalITSPipeline/outdata/out -a ITS
Traceback (most recent call last):
  File "/usr/local/bin/brocc.py", line 5, in <module>
    main()
  File "/usr/local/lib/python2.7/dist-packages/brocclib/command.py", line 92, in main
    taxa_db.load_cache()
  File "/usr/local/lib/python2.7/dist-packages/brocclib/get_xml.py", line 27, in load_cache
    if os.path.exists(self.cache_fp):
  File "/usr/lib/python2.7/genericpath.py", line 18, in exists
    os.stat(path)
TypeError: coercing to Unicode: need string or buffer, NoneType found

This error occurs when using any fasta files as input. the first

Are you familiar with this problem?
Thanks very much

Andrea



Il giorno mercoledì 8 aprile 2015 19:06:18 UTC+2, Kyle Bittinger ha scritto:

Kyle Bittinger

unread,
Apr 9, 2015, 8:56:56 AM4/9/15
to qiime...@googlegroups.com
I answered this in your private email, will provide solution here when we identify the problem.

Jay T

unread,
Apr 8, 2016, 10:50:05 AM4/8/16
to Qiime 1 Forum
Kyle did you ever determine the issue?
Reply all
Reply to author
Forward
0 new messages