None of my ITS sequences are mapping to the new UNITE database

Jay T

unread,

Apr 7, 2016, 6:27:24 PM4/7/16

to qiime...@googlegroups.com

I've ran my ITS analyses (open reference otu workflow) without a hitch using the following tutorial: http://nbviewer.jupyter.org/github/biocore/qiime/blob/1.9.1/examples/ipynb/Fungal-ITS-analysis.ipynb

However I was getting a lot of unassigned sequences (20%-50%) and thought the reason might be database related so I decided to try and find out if there was a newer database.

I found this on the UNITE ITS Fungal website: https://unite.ut.ee/repository.php

I used the latest version, keeping all my parameters the same and when I looked at my new taxa plots nothing appeared to hit the taxa in the OTU table. So essentially all of my barplots were red(unassigned).

What is going on here. Can someone help me?

Thanks,

JT

jonsan

unread,

Apr 8, 2016, 12:09:27 AM4/8/16

to Qiime 1 Forum

Hi JT,

Which exact UNITE database file are you using?

You will likely need to modify the params.txt file to point to the new reference sequence file and reference taxonomy annotation file. Is it possible that you passed a new filepath to the pick_open_reference_otus.py script using the -r flag, but didn't modify the params.txt file?

Cheers,

-jon

Jay T

unread,

Apr 8, 2016, 12:20:08 AM4/8/16

to Qiime 1 Forum

The latest one, version 7. https://unite.ut.ee/repository.php. I have
modified the params.txt using the reference fasta (97% similarity) and the
reference taxonomy (97%). I have tried blast, uclust.. I have changed the
similarity threshold to 90% and still get 100% unassigned reads. I am
currently testing 75% but Qiime seems to be taking a much longer time.

I had none of these issues using the its_12_11 reference fasta and taxonomy
mentioned in the Qiime illumina ITS tutorial but that seems to be very
outdated.

Thanks,
JT

jonsan

unread,

Apr 8, 2016, 12:38:21 AM4/8/16

to Qiime 1 Forum

Hmmm, looking back through some old threads it seems others have had some issues with newer versions of UNITE as well. One explanation seems to have something to do with how newer versions have trimmed off SSU/LSU overhangs that might still be present in your sequences. Trying one of the proposed solutions in that thread might be helpful. In the meantime, I'll ask around and see if anyone has experience with the most recent version.

Cheers,

-jon

Jay T

unread,

Apr 8, 2016, 12:41:36 AM4/8/16

to Qiime 1 Forum

Can you link the full path to the website? Your hyperlink just directs to the Google group webpage.

Thanks,
JT

TonyWalters

unread,

Apr 8, 2016, 3:28:07 AM4/8/16

to Qiime 1 Forum

Hello Jay,

There should be a /developer/ folder in your UNITE database files-can you try pointing to the fasta files for OTU picking/taxonomic assignment and see if that improves the results? That should deal with the trimmed off SSU/LSU overhangs that Jon was referring to.

jonsan

unread,

Apr 8, 2016, 11:58:47 AM4/8/16

to Qiime 1 Forum

Strange... if this link works.

-jon

Jay T

unread,

Apr 8, 2016, 1:29:43 PM4/8/16

to qiime...@googlegroups.com

Tony -

The developer folder worked fine however 30-77% of my sequences in my 30 samples are still showing up as unassigned compared to around 15-50% when I used the older ITS 12_11 database. Any suggestions?

I used uclust as the default OTU picking strategy at default similarity. In addition, I attempted to use both the developer reference files clustered at 97 and 99%. I

Thanks,

JT

TonyWalters

unread,

Apr 8, 2016, 1:35:43 PM4/8/16

to Qiime 1 Forum

Can you try blasting some of your representative sequences on NCBI and see if you get 100% identity hits? If not, there may be PCR artifacts, which would usually be at the start and/or end of the reads.

You might also try using blast as the assignment method-you could manually run assign_taxonomy.py -m blast on your representative set, specify a new output folder (e.g. assignments_blast), and build an OTU table with the already clustered data (should be a seqs_otus.txt file in your open reference output folder that is your OTU mapping file) and the new assignments via make_otu_table.py. Then check to see if the assignments are improved.

Jay T

unread,

Apr 14, 2016, 2:07:59 PM4/14/16

to Qiime 1 Forum

Hey Tony - I meant to report that the blast assignment works much better than the uclust for some of these alternate databases. I followed your instructions and it worked great! I went from 70% unassigned to around 1%. Huge difference!!!