Help using Blast to assign taxonomy - reference databse

439 views
Skip to first unread message

Rebekah Newstead

unread,
Dec 14, 2016, 6:23:38 AM12/14/16
to Qiime 1 Forum
Hi,

I have been using qiime to assign taxonomy to my 18S sequences using Silva, and for this I have imported a Silva database that I am using . 

I'd like to run a BLAST search instead and understand that the correct script for this is:

assign_taxonomy.py -i otus.fa -r ref_seq_set.fna -t id_to_taxonomy.txt

when I run it I get the error message:

Error in assign_taxonomy.py: option -r: file does not exist: 'ref_seq_set.fna'

I am using  a shared Linux cluster where BLAST is available to load. However, when loaded the same message appears. I imagine then I need to import a reference BLAST database into my home directory.

I am also guessing that this should be a reference set from here


however I am unsure if this is the correct thing to do, or the correct place to be looking for the reference database that I need,

Is anyone able to point me in the direction of how I I get the correct database to assign taxonomy to my 18S sequences


thanks in advance for any help


Rebekah

TonyWalters

unread,
Dec 14, 2016, 6:35:43 AM12/14/16
to Qiime 1 Forum
Hello Rebecca,

The basic command for assigning taxonomy with blast is:
assign_taxonomy.py -m blast -i X -r Y -t Z -o A
where X is your input fasta file (i.e., your sequences from your samples)
Y is the reference fasta file you want to blast against
Z is the taxonomy file for the reference fasta file (needs to be matched to the reference reads)
A is the output folder

It looks like you aren't specifying -m blast, and your filepath to ref_seq_set.fna is incorrect. Are you sure it is in the directory that you are running the command in and is spelled correctly?

You should be able to run these commands if blast is installed:
blastall
and
formatdb

Was otus.fa generated outside of QIIME? If so, I would make sure that its format (specifically the sample IDs in the fasta labels) are in QIIME-compatible format (http://qiime.org/documentation/file_formats.html#demultiplexed-sequences). This shouldn't interfere with the taxonomic assignment step, but it could create headaches for you later. To do so, I would first suggest checking your mapping file with validate_mapping_file.py (http://qiime.org/scripts/validate_mapping_file.html) and validating the otus.fa file with validate_demultiplexed_fasta.py (http://qiime.org/scripts/validate_demultiplexed_fasta.html).

Are the reference files from the SILVA 128 release?

-Tony

Rebekah Newstead

unread,
Dec 14, 2016, 8:21:25 AM12/14/16
to Qiime 1 Forum
Hi Toni,

thanks for your reply.

I am using linux cluster and through this I am loading Blast through the command:

module load blast

As far as I am aware Blast is then loaded and available, but it doesn't appear as such within the directory, if that makes sense, 

I can run the command blastall

otus.fa was generated using a Usearch pipeline.

The Silva release I was using was 108 with this script:

assign_taxonomy.py -i otus.fa -m blast \
-t ../Silva_108_Qiime/taxa_mapping/Silva_RDP_taxa_mapping_Eukarya_only_genus.txt \
-r ../Silva_108_Qiime/rep_set/Silva_108_rep_set_Eukarya_only.fna \
-o assigned_taxonomy2;

for using this I had a Silva file in my home directory. Looking at the qiime website it doesn't seem to suggest SILVA 128 is compatible:


Am I wrong in this? And if release 128 is compatible I am unsure of how to use it being in ARB format.

This script above worked fine but the taxonomic resolution of the output was poor. When using the web based Blast search for some of the OTU's, the resolution of taxonomic assignment was much greater, hence wishing to use BLAST for assign taxonomy and compare the output.

thanks again for any help you are able to give


TonyWalters

unread,
Dec 14, 2016, 8:26:34 AM12/14/16
to Qiime 1 Forum
Hello,

There is a 123 QIIME release, which is quite a bit newer than the 108 release. I'd suggest trying that one for now, and see if it improves your assignments. The 128 QIIME compatible release hasn't been put together yet, but it's being worked on (I wouldn't wait for it though, there could be delays due to the holidays). 

Rebekah Newstead

unread,
Dec 14, 2016, 10:06:44 AM12/14/16
to Qiime 1 Forum
Thanks,

currently having a go doing that

in this release I presume that 

18S_only/taxonomy_all_levels.txt   

would replace the

 Silva_RDP_taxa_mapping_Eukaryote_only_genus.txt   file?

cheers
Rebekah

TonyWalters

unread,
Dec 14, 2016, 10:13:21 AM12/14/16
to Qiime 1 Forum
Hello Rebekah,

You are correct.
Reply all
Reply to author
Forward
0 new messages