assign_taxonomy.py using BLAST

78 views
Skip to first unread message

kristin oosthuizen

unread,
Jun 13, 2016, 11:14:05 AM6/13/16
to Qiime 1 Forum
Hi,

I am using assign_taxonomy,py with BLAST, using Greengenes OTUs (16S) 13_8 (most recent) 97_otus.fasta and 97_otu_taxonomy.txt for -r and -t


This is my script:

umask 0077
export TMP=/scratch/${PBS_JOBID}
mkdir -p ${TMP}

cd $PBS_O_WORKDIR

module load python

TMPDIR=${TMP} assign_taxonomy.py -i V3V4_otu_rename_uparse.fasta -r 97_otus.fasta -t 97_otu_taxonomy.txt -o V3V4_BLAST_taxonomy -m blast

rm -rf ${TMP}

And I am getting the following error:

Traceback (most recent call last):
  File "/apps/python/2.7.11-gcc-4.8.5/bin/assign_taxonomy.py", line 417, in <module>
    main()
  File "/apps/python/2.7.11-gcc-4.8.5/bin/assign_taxonomy.py", line 394, in main
    log_path=log_path)
  File "/apps/python/2.7.11-gcc-4.8.5/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 500, in __call__
    abspath(reference_seqs_path), output_dir=blast_db_dir)
  File "/apps/python/2.7.11-gcc-4.8.5/lib/python2.7/site-packages/bfillings/formatdb.py", line 116, in build_blast_db_from_fasta_path
    fdb = FormatDb(WorkingDir=output_dir, HALT_EXEC=HALT_EXEC)
  File "/apps/python/2.7.11-gcc-4.8.5/lib/python2.7/site-packages/burrito/util.py", line 201, in __init__
    self._error_on_missing_application(params)
  File "/apps/python/2.7.11-gcc-4.8.5/lib/python2.7/site-packages/burrito/util.py", line 468, in _error_on_missing_application
    "Is it in your path?" % command)
burrito.util.ApplicationNotFoundError: Cannot find formatdb. Is it installed? Is it in your path?


I am very new to QIIME, and also not good with Python.

I am not sure what the formatdb is and why it cannot be found.

If anybody has seen this before and can help, it will be greatly appreciated.

Thanks!
Kristin

Colin Brislawn

unread,
Jun 13, 2016, 12:52:21 PM6/13/16
to Qiime 1 Forum
Hello Kristin,

Thanks for getting in touch with us and posting your full command and error message. You are correct, this has to do with formatdb, part of the legacy-blast package. Do you have legacy-blast package installed? You can run these commands to find out:
which blast
which formatdb

Depending on what you can find, I can help you install legacy blast or fix your deployment. 

Keep in touch! 
Colin

kristin oosthuizen

unread,
Jun 14, 2016, 5:47:50 AM6/14/16
to Qiime 1 Forum
Hi Colin,

Thank you for getting back to me so soon. I am running all of my analyses on our university server, but I checked and I've found a legacy_blast.pl script in apps/NCBI/BLAST/2.3.0+/bin. I guess this is what I need? How would I call this application then?

 I tried module load app/NCBI and module load perl, because it's a perl script, but I get the same error about formatdb.

Thank you for all your help.

kristin oosthuizen

unread,
Jun 14, 2016, 8:58:28 AM6/14/16
to Qiime 1 Forum
Hi Colin,

So I am busy trying to figure it out with IT, it seems it was not installed. Thank you for pointing me in te right direction. Will get back to you if the problem persists.

Thanks again!

Regards,
Kristin

Colin Brislawn

unread,
Jun 14, 2016, 12:18:32 PM6/14/16
to Qiime 1 Forum
Hello Kristin,

Yeah, I'm guessing that perl script calls blast, and is not the program itself. Once you and your team gets legacy blast installed (not blast plus!), let me know if it works. 

Colin

kristin oosthuizen

unread,
Jun 28, 2016, 3:31:54 AM6/28/16
to Qiime 1 Forum
Hi Colin,

Thank you again for helping me with my previous error. We figured it out, installed legacy blast and it's working. Now I have another question about using QIIME assign_taxonomy.py with RDP classifier. I am using UNITE QIIME release; sh_refs_qiime_ver7_97_31.01.2016.fasta and sh_taxonomy_qiime_ver7_97_31.01.2016.txt  (https://unite.ut.ee/repository.php).


This is my script:


umask 0077
export TMP=/scratch/${PBS_JOBID}
mkdir -p ${TMP}

cd $PBS_O_WORKDIR

module load python
module load app/RDP/2.2

TMPDIR=${TMP} assign_taxonomy.py -i ITS_otu_rename_uparse.fasta -r sh_refs_qiime_ver7_97_31.01.2016.fasta -t sh_taxonomy_qiime_ver7_97_31.01.2016.txt -o ITS_RDP_taxonomy -c 0.80 -m rdp --rdp_max_memory 8000


rm -rf ${TMP}


And I am getting the following error:


Traceback (most recent call last):
  File "/apps/python/2.7.11-gcc-4.8.5/bin/assign_taxonomy.py", line 417, in <module>
    main()
  File "/apps/python/2.7.11-gcc-4.8.5/bin/assign_taxonomy.py", line 394, in main
    log_path=log_path)
  File "/apps/python/2.7.11-gcc-4.8.5/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 860, in __call__
    max_memory=max_memory, tmp_dir=tmp_dir)
  File "/apps/python/2.7.11-gcc-4.8.5/lib/python2.7/site-packages/bfillings/rdp_classifier.py", line 515, in train_rdp_classifier_and_assign_taxonomy
    tmp_dir=tmp_dir)
  File "/apps/python/2.7.11-gcc-4.8.5/lib/python2.7/site-packages/bfillings/rdp_classifier.py", line 485, in train_rdp_classifier
    return app(training_seqs_file)
  File "/apps/python/2.7.11-gcc-4.8.5/lib/python2.7/site-packages/bfillings/rdp_classifier.py", line 327, in __call__
    remove_tmp=remove_tmp)
  File "/apps/python/2.7.11-gcc-4.8.5/lib/python2.7/site-packages/burrito/util.py", line 285, in __call__
    'StdErr:\n%s\n' % open(errfile).read())
burrito.util.ApplicationError: Unacceptable application exit status: 1
Command:
cd "/home/16102142/Amplicon_NGS_raw_data/ITS_ID/ITS_unique_ID/ITS_linefixed/ITS_header_change/ITS_dereplicate/ITS_abundance_sort/ITS_OTU_cluster/ITS_ref_chimera_filter/ITS_fasta_formatter/ITS_rename/ITS_map_reads_to_otus/ITS_map_reads_to_otus/ITS_assign_taxonomy/"; java -Xmx8000M -cp "/apps/RDP/2.2/rdp_classifier-2.2.jar" edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker "/scratch/1950.launch.hpc/RdpTaxonomy_Oh8KtB.txt" "/scratch/1950.launch.hpc/tmpPR9JMcdMtcz96xhXzZQO.txt" 1 version1 cogent "/scratch/1950.launch.hpc/RdpTrainer_nroglT" > "/scratch/1950.launch.hpc/tmpfc6MIBMmfBu0aqdwafHf.txt" 2> "/scratch/1950.launch.hpc/tmpwp7CLVLWv2aSwl9oqeES.txt"
StdOut:

StdErr:
Copyright 2006 Michigan State University Board of Trustees.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

Authors's mailng address:
Center for Microbial Ecology
2225A Biomedical Physical Science
Michigan State University
East Lansing, Michigan USA 48824-4320
E-mail: James R. Cole at co...@msu.edu
    Qiong Wang at wang...@msu.edu
    James M. Tiedje at tie...@msu.edu


Exception in thread "main" java.lang.IllegalArgumentException:
Illegal taxonomy format at 9050**9049*7*genus
    at edu.msu.cme.rdp.classifier.train.TreeFactory.creatTaxidMap(TreeFactory.java:73)
    at edu.msu.cme.rdp.classifier.train.TreeFactory.<init>(TreeFactory.java:51)
    at edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker.<init>(ClassifierTraineeMaker.java:41)
    at edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker.main(ClassifierTraineeMaker.java:133)


I was hoping, if you perhaps know what the error means, if you could please help me to solve the problem.
Thank you very much, you have been very helpful!

Kind regards,
Kristin


Colin Brislawn

unread,
Jun 28, 2016, 12:17:43 PM6/28/16
to Qiime 1 Forum, William Walters
Hello Kristin,

I'm glad you got blast up and running. Let's tackle this next one.

Did you skim your error message for clues? I think I found this one at the bottom of the error:
Exception in thread "main" java.lang.IllegalArgumentException: 
Illegal taxonomy format at 9050**9049*7*genus
    at edu.msu.cme.rdp.classifier.train.TreeFactory.creatTaxidMap(TreeFactory.java:73)
    at edu.msu.cme.rdp.classifier.train.TreeFactory.<init>(TreeFactory.java:51)
    at edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker.<init>(ClassifierTraineeMaker.java:41)
    at edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker.main(ClassifierTraineeMaker.java:133)

Looks like the script is not happy with the taxonomy file. I'm not super familiar with the UNITE database, so I've cc'ed a qiime dev who should know more.

Colin


TonyWalters

unread,
Jun 28, 2016, 12:24:09 PM6/28/16
to Qiime 1 Forum, william....@gmail.com
Hi,

See this thread: https://groups.google.com/forum/#!searchin/qiime-forum/ascii$20unite/qiime-forum/9btGkJlCsU8/5I3Cq9AACQAJ

I've encountered non-ascii characters in the UNITE database, and it can cause errors like the one you are seeing.

kristin oosthuizen

unread,
Jul 1, 2016, 10:16:36 AM7/1/16
to Qiime 1 Forum, william....@gmail.com
Hi,

Thank you. Will give the script removing the non-standard characters a go. I was wondering, what version of RDP classifier QIIME uses for assigning taxonomy? I want to classify my ITS2 data and wanted to try with BLAST and RDP and compare, but it seem like I will need RDP 2.10 or above to classify ITS sequences and it seems like QIIME uses only 2.2? Is this true? I have tried using 2.11 and it did not work.

Thank you very much for all your help.

Kristin
 

kristin oosthuizen

unread,
Jul 1, 2016, 10:17:13 AM7/1/16
to Qiime 1 Forum, william....@gmail.com
Reply all
Reply to author
Forward
0 new messages