Assign taxonomy using fungene alignments

454 views
Skip to first unread message

Jen Underwood

unread,
Nov 13, 2012, 12:39:16 PM11/13/12
to qiime...@googlegroups.com
Hi, 
  I am trying to assign taxonomy for nirS sequences using an aligned seqs file I was able to download from the fungene website (http://fungene.cme.msu.edu//hmm_details.spr?hmm_id=21).   I used muscle to align my sequences now I'm trying to assign taxonomy.  I edited the aligned seqs file I downloaded from fungene and manually created a taxonomy file hoping that it would work.  Attached is the aligned seqs and the taxonomy file.  Below is the error I received.  Any advise you have would be great!  Thanks!  

assign_taxonomy.py -i otus/align/seqs.fna_rep_set_aligned.fasta -t fungene_taxonomy.txt -r fungene_7.1_nirS_77_aligned_nucleotide_seqs_edit.txt -c 0.6 -o otus/assign_tax
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/assign_taxonomy.py", line 226, in <module>
    main()
  File "/macqiime/QIIME/bin/assign_taxonomy.py", line 222, in main
    result_path=result_path,log_path=log_path)
  File "/macqiime/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 359, in __call__
    taxonomy_file, training_seqs_file = self._generate_training_files()
  File "/macqiime/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 394, in _generate_training_files
    seq_id, lineage_str = map(strip, line.split('\t'))
ValueError: too many values to unpack



fungene_7.1_nirS_77_aligned_nucleotide_seqs_edit.txt
fungene_taxonomy.txt

Daniel McDonald

unread,
Nov 13, 2012, 12:51:40 PM11/13/12
to qiime...@googlegroups.com
Hey Jen,

The taxonomy file needs to contain all of the same number of levels.
For instance:

74 "Root;k_Bacteria;p_Proteobacteria;c_Betaproteobacteria;o_Burkholderiales;f_unclassified
Burkholderiales;s_Lepothrix cholodnii SP-6,Hydroxylamine reductase"
75 "Root;k_Bacteria;p_uncultured bacterium 2304,putative cytochrome
cd1 nitrite reductase NirS"

ID 74 has Root, k, p, c, o, f, s (no genus??) while ID 75 has Root, k,
unknown. The RDP Classifier expects taxon names (even if empty) at all
positions, with each input taxonomy string having the same number of
levels. An example is:

74 "Root;k_Bacteria;p_Proteobacteria;c_Betaproteobacteria;o_Burkholderiales;f_unclassified
Burkholderiales;g_;s_Lepothrix cholodnii SP-6,Hydroxylamine reductase"
75 "Root;k_Bacteria;p_uncultured bacterium 2304,c_putative cytochrome
cd1 nitrite reductase NirS;o_;f_g_s_"

Hope that helps!
-Daniel
> --
>
>
>

Jen Underwood

unread,
Nov 13, 2012, 1:29:22 PM11/13/12
to qiime...@googlegroups.com
Great, thanks!  I'll try that!

Jen Underwood

unread,
Nov 13, 2012, 1:49:01 PM11/13/12
to qiime...@googlegroups.com
Hi Daniel, 
  I just tested out this fix with the attached file using only 3 assignments and still got the same error.  Any other suggestion? 
fungene_taxonomy_test.txt

Daniel McDonald

unread,
Nov 13, 2012, 1:56:15 PM11/13/12
to qiime...@googlegroups.com
I am not seeing any obvious issues with the file, and a similar
command worked for me:

11:55:11 rl1-1-221-221-dhcp:Downloads$ assign_taxonomy.py -i
fungene_7.1_nirS_77_aligned_nucleotide_seqs_edit.txt -t
fungene_taxonomy_test.txt -r
fungene_7.1_nirS_77_aligned_nucleotide_seqs_edit.txt -c 0.6 -o asdasd
11:55:48 rl1-1-221-221-dhcp:Downloads$

Can you please send the output of "print_qiime_config.py -t"?

Thanks,
Daniel

On Tue, Nov 13, 2012 at 11:49 AM, Jen Underwood
> --
>
>
>

Jen Underwood

unread,
Nov 13, 2012, 2:26:06 PM11/13/12
to qiime...@googlegroups.com
Hi Daniel, 

 Here is the error and I attached the file to my muscle aligned sequences

assign_taxonomy.py -i otus/align/seqs.fna_rep_set_aligned.fasta -t fungene_taxonomy_test.txt -r fungene_7.1_nirS_77_aligned_nucleotide_seqs_edit.txt -c 0.6 -o otus/assign_tax  
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/assign_taxonomy.py", line 226, in <module>
    main()
  File "/macqiime/QIIME/bin/assign_taxonomy.py", line 222, in main
    result_path=result_path,log_path=log_path)
  File "/macqiime/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 359, in __call__
    taxonomy_file, training_seqs_file = self._generate_training_files()
  File "/macqiime/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 394, in _generate_training_files
    seq_id, lineage_str = map(strip, line.split('\t'))
ValueError: too many values to unpack
MacQIIME visitorb235:YRPnirS $ 

Jen Underwood

unread,
Nov 13, 2012, 2:30:55 PM11/13/12
to qiime...@googlegroups.com
Actually, here is an example with just 3 muscle aligned sequences.  Other one was too big to attach.  
seqs.fna_rep_set_aligned_example.txt

Jen Underwood

unread,
Nov 13, 2012, 4:19:04 PM11/13/12
to qiime...@googlegroups.com
Oh, didn't read your response completely.  Here is the output from that command

MacQIIME visitorb235:YRPnirS $ print_qiime_config.py -t

System information
==================
         Platform: darwin
   Python version: 2.7.1 (r271:86832, Dec 15 2011, 08:41:37)  [GCC 4.0.1 (Apple Inc. build 5493)]
Python executable: /macqiime/bin/python

Dependency versions
===================
                     PyCogent version: 1.5.1
                        NumPy version: 1.5.1
                   matplotlib version: 1.1.0
                  biom-format version: 0.9.3
                QIIME library version: 1.5.0
                 QIIME script version: 1.5.0
        PyNAST version (if installed): 1.1
RDP Classifier version (if installed): rdp_classifier-2.2.jar

QIIME config values
===================
                     blastmat_dir: None
                         sc_queue: all.q
      topiaryexplorer_project_dir: None
     pynast_template_alignment_fp: /macqiime/greengenes/core_set_aligned.fasta.imputed
                  cluster_jobs_fp: /macqiime/QIIME/bin/start_parallel_jobs.py
pynast_template_alignment_blastdb: None
assign_taxonomy_reference_seqs_fp: None
                     torque_queue: friendlyq
              qiime_test_data_dir: None
   template_alignment_lanemask_fp: /macqiime/greengenes/lanemask_in_1s_and_0s
                    jobs_to_start: 1
                cloud_environment: False
                qiime_scripts_dir: /macqiime/QIIME/bin/
            denoiser_min_per_core: 50
                      working_dir: None
                    python_exe_fp: /macqiime/bin/python
                         temp_dir: /tmp/
                      blastall_fp: blastall
                 seconds_to_sleep: 60
assign_taxonomy_id_to_taxonomy_fp: None


running checks:

test_FastTree_supported_version (__main__.Qiime_config)
FastTree is in path and version is supported ... ok
test_INFERNAL_supported_version (__main__.Qiime_config)
INFERNAL is in path and version is supported ... ok
test_ParsInsert_supported_version (__main__.Qiime_config)
ParsInsert is in path and version is supported ... ok
test_R_supported_version (__main__.Qiime_config)
R is in path and version is supported ... FAIL
test_ampliconnoise_install (__main__.Qiime_config)
AmpliconNoise install looks sane. ... FAIL
test_blast_supported_version (__main__.Qiime_config)
blast is in path and version is supported ... FAIL
test_blastall_fp (__main__.Qiime_config)
blastall_fp is set to a valid path ... ERROR
test_blastmat_dir (__main__.Qiime_config)
blastmat_dir is set to a valid path. ... ok
test_cdbtools_supported_version (__main__.Qiime_config)
cdbtools is in path and version is supported ... ok
test_cdhit_supported_version (__main__.Qiime_config)
cd-hit is in path and version is supported ... ok
test_chimeraSlayer_install (__main__.Qiime_config)
no obvious problems with ChimeraSlayer install ... ok
test_clearcut_supported_version (__main__.Qiime_config)
clearcut is in path and version is supported ... ok
test_cluster_jobs_fp (__main__.Qiime_config)
cluster_jobs_fp is set to a valid path and is executable ... ok
test_denoiser_supported_version (__main__.Qiime_config)
denoiser aligner is ready to use ... ok
test_for_obsolete_values (__main__.Qiime_config)
local qiime_config has no extra params ... ok
test_matplotlib_suported_version (__main__.Qiime_config)
maptplotlib version is supported ... ok
test_mothur_supported_version (__main__.Qiime_config)
mothur is in path and version is supported ... ok
test_muscle_supported_version (__main__.Qiime_config)
muscle is in path and version is supported ... ok
test_numpy_suported_version (__main__.Qiime_config)
numpy version is supported ... ok
test_pplacer_supported_version (__main__.Qiime_config)
pplacer is in path and version is supported ... ok
test_pynast_suported_version (__main__.Qiime_config)
pynast version is supported ... ok
test_pynast_template_alignment_blastdb_fp (__main__.Qiime_config)
pynast_template_alignment_blastdb, if set, is set to a valid path ... ok
test_pynast_template_alignment_fp (__main__.Qiime_config)
pynast_template_alignment, if set, is set to a valid path ... ok
test_python_exe_fp (__main__.Qiime_config)
python_exe_fp is set to a working python env ... ok
test_python_supported_version (__main__.Qiime_config)
python is in path and version is supported ... ok
test_qiime_scripts_dir (__main__.Qiime_config)
qiime_scripts_dir, if set, is set to a valid path ... ok
test_qiime_test_data_dir (__main__.Qiime_config)
qiime_test_data_dir, if set, is set to a valid path ... ok
test_raxmlHPC_supported_version (__main__.Qiime_config)
raxmlHPC is in path and version is supported ... ok
test_rtax_supported_version (__main__.Qiime_config)
rtax is in path and version is supported ... ok
test_temp_dir (__main__.Qiime_config)
temp_dir, if set, is set to a valid path ... ok
test_template_alignment_lanemask_fp (__main__.Qiime_config)
template_alignment_lanemask, if set, is set to a valid path ... ok
test_uclust_supported_version (__main__.Qiime_config)
uclust is in path and version is supported ... ok
test_usearch_supported_version (__main__.Qiime_config)
usearch is in path and version is supported ... FAIL
test_working_dir (__main__.Qiime_config)
working_dir, if set, is set to a valid path ... ok

======================================================================
ERROR: test_blastall_fp (__main__.Qiime_config)
blastall_fp is set to a valid path
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/print_qiime_config.py", line 132, in test_blastall_fp
    raise ApplicationNotFoundError("blastall_fp set to %s, but is not in your PATH. Either use an absolute path to or put it in your PATH." % blastall)
ApplicationNotFoundError: blastall_fp set to blastall, but is not in your PATH. Either use an absolute path to or put it in your PATH.

======================================================================
FAIL: test_R_supported_version (__main__.Qiime_config)
R is in path and version is supported
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/print_qiime_config.py", line 703, in test_R_supported_version
    % ('.'.join(map(str,acceptable_version)), version_string))
AssertionError: Unsupported R version. (2, 12, 0).(2, 12, 0) is required, but running 2.15.2.

======================================================================
FAIL: test_ampliconnoise_install (__main__.Qiime_config)
AmpliconNoise install looks sane.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/print_qiime_config.py", line 143, in test_ampliconnoise_install
    "$PYRO_LOOKUP_FILE variable is not set. See %s for help." % url)
AssertionError: $PYRO_LOOKUP_FILE variable is not set. See http://www.qiime.org/install/install.html#ampliconnoise-install for help.

======================================================================
FAIL: test_blast_supported_version (__main__.Qiime_config)
blast is in path and version is supported
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/print_qiime_config.py", line 403, in test_blast_supported_version
    "which components of QIIME you plan to use.")
AssertionError: blast not found. This may or may not be a problem depending on which components of QIIME you plan to use.

======================================================================
FAIL: test_usearch_supported_version (__main__.Qiime_config)
usearch is in path and version is supported
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/print_qiime_config.py", line 668, in test_usearch_supported_version
    "which components of QIIME you plan to use.")
AssertionError: usearch not found. This may or may not be a problem depending on which components of QIIME you plan to use.

----------------------------------------------------------------------
Ran 34 tests in 0.552s

FAILED (failures=4, errors=1)
MacQIIME visitorb235:YRPnirS $ 

Daniel McDonald

unread,
Nov 14, 2012, 2:15:38 PM11/14/12
to qiime...@googlegroups.com
Hey Jen,

I just reran using your files (reattached for completeness) with the
QIIME v1.5.0 virtual machine and was not able to reproduce the error.
I had to modify the fungene_taxonomy_text.txt to limit it to reduce it
to genus level specificity, but that's all (modified attached). Is
there anything custom about your QIIME install?

Here was the command I used:

assign_taxonomy.py -i seqs.fna_rep_set_aligned_example.txt -t
fungene_taxonomy_test.txt -r
fungene_7.1_nirS_77_aligned_nucleotide_seqs.txt -c 0.6 -o foo
-Daniel
> --
>
>
>
fungene_7.1_nirS_77_aligned_nucleotide_seqs_edit.txt
fungene_taxonomy_test.txt
seqs.fna_rep_set_aligned_example.txt

Jen Underwood

unread,
Nov 14, 2012, 4:39:03 PM11/14/12
to qiime...@googlegroups.com
Hi Daniel, 

    Your files seemed to work for me so maybe reducing to genus level is all I need to do?  I have nothing custom that I know of for my Qiime install.  I used the MacQiime install (not virtual).   I'll give this a shot now with the whole dataset and see what happens!  Keeping my fingers crossed!  Thanks!

Jen Underwood

unread,
Nov 14, 2012, 5:28:13 PM11/14/12
to qiime...@googlegroups.com
Ok it doesn't like the attached taxonomy file for some reason?  It works when I just use your simple one using 3 but not with this one.  Anything you can spot that is wrong?  I've narrowed it down to this file being the issue.

MacQIIME visitorb235:YRPnirS $ assign_taxonomy.py -i otus/align/seqs.fna_rep_set_aligned_example.txt -t fungene_taxonomy_family.txt -r fungene_7.1_nirS_77_aligned_nucleotide_seqs_edit.txt -o otus/taxa_example -c 0.6
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/assign_taxonomy.py", line 226, in <module>
    main()
  File "/macqiime/QIIME/bin/assign_taxonomy.py", line 222, in main
    result_path=result_path,log_path=log_path)
  File "/macqiime/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 359, in __call__
    taxonomy_file, training_seqs_file = self._generate_training_files()
  File "/macqiime/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 394, in _generate_training_files
    seq_id, lineage_str = map(strip, line.split('\t'))
ValueError: too many values to unpack
MacQIIME visitorb235:YRPnirS $ assign_taxonomy.py -i otus/align/seqs.fna_rep_set_aligned_example.txt -t Daniel/fungene_taxonomy_Daniel.txt -r fungene_7.1_nirS_77_aligned_nucleotide_seqs_edit.txt -o otus/taxa_example -c 0.6
MacQIIME visitorb235:YRPnirS $ assign_taxonomy.py -i otus/align/seqs.fna_rep_set_aligned_example.txt -t fungene_taxonomy_familyv2.txt -r fungene_7.1_nirS_77_aligned_nucleotide_seqs_edit.txt -o otus/taxa -c 0.6
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/assign_taxonomy.py", line 226, in <module>
    main()
  File "/macqiime/QIIME/bin/assign_taxonomy.py", line 222, in main
    result_path=result_path,log_path=log_path)
  File "/macqiime/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 359, in __call__
    taxonomy_file, training_seqs_file = self._generate_training_files()
  File "/macqiime/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 394, in _generate_training_files
    seq_id, lineage_str = map(strip, line.split('\t'))
ValueError: too many values to unpack
MacQIIME visitorb235:YRPnirS $ 


On Tuesday, November 13, 2012 10:39:16 AM UTC-7, Jen Underwood wrote:
fungene_taxonomy_family.txt

Jen Underwood

unread,
Nov 14, 2012, 5:40:24 PM11/14/12
to qiime...@googlegroups.com
Found a semicolon I needed for #75 but that still didn't work

Daniel McDonald

unread,
Nov 15, 2012, 6:47:36 PM11/15/12
to qiime...@googlegroups.com, Greg Caporaso, William Walters
I am confused on this one as well: the taxonomy file seems sane once I
fixed the semicolon for ID 75. It "works" on my computer but bails
with RDP. Greg or Tony, do either of you have insight here?
-Daniel

16:46:26 biot19-240-dhcp:Downloads$ assign_taxonomy.py -i
seqs.fna_rep_set_aligned_example.txt -t fungene_taxonomy_family.txt -r
fungene_7.1_nirS_77_aligned_nucleotide_seqs_edit.txt -o fooooo -c 0.6
Traceback (most recent call last):
File "/Users/mcdonald/ResearchWork/software/qiime/scripts/assign_taxonomy.py",
line 247, in <module>
main()
File "/Users/mcdonald/ResearchWork/software/qiime/scripts/assign_taxonomy.py",
line 243, in main
result_path=result_path,log_path=log_path)
File "/Users/mcdonald/ResearchWork/software/qiime/qiime/assign_taxonomy.py",
line 408, in __call__
max_memory=max_memory, tmp_dir=tmp_dir)
File "/Users/mcdonald/ResearchWork/software/qiime/qiime/pycogent_backports/rdp_classifier.py",
line 508, in train_rdp_classifier_and_assign_taxonomy
tmp_dir=tmp_dir)
File "/Users/mcdonald/ResearchWork/software/qiime/qiime/pycogent_backports/rdp_classifier.py",
line 478, in train_rdp_classifier
return app(training_seqs_file)
File "/Users/mcdonald/ResearchWork/software/qiime/qiime/pycogent_backports/rdp_classifier.py",
line 320, in __call__
result = super(RdpClassifier, self).__call__(data=data,
remove_tmp=remove_tmp)
File "/Users/mcdonald/ResearchWork/software/pycogent/trunk/cogent/app/util.py",
line 251, in __call__
open(errfile).read())
cogent.app.util.ApplicationError: Unacceptable application exit status: 1
Command:
cd "/Users/mcdonald/Downloads/"; java -Xmx1500M -cp
"/Users/mcdonald/rdp_classifier_2.2/rdp_classifier-2.2.jar"
edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker
"/tmp/RdpTaxonomy_0VReQv.txt" "/tmp/tmpsUAc7nOzDpxfNSqqGcRA.txt" 1
version1 cogent "/tmp/RdpTrainer_tBXcxx" >
"/tmp/tmpx7WwAuWfQt5AHNpnlmhq.txt" 2>
"/tmp/tmpacba3XsXuIGSe3tP51zg.txt"
StdOut:

StdErr:
Copyright 2006 Michigan State University Board of Trustees.

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or (at
your option) any later version.

This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA

Authors's mailng address:
Center for Microbial Ecology
2225A Biomedical Physical Science
Michigan State University
East Lansing, Michigan USA 48824-4320
E-mail: James R. Cole at co...@msu.edu
Qiong Wang at wang...@msu.edu
James M. Tiedje at tie...@msu.edu


Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 8230
at edu.msu.cme.rdp.classifier.train.GoodWordIterator.createWordIndex(GoodWordIterator.java:86)
at edu.msu.cme.rdp.classifier.train.GoodWordIterator.<init>(GoodWordIterator.java:50)
at edu.msu.cme.rdp.classifier.train.RawHierarchyTree.initWordOccurrence(RawHierarchyTree.java:106)
at edu.msu.cme.rdp.classifier.train.TreeFactory.addSequencewithLineage(TreeFactory.java:211)
at edu.msu.cme.rdp.classifier.train.TreeFactory.addSequence(TreeFactory.java:139)
at edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker.<init>(ClassifierTraineeMaker.java:47)
at edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker.main(ClassifierTraineeMaker.java:133)
> --
>
>
>

Tony Walters

unread,
Nov 15, 2012, 7:32:25 PM11/15/12
to Daniel McDonald, qiime...@googlegroups.com, Greg Caporaso
It can be quite a pain to track down the exact cause of these errors.  You might get rid of the whitespace that is in the taxonomy strings.  Also, you might grab a small subset (say the first 10 lines) and the associated 10 sequences (hopefully in the same order in the fasta file) and see if the smaller set throws the same error.  Also, in the /tmp/ folder there should be some of the intermediate files created for RDP training, that might give a hint as to which lines are causing the error.

-Tony

Jen Underwood

unread,
Nov 16, 2012, 10:48:53 AM11/16/12
to qiime...@googlegroups.com, Daniel McDonald, Greg Caporaso
I tried all those suggestions (ordered 1-76, reduced to 10, no white space) but still no luck.   

Jen Underwood

unread,
Nov 16, 2012, 10:53:51 AM11/16/12
to qiime...@googlegroups.com, Daniel McDonald, Greg Caporaso
It is something very basically different between the above to taxa assignments formats that I'm not seeing.  One is Daniel's that Qiime processes fine (only 3) with all of my sequences and the other is mine (removed all but 3) that it doesn't like.  But I just don't get it???
fungene_taxonomy_Daniel.txt
fungene_taxonomy_family_3.txt

Jen Underwood

unread,
Nov 16, 2012, 11:00:17 AM11/16/12
to qiime...@googlegroups.com
Lastly, I tried to copy the taxonomies from mine to add to Daniel's but that didn't work either.  Grrrr   :)


On Tuesday, November 13, 2012 10:39:16 AM UTC-7, Jen Underwood wrote:

Greg Caporaso

unread,
Nov 16, 2012, 12:11:30 PM11/16/12
to Qiime Forum
Hi Jen,
I think it might be different line breaks in the two files. Try the following:

tr '\r' '\n' < fungene_taxonomy_family_3.txt > fungene_taxonomy_family_3.txt_
mv fungene_taxonomy_family_3.txt_ fungene_taxonomy_family_3.txt

And then run with the new fungene_taxonomy_family_3.txt. 


--
 
 
 

Jen Underwood

unread,
Nov 19, 2012, 12:59:19 PM11/19/12
to qiime...@googlegroups.com
Hi Greg, 

  That worked for me when using the family_3 file but when I tried it on the whole file "fungene_taxonomy_family.text" I got an error.  Attached is that file and below are the errors. 

MacQIIME visitorb235:YRPnirS $ tr '\r' '\n' < fungene_taxonomy_family.txt > fungene_taxonomy_family.tx_mv fungene_taxonomy_family.txt_fungene_taxonomy_family.txt
usage: tr [-Ccsu] string1 string2
       tr [-Ccu] -d string1
       tr [-Ccu] -s string1
       tr [-Ccu] -ds string1 string2
MacQIIME visitorb235:YRPnirS $ assign_taxonomy.py -i otus/align/seqs.fna_rep_set_aligned_example.txt -t fungene_taxonomy_family.txt -r fungene_7.1_nirS_77_aligned_nucleotide_seqs_edit.txt -o otus/taxa -c 0.6
Traceback (most recent call last):
  File "/macqiime/QIIME/bin/assign_taxonomy.py", line 226, in <module>
    main()
  File "/macqiime/QIIME/bin/assign_taxonomy.py", line 222, in main
    result_path=result_path,log_path=log_path)
  File "/macqiime/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 364, in __call__
    max_memory=max_memory)
  File "/macqiime/lib/python2.7/site-packages/qiime/pycogent_backports/rdp_classifier.py", line 492, in train_rdp_classifier_and_assign_taxonomy
    training_seqs_file, taxonomy_file, training_dir, max_memory=max_memory)
  File "/macqiime/lib/python2.7/site-packages/qiime/pycogent_backports/rdp_classifier.py", line 464, in train_rdp_classifier
    return app(training_seqs_file)
  File "/macqiime/lib/python2.7/site-packages/qiime/pycogent_backports/rdp_classifier.py", line 320, in __call__
    result = super(RdpClassifier, self).__call__(data=data, remove_tmp=remove_tmp)
  File "/macqiime/lib/python2.7/site-packages/cogent/app/util.py", line 250, in __call__
    % (str(exit_status),command)
cogent.app.util.ApplicationError: Unacceptable application exit status: 1, command: cd "/Users/qiime/Documents/Qiime/YRPnirS/"; java -Xmx1000M -cp "/macqiime/rdp_classifier_2.2/rdp_classifier-2.2.jar" edu.msu.cme.rdp.classifier.train.ClassifierTraineeMaker "/var/folders/ZN/ZNsHCF51FeiWRkA8IeVmW++++TM/-Tmp-/RdpTaxonomy_Gzq5kj.txt" "/tmp/tmpt4ZGBGOmh6ISDYmPLOVo.txt" 1 version1 cogent "/var/folders/ZN/ZNsHCF51FeiWRkA8IeVmW++++TM/-Tmp-/RdpTrainer_WWA8hm" > "/tmp/tmpPcw3sekUHGZ0XMxut3Qy.txt" 2> "/tmp/tmpl7mgnmE1YNRVxWzZ01x0.txt"
MacQIIME visitorb235:YRPnirS $ 

fungene_taxonomy_family.txt

Tony Walters

unread,
Nov 19, 2012, 2:18:36 PM11/19/12
to qiime...@googlegroups.com
Hello Jen,

I think in this case the problem was with the fasta file rather than the taxonomy mapping file.  I filtered the file (got rid of the gap and ellipse characters, and the labels apart from the number that matched the taxonomy file, and the last sequence which did not correspond to anything in the taxonomy mapping file)  I've attached the file.  Let me know if this works for the reference (-r) sequences for you.

-Tony

--
 
 
 

fungene_7.1_filtered.fasta

Jen Underwood

unread,
Nov 19, 2012, 2:26:40 PM11/19/12
to qiime...@googlegroups.com
Wohoo!  That worked!  Thanks!

Daniel McDonald

unread,
Nov 19, 2012, 7:53:28 PM11/19/12
to qiime...@googlegroups.com
Nice! Glad to hear!
-Daniel
> --
>
>
>

Jen Underwood

unread,
Nov 20, 2012, 1:21:54 PM11/20/12
to qiime...@googlegroups.com
Except that 99.7% of the sequences aligned with nothing.  :(  Back to the drawing board...
Reply all
Reply to author
Forward
0 new messages