assigning OTUs for fungus

30 views
Skip to first unread message

Roody_UF

unread,
Sep 9, 2016, 9:57:54 AM9/9/16
to Qiime 1 Forum
I have a data set that I have analyzed using the:
pick_de_novo_otus.py command and got had all of my sequences assigned as "unassigned." This data set is fungal ITS1 and not bacterial. Now I am changing my path and trying to use blast to pick otus and assign taxonomy. 

I used the below  to generate the reference file required:
subsample_fasta.py -i seqs.fna -p .10 -o refseqs.fasta

then ran: 
pick_otus.py -i seqs.fna -o blast_picked_otus -m blast -r refseqs.fasta

The command is still running nohup on my server. I decided to do what I perhaps should have done before and subset the seqs.fna file and the refseqs.fasta file so that they were 244 and 101 sequences, respectively. 

I then ran pick rep set on these subset files: 
pick_rep_set.py -i subsample_101of244_otus/sub_of_seq_otus.txt -f sub_of_seq.fna -o rep_set1.fna

I want to assign taxonomy using blast however I keep getting the same error: 
 File "/usr/local/bin/assign_taxonomy.py", line 417, in <module>
    main()
  File "/usr/local/bin/assign_taxonomy.py", line 222, in main
    option_parser, opts, args = parse_command_line_parameters(**script_info)
  File "/usr/lib/python2.7/dist-packages/qcli/option_parsing.py", line 313, in parse_command_line_parameters
    opts,args = parser.parse_args(command_line_args)
  File "/usr/lib/python2.7/optparse.py", line 1384, in parse_args
    values = self.get_default_values()
  File "/usr/lib/python2.7/optparse.py", line 1329, in get_default_values
    defaults[option.dest] = option.check_value(opt_str, default)
  File "/usr/lib/python2.7/optparse.py", line 770, in check_value
    return checker(self, opt, value)
  File "/usr/lib/python2.7/dist-packages/qcli/option_parsing.py", line 41, in check_existing_filepath
    "option %s: file does not exist: %r" % (opt, value))
optparse.OptionValueError: option --id_to_taxonomy_fp: file does not exist: '# /usr/share/qiime/data/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt'

I tried to download the ggfile again and use it with my -t and a path to the file, but it didn't seem to work. Is my initial sub setting the problem? Should I kill the one running nohup on the server?

Thanks!

Embriette

unread,
Sep 9, 2016, 11:02:07 AM9/9/16
to Qiime 1 Forum
Good morning,

Your error is indicating that the default taxonomy file, which is the Greengenes taxonomy, is not in the expected folder. However, you should not be using this database anyway as it is strictly bacterial (and would explain why you had everything Unspecified after your first de novo attempt). I recommend using the UNITE database and when you run your commands, you'll need to specify which reference database and taxonomy files you're using, otherwise QIIME will attempt to use the default (Greengenes). Blast will also take quite a long time to run so I would recommend using SUMAclust to pick your fungal OTUs de novo, or you can use the default (uclust). You can certainly use both methods (SUMAclust or uclust and blast) and compare the outputs that you get. 

To further troubleshoot why you are getting the error when you try to assign taxonomy with the default, run print_qiime_config.py and send us your output.

Thanks!

Embriette

Roody_UF

unread,
Sep 9, 2016, 12:50:26 PM9/9/16
to Qiime 1 Forum
Thank you Embriette! 
This is my output from the print_qiime_config.py: 
System information
==================
         Platform:      linux2
   Python version:      2.7.6 (default, Jun 22 2015, 17:58:13)  [GCC 4.8.2]
Python executable:      /usr/bin/python

QIIME default reference information
===================================
For details on what files are used as QIIME's default references, see here:

Dependency versions
===================
          QIIME library version:        1.9.1
           QIIME script version:        1.9.1
qiime-default-reference version:        0.1.3
                  NumPy version:        1.8.2
                  SciPy version:        0.13.3
                 pandas version:        0.17.1
             matplotlib version:        1.3.1
            biom-format version:        2.1.4
                   h5py version:        2.5.0 (HDF5 version: 1.8.11)
                   qcli version:        0.1.0
                   pyqi version:        0.3.2
             scikit-bio version:        0.2.3
                 PyNAST version:        1.2.2
                Emperor version:        0.9.51
                burrito version:        0.9.1
       burrito-fillings version:        0.1.1
              sortmerna version:        SortMeRNA version 2.0, 29/11/2014
              sumaclust version:        SUMACLUST Version 1.0.00
                  swarm version:        Swarm 1.2.19 [May 25 2016 14:36:46]
                          gdata:        Installed.

QIIME config values
===================
For definitions of these settings and to learn how to configure QIIME, see here:

                     blastmat_dir:      /opt/qiime_deps/blast-2.2.22-release/data
                  cluster_jobs_fp:      None
      pick_otus_reference_seqs_fp:      /usr/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                    jobs_to_start:      1
pynast_template_alignment_blastdb:      None
                qiime_scripts_dir:      /usr/lib/qiime/bin/
                      working_dir:      .
     pynast_template_alignment_fp:      /usr/share/qiime/data/core_set_aligned.fasta.imputed
                    python_exe_fp:      python
                         temp_dir:      /tmp/
assign_taxonomy_reference_seqs_fp:      # /usr/share/qiime/data/gg_13_8_otus/rep_set/97_otus.fasta
                      blastall_fp:      /opt/qiime_deps/blast-2.2.22-release/bin/blastall
                 seconds_to_sleep:      60
assign_taxonomy_id_to_taxonomy_fp:      # /usr/share/qiime/data/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt


~~~~~Here is the output for print_qiime_config.py -tf

dyrdahlyoung@if-plp-haustorium2:~/Itu$ print_qiime_config.py -tf

System information
==================
         Platform:      linux2
   Python version:      2.7.6 (default, Jun 22 2015, 17:58:13)  [GCC 4.8.2]
Python executable:      /usr/bin/python

QIIME default reference information
===================================
For details on what files are used as QIIME's default references, see here:

Dependency versions
===================
                QIIME library version:  1.9.1
                 QIIME script version:  1.9.1
      qiime-default-reference version:  0.1.3
                        NumPy version:  1.8.2
                        SciPy version:  0.13.3
                       pandas version:  0.17.1
                   matplotlib version:  1.3.1
                  biom-format version:  2.1.4
                         h5py version:  2.5.0 (HDF5 version: 1.8.11)
                         qcli version:  0.1.0
                         pyqi version:  0.3.2
                   scikit-bio version:  0.2.3
                       PyNAST version:  1.2.2
                      Emperor version:  0.9.51
                      burrito version:  0.9.1
             burrito-fillings version:  0.1.1
                    sortmerna version:  SortMeRNA version 2.0, 29/11/2014
                    sumaclust version:  SUMACLUST Version 1.0.00
                        swarm version:  Swarm 1.2.19 [May 25 2016 14:36:46]
                                gdata:  Installed.
RDP Classifier version (if installed):  rdp_classifier-2.2.jar
          Java version (if installed):  1.8.0_66

QIIME config values
===================
For definitions of these settings and to learn how to configure QIIME, see here:

                     blastmat_dir:      /opt/qiime_deps/blast-2.2.22-release/data
                  cluster_jobs_fp:      None
      pick_otus_reference_seqs_fp:      /usr/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                    jobs_to_start:      1
pynast_template_alignment_blastdb:      None
                qiime_scripts_dir:      /usr/lib/qiime/bin/
                      working_dir:      .
     pynast_template_alignment_fp:      /usr/share/qiime/data/core_set_aligned.fasta.imputed
                    python_exe_fp:      python
                         temp_dir:      /tmp/
assign_taxonomy_reference_seqs_fp:      # /usr/share/qiime/data/gg_13_8_otus/rep_set/97_otus.fasta
                      blastall_fp:      /opt/qiime_deps/blast-2.2.22-release/bin/blastall
                 seconds_to_sleep:      60
assign_taxonomy_id_to_taxonomy_fp:      # /usr/share/qiime/data/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt

QIIME full install test results
===============================
...........................
----------------------------------------------------------------------
Ran 27 tests in 0.064s

OK

Embriette

unread,
Sep 9, 2016, 1:05:36 PM9/9/16
to Qiime 1 Forum
Hi there,

I see the problem! There are two '#' characters in front of filepaths for assign_taxonomy_reference_seqs_fp and assign_taxonomy_id_to_taxonomy_fp. You'll need to open your QIIME config file and remove those characters. In this case they do not comment out the filepath, and the system is looking for a filepath that it doesn't recognize. When you remove them, the default behavior will work appropriately again.

Thanks!

Embriette


Roody_UF

unread,
Sep 12, 2016, 4:40:13 PM9/12/16
to Qiime 1 Forum
Hello again!

I removed the "#" from the qiime config file. I am still seeing the same "#" from the print_qiime_cofig.py command. But I don't want to use the gg file.

I downloaded the UNITE rep/refseqs I am trying to assign taxonomy using and I am using the -r and -t commands. the whole command I am using is: 

assign_taxonomy.py -i sh_refs_qiime_ver7_99_22.08.2016.fasta -r UNITE_rep_set1.fna -t sh_taxonomy_qiime_ver7_99_22.08.2016.txt

I am still getting the same error. looking for OTUs using the greengenes from the file it cannot find. What am I missing from my command that it is not accepting my --id_to_taxonomy_fp file from the -t command? 

I am not so much worried about the "#" as I am getting it to read from the UNITE data base rather than gg/default behavior. 

Thanks so much!
-Roody



Embriette

unread,
Sep 13, 2016, 1:21:33 PM9/13/16
to Qiime 1 Forum
Hi Roody,

Do you have more than one copy of your config file? If the '#' character is still there, that indicates that the config file you edited is not the one that QIIME is actually using.

Regarding your assign taxonomy error, try typing in the full paths for your files.

Thanks!

Embriette
Reply all
Reply to author
Forward
0 new messages