Problem using biome convert with RDP taxonomy

93 views
Skip to first unread message

LeAnna Cates

unread,
Jun 15, 2016, 3:07:59 PM6/15/16
to Qiime 1 Forum
Hello qiime users,

I am running into a problem converting my .biom file into a tsv .biome.txt file for downstream analysis in R. After some trouble shooting, we believe the problem is with how the taxonomy output of the RDP classifier is formatted; it may not be compatible with the make_otu_table.py. 

First off, I am running RDP classification using the RDP command line program, rather than using Qiime. I am doing this because my system administrator is running a lite install of qiime, so running RDP through qiime is challenging. Here is a sample line of what my taxonomy.txt file looks like after running the RDP classifier command line version using the rep_set.fna file generated by USEARCH:


SH000162.07FU_KF359564_reps Root rootrank 1.0 Fungi domain 1.0 Ascomycota phylum 1.0 Leotiomycetes class 0.99 Helotiales order 0.99 Helotiales_unidentified family 0.99 Helotiales_unidentified_1 genus 0.99 Helotiales_sp|SH204753.06FU species 0.94



I then combine this file with my final_otu_map.txt output from usearch to generate my biom file using the make_otu_table.py command. Here is a sample line of my final_otu_map.txt file and the command I use to generate the .biom file:

first two lines of final_otu_map.txt

SH031751.07FU_HQ846986_refs s101_76732

SH005194.07FU_AY487091_refs s101_48921 s101_36213 s101_62634


command to generate .biom file
make_otu_table.py -i final_otu_map.txt -t rdp_taxonomy.txt -o output.biom

I then go to convert this .biom file to a tsv to I can run downstream analysis in R. to do this I use the biom convert command as follows:
biom convert -i output.biom -o output_tsv.txt --to-tsv --table-type="OTU table" --header-key taxonomy --output-metadata-id "ConsensusLineage"

Every time I run this I get an error message the ends in "type error" (full error message and output of print_qiime_config.py pasted below). I assume this is a problem with how the RDP taxonomy output is formatted, as I can get this to output fine if I assign taxonomy using assign_taxonomy.py, and use this taxonomy output file. 

Question:
How can I go about converting the RDP command line taxonomy output so that it will play nice with my usearch output and generate a usable tab separated .txt file for downstream analysis?


Full error message:

Traceback (most recent call last):

  File "/share/pkg/qiime/1.9.0/install/bin/pyqi", line 184, in <module>

    optparse_main(cmd_obj, argv[1:])

  File "/share/pkg/qiime/1.9.0/install/lib/python2.7/site-packages/pyqi/core/interfaces/optparse/__init__.py", line 275, in optparse_main

    result = optparse_cmd(local_argv[1:])

  File "/share/pkg/qiime/1.9.0/install/lib/python2.7/site-packages/pyqi/core/interface.py", line 39, in __call__

    cmd_result = self.CmdInstance(**cmd_input)

  File "/share/pkg/qiime/1.9.0/install/lib/python2.7/site-packages/pyqi/core/command.py", line 137, in __call__

    result = self.run(**kwargs)

  File "/share/pkg/qiime/1.9.0/install/lib/python2.7/site-packages/biom/commands/table_converter.py", line 198, in run

    metadata_formatter=obs_md_fmt_f)

  File "/share/pkg/qiime/1.9.0/install/lib/python2.7/site-packages/biom/table.py", line 4027, in to_tsv

    observation_column_name)

  File "/share/pkg/qiime/1.9.0/install/lib/python2.7/site-packages/biom/table.py", line 1298, in delimited_self

    md_out = metadata_formatter(md.get(header_key, None))

  File "/share/pkg/qiime/1.9.0/install/lib/python2.7/site-packages/biom/commands/table_converter.py", line 44, in <lambda>

    'sc_separated': lambda x: '; '.join(x),

TypeError



output of print_qiime_config.py

System information

==================

         Platform: linux2

   Python version: 2.7.7 (default, Jun  9 2014, 08:40:25)  [GCC 4.4.7 20120313 (Red Hat 4.4.7-3)]

Python executable: /share/pkg/python/2.7.7/install/bin/python


QIIME default reference information

===================================

For details on what files are used as QIIME's default references, see here:

 https://github.com/biocore/qiime-default-reference/releases/tag/0.1.2


Dependency versions

===================

          QIIME library version: 1.9.0

           QIIME script version: 1.9.0

qiime-default-reference version: 0.1.2

                  NumPy version: 1.9.2

                  SciPy version: 0.14.0

                 pandas version: 0.13.1

             matplotlib version: 1.3.1

            biom-format version: 2.1.4

                   h5py version: 2.3.0 (HDF5 version: 1.8.5)

                   qcli version: 0.1.1

                   pyqi version: 0.3.2

             scikit-bio version: 0.2.3

                 PyNAST version: 1.2.2

                Emperor version: 0.9.51

                burrito version: 0.9.0

       burrito-fillings version: Installed.

              sortmerna version: SortMeRNA version 2.0, 29/11/2014

              sumaclust version: SUMACLUST Version 1.0.00

                  swarm version: Swarm 1.2.19 [May  8 2015 13:33:21]

                          gdata: Installed.


QIIME config values

===================

For definitions of these settings and to learn how to configure QIIME, see here:

 http://qiime.org/install/qiime_config.html

 http://qiime.org/tutorials/parallel_qiime.html


                     blastmat_dir: /share/pkg/blast/2.2.26/install/data/

      pick_otus_reference_seqs_fp: /share/pkg/qiime/1.9.0/install/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta

                         sc_queue: all.q

      topiaryexplorer_project_dir: None

     pynast_template_alignment_fp: /share/pkg/qiime/1.9.0/install/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta

                  cluster_jobs_fp: /share/pkg/qiime/1.9.0/install/bin/start_parallel_jobs_sc.py

pynast_template_alignment_blastdb: None

assign_taxonomy_reference_seqs_fp: /share/pkg/qiime/1.9.0/install/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta

                     torque_queue: friendlyq

                    jobs_to_start: 1

            denoiser_min_per_core: 50

assign_taxonomy_id_to_taxonomy_fp: /share/pkg/qiime/1.9.0/install/lib/python2.7/site-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt

                         temp_dir: /scratch

                     slurm_memory: None

                      slurm_queue: None

                      blastall_fp: /share/pkg/blast/2.2.26/install/bin/blastall

                 seconds_to_sleep: 1

Daniel McDonald

unread,
Jun 16, 2016, 12:34:25 PM6/16/16
to Qiime 1 Forum
Hi LeAnna,

Would you be willing to share a snippet (or the full file) of the generated RDP taxonomy either directly with me via my email or with the forum? As you noted, the issue is that make_otu_table.py is not interpreting the format properly. I can play around with it and provide a means to transform the output into something that can be readily interpreted. 

Best,
Daniel


Reply all
Reply to author
Forward
0 new messages