biom.exception.TableException: Duplicate sample IDs!

1,000 views
Skip to first unread message

James Doonan

unread,
Nov 11, 2015, 8:17:46 PM11/11/15
to Qiime 1 Forum
HI,

I'm trying to build a heatmap with the make_otu_heatmap.py script. I'm using a biom file from mg-rast and a tree file I generated through neighbor_joining.py. When I run this command;

make_otu_heatmap.py -i table_from_mgrast.biom -o heatmap_tree.pdf -m metadata.txt -t beta_div_cluster.tre/nj_euclidean_table.tre

I get the following error;

Traceback (most recent call last):
  File "/usr/local/bin/make_otu_heatmap.py", line 254, in <module>
    main()
  File "/usr/local/bin/make_otu_heatmap.py", line 242, in main
    otu_table = otu_table.sort_order(otu_id_order, axis='observation')
  File "/usr/lib/python2.7/dist-packages/biom/table.py", line 1755, in sort_order
    self.type)
  File "/usr/lib/python2.7/dist-packages/biom/table.py", line 328, in __init__
    errcheck(self)
  File "/usr/lib/python2.7/dist-packages/biom/err.py", line 472, in errcheck
    raise ret
biom.exception.TableException: Duplicate sample IDs!

I'm not sure how to modify this. I've attached the input files. Any suggestions would be greatly appreciate.

Thanks!

James
table_from_mgrast.biom
metadata.txt
nj_euclidean_table.tre

Kyle Bittinger

unread,
Nov 13, 2015, 12:20:57 PM11/13/15
to Qiime 1 Forum
James, would you send the output from print_qiime_config.py?

Also, are you able to execute this command without an error?  (This works on my box)
biom convert -i table_from_mgrast.biom -o table_from_mgrast.txt --to-tsv

James Doonan

unread,
Nov 13, 2015, 4:57:10 PM11/13/15
to Qiime 1 Forum
HI Kyle,

System information
==================
         Platform:    linux2
   Python version:    2.7.6 (default, Jun 22 2015, 17:58:13)  [GCC 4.8.2]
Python executable:    /usr/bin/python

QIIME default reference information
===================================
For details on what files are used as QIIME's default references, see here:
 https://github.com/biocore/qiime-default-reference/releases/tag/0.1.3

Dependency versions
===================
          QIIME library version:    1.9.1
           QIIME script version:    1.9.1
qiime-default-reference version:    0.1.3
                  NumPy version:    1.10.1
                  SciPy version:    0.16.1
                 pandas version:    0.13.1
             matplotlib version:    1.3.1
            biom-format version:    2.1.4
                   h5py version:    2.2.1 (HDF5 version: 1.8.11)
                   qcli version:    0.1.1
                   pyqi version:    0.3.2
             scikit-bio version:    0.2.3
                 PyNAST version:    1.2.2
                Emperor version:    0.9.51
                burrito version:    0.9.1
       burrito-fillings version:    0.1.1
              sortmerna version:    SortMeRNA version 2.0, 29/11/2014
              sumaclust version:    SUMACLUST Version 1.0.00
                  swarm version:    Swarm 1.2.19 [Nov 10 2015 22:36:35]
                          gdata:    Installed.

QIIME config values
===================
For definitions of these settings and to learn how to configure QIIME, see here:
 http://qiime.org/install/qiime_config.html
 http://qiime.org/tutorials/parallel_qiime.html

                     blastmat_dir:    None
      pick_otus_reference_seqs_fp:    /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                         sc_queue:    all.q
topiaryexplorer_project_dir:    None
     pynast_template_alignment_fp:    /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta
                  cluster_jobs_fp:    start_parallel_jobs.py
pynast_template_alignment_blastdb:    None
assign_taxonomy_reference_seqs_fp:    /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta
                     torque_queue:    friendlyq
                    jobs_to_start:    1
                       slurm_time:    None
            denoiser_min_per_core:    50
assign_taxonomy_id_to_taxonomy_fp:    /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt
                         temp_dir:    /tmp/
                     slurm_memory:    None
                      slurm_queue:    None
                      blastall_fp:    blastall
                 seconds_to_sleep:    1

The biom convert command works fine on my box. First few lines;

# Constructed from biom file
#OTU ID 4623128.3       4623127.3       4623125.3       4623123.3       4623121.3       4623119.3       4623117.3       4623115.3
387432  210.0   7644.0  1539.0  308.0   355.0   4879.0  221.0   8001.0
387433  211.0   7657.0  1549.0  311.0   359.0   4900.0  225.0   8017.0
326366  3936.0  639.0   1369.0  457.0   1523.0  5006.0  14.0    138.0
326367  3019.0  501.0   1082.0  329.0   1165.0  3819.0  9.0     94.0

Thanks,

James

Kyle Bittinger

unread,
Nov 16, 2015, 12:24:38 PM11/16/15
to Qiime 1 Forum
That's not the text output I get when I process the file you posted.  My output looks like this:

# Constructed from biom file
#OTU ID RMS     RLS     RES     AMS     AMA     ALS     AH      AES
387432  210.0   7644.0  1539.0  308.0   355.0   4879.0  221.0   8001.0
387433  211.0   7657.0  1549.0  311.0   359.0   4900.0  225.0   8017.0
326366  3936.0  639.0   1369.0  457.0   1523.0  5006.0  14.0    138.0
326367  3019.0  501.0   1082.0  329.0   1165.0  3819.0  9.0     94.0

The sample ID's in my output match those in your sample metadata file.  Did you paste the wrong output?
Best,
Kyle

Abigail Armstrong

unread,
Nov 17, 2015, 1:39:50 PM11/17/15
to Qiime 1 Forum
I am also having the same problem. I have tried multiple OTU tables with edited samples names to ensure that they were unique and I get the same error. I can send the files I am working with as well or we can focus on James data to try and fix the problem. I've looked into a few things but have yet to find any promising leads as to the core of the problem.
Thanks,
Abigail

James Doonan

unread,
Nov 17, 2015, 2:25:58 PM11/17/15
to Qiime 1 Forum
Hi Kyle/Abigail,

My output is the same as what Kyle posted. Sorry for confusing matters, I changed the names of the columns a few times to try and get past the duplicate sample ID error. I still get the sample results even if I change the sample IDs.

Thanks,
James

Jenya Kopylov

unread,
Nov 17, 2015, 3:52:54 PM11/17/15
to Qiime 1 Forum
Hi James,

It looks like your OTU tree is actually a sample tree (the tips are sample names).
This file is causing the error you're seeing because none of the OTU table observation IDs match the tree tips.
If you intended to pass the sample tree, you should use the parameter --sample_tree instead of --otu-tree.

Hopefully this resolves your issue,
Jenya


Abigail Armstrong

unread,
Nov 23, 2015, 12:57:52 PM11/23/15
to Qiime 1 Forum
I think I may have been having a similar problem -- mismatched tree and OTU table. I had been using a summarized OTU table (resulting output from summarize_taxa.py) and the rep_set.tre from picking OTUs. I was able to remedy the problem by either dropping the tree when running on a summarized OTU table or using the original OTU table with the rep_set tree.

On a similar note, is there a way to get a representative sequence set for OTUs that have been summarized at various taxonomic levels so that I can generate new trees?

Thanks!
Abigail

Jenya Kopylov

unread,
Nov 23, 2015, 4:08:55 PM11/23/15
to Qiime 1 Forum
Hi Abigail,

You can try to generate a FASTA file with representative OTUs for your summarized taxa using filter_taxa_from_otu_table.py (use summarized taxonomies) and then filter_fasta.py (input the representative OTU file).

Jenya


Reply all
Reply to author
Forward
0 new messages