Creating a new Qiime database & troubleshooting

86 views
Skip to first unread message

Ang Angelova

unread,
Apr 29, 2016, 5:17:50 AM4/29/16
to qiime...@googlegroups.com
Hi,

I am trying to create myself my own database for macqiime "assign_taxonomy.py".

I have downloaded my desired fasta sequences from NCBI. My .fasta file looks like this (i am using organism name as ID):
>Actinopolyspora_biskrensis
CCATGGGTCTCAGGACGGAACGCTGACGGGCGCGCTTCACACATGCAAGTCGAACGCTCGCACCCCGTGT...
>Streptomyces_panaciradicis
GGCTCAGGACGAACGCTGGCGGCGTGCTTAACACATGCAAGTCGAACGATGAACCACTTCGGTGGGGATT...

Then I created a taxonomy file that looks like this (i am using same organism name as ID. Hope thats ok):
Actinopolyspora_biskrensis celullar organisms; Bacteria; Terrabacteria group; Actinobacteria; Actinobacteria; Actinopolysporales; Actinopolysporaceae; Actinopolyspora; Actinopolyspora_biskrensis
Streptomyces_panaciradicis celullar organisms; Bacteria; Terrabacteria group; Actinobacteria; Actinobacteria; Streptomycetales; Streptomycetaceae; Streptomyces; Streptomyces_panaciradicis

I am trying to run taxonomy now but I get an error:
 

Traceback (most recent call last):

  File "/macqiime/anaconda/bin/assign_taxonomy.py", line 417, in <module>

    main()

  File "/macqiime/anaconda/bin/assign_taxonomy.py", line 386, in main

    taxon_assigner = taxon_assigner_constructor(params)

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 1234, in __init__

    self.id_to_taxonomy = self._parse_id_to_taxonomy_file(id_to_taxonomy_f)

  File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 117, in _parse_id_to_taxonomy_file

    identifier, taxonomy = map(strip, line.split('\t'))

ValueError: need more than 1 value to unpack


Frankly, some of my sequences could be appearing twice in my fasta & taxonomy files. Also my IDs are organism names. Is any of those a problem?


What does this error mean? What do I need to change? 


Angelina

TonyWalters

unread,
Apr 29, 2016, 6:22:57 AM4/29/16
to Qiime 1 Forum
Hello Angelina,

The taxonomy mapping file needs to be tab separated between the ID and the taxonomy strings, i.e. a tab between these (but not between each of the taxonomy classifications):
Actinopolyspora_biskrensis  celullar organisms

See: http://qiime.org/documentation/file_formats.html#id-to-taxonomy-map

Also this page has a guide for building alternative training databases for taxonomic assignment (most strict for RDP based assignments, less strict for uclust or blast based assignments): 
Reply all
Reply to author
Forward
0 new messages