I am trying to create myself my own database for macqiime "assign_taxonomy.py".
I have downloaded my desired fasta sequences from NCBI. My .fasta file looks like this (i am using organism name as ID):
>Actinopolyspora_biskrensis
CCATGGGTCTCAGGACGGAACGCTGACGGGCGCGCTTCACACATGCAAGTCGAACGCTCGCACCCCGTGT...
>Streptomyces_panaciradicis
GGCTCAGGACGAACGCTGGCGGCGTGCTTAACACATGCAAGTCGAACGATGAACCACTTCGGTGGGGATT...
Then I created a taxonomy file that looks like this (i am using same organism name as ID. Hope thats ok):
Actinopolyspora_biskrensis celullar organisms; Bacteria; Terrabacteria group; Actinobacteria; Actinobacteria; Actinopolysporales; Actinopolysporaceae; Actinopolyspora; Actinopolyspora_biskrensis
Streptomyces_panaciradicis celullar organisms; Bacteria; Terrabacteria group; Actinobacteria; Actinobacteria; Streptomycetales; Streptomycetaceae; Streptomyces; Streptomyces_panaciradicis
Traceback (most recent call last):
File "/macqiime/anaconda/bin/assign_taxonomy.py", line 417, in <module>
main()
File "/macqiime/anaconda/bin/assign_taxonomy.py", line 386, in main
taxon_assigner = taxon_assigner_constructor(params)
File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 1234, in __init__
self.id_to_taxonomy = self._parse_id_to_taxonomy_file(id_to_taxonomy_f)
File "/macqiime/anaconda/lib/python2.7/site-packages/qiime/assign_taxonomy.py", line 117, in _parse_id_to_taxonomy_file
identifier, taxonomy = map(strip, line.split('\t'))
ValueError: need more than 1 value to unpack
Frankly, some of my sequences could be appearing twice in my fasta & taxonomy files. Also my IDs are organism names. Is any of those a problem?
What does this error mean? What do I need to change?
Angelina