Skip to first unread message

Cameron

unread,
Sep 28, 2016, 6:35:19 AM9/28/16
to Qiime 1 Forum
Hi Qiime users,

I recently ran denovo identify_chimeric_seqs.py using vsearch (renamed as usearch61). It took 48 hours to complete and in the end I obtained an error message along with three files:
XXX_consensus_with_abundance.fasta
XXX_consensus_with_abundance.uc
XXX_smallmem_clustered.log

The command I used was identify_chimeric_seqs.py -m usearch61 -i SASA_TMRU_FR_check_seqs.fna -o chimeric_seqs_97_USearch61 --suppress_usearch61_ref

I chose to suppress the use of a reference database as I was concerned that if I used, for example, my GreenGenes 16S reference database, that this would render me back in position of effectively doing a closed reference OTU picking strategy (which I have already tried and it wipes out a huge proportion of my sequences). So firstly, I am finding it hard to find information on the pros and cons of using a reference database during chimera checking; does anyone have any experience of comparison, have any advice, or knows a good place to seek out this information? I'd love to hear.

The error I got was:

Traceback (most recent call last):
  File "/macqiime/bin/identify_chimeric_seqs.py", line 4, in <module>
    __import__('pkg_resources').run_script('qiime==1.9.0', 'identify_chimeric_seqs.py')
  File "/macqiime/lib/python2.7/site-packages/setuptools-12.2-py2.7.egg/pkg_resources/__init__.py", line 698, in run_script
   
  File "/macqiime/lib/python2.7/site-packages/setuptools-12.2-py2.7.egg/pkg_resources/__init__.py", line 1616, in run_script
   
  File "/macqiime/lib/python2.7/site-packages/qiime-1.9.0-py2.7.egg/EGG-INFO/scripts/identify_chimeric_seqs.py", line 354, in <module>
    main()
  File "/macqiime/lib/python2.7/site-packages/qiime-1.9.0-py2.7.egg/EGG-INFO/scripts/identify_chimeric_seqs.py", line 350, in main
    threads=threads)
  File "/macqiime/lib/python2.7/site-packages/qiime-1.9.0-py2.7.egg/qiime/identify_chimeric_seqs.py", line 774, in usearch61_chimera_check
    log_lines, verbose, threads)
  File "/macqiime/lib/python2.7/site-packages/qiime-1.9.0-py2.7.egg/qiime/identify_chimeric_seqs.py", line 894, in identify_chimeras_usearch61
    parse_usearch61_clusters(open(output_consensus_uc, "U"))
  File "/macqiime/lib/python2.7/site-packages/burrito_fillings-0.1.0-py2.7.egg/bfillings/usearch.py", line 2483, in parse_usearch61_clusters
KeyError: 'denovo45'

Secondly, can anyone help with how to fix this error?

I've been trying to play with the XXX_consensus_with_abundance.fasta file, printing all the singletons from here, and using this list to remove all the corresponding sequences from an OTU picked, rep_set, aligned file. It later occurred to me that this still would not be taking out the chimeras, so I wanted to ask what I might have done wrong in this chimera checking step, especially given that this step takes me so so long to run (6,187,414 sequences, 3.61GB). I've also been trying to used chimeraslayer, however this has been running for almost a week now.

Thanks for any help or information!

Cameron

unread,
Sep 28, 2016, 12:57:32 PM9/28/16
to Qiime 1 Forum
In case it is of help, attached is a file with a sample of my sequences (10,000 seqs; SASA_TMRU_FR_check_seqs_10000.fna), as well as the three output files.

I have just run the command again, but this time using a GG reference database and I get the same output error.

identify_chimeric_seqs.py -m usearch61 -i SASA_TMRU_FR_check_seqs_10000.fna -r gg_13_8_otus/rep_set/97_otus.fasta -o uclust_otus_open/chimeric_seqs_97_USearch61_ref_test

Thanks for any ideas!
SASA_TMRU_FR_check_seqs_10000.fna
SASA_TMRU_FR_check_seqs_10000.fna_consensus_with_abundance.fasta
SASA_TMRU_FR_check_seqs_10000.fna_consensus_with_abundance.uc
SASA_TMRU_FR_check_seqs_10000.fna_smallmem_clustered.log
Reply all
Reply to author
Forward
0 new messages