Chimera Checking issue

326 views
Skip to first unread message

Kirk Bergstrom

unread,
Jul 21, 2015, 11:55:24 AM7/21/15
to qiime-forum
Hi there - I have two basic questions:

I'm using the Qiime virtual box. I'm trying filter out chimeras (post OTU picking and alignment) using the identify_chimeric_seqs.py script using either ChimeraSlayer or Blast Fragments approaches.  Both have given me errors, shown below:

Command line (for ChimeraSlayer)

!identify_chimeric_seqs.py -m ChimeraSlayer -i /home/qiime/Desktop/XL154\ QIIME\ Test/fastq-join_joined/pickdenovoOTU/pynast_aligned_seqs/XL154_S2S4_WT_DKO_combined_seqs_rep_set_aligned.fasta -a /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set_aligned/85_otus.pynast.fasta -o chimeric_seqs_cs.txt

Error from ChimeraSlayer approach:

Traceback (most recent call last):
  File "/usr/local/bin/identify_chimeric_seqs.py", line 354, in <module>
    main()
  File "/usr/local/bin/identify_chimeric_seqs.py", line 328, in main
    keep_intermediates=keep_intermediates)
  File "/usr/local/lib/python2.7/dist-packages/qiime/identify_chimeric_seqs.py", line 159, in chimeraSlayer_identify_chimeras
    keep_intermediates=keep_intermediates):
  File "/usr/local/lib/python2.7/dist-packages/qiime/identify_chimeric_seqs.py", line 143, in __call__
    keep_intermediates=keep_intermediates)
  File "/usr/local/lib/python2.7/dist-packages/qiime/identify_chimeric_seqs.py", line 637, in get_chimeras_from_Nast_aligned
    app_results = app()
  File "/usr/local/lib/python2.7/dist-packages/burrito/util.py", line 295, in __call__
    result_paths = self._get_result_paths(data)
  File "/usr/local/lib/python2.7/dist-packages/qiime/identify_chimeric_seqs.py", line 419, in _get_result_paths
    raise ApplicationError("Calling ChimeraSlayer failed.")
burrito.util.ApplicationError: Calling ChimeraSlayer failed.


Command Line for Blast fragments approach:

!identify_chimeric_seqs.py -i XL154_S2S4_WT_DKO_combined_seqs.fna -t /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/taxonomy/97_otu_taxonomy.txt -r /usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/97_otus.fasta -m blast_fragments -o chimeric_seqs_blast.txt

Error for Blast fragments approach: 
Traceback (most recent call last):
  File "/usr/local/bin/identify_chimeric_seqs.py", line 354, in <module>
    main()
  File "/usr/local/bin/identify_chimeric_seqs.py", line 321, in main
    taxonomy_depth=taxonomy_depth)
  File "/usr/local/lib/python2.7/dist-packages/qiime/identify_chimeric_seqs.py", line 337, in blast_fragments_identify_chimeras
    bcc = BlastFragmentsChimeraChecker(params)
  File "/usr/local/lib/python2.7/dist-packages/qiime/identify_chimeric_seqs.py", line 206, in __init__
    build_blast_db_from_fasta_path(reference_seqs_fp)
  File "/usr/local/lib/python2.7/dist-packages/bfillings/formatdb.py", line 121, in build_blast_db_from_fasta_path
    app_result = fdb(fasta_path)
  File "/usr/local/lib/python2.7/dist-packages/burrito/util.py", line 285, in __call__
    'StdErr:\n%s\n' % open(errfile).read())
burrito.util.ApplicationError: Unacceptable application exit status: 2
Command:
cd "/usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/"; formatdb -l "97_otus.fasta.log" -o T -n "97_otus.fasta" -i "97_otus.fasta" -p F > "/dev/null" 2> "/dev/null"
StdOut:

StdErr:



Are there issues with the filepaths to the reference sequences (I got these directly from the print_qiime_config.py script output)  Thank you for your help in this matter,

 - Kirk


Colin Brislawn

unread,
Jul 21, 2015, 1:39:32 PM7/21/15
to qiime...@googlegroups.com
Hello Kirk,

I think your reference paths are fine, or at least the current errors don't mention a problem with your referances.  

Did you read the last few lines of each error?


For ChimeraSlayer
    raise ApplicationError("Calling ChimeraSlayer failed.")
burrito.util.ApplicationError: Calling ChimeraSlayer failed.
It can't call ChimeraSlayer, probably because it is not installed.


For Blast
  File "/usr/local/lib/python2.7/dist-packages/burrito/util.py", line 285, in __call__
    'StdErr:\n%s\n' % open(errfile).read())
burrito.util.ApplicationError: Unacceptable application exit status: 2
Command:
cd "/usr/local/lib/python2.7/dist-packages/qiime_default_reference/gg_13_8_otus/rep_set/"; formatdb -l "97_otus.fasta.log" -o T -n "97_otus.fasta" -i "97_otus.fasta" -p F > "/dev/null" 2> "/dev/null"
It can't open up the error file. I'm not sure if this is a problem with blast, or with qiime, if with the way you are piping the qiime output to /dev/null. 


Going forward, you could try installing ChimeraSlayer:

I currently use the UCHIME algorithm for chimera detection, and it works great. 
I use the version implemented in VSEARCH.

Here is how I do chimera checking using vsearch.
vsearch -uchime_denovo rep_set.fna \
-strand plus -nonchimeras rep_set.checked_denovo.fna 


Good luck Kirk!
Colin

Kirk B

unread,
Jul 21, 2015, 2:00:31 PM7/21/15
to qiime-forum
Thanks again Colin - I think you nailed it, I wrongly assumed ChimeraSlayer was already installed in the Vbox.  I will try each of those suggestions!
Cheers,
Kirk

--

---
You received this message because you are subscribed to the Google Groups "Qiime Forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qiime-forum...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Colin Brislawn

unread,
Jul 21, 2015, 4:00:06 PM7/21/15
to qiime-forum
You should totally check out vsearch -uchime_denovo.

It's fast, it's good, and it's open source.

Colin


You received this message because you are subscribed to a topic in the Google Groups "Qiime Forum" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/qiime-forum/kD3D3RMhW4g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to qiime-forum...@googlegroups.com.

Kirk B

unread,
Jul 21, 2015, 4:58:10 PM7/21/15
to qiime-forum
Thanks Colin - looking forward to trying vsearch out!
 - Kirk

Kirk B

unread,
Aug 5, 2015, 2:42:08 PM8/5/15
to qiime-forum
Hi Colin - I downloaded and install vsearch to use for chimera checking (with UCHIME) as recommended. It looks like it installed okay, but and got some results that were a bit confusing

First I tried using the rep.set.fna file after pick_open_references_otu.py

Command line: 
!vsearch -uchime_denovo rep_set.fna -strand plus -nonchimeras rep_set.checked_denovo.fna

Results:
vsearch v1.1.3_linux_x86_64, 7.6GB RAM, 1 cores

Reading file rep_set.fna 100%
13598132 nt in 53567 seqs, min 200, max 494, avg 254
Indexing sequences 100%
Sorting by abundance 100%
Counting unique k-mers 100%
Detecting chimeras 100%
Found 0 (0.0%) chimeras, 53567 (100.0%) non-chimeras,
and 0 (0.0%) suspicious candidates in 53567 sequences.

I did not expect their to be no chimeras....


Next I tried doing the same on my fna file immediately prior to otu picking:

Command line: 
!vsearch -uchime_denovo XL154_fullset_qf20.fna -strand plus -nonchimeras XL154_fullset_qf20_nonchimeras.fna

Result:

vsearch v1.1.3_linux_x86_64, 7.6GB RAM, 1 cores

Reading file XL154_fullset_qf20.fna 100%
2482249766 nt in 9806070 seqs, min 190, max 494, avg 253
Indexing sequences 100%
Sorting by abundance 100%
Counting unique k-mers 100%
Detecting chimeras 0%Segmentation fault (core dumped)

So I'm not sure of 1: why no chimeric sequences were detected in my rep.set.fna file, unless there indeed are none after this script, although the workflow does not mentioned any chimera removal; 2. Why the original input file did not work with vsearch.  

Your insights are much appreciated!

Best,
Kirk



Colin Brislawn

unread,
Aug 5, 2015, 6:03:43 PM8/5/15
to qiime-forum
Hello Kirk,

De novo chimera checking algorithms make use of the abundance of each read to infer which reads are the chimeras and which reads are the parents. Basically, the less abundant read is identified as the chimera, but only if the reads have size annotations appended to them.

The rep_set.fna produced by qiime does not include size annotations.
Your input file does not have size annotations either, but it can! :-)

First, dereplicate your original file and append size annotations during this step.
vsearch -derep_fulllength $reads -sizeout -output seqs.derep.fna -log seqs.derep.log

Then, feed the resulting file into uchime.
vsearch -uchime_denovo  seqs.derep.fna -strand plus \
-nonchimeras chimeras/seqs.derep.checked_denovo.fna \
-chimeras chimeras/seqs.derep.denovo_chimeras.fna \
--log chimeras/seqs.derep.checked_denovo.log


These days, I do my OTU picking through vsearch instead of qiime. My command looks like...
vsearch --cluster_smallmem seqs.derep.mc2.fna --sizein --sizeout
...which preserves the size annotations of my centroids and adds up their abundance as other reads get added to their cluster. I can use the resulting rep_set.fna file with uchime because it has size these annotations. 

Colin



Kirk B

unread,
Aug 5, 2015, 7:08:10 PM8/5/15
to qiime-forum
Thanks Colin - I completely overlooked the size issue - I will definitely try the derep steps as suggested, and am looking forward to seeing what vsearch can do!
Best,
Kirk
Reply all
Reply to author
Forward
0 new messages