Chimera checking with 300bp fragments

AnnaC

unread,

Nov 6, 2016, 11:43:55 AM11/6/16

to Qiime 1 Forum

Hello everyone,

I am trying to understand chimera checking. I have several doubts and I would really appreciate if someone can help me to understand a little bit better these facts.

1) Which algorithm works better for length around 270-320bp? We are using Ion Torrent so we cannot arrive at 400bp minimum length…

(2) 2) I have been doing some test and I feel surprised and confused when a sequence like that is considered a chimera using ChimeraSlayer (blast screenshot attached). BLAST results: https://blast.ncbi.nlm.nih.gov/Blast.cgi#alnHdr_1083262127

(3) 3) One specific chimera OTU would represent really low abundant pattern, wouldn’t be? If it is like this, it is not better to remove only really low abundant OTUs rather than remove putative chimera that they do not look like chimeras?

(4) 4) I am working on several skin sites and I have seen some “putative chimeras” that were exclusive to some individual and in different skin sites… So I wouldn’t think they are actual chimeras, but some strain or species.

Could anyone help me to better understand these points?

Thank you so much!!!

Anna

BLAST_161106.png

Daniel McDonald

unread,

Nov 7, 2016, 11:47:58 PM11/7/16

to Qiime 1 Forum

Hi Anna,

Where are you finding a minimum read length? I'm not familiar with one.

The author of ChimeraSlayer does not recommend using it anymore. If you choose to chimera check, I recommend using USEARCH or VSEARCH instead. As you note, chimera checking is not perfect and that is supported time and time again in the literature. It is a difficult and unsolved problem with a fair possibility of false positive and false negatives. One route to lower chimeras is, also as you note, removal of low abundance OTUs.

Best,

Daniel

AnnaC

unread,

Nov 8, 2016, 4:46:50 AM11/8/16

to qiime...@googlegroups.com

Hi Daniel,

thanks again! :)

There is no minimum length, but in the Chimera Slayer paper they say "CS retained near maximal chimera detection accuracy for sequences with length at least 400 bases", they optimised the algorythm using longer reads than I am using... If you also told me that is better not to use it I will try to find a better option. The free version of Usearch is not working, it gives us "Out of memory" failure. That can be solved purchasing it, but I don't know if it is worth for my case... I have also read about Perseus, do you recommend using it?

Maybe I will just continue working with the removal of low abundant OTUs...

Best,

Anna

Colin Brislawn

unread,

Nov 8, 2016, 12:36:14 PM11/8/16

to qiime...@googlegroups.com

Hello Anna,

That chimera slayer paper is pretty out of date, and newer algorithms are not as dependant on sequence length. Check out the UCHIME paper for a newer view, or the UCHIME2 paper for some 'bleeding edge' thinking.

http://bioinformatics.oxfordjournals.org/content/27/16/2194.short

http://www.biorxiv.org/content/early/2016/09/09/074252.abstract

The free version of usearch is pretty limiting. Try vsearch instead! It's open source, uses exact alignments, and is always free. You can install it using conda like this:

conda install vsearch

or get it from github: https://github.com/torognes/vsearch

Let me know if that helps,

Colin

AnnaC

unread,

Nov 8, 2016, 2:14:47 PM11/8/16

to Qiime 1 Forum

Hello Colin,

thanks for all the advice, it helps a lot! I will try using vsearch instead, it looks definetly much more suitable for what I need...

I will also have a look at the new UCHIME2.

Best,

Anna

Colin Brislawn

unread,

Nov 8, 2016, 5:39:39 PM11/8/16

to Qiime 1 Forum

Great! I've had a really good experience using vsearch, and I hope you do too.

The uchime2 algorithm is pretty new, and I'm not sure how much community testing it has undergone. Because it's part of usearch, uchime2 will have the same size limitations as anything else run through usearch.

Colin

PS The vsearch forums are excellent, and the devs are super nice. https://groups.google.com/forum/#!forum/vsearch-forum

Reply all

Reply to author

Forward