Thank you very much Nicola, I appreciate your help. Your answer makes great sense.
In my test run to compare bt2 and blastn with default setting on one dataset, it seems bt2 result (82 species) contains more otu than blastn result (60 species), though the 22 species are in very low abundance.
per your answer, I think I will stick to blastn for now, and bump up the --stat_q value. I plan to test 0.1 to 0.3 (step size 0.05) and see how much difference that'll make. Let me know if there is a better way to choose the --stat_q value.
Thanks again!
On Tuesday, April 24, 2012 9:01:20 AM UTC-4, Nicola Segata wrote:
> Hi Niki,</div>
> thanks a lot for your question. </div>
>
> </p>
> <font color="#393939" face="Helvetica, Arial, sans-serif">Short answer: if computational performance is the bottleneck of your analysis you should use BowTie2, if you want to maximize accuracy it doesn't really matter because Blastn and BowTie2 provide very similar predictions.</font></p>
> <font color="#393939" face="Helvetica, Arial, sans-serif">Here is a more verbose answer in case the computational performance is not the main factor to consider in your analysis. Based on our ten synthetic datasets, bowtie2 and blastn perform almost equally in terms of accuracy. Sometimes blastn is slightly better than bowtie2, but other times the opposite is true. Bowtie2 with '--bt2_ps very-sensitive-local' (the default) or '--bt2_ps sensitive-local' seems to be on average a little more accurate than blastn producing however some more false positives. On the other hand, Bowtie2 with '--bt2_ps very-sensitive' (the default) or '--bt2_ps sensitive' is a bit less accurate than blastn but with fewer false positives. The false positives / false negatives trade-off can also be tuned using the '--stat_q' option (that works for both blastn and BowTie2) that we suggest to set higher than 0.1 (but smaller than 0.33) if one wants to avoid as much as possible false positives at the price of having some false negatives. Currently, we profiled 1,000 real metagenomes using blastn (because we didn't add the BowTie2 option yet), and just few metagenomes with BowTie2; until a more comprehensive analysis is performed, it may be a bit safer to use blastn instead of BowTie2 if the accuracy is much more important than computational efficiency.</font></p>
> </p></div>
>
> </div>
> I added a FAQ section with this question/answer (hopefully other Q/A will be added soon):</div>
> <a href="
https://bitbucket.org/nsegata/metaphlan/wiki/FAQ" target="_blank">
https://bitbucket.org/nsegata/<WBR>metaphlan/wiki/FAQ</a>
> </div>
>
> </div>
> Let me know if you have any comment or question!</div>
> thanks</div>
> Nicola</div>
> On Tuesday, April 24, 2012 12:43:10 AM UTC-4, Niki wrote:<blockquote class="gmail_quote" style="margin:0;margin-left:0.8ex;border-left:1px #ccc solid;padding-left:1ex">Thanks for this nice piece of software. I have a quick question here. If I prefer accuracy over time-efficiency, should I use blast or bowtie2? Thanks!
> </p></blockquote>