Hi all,
Thank you for reading this post!
>I am analyzing 16s data MiSeq for more than 100 samples
>When I look at the OTU table that gives reads/OTU, I see a huge number of OTUs across all samples (>70,000)
>Now some of these OTUs might have arisen due to spurious nature of the MiSeq chemistry OR for whatever reason the confidence in rare OTUs is low. I want to remove OTUs based on a cutoff and I am confused regarding which method to prefer. I have thought of following options
1) Remove OTUs that have less than X number of
reads. I don't know how to decide on that number since I couldn't find any clear reference stating whats the standard
2) Remove OTUs that contribute "X" value of
relative abundance . e.g. remove all those OTUs that have <0.01 abundance
Now for both these options, should I apply cutoffs on per sample basis or across entire sample set?
I came across a reference that says apply cut off of 0.005% across entire OTU set for all samples in comparison (if I interpret it correctly). This is that paper:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3531572/pdf/nihms420659.pdfCan someone please tell me if there is a standard way to filter OTUs from OTU table? Minimum OTU size should be 2 ? 10? 20? etc OR relative abundance of 0.1? 0.01? 0.005? etc
Thank you,
Kruttika