Minimum count fraction filtering

21 katselukertaa
Siirry ensimmäiseen lukemattomaan viestiin

Neus

lukematon,
20.8.2016 klo 5.55.3120.8.2016
vastaanottaja Qiime 1 Forum
Dear all,

I have some doubts about how the option --min_count_fraction (from filter_otus_from_otu_table.py) works. 

Just as an example: Imagine that you have one OTU which is the only one present in a sample. Would it be removed if you apply a filtering of 1% and its fraction regarding the whole dataset is 0.6%?

Is there a script to filter OTUs taking into account the minimum count fraction of each sample but using the .biom table of the whole dataset? I mean, without need for filtering samples from the out table and do it separately.

Thank you!!


Jai Ram Rideout

lukematon,
21.8.2016 klo 19.34.4521.8.2016
vastaanottaja Qiime 1 Forum
Hi Neus,

Just as an example: Imagine that you have one OTU which is the only one present in a sample. Would it be removed if you apply a filtering of 1% and its fraction regarding the whole dataset is 0.6%?

Yes, that OTU would be removed because it contains 0.6% of the total sequences in the .biom file, and the minimum percentage for an OTU to be retained is 1%. As another example, if you supplied --min_count_fraction 0.5, OTUs would be filtered from the table that have sequence counts making up less than 50% of the total sequence count in the .biom file. Does that make sense?

Is there a script to filter OTUs taking into account the minimum count fraction of each sample but using the .biom table of the whole dataset? I mean, without need for filtering samples from the out table and do it separately.

Unfortunately I don't think there's an easy way to do this. Performing separate filtering like you described should work, and it may be possible to automate with some shell scripting.

Best,
Jai
Vastaa kaikille
Vastaa kirjoittajalle
Välitä
0 uutta viestiä