Hi Jesse,
Sorry, but i'm still a little confused.
Regarding qiime's otu category significance, i thought the statistical tests in that script was based on relative abundances and therefor did not need rarefying? Categorize_by_function.py produces files with counts, but after processing them with otu_category_significance.py the result file contains percentages.
So far i have only rarefied the metagenome_predictions.biom on the fly when producing betadiversity plots. (rarefied to the lowest number of genes).
Here is an overview of what i have been doing until now:
fasta -> pick_closed_reference_otus.py -> normalize_by_copy_number.py -> predict_metagenome.py -> betadiversity_through_plots.py (using rarefaction)
fasta -> pick_closed_reference_otus.py -> normalize_by_copy_number.py -> predict_metagenome.py -> categorize_by_function.py -> otu_category_significance.py
fasta -> pick_closed_reference_otus.py -> normalize_by_copy_number.py -> predict_metagenome.py -> categorize_by_function.py -> summarize_taxa_through_plots.py
Does this look ok?
If rarefaction is in fact needed I suppose it should done on either the initial otu_table OR a later stage (pathway level), not both? so that if i rarefy the initial otu_table, i should not do another rarefaction on the pathway level. (ie. when running betadiversity_through_plots.py)?
best,
Kristian Holm