Hi Morgan,
It's my understanding that PICRUSt uses the KEGG BRITE hierarchy system to collapse genes into pathways. Is this correct?
I've been able to run predict_metagenomes.py and categorize_by_function.py, but I'm confused about the results. I set the level to 3 when running categorize_by_function.py to collapse genes to the 3rd level of the hierarchy. However, several of the output pathways are actually listed at the 2nd level. For example, I'm getting a pathway called "Unclassified; Metabolism; Metabolism of cofactors and vitamins".
"Metabolism of cofactors and vitamins" is really one level above the 3rd level in the KEGG BRITE hierarchy. Are the values in this row inclusive of all of the pathways within this category, or is this only for "unclassified" genes that were not able to be collapsed to the 3rd level? If so, is there a way I can see which genes PICRUSt included when collapsing them into this category? I have the output file for predict_metagenomes, but I don't know how to map each gene back to its parent category.
I'd appreciate any insight you might have on this!
Thanks,
David