Picrust2 Output clarification and ggpicrust2 Input

58 views
Skip to first unread message

Ivanova Stasja

unread,
Mar 18, 2025, 9:57:22 AMMar 18
to picrust-users

Dear Picrust2 team,

I am trying to process my data with Picrust2 for the first time and then visualize it with ggpicrust2. Initially, I ran the old version of Picrust2, which used the IMG database. Then, I ran the new version with the added GTDB database.

As output, I received familiar folders from the previous version of the program: EC_metagenome_out, KO_metagenome_out, pathways_out, and intermediate. Additionally, I noticed new files related to bacteria (bac_predicted), archaea (arc_predicted), and a combined dataset (combined_predicted).

How does the output in metagenome_out differ from these individual predicted.tsv files? Which database was used for generating the combined_predicted output? The metagenome_out files should include metagenomes from GTDB, correct?

For further analysis and visualization, ggpicrust2 uses the files from metagenome_out as input. I am unsure if I am missing anything by not analyzing the additional combined_predicted files.

Apologies if this has already been explained somewhere—I couldn’t find the information.
Thank you for your help!

Best regards,

Stasy

Robyn Wright

unread,
Mar 18, 2025, 12:06:11 PMMar 18
to picrust-users
Hi there,

Essentially, with the new version, some of the first steps are split to be separate for bacteria and archaea because we are now using the GTDB phylogenetic trees and these are not rooted, so we can't join the bacteria and archaea together. This means that sequences are placed in both the bacterial and archaeal trees, and the best fit is found for each sequence. After this, the EC and KO predictions at the sequence/ASV level are carried out separately for the bacterial/archaeal sequences, and then these ASV-level (genome-level) predictions are combined before the metagenome prediction. You should find therefore that the combined_predicted is the sum of the bacteria and archaea predicted files. For the metagenome prediction, the combined predicted is essentially multiplied by the predicted 16S copy number and abundance of each ASV/sequence to get a prediction for the metagenome overall. 

I have not used ggpicrust2 myself, but the combined_EC_predicted.tsv and combined_KO_predicted.tsv files are equivalent to the previous EC_predicted.tsv and KO_predicted.tsv files, and the metagenome_out folders are equivalent to those produced by the previous PICRUSt2 version, so I would imagine that they could still be used for ggpicrust2. 

Let me know if there's anything else that I can clarify! 

Robyn 

Ivanova Stasja

unread,
Mar 18, 2025, 2:45:50 PMMar 18
to picrus...@googlegroups.com
Thank you so much for the clarification, I got it! All the best!

вт, 18 мар. 2025 г. в 17:06, Robyn Wright <roby...@gmail.com>:
> --
> You received this message because you are subscribed to the Google Groups "picrust-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to picrust-user...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/picrust-users/9121b12e-d656-4087-8864-b9ac06ded310n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages