Questions about using PICRUSt2 output

4,078 views
Skip to first unread message

James Harder

unread,
Jun 4, 2018, 7:09:18 PM6/4/18
to picrust-users

Hi all,

Our lab has a large number of 16S sequencing data from samples of mouse gut microbiota, and I'm currently looking into the possibility of using PICRUSt2 to identify functional differences and correlate them with disease severity.  As a test, I've run a couple of our samples through the DADA2 de-noiser, and used the the resulting output files in the PICRUSt2 workflow, and I successfully generated the output pathway abundance files. I think that the program is very promising, but before I start processing all 300+ of our samples, I have a couple questions centered around how I can analyze the pathway abundance files.  

First, I haven't had much luck trying to find a way to group convert the Metacyc Pathway IDs in the pathway abundance file (e.g. 1CMET2-PWY) into the actual pathway names. (e.g. N10-formyl-tetrahydrofolate biosynthesis).  Do you have any suggestions on how to do that?

Second, the PICRUSt v1.0 tutorial uses STAMP for visualization and statistics of the data.   Since Picrust2 doesn't generate a BIOM file, there isn't an obvious way to use STAMP to analyze the data.  So, I have two questions for you:  Would you recommend using STAMP or a different program for visualization and statistics?   If STAMP is still a good choice, could you give me some pointers on importing the PICRUSt2 output into STAMP?

Thanks,
Jim Harder

Gavin Douglas

unread,
Jun 4, 2018, 8:21:06 PM6/4/18
to picrus...@googlegroups.com
Hey Jim,

Thanks for your interest in the PICRUSt2 beta! I’m away for the next week, but I’d be happy to help you get your table into STAMP format when I get back. However, STAMP just requires a tab-delimited input file and I believe it's a minor formatting change to get the required input format. I personally use RStudio for my data analysis, but it’s up to you what tool you’d like to use.

For linking pathway ids to names you can download the mapping file provided by HUMAnN2 here for now: https://www.dropbox.com/s/fq39skkbff3f1pr/map_metacyc-pwy_name.txt.gz?dl=1. I’m planning to add in mapping files to the next version to make this easier!


Thanks,

Gavin




--
You received this message because you are subscribed to the Google Groups "picrust-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to picrust-user...@googlegroups.com.
To post to this group, send email to picrus...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chan Nguyen

unread,
Oct 31, 2018, 1:52:52 AM10/31/18
to picrust-users
Hi Gavin, 
Are you still working on adding the mapping file for the pathway? I generated the pathway_abundance.qza, ec_metagenome.qza, and ko_metagenome.qza but I have no idea to do with them further. How to know which pathway linked to the desired phenotype? And how to do stats?
Cheers, 
Chan

Gavin Douglas

unread,
Oct 31, 2018, 8:44:38 AM10/31/18
to picrus...@googlegroups.com
Hi Chan,

Yes the description mapfiles are in "picrust2/default_files/description_mapfiles” (i.e. in the PICRUSt2 GitHub repository). You can take a look at “metacyc_pathways_info_prokaryotes.txt.gz” for the description of each MetaCyc pathway found in prokaryotes.

As for how to analyze the output files - you can run statistics on them as you would for any FeatureTable[Frequency] type in QIIME2. One simple thing to do would be to create a PCoA as shown in the tutorial. You could also try ANCOM on the output tables, which is available through QIIME2. Otherwise you could export the tables out of QIIME2 and run analyses in your favourite statistics program.


Best,

Gavin 

Chan Nguyen

unread,
Oct 31, 2018, 6:22:28 PM10/31/18
to picrust-users
Hi Gavin, 
Thank you for your reply. Yes I got those files. In the picrust1 there is step called categorize_bu_function, is there a similar one in picrust2? Also, I want to focus on several pathways only, can I use the filter command in qiime2 to get them?
Cheers, 
Chan

Gavin Douglas

unread,
Nov 1, 2018, 8:41:59 AM11/1/18
to picrus...@googlegroups.com
Hi Chan,

The categorize_by_function.py program collapsed gene families to higher-level categories. This has been improved in PICRUSt2 to now infer pathway abundances from gene family abundances in a more conservative way using MinPath (and to restrict to prokaryotic pathways only for 16S data). MetaCyc pathways are output by default since KEGG is closed-source. You can input custom pathway databases too (say if you have a KEGG subscription), but currently this is only available with the stand-alone script run_minpath.py (i.e. not in the QIIME2 plugin).

Yes you can filter the table to set of feature ids (the ids will be written the same as in the description mapfile I pointed you to).


Best,

Gavin

Chan Nguyen

unread,
Nov 8, 2018, 1:41:21 AM11/8/18
to picrust-users
Hi Gavin, 
I was trying to focus on several pathway and KOs, then I could filtered them based on FeatureID. But then what I got was the names of pathways/KO and what samples they belonged to, not the name of microbiota contributing to that pathway like what I got in picrust1. How can do that in picrust2?
Cheers, 
Chan

Chan Nguyen

unread,
Nov 8, 2018, 1:50:44 AM11/8/18
to picrust-users
Like running metagenome_contributions.py. I want to know which taxa contribute to a particular pathway or KO.

Gavin Douglas

unread,
Nov 8, 2018, 8:32:36 AM11/8/18
to picrus...@googlegroups.com
Hi Chan,

Stratified tables (i.e. function abundances stratified by each contributing predicted genome) are output if the “—strat_out” option is used with metagenome_pipeline.py (see here: https://github.com/picrust/picrust2/wiki/Metagenome-prediction). This isn’t implemented in the QIIME2 plugin yet since I haven’t determined whether an existing artifact type would work with stratified output or whether I need to make a new one. Note that it increases the memory usage and running-time by quite a bit - you can help alleviate this by using the “—min_reads” and “—mean_samples” options to collapse rare ASVs into a single category.

If you feed this stratified table into run_minpath.py you will also get stratified pathway abundances.


Best,

Gavin

danielava...@gmail.com

unread,
Aug 30, 2020, 5:01:35 AM8/30/20
to picrust-users
Hello Gavin,

Thank you for the explanation. Would you have a current link where I can have access to the pathways ids? I am also working with Picrust2 output. I tried the one you posted but it is not working anymore.
Thank you very much in advance.
Best,
Daniela

Gavin Douglas

unread,
Aug 31, 2020, 8:33:11 AM8/31/20
to picrus...@googlegroups.com
Hey Daniela,

You can find that mapfile as part of the PICRUSt2 repo now (https://github.com/picrust/picrust2/) in this subdirectory: picrust2/default_files/pathway_mapfiles.


Best,

Gavin



Reply all
Reply to author
Forward
0 new messages