Downstream processing of q2-picrust2 outputs

Matthew Slattery

unread,

Feb 14, 2019, 5:09:17 PM2/14/19

to picrust-users

I successfully ran the PICRUSt2 QIIME2 plugin for my 16S data, but now I am not sure what to do with the EC metagenome, KO metagenome, and pathway abundance artifacts.

Is there another program that most users turn to to begin comparison of these predicted metagenomes? For example, I can see differences in the total number of unique pathways between my samples using the pathway_abundance.qza artifact, but I don't know how to identify which pathways are changing, which samples they are associated with, or what their biological function is.

Is running the full PICRUSt2 pipeline necessary for these comparisons? If so, and I obtain the full PICRUSt2 output files, what's the best way to begin analyzing that data?

This is my first time working with 16S and predicted metagenome data, so any advice or guidance is greatly appreciated.

Thank you,

Matt

Gavin Douglas

unread,

Feb 15, 2019, 8:30:13 AM2/15/19

to picrus...@googlegroups.com

Hi Matt,

Essentially you can analyze these tables using any approach you would use to analyze 16S abundance data. There are a lot of options out there and it’s highly debated in the field what the best approaches to use are! The typical question being asked would be whether there is a difference in the relative abundance of a gene family or pathway between sample groupings (e.g. case and control samples). Note that pre-processing can also be an important step (e.g. getting rid of rare features before testing and making sure that the abundance data is in the format the tool expects, such as relative abundances instead of the raw abundances output by PICRUSt2, before running analyses).

I like running my analyses in R so I personally would export those files to tab-delimited format and read them in. There are numerous R packages you could use, like ALDEx2 (https://bioconductor.org/packages/release/bioc/html/ALDEx2.html) if you want to try that approach. However, learning to work in R can be a bit of a learning curve so you’ll have to decide whether it would be worth your time.

You could also run ANCOM, which is part of QIIME2. This should be straight-forward to run although I have found it to have very low statistical power when running it on test datasets (although that is certainly better than calling many false positives!).

Best,

Gavin

--
You received this message because you are subscribed to the Google Groups "picrust-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to picrust-user...@googlegroups.com.
To post to this group, send email to picrus...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matthew Slattery

unread,

Feb 19, 2019, 3:45:35 PM2/19/19

to picrust-users

Gavin,

Thank you for the prompt and thorough reply! Transforming the data into relative abundance files was one step that I had missed, and I will look into trimming rare features.

I'm not overly familiar with R, but I will look into ALDEx2. Is there a community hub where I might find more discussion and debate surrounding those types of analytical tools?

I had limited success working with ANCOM in QIIME2, but it sent me in the right direction. Do you know if it's possible to use gneiss for the PICRUSt2 outputs? I didn't see an option for generating balances without a taxonomic tree (i.e. the "gneiss balance-taxonomy" command in QIIME2 requires a taxonomic tree input, preventing me from calculating the balances for pathway abundances).

Thanks again,

Matt

Gavin Douglas

unread,

Feb 19, 2019, 8:31:22 PM2/19/19

to picrus...@googlegroups.com

Hey Matt,

I don't know of any discussion forum on these tools in general sorry. An easy to use tool that I should have mentioned is STAMP (http://kiwi.cs.dal.ca/Software/STAMP). You would be able to run basic tests for differential abundance with this tool after converting your data to relative abundance. The caveat is that the statistical approaches implemented in this tool aren't the best for compositional data (see: https://www.frontiersin.org/articles/10.3389/fmicb.2017.02224/full)

I haven't tried GNEISS with PICRUSt2 output. I think it could be possible if one made a hierarchical tree based on some distance measure between features (i.e. something analogous to phylogenetic distance between ASVs). It's not obvious to me how this would be done though!

Best,

Gavin

Matthew Slattery

unread,

Feb 20, 2019, 6:00:24 PM2/20/19

to picrust-users

STAMP looks perfect.

You rock,

Matt

Jordan Stanford

unread,

Apr 21, 2020, 6:03:28 AM4/21/20

to picrust-users

Hi Gavin,

Sorry I hope you don't mind me jumping into this topic. Just wanting to know your advice around what would be considered as "rare" features and whether there is a standard 'cut off' commonly suggested for Picrust data? I have converted the counts into relative abundances, and for many of the pathway data, the relative abundance is as low as 2.89^E-05.

Any assistance is greatly appreciated.

Many thanks,

Jordan

To unsubscribe from this group and stop receiving emails from it, send an email to picrus...@googlegroups.com.

Gavin Douglas

unread,

Apr 21, 2020, 10:52:05 AM4/21/20

to picrus...@googlegroups.com

Hi Jordan,

There’s no clear cut-off unfortunately. Researchers often use somewhat arbitrary cut-offs when excluding features from data like this. The motivation is often to reduce the number of features tested (to reduce the burden of multiple testing) so one way to figure out a useful cut-off is to see how many features remain.

For ASVs it’s easier because there actually is an expected rate of bleed-through of DNA from past runs (for the MiSeq at least) which you can use to calculate a rough cut-off. It’s much less straight-forward for other sequencing technologies and datatypes though unfortunately.

Best,

Gavin

To unsubscribe from this group and stop receiving emails from it, send an email to picrust-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/picrust-users/0a2f109c-4e42-4a33-ba3f-900cf858863a%40googlegroups.com.

Jordan Stanford

unread,

Apr 21, 2020, 6:31:23 PM4/21/20

to picrus...@googlegroups.com

Thanks so much Gavin for the comprehensive and prompt reply. I really appreciate it.

Can I ask also, I noticed that for downstream analysis you recommended ALDEx2 as a potentially useful tool in R. Just wanting to double check as I’ve just looked up it’s functions, do you input relative abundance data instead of counts and then do the centre log transformation step... or do you skip this step for relative abundance data ?

Also, could maaslin be another option to analyse these picrust results?

Many thanks,

Jordan

Gavin

To unsubscribe from this group and stop receiving emails from it, send an email to picrust-users+unsubscribe@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/picrust-users/0a2f109c-4e42-4a33-ba3f-900cf858863a%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "picrust-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to picrust-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/picrust-users/D1994ED3-4BE0-496A-90E6-D9D4E3BF7C16%40gmail.com.

Gavin Douglas

unread,

Apr 22, 2020, 9:13:39 AM4/22/20

to picrus...@googlegroups.com

Hi Jordan,

Previously I’ve rounded the PICRUSt2 output and used ALDEx2. However, it’s important to realize that interpreting the differential abundance tests for metagenome predictions (and for microbiome data in general) isn’t trivial. Different tools can give starkly different results unfortunately which means that you have to be really cautious in interpreting results. I can’t recommend any particular differential abundance tool because I think it’s still an open question which method works best. You can get some insight into this issue in the latest version of the PICRUSt2 preprint (https://www.biorxiv.org/content/10.1101/672295v2), especially by looking at supplementary figure 13.

Hopefully that’s helpful,

Gavin

To unsubscribe from this group and stop receiving emails from it, send an email to picrust-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/picrust-users/CAFU%3D7zcSNXzszz1XX_rHJ-LURz%2B-4R78hZe8ZfqNePD9suvMiQ%40mail.gmail.com.

Wolfgang Rumpf

unread,

May 19, 2020, 12:00:02 PM5/19/20

to picrust-users

STAMP does look great, but I am having hellacious problems getting it installed on Mac or Linux. Any pointers?

Gavin Douglas

unread,

May 20, 2020, 1:42:23 PM5/20/20

to picrus...@googlegroups.com

Sorry I haven’t installed it recently, but in the past installing it with conda in Linux has worked for me. There is a STAMP google group you could check: https://groups.google.com/forum/?hl=en#!forum/stamp_help

Best,

Gavin

To view this discussion on the web visit https://groups.google.com/d/msgid/picrust-users/1ada6d55-a12a-4709-aa46-5be4564b0c70%40googlegroups.com.

Reply all

Reply to author

Forward