Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Picrust 2 vs Tax4fun

484 views
Skip to first unread message

Felipe Melis

unread,
Sep 9, 2021, 12:34:28 PM9/9/21
to picrust-users
Hi there my name is Felipe Melis and I'm a PhD student from Chile, 

I'm working with Picrust 2 and making a comparison with Tax4fun2. I have samples from different soils and I noticed that I have a huge difference in results between two methanogenic-related KOs. Tax4fun predicts the presence of K00400 and K00401 in most of the samples, while in Picrust 2 those functions only a appear in very few samples. Investigating in detail at the default databases from the two implementations, that are used to compare the 16S OTUs to the references for the functional annotation, I discovered that Tax4fun had a huge amount of sequences and organisms (thousands of taxa) associated to those methanogenic functions so they have high chance to predict them, while in comparison Picrust 2 only have around 140 sequences and thus is absent most of the time. Do you have any clue or hint about these big differences? 

Gavin Douglas

unread,
Sep 9, 2021, 1:34:25 PM9/9/21
to picrus...@googlegroups.com
Hey Felipe,

This is definitely an interesting observation - I’m not sure why the numbers of reference genomes annotated with those KOs would differ so much. It’s possible that the latest Tax4Fun2 database has more archaeal taxa in the reference database.

Have you dug into the taxonomic classification of OTUs predicted to encode these functions in your dataset? If these KOs are expected to be restricted only to specific archaeal lineages then it may be a red-flag if any bacterial 16S OTUs in your dataset are predicted to encode the KO based on either tool.


Cheers,

Gavin

On Sep 9, 2021, at 12:34 PM, Felipe Melis <felipe...@gmail.com> wrote:

Hi there my name is Felipe Melis and I'm a PhD student from Chile, 

I'm working with Picrust 2 and making a comparison with Tax4fun2. I have samples from different soils and I noticed that I have a huge difference in results between two methanogenic-related KOs. Tax4fun predicts the presence of K00400 and K00401 in most of the samples, while in Picrust 2 those functions only a appear in very few samples. Investigating in detail at the default databases from the two implementations, that are used to compare the 16S OTUs to the references for the functional annotation, I discovered that Tax4fun had a huge amount of sequences and organisms (thousands of taxa) associated to those methanogenic functions so they have high chance to predict them, while in comparison Picrust 2 only have around 140 sequences and thus is absent most of the time. Do you have any clue or hint about these big differences? 

--
You received this message because you are subscribed to the Google Groups "picrust-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to picrust-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/picrust-users/c865a9fd-8da8-44d4-ae5e-bfea4671f138n%40googlegroups.com.

Chao-Jui Chang

unread,
Nov 1, 2024, 7:11:12 AM11/1/24
to picrust-users
Hi all

I'm wondering if it is caused by the reference KEGG database used by these two programs

in the original paper of Tax4Fun2 (Published: 18 May 2020), it was written that Functional profiles were calculated based on obtained protein sequences with UProC version 1.2.0 [43] using the KEGG Orthology (KO) database for prokaryotes (July 2018 release; [44]) as reference. and I checked the reference database (Tax4Fun2_ReferenceData_v2.tar provided by the authors), there are 21620 KOs
and
in the original paper of  PICRUSt2  (Published: 01 June 2020), it was written that the total number of KOs is 10,543 in PICRUSt2, as compared to 6,909 in PICRUSt1, a 1.5-fold increase. Unfortunately, I cannot find any information about the release date of the KO reference applied by  PICRUSt2.  

Leaving the annotation algorithm aside, I think the reason is obviously the database used by these two programs (Tax4Fun2 is almost two times than PICRUSt2), 
I think this (the KEGG database version) is the current disadvantage of the PICRUSt2, and now it seems the developer is still working on it

sincerely 
Chao-Jui

Gavin Douglas 在 2021年9月10日 星期五凌晨1:34:25 [UTC+8] 的信中寫道:

Robyn Wright

unread,
Nov 8, 2024, 7:55:52 AM11/8/24
to picrust-users
Hi there,

I'm sure that the different versions would have created differences, but I am unsure myself of exactly how much. The genomes currently used by PICRUSt2 weren't annotated by us, but the annotations were provided by IMG/JGI. The annotations were acquired in 2017 (some more details are in the PICRUSt2 paper supplementary methods). 

I am currently working on an update to the PICRUSt2 database for 16S analyses that uses the GTDB genomes and their tree, along with Eggnog annotations of the genomes. I have currently prepared the annotations, the tree files, etc, and am just starting to test them. When I have (hopefully) verified that these are working, I'll work on a release for more general use.

Robyn

Reply all
Reply to author
Forward
0 new messages