The values of Family-wide P-value' and 'Viterbi P-values'

165 views
Skip to first unread message

Zhipeng Jia

unread,
May 25, 2021, 11:58:44 AM5/25/21
to hahnlab-cafe
Dear CAFE community,
I am going to be using CAFE to detect the contraction and expansion of gene families.
There has been a lot of confusion about these two P values, Family-wide P-value' and 'Viterbi P-values'. 
If the results( *.cafe file) produced by the CAFE are then analyzed directly with catetutorial.report.analysis.py. I have found that catetutorial.report.analysis.py counts all gene families contraction and expansion, regardless of whether the Family-wide P-value is greater than 0.05 or less than or equal to 0.05.  For example, there are families whose Family-wide P-value is already greater than 0.05, but these families are still present in the result statistics file,*.fams.txt.
I know that when both wide p-value 'and' Viterbi p-values' of any family are less than 0.01, these families will be marked as rapidly evolving families.
I think we should set both 'Family-wide P-value' and 'Viterbi P-values' to be less than 0.05 as a threshold and then select these expanded or contracted families from *.cafe file according to the threshold, That is, a gene family is considered expanded or contracted only if both 'Family-wide P-value' and 'Viterbi P-values' are less than 0.05. If we do not set two values less than 0.05, but directly use catetutorial.report.analysis.py, the number of the final families is too high.
We can then use these selected families for GO or KEGG enrichment and other analyses.

Am I understanding this correctly?
Thank you in advance.

Best regards,
Zhipeng


Hahn, Matthew

unread,
May 27, 2021, 9:49:01 PM5/27/21
to Zhipeng Jia, hahnlab-cafe
Hi Zhipeng,

I’m not sure I exactly understand the issue, but I’ll try to clarify some CAFE details: 

Any family that changes size at all should be reported in the main output file, whether or not it is rapidly evolving. A subset of families may be evolving rapidly—these are identified via the “family-wide” p-values. For families whose p-value is less than the chosen family-wide p-value threshold, CAFE can also try to figure out which branch or branches are responsible. The method used to do this is the “Viterbi” p-value, which gives p-values for different branches. 

Does this help?


thanks,
Matt

--
You received this message because you are subscribed to the Google Groups "hahnlab-cafe" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hahnlabcafe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/f2e60d54-a441-4d1d-b0d4-e3c4451ae923n%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages