Interpretation of p value of 0 in branch model b_neut vs b

Kitty Murphy

unread,

Nov 24, 2021, 9:54:13 AM11/24/21

to The ETE toolkit

Hi ETE3 team,

Could you clarify how to interpret a p value of 0 using the branch model test b_neut vs b_free? Initially I thought these would be interpreted as significant, but I realised for the same gene I would have an insignificant p value after running the LRT M0 vs b_free and then a significant value when running the second LRT b_neut vs b_free, which didn't make sense to me.

After looking into it, it seems that in PAML outputs a p-value of 0 when the LRT is negative: https://www.biostars.org/p/166718/, however, as ETE3 doesn't output the LRT value it's not possible to decipher this.

Many thanks,

Kitty

Kitty Murphy

unread,

Jan 20, 2022, 7:36:22 AM1/20/22

to The ETE toolkit

I also found this to be the case when comparing genes analysed using the branch and branch-site model, such that genes with a p-value of 0 using the branch model had a non-significant p value using the branch-site model.

fransua

unread,

Jan 20, 2022, 9:42:09 AM1/20/22

to The ETE toolkit

Hi,

ETE3 returns p-value computed internally either using chi2 from scipy if installed or using built-in function (I recommend to install scipy).

p-value of 0 should mean significant.

btw, ETE3 is not removing files generated by PAML, you should be able to get the likelihoods from the main outfiles and reproduce the LRT.

finally I do not see inconsistencies in the first case you explain, seems that b_free (the most complicated model) is better than M0 and than b_neut.

In the second case I don't understand what you did...

let me know if I can help further (would need more details :) )

cheers

Kitty Murphy

unread,

Jan 21, 2022, 9:42:18 AM1/21/22

to The ETE toolkit

Hi Fransua,

Thank you for the explanation regarding how the p-value is computed, and good to know about the PAML files, I rarely look past the log file with the main results!

I just assumed if the b_neut b_free LRT is significant (the branch of interest has omega significantly greater than 1), then the M0 b_free LRT should also be significant (the branch of interest is not evolving neutrally). I should probably look back at exactly how these models are working..

I'm still confused about the second case (significant in branch model (bneut,bfree) but not branch-site (bSA1,bsA)), perhaps I'm calling the models incorrectly? Below is my code for running the branch and branch-site models.

BRANCH

ete3 evol -t species_tree.nw --alg $gene-clean.fasta -o $gene-branch --models M0 b_neut b_free --tests b_free,M0 b_free,b_neut --cpu 16 --mark ${marks[*]} >> $gene.branch.log

BRANCH-SITE

ete3 evol -t species_tree.nw --alg $gene-clean.fasta -o $gene-branch-site --models M0 M1 bsA bsA1 --tests M1,bsA bsA1,bsA --cpu 16 --mark ${marks[*]} >> $gene.branch-site.log

Let me know if there is anything else I can share.

Thanks

fransua

unread,

Jan 21, 2022, 9:50:23 AM1/21/22

to The ETE toolkit

Hi Kitty,

it's a bit complicated without the outputs, but b_free vs b_neut does not mean positive selection, it just means that b_neut (omega = 1) is not working well, but b_free can be pointing at omega <1 :)

also (even if your omega is > 1 ) testing on branches (b_free b_neut) is very hard to extrapolate to tests on branches and sites, the number of degrees of freedom is very different. Meaning that even if overall a given branch has a high likelihood to be evolving at a omega rate higher than 1, it can be that at the level of sites their is not enough statistical power.

Kitty Murphy

unread,

Jan 21, 2022, 9:56:51 AM1/21/22

to The ETE toolkit

Thanks for the quick response!

Yes, sorry, I am only interpreting b_free vs b_neut as positive selection if omega on the branch of interest is greater than 1.

I see, this make sense! Thank you so much!

Reply all

Reply to author

Forward

Interpretation of p value of 0 in branch model b_neut vs b_free

Kitty Murphy

Kitty Murphy

fransua

Kitty Murphy

fransua

Kitty Murphy