19 views

Skip to first unread message

Jul 16, 2021, 8:07:51 PMJul 16

to bali-phy-users

Hi,

Sorry for the possibly silly question. I am a little confused. Below is the relevant output from bali-phy (version 3.6.0) help commands:

bali-phy help m2: posP: The fraction of positively selected sites.

bali-phy help m2a: posP: The fraction of positively selected sites.

bali-phy help m8: posP: The fraction of neutral sites.

bali-phy help m8a: posP: The fraction of neutral sites.

Is this a mistake in the documentation or does posP value have different meanings between m2 and m8 models?

Thanks,

Karolis

Jul 16, 2021, 8:24:05 PMJul 16

to bali-ph...@googlegroups.com

Hi Karolis,

Yeah, that is definitely a mistake in the documentation! I have
fixed it on the master branch. If I make a 3.7.0 release before I
do a 4.0, then I will include it.

All the site models have `posP` as the fraction of sites that are
positively selected with dN/dS = `posW`.

-BenRI

--

You received this message because you are subscribed to the Google Groups "bali-phy-users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to bali-phy-user...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/bali-phy-users/0dc8756b-71e9-4472-8c8b-2f20a7fcb0e2n%40googlegroups.com.

Jul 17, 2021, 5:56:49 PMJul 17

to bali-ph...@googlegroups.com

OK, so actually my original response here wasn't quite right. I've updated the command-line documentation so that we have:

m2a posP: The fraction of positively selected sites. m2a_test posP: The fraction of positively selected sites. m8a posP: The fraction of *neutral* sites. m8 posP: The fraction of positively selected sites. m8a_test posP: The fraction of positively-selected or neutral sites.

On the master branch, I've updated the command-line documentation for the m8a and m8 models, and added new documentation for the m8a_test model

It is still the case that all the site models have `posP` as the fraction of sites with dN/dS=`posW`. But m8a (which is the same as m8[posW=1]) does not have an explicit posW parameter. Since posW=1 for m8a, posP is the fraction of NEUTRAL sites.

The relationship between all the m<number> models from PAML
is a bit weird and complicated. You may have to read the
references in the m7, m8a, and m2a models to get the full picture.

-BenRI

P.S. The command line documentation comes from JSON files, like
this one:

https://github.com/bredelings/BAli-Phy/blob/master/bindings/models/m8.json

P.P.S. Here is the new description for m8a_test -- it is somewhat long:

Description:

This is a Bayesian version of the M8a test for positive selection. The

original M8a test is a likelihood-ratio test. The Bayesian version of the

test chooses a prior with 50% mass on the null hypothesis H0 and 50% on

the alternative hypothesis H1:

H0: no positive selection

H1: some positive selection.

When posSelection = 0, the value of posW is ignored and dN/dS=1. When

posSelection = 1, the value of posW is used and dN/dS=posW. The posterior

probability of posSelection gives the posterior probability of H1. The

Bayes Factor is given by Pr(posSelection=1)/Pr(posSelection=0).

The M8a test (Swanson et al, 2003) has null and alternative hypotheses

given by:

H0: m8[posW=1] = m8a

H1: m8

This is an improvement over the test originally proposed by Yang (2000):

H0: m8[posP=1] = m7

H1: m8

In the originally proposed likelihood-ratio test, a mixture of conserved

and neutral sites would not fit the null hypothesis (m7), leading to the

erroneous inference of positive selection. In the M8a test, the null

hypothesis (m8a) has been improved to handle both neutral and conserved

sites, so that positive selection is not incorrectly inferred.

Jul 19, 2021, 11:57:16 AMJul 19

to bali-phy-users

Thank you. I was wondering about M8a, this clarifies a lot. For some reason, I am not getting the emails on replies (even though I am subscribed). I actually tracked this down yesterday in the Haskell file SModel.hs:

m8a_test_omega_dist mu gamma n_bins posP posW 0 = m8_omega_dist mu gamma n_bins posP 1.0

m8a_test_omega_dist mu gamma n_bins posP posW _ = m8_omega_dist mu gamma n_bins posP posW

m8a_test_omega_dist mu gamma n_bins posP posW _ = m8_omega_dist mu gamma n_bins posP posW

It is nice to have this documented, however.

Thanks again.

Jul 19, 2021, 12:37:25 PMJul 19

to bali-phy-users

Just to confirm,

In a hypothetical scenario where m8a_test is used. Assuming the resulting estimate of the parameter posSelection is zero (or very close to zero). The estimate of posW should always be one (or close to one). Is that correct?

Thanks

Jul 20, 2021, 4:10:25 PMJul 20

to bali-ph...@googlegroups.com

-------- Forwarded Message
--------

Subject: | Re: Different definitions of the posP parameter between the m2 and m8 models |
---|---|

Date: | Mon, 19 Jul 2021 10:32:40 -0700 |

From: | Benjamin Redelings <benjamin....@gmail.com> |

To: | Karolis Ramanauskas <kra...@uic.edu> |

On 7/19/21 9:33 AM, Karolis Ramanauskas wrote:

Just to confirm,

In a hypothetical scenario where m8a_test is used. Assuming the resulting estimate of the parameter posSelection is zero (or very close to zero). The estimate of posW should always be one (or close to one). Is that correct?

Not the way it is logged right now.

1. When posSelection=0, the value of posW is never used. We just use 1.0 instead. So when posSelection=0, posW is actually drawn from the prior of posW.

When I have used these models, I reported Pr(posSelection=1) and E(posW|posSelection=1) instead of E(posW), so this did not come up.

2. To get expectations conditional on posSeletion=1, I used a command like this:

stats-select -sm8a_test:posSelection=1 < C1.log | statreport --mean -Note that Pr(posSelection=1) = E(posSelection).

3. It is certainly possible IN THEORY to log `if posSelection == 1 then posW else 1.0` instead of `posW`. That is a Haskell expression that would do what you want. However, until version 4 is ready, this might be difficult.

Hmm... do you think E(if posSelection == 1 then posW else 1.0) makes sense mathematically? I guess it would correspond to estimating posW under a prior with a point mass on 1.0....

-BenRI

To view this discussion on the web visit https://groups.google.com/d/msgid/bali-phy-users/9036d201-af1d-48c0-9d80-53151a4806ebn%40googlegroups.com.

Jul 20, 2021, 4:31:25 PMJul 20

to bali-phy-users

Hi,

Points #1 and #2 make sense and clarify the behavior of BAli-Phy. Regarding your question in #3 about the correctness of conditionally logging the sampled posW value (if posSelection=1) or 1.0 (if posSelection=0). I am afraid this question is a little over my head, I have a very rudimentary (and fragmented) understanding of the related theory (although I am slowly learning) and it would be dishonest of me to weigh in. I am sorry.

Thank you for taking the time to help me understand all of this better,

Karolis

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu