Result with branch-site model

Camille

unread,

Mar 8, 2013, 1:25:12 PM3/8/13

to pamlso...@googlegroups.com

Hello,

We are working on the positive selection detection on our gene of interest by using the branch-site models of PAML. In the results for the branch of hominids, we have an amino acid that concerns us particularly with a posterior probability of 0.93 for positive selection. However, when we’re looking at the multiple sequence alignment for this position, all species have the same amino acid (Y) (codon TAT or TAC) except the chimpanzee that have a C (TGT). Is it possible that just this substitution can lead to a positive selection in the ancestral branch of the hominids or is it just a false positive? Moreover, this tyrosine is not under positive selection in the chimpanzee branch.

For more information, here is the alignment, the tree we used, as well as the statistics for the alternative model (BEB) for the specific branch of hominids and the .ctl file:

seqfile = ../../codonsAlignment.paml * sequence data file name

treefile = ../subTree.nh * tree structure file name

outfile = mlc * main result file name

noisy = 0 * 0,1,2,3,9: how much rubbish on the screen

verbose = 0 * 1: detailed output, 0: concise output

runmode = 0 * 0: user tree; 1: semi-automatic; 2: automatic

* 3: StepwiseAddition; (4,5):PerturbationNNI; -2: pairwise

seqtype = 1 * 1:codons; 2:AAs; 3:codons-->AAs

CodonFreq = 2 * 0:1/61 each, 1:F1X4, 2:F3X4, 3:codon table

clock = 0 * 0: no clock, unrooted tree, 1: clock, rooted tree

aaDist = 0 * 0:equal, +:geometric; -:linear, {1-5:G1974,Miyata,c,p,v}

model = 2 * models for codons:

* 0:one, 1:b, 2:2 or more dN/dS ratios for branches

NSsites = 2

* 0:one w; 1:NearlyNeutral; 2:PositiveSelection; 3:discrete;

* 4:freqs; 5:gamma;6:2gamma;7:beta;8:beta&w;9:betaγ10:3normal

icode = 0 * 0:standard genetic code; 1:mammalian mt; 2-10:see below

Mgene = 0 * 0:rates, 1:separate; 2:pi, 3:kappa, 4:all

fix_kappa = 0 * 1: kappa fixed, 0: kappa to be estimated

kappa = 2 * initial or fixed kappa

fix_omega = 0 * 1: omega or omega_1 fixed, 0: estimate

omega = 1 * initial or fixed omega, for codons or codon-based AAs

getSE = 0 * 0: don't want them, 1: want S.E.s of estimates

RateAncestor = 0 * (0,1,2): rates (alpha>0) or ancestral states (1 or 2)

Small_Diff = .45e-6

cleandata = 1 * remove sites with ambiguity data (1:yes, 0:no)?

fix_blength = 0 * 0: ignore, -1: random, 1: initial, 2: fixed

(platypus:0.643624,opossum:0.483678,(Lesserhedgehogtenrec:0.140102,((armadillo
:0.075402,sloth:0.04919):0.039775,(((rabbit:0.094968,(kangaroorat:0.083936,(mouse:0.048012,rat:0.038255):0.111631):0.068775):0.003507,(tarsier:0.080627,((Orangutan:0.010399,((human:0.002584,gorilla:0.00086):0,chimpanzee:0.00517
7):0.004382)$1:0.008795,macaque:0.023443):0.080006):0.030935):0.010282,((microbat:0.076961,((dog:0.049111,panda:0.028677):0.028138,Hedgehog:0.128905):0.003903):0.001154,((((sheep:0.003576,cow:0.008284):0.035785,dolphin:0.032995):0.010924,pig:
0.051511):0.020419,horse:0.065234):0.00406):0.007708):0.017615):0):0.241892:0.14191);

dN/dS (w) for site classes (K=4)

site class 0 1 2a 2b

proportion 0.48983 0.45139 0.03059 0.02819

background w 0.16713 1.00000 0.16713 1.00000

foreground w 0.16713 1.00000 41.57506 41.57506

203 Y 0.06941 0.00066 0.92858 0.00135 ( 3)

Thank you very much for your help!

Camille

WebRep

Évaluation globale

alignment.PNG

Ziheng

unread,

Mar 17, 2013, 6:24:49 AM3/17/13

to pamlso...@googlegroups.com

one single amino acid change can generate a good signal in the branch-site test, but we have not come across a case where the signal is statistically significant.

anyway the result does sound strange from your description. i understand that the single amino acid change occurs on the chimp branch, the test did not detect evidence for the chimp branch but it does so for the branch ancestral to the human and the chimp. if that is the case and if you like, you can send me the data files (control file, tree file and alignment) for me to have a look.

ziheng yang

Camille

unread,

Mar 25, 2013, 2:46:11 PM3/25/13

to pamlso...@googlegroups.com

After a discussion by email, Dr Yang wrote me this answer:

it is not so surprising that site 203 Y is selected if you label all the 7 branches as foreground. The model assumes that all foreground branches are under the same selective pressure with the same sites under the same selective regime. To explain the amino acid change on the chimp branch, which is rather short, a large w may be necessary.
What is strange is that the site is not picked up when you label the chimp branch only as the single foreground branch, as you said

"Moreover, this tyrosine is not under positive selection in the chimpanzee branch."

I just duplicated this analysis, and got the following for the BEB analysis, with site 203Y picked up.

203 Y 0.985*

dN/dS (w) for site classes (K=4)

site class 0 1 2a 2b

proportion       0.00000 0.00000 0.50548 0.49452
background w     0.16209 1.00000 0.16209 1.00000
foreground w     0.16209 1.00000 999.00000 999.00000

For this analysis, you label the single chimp branch with #1.

Thanks to him for his responses.

Reply all

Reply to author

Forward