Hello,
We are working on the positive selection detection on our gene of interest by using the branch-site models of PAML. In the results for the branch of hominids, we have an amino acid that concerns us particularly with a posterior probability of 0.93 for positive selection. However, when we’re looking at the multiple sequence alignment for this position, all species have the same amino acid (Y) (codon TAT or TAC) except the chimpanzee that have a C (TGT). Is it possible that just this substitution can lead to a positive selection in the ancestral branch of the hominids or is it just a false positive? Moreover, this tyrosine is not under positive selection in the chimpanzee branch.
For more information, here is the alignment, the tree we used, as well as the statistics for the alternative model (BEB) for the specific branch of hominids and the .ctl file:
seqfile = ../../codonsAlignment.paml * sequence data file name
treefile = ../subTree.nh * tree structure file name
outfile = mlc * main result file name
noisy = 0 * 0,1,2,3,9: how much rubbish on the screen
verbose = 0 * 1: detailed output, 0: concise output
runmode = 0 * 0: user tree; 1: semi-automatic; 2: automatic
* 3: StepwiseAddition; (4,5):PerturbationNNI; -2: pairwise
seqtype = 1 * 1:codons; 2:AAs; 3:codons-->AAs
CodonFreq = 2 * 0:1/61 each, 1:F1X4, 2:F3X4, 3:codon table
clock = 0 * 0: no clock, unrooted tree, 1: clock, rooted tree
aaDist = 0 * 0:equal, +:geometric; -:linear, {1-5:G1974,Miyata,c,p,v}
model = 2 * models for codons:
* 0:one, 1:b, 2:2 or more dN/dS ratios for branches
NSsites = 2
* 0:one w; 1:NearlyNeutral; 2:PositiveSelection; 3:discrete;
* 4:freqs; 5:gamma;6:2gamma;7:beta;8:beta&w;9:betaγ10:3normal
icode = 0 * 0:standard genetic code; 1:mammalian mt; 2-10:see below
Mgene = 0 * 0:rates, 1:separate; 2:pi, 3:kappa, 4:all
fix_kappa = 0 * 1: kappa fixed, 0: kappa to be estimated
kappa = 2 * initial or fixed kappa
fix_omega = 0 * 1: omega or omega_1 fixed, 0: estimate
omega = 1 * initial or fixed omega, for codons or codon-based AAs
getSE = 0 * 0: don't want them, 1: want S.E.s of estimates
RateAncestor = 0 * (0,1,2): rates (alpha>0) or ancestral states (1 or 2)
Small_Diff = .45e-6
cleandata = 1 * remove sites with ambiguity data (1:yes, 0:no)?
fix_blength = 0 * 0: ignore, -1: random, 1: initial, 2: fixed
(platypus:0.643624,opossum:0.483678,(Lesserhedgehogtenrec:0.140102,((armadillo
:0.075402,sloth:0.04919):0.039775,(((rabbit:0.094968,(kangaroorat:0.083936,(mouse:0.048012,rat:0.038255):0.111631):0.068775):0.003507,(tarsier:0.080627,((Orangutan:0.010399,((human:0.002584,gorilla:0.00086):0,chimpanzee:0.00517
7):0.004382)$1:0.008795,macaque:0.023443):0.080006):0.030935):0.010282,((microbat:0.076961,((dog:0.049111,panda:0.028677):0.028138,Hedgehog:0.128905):0.003903):0.001154,((((sheep:0.003576,cow:0.008284):0.035785,dolphin:0.032995):0.010924,pig:
0.051511):0.020419,horse:0.065234):0.00406):0.007708):0.017615):0):0.241892:0.14191);
dN/dS (w) for site classes (K=4)
site class 0 1 2a 2b
proportion 0.48983 0.45139 0.03059 0.02819
background w 0.16713 1.00000 0.16713 1.00000
foreground w 0.16713 1.00000 41.57506 41.57506
203 Y 0.06941 0.00066 0.92858 0.00135 ( 3)
Thank you very much for your help!
Camille