Help with Clade Model C in PAML: No BEB Sites Detected

58 views
Skip to first unread message

Ao Qo

unread,
May 19, 2025, 3:26:39 AMMay 19
to PAML discussion group

Dear Professor Yang,

I hope this message finds you well.

I am currently using PAML (codeml) to detect positive selection with Clade Model C, following the specification of model = 3 and NSsites = 2. My goal is to compare different clades in a phylogeny of 13 species of African mole-rats (family Bathyergidae), grouped into three clades of interest.

For the Clade Model C analysis, I used the following control file settings (excerpt below):

seqfile = /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/09.onlyB.13.fa2paml/evm.model.ptg000001l.10.paml
treefile = /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/12.1.paml/onlyB/00.tree/unroot.tree.tri
      outfile = clade.tri.mlc             * main result file name

        noisy = 9  * 0,1,2,3,9: how much rubbish on the screen
      verbose = 0  * 0: concise; 1: detailed, 2: too much
      runmode = 0  * 0: user tree

      seqtype = 1  * 1:codons; 2:AAs; 3:codons-->AAs
    CodonFreq = 2  * 0:1/61 each, 1:F1X4, 2:F3X4, 3:codon table
        clock = 0  * 0:no clock, 1:clock; 2:local clock; 3:CombinedAnalysis

                         * Model C: model = 3  NSsites = 2
                         * Model D: model = 3  NSsites = 3

        model = 3  * 3 = clade models
      NSsites = 2  * choose "2" or "3"
                         
        icode = 0  * 0:universal code; 1:mammalian mt; 2-10:see below

    fix_kappa = 0  * 1: kappa fixed, 0: kappa to be estimated
        kappa = 2.5  * initial or fixed kappa

    fix_omega = 0  * 1: omega or omega_1 fixed, 0: estimate
        omega = 0.13579 * initial or fixed omega, for codons or codon-based AAs

    fix_alpha = 1  * 0: estimate gamma shape parameter; 1: fix it at alpha
        alpha = 0. * initial or fixed alpha, 0:infinity (constant rate)

        ncatG = 3  * # of categories in dG of NSsites models

        getSE = 0  * 0: don't want them, 1: want S.E.s of estimates
 RateAncestor = 0  * (0,1,2): rates (alpha>0) or ancestral states (1 or 2)

   Small_Diff = .2e-6
    cleandata = 0  * remove sites with ambiguity data (1:yes, 0:no)?

The tree is unrooted, and I manually specified the three clades as follows in the tree file:

(Bsu #1, ((Cho #2, ((((Fan, Fmi), Fdm), Fda), Fme)), Hgl), Gca #1);

For the null model (M2a_rel), I used the same sequence file and an unlabeled unrooted tree, with the following config file:

seqfile = /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/09.onlyB.13.fa2paml/evm.model.ptg000001l.10.paml
treefile = /data/01/p1/user157/Bathyergidae-sociaty/2024812.newcactus/03.proteincoding-selected-analysis/01.getgene/12.1.paml/onlyB/00.tree/unroot.tree.raw
      outfile = M2a_rel.mlc            * Path to the output file
   
        noisy = 3              * How much rubbish on the screen
      verbose = 1              * More or less detailed report

      seqtype = 1              * Data type
        ndata = 1           * Number of data sets or loci
        icode = 0              * Genetic code
    cleandata = 0              * Remove sites with ambiguity data?
               
        model = 0         * Models for ω varying across lineages
          NSsites = 22          * Models for ω varying across sites
    CodonFreq = 7        * Codon frequencies
          estFreq = 0        * Use observed freqs or estimate freqs by ML
        clock = 0          * Clock model
    fix_omega = 0         * Estimate or fix omega
        omega = 0.5        * Initial or fixed omega

The the two model seems to run normally. However, for all genes analyzed under Clade Model C, the output file clade.tri.mlc does not show any positively selected sites detected by BEB (Bayes Empirical Bayes), even though I expected some sites to be identified in the foreground clades.

I have attached an example output file (clade.tri.mlc and clade.raw.mlc(M2a_rel)) from one of the orthologs for several species.

Thank you very much for your time and for developing such a powerful tool.

Best regards,
Na Wan

clade.tri.mlc.txt
clade.raw.mlc.txt
Reply all
Reply to author
Forward
0 new messages