CodeML for likelihood ratio test, segmentation fault

461 views
Skip to first unread message

Elaine G

unread,
Nov 20, 2013, 1:21:04 PM11/20/13
to pamlso...@googlegroups.com
Dear PAML discussion group,

I have a dataset with 762bp (254aa) of exon sequences from a single gene for 7 species. I would like to use CodeML to estimate the fit of a model of one omega across the entire tree v. varying omegas on different branches (M0 v M1 or 2). I was hoping obtain likelihood values for runs using the alternate models in CodeML so that I could perform a likelihood ratio test. However, I can't seem to find the likelihood in the output file. I am still stuck on the first step of running CodeML with the M0 model. I should note that I am also unable to find a tree-wide omega average in the output files. 

I thought the problem might be that I had verbose = 0 and thus was not getting all the information in the output. However, when I change verbose to either 1 or 2, I get a segmentation fault. I should note that output files are still produced, and contain the same information as the output files obtained running CodeML with verbose=0. 

So, I am not sure if there is a problem with my input data, my control file, or if there is a memory allocation issue. 

Any suggestions? I have tried changing many things, but the pattern is the same. I will paste in my control file below.

Many thanks!
Elaine

      seqfile = reduced_taxa_rat.phy
     treefile = reduced_treefile

      outfile = reduced_rat_outfile           * main result file name
        noisy = 9  * 0,1,2,3,9: how much rubbish on the screen
      verbose = 1  * 0: concise; 1: detailed, 2: too much
      runmode = 0  * 0: user tree;  1: semi-automatic;  2: automatic
                   * 3: StepwiseAddition; (4,5):PerturbationNNI; -2: pairwise

      seqtype = 1  * 1:codons; 2:AAs; 3:codons-->AAs
    CodonFreq = 0


        model = 0
      NSsites = 0
        icode = 0  * 0:universal code; 1:mammalian mt; 2-10:see below
        clock = 0  * 0:no clock, 1:global clock; 2:local clock

    fix_kappa = 0  * 1: kappa fixed, 0: kappa to be estimated
        kappa = 10  * initial or fixed kappa
    fix_omega = 0  * 1: fix omega at omega (below), 0: estimate omega
        omega = 0.1  * initial or fixed omega, for codons or codon-based AAs

    fix_alpha = 1  * 0: estimate gamma shape parameter; 1: fix it at alpha
        alpha = 0. * initial or fixed alpha, 0:infinity (constant rate)
       Malpha = 0  * different alphas for genes
        ncatG = 0  * # of categories in dG of NSsites models

        getSE = 1  * 0: don't want them, 1: want S.E.s of estimates
 RateAncestor = 0  * (0,1,2): rates (alpha>0) or ancestral states (1 or 2)

   Small_Diff = 1e-6
*    cleandata = 0  * remove sites with ambiguity data (1:yes, 0:no)?
        method = 0   * 0: simultaneous; 1: one branch at a time
   fix_blength = 0  * 0: ignore, -1: random, 1: initial, 2: fixed

Ziheng

unread,
Dec 3, 2013, 4:03:49 PM12/3/13
to pamlso...@googlegroups.com
Do you see error messages on the monitor?
Ziheng

Elaine G

unread,
Dec 10, 2013, 9:52:02 PM12/10/13
to pamlso...@googlegroups.com
Hi Ziheng,

This is the output to my terminal screen:


ns = 7   ls = 762
Reading sequences, sequential format..
Reading seq # 7: hmn     
Sequences read..
Counting site patterns..  0:00
         172 patterns at      254 /      254 sites (100.0%),  0:00
Counting codons..

  2   1:Sites   188.0 +  574.0 =  762.0 Diffs    86.5 +   78.5 =  165.0
  3   1:Sites   190.1 +  571.9 =  762.0 Diffs    80.5 +   71.5 =  152.0
  3   2:Sites   190.5 +  571.5 =  762.0 Diffs    21.5 +   37.5 =   59.0
  4   1:Sites   190.2 +  571.8 =  762.0 Diffs    81.0 +   66.0 =  147.0
  4   2:Sites   190.6 +  571.4 =  762.0 Diffs    22.0 +   38.0 =   60.0
  4   3:Sites   192.7 +  569.3 =  762.0 Diffs     8.0 +   12.0 =   20.0
  5   1:Sites   189.8 +  572.2 =  762.0 Diffs    80.5 +   73.5 =  154.0
  5   2:Sites   190.2 +  571.8 =  762.0 Diffs    27.0 +   42.0 =   69.0
  5   3:Sites   192.3 +  569.7 =  762.0 Diffs    15.0 +   15.0 =   30.0
  5   4:Sites   192.4 +  569.6 =  762.0 Diffs    12.0 +   10.0 =   22.0
  6   1:Sites   189.9 +  572.1 =  762.0 Diffs    78.0 +   71.0 =  149.0
  6   2:Sites   190.4 +  571.6 =  762.0 Diffs    23.5 +   37.5 =   61.0
  6   3:Sites   192.4 +  569.6 =  762.0 Diffs    11.0 +   12.0 =   23.0
  6   4:Sites   192.5 +  569.5 =  762.0 Diffs     9.0 +    6.0 =   15.0
  6   5:Sites   192.1 +  569.9 =  762.0 Diffs    12.0 +    9.0 =   21.0
  7   1:Sites   190.4 +  571.6 =  762.0 Diffs    79.0 +   69.0 =  148.0
  7   2:Sites   190.8 +  571.2 =  762.0 Diffs    23.0 +   36.0 =   59.0
  7   3:Sites   192.9 +  569.1 =  762.0 Diffs    10.0 +   12.0 =   22.0
  7   4:Sites   193.0 +  569.0 =  762.0 Diffs     8.0 +    6.0 =   14.0
  7   5:Sites   192.6 +  569.4 =  762.0 Diffs    11.0 +    9.0 =   20.0
  7   6:Sites   192.7 +  569.3 =  762.0 Diffs     1.0 +    4.0 =    5.0

      168 bytes for distance
   167872 bytes for conP
        0 bytes for fhK
  5000000 bytes for space
Segmentation fault: 11







Thanks!

Ziheng

unread,
Feb 1, 2014, 6:08:33 AM2/1/14
to pamlso...@googlegroups.com
I don't see any obvious problem.
To determine the reason for the crash, I need to know which version you are using, which platform, and how you compiled the program.
Does the program work on the example datasets in the package.
If you are using a mac, I seem to remember there were issues about how you compile the program, which were discussed before. Perhaps you can search in this forum for compiling paml for the mac, or something like that.
Best,
Ziheng
Reply all
Reply to author
Forward
0 new messages