Error: some gene labels are missing

45 views
Skip to first unread message

kim

unread,
Dec 14, 2022, 2:19:44 PM12/14/22
to PAML discussion group
Hello,
I am running codeml and getting the following error message "Error: some gene labels are missing." I'd appreciate help in identifying what is causing the error. Below I have copied my control file, the first two lines of my seqfile, and the last lines of standard output. 
Thank you,
Kim

#control file:

      seqfile = core_gene_alignment.phy       * sequence data filename
      outfile = out.txt   * main result file name

        noisy = 9      * 0,1,2,3,9: how much rubbish on the screen
      verbose = 1      * 1:detailed output
      runmode = -2     * -2:pairwise
        Mgene = 1
      seqtype = 1      * 1:codons
    CodonFreq = 1      * 0:equal, 1:F1X4, 2:F3X4, 3:F61
        model = 0      *
      NSsites = 0      *
        icode = 3      * 3:mold mt.

#first two lines of seqfile:

 95 237111 G
G 255 401 213 171 321 534 449 570 507 242 276 181 102 194 201 270 300 469 728 131 398 221 232 439 463 467 599 378 145 272 301 320 204 185 451 296 287 333 326 694 336 439 466 664 666 511 476 418 455 197 160 316 176 349 205 78 245 97 178 171 569 384 137 309 86 401 311 229 231 105 127 119 126 335 871 227 527 279 92 412 321 201 215 119 153 96 242 271 465 87 591 156 186 81 100 1064 702 288 303 286 274 64 220 564 263 168 191 575 291 248 111 180 335 412 90 888 75 599 217 231 494 326 685 238 935 332 250 383 371 442 619 146 166 246 275 327 324 191 385 834 374 334 311 453 427 460 429 404 195 225 724 246 317 184 89 578 534 508 234 194 281 152 240 288 304 227 246 675 218 144 231 282 231 181 192 145 180 147 122 122 148 136 120 121 120 118 99 107 85 50 83 64 332 1224 332 227 205 156 131 131 150 134 139 139 87 94 91 90 62 270 312 186 227 296 243 228 245 195 864 473 150 632 381 143 184 579 654 237 316 219 636 163 201 229 174 327 280 305 402 414 336 96 298 948 669 823 284 151 270 538 367 256 616 272 146 356


#Last lines of standard output (there's more before this, but the last lines seem most relevant):
282 sites in gene 246 go to file Gene246.seq
151 sites in gene 247 go to file Gene247.seq
267 sites in gene 248 go to file Gene248.seq
250 sites in gene 249 go to file Gene249.seq
163 sites in gene 250 go to file Gene250.seq
253 sites in gene 251 go to file Gene251.seq
208 sites in gene 252 go to file Gene252.seq
261 sites in gene 253 go to file Gene253.seq
144 sites in gene 254 go to file Gene254.seq
355 sites in gene 255 go to file Gene255.seq

codon site   5412: TCT TCT TCT TCT TCT TCT TCC TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT AGT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT TCT
codon site   5587: AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT TCC TCC TCC TCC AGC AGC AGC AGC TCC AGC AGC TCC AGC TCC TCC AGC AGC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC TCC AGC TCC AGC TCC TCC AGC AGC AGC TCC TCC TCC TCC TCT TCC TCC TCC TCC AGC AGC TCC TCC TCC TCC TCC TCC AGT AGT AGT AGT AGT AGT TCC TCC TCC TCC TCC TCC AGC AGC AGC
codon site  25230: TCT TCC TCC AGC AGC AGC TCC TCC TCC TCC AGC TCT TCC TCC TCC AGC AGT AGT AGC AGT AGC AGT AGT AGT AGT AGT AGT AGT AGT AGT AGC AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGC AGT AGC AGC AGT AGC AGC AGC AGT AGC AGT AGT AGT AGT AGT AGC AGC AGT AGT AGC AGC AGC AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGC AGC AGC AGC TCC TCC TCC TCC TCC TCC AGC AGT AGC AGC AGT AGT AGT AGT AGT
codon site  28012: AGC TCC TCC TCC TCC TCC TCC TCC TCC AGT AGT AGT AGT TCC AGT AGT AGT AGT AGT AGC AGC AGC AGC AGT AGC AGC AGC AGC AGC AGC AGT AGC AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGT AGC AGT AGT AGT AGT AGT AGT AGT AGT AGC AGT AGC AGT AGT AGT AGT AGT AGT AGT AGC AGC AGT AGT AGT AGT AGC AGC AGC
codon site  67772: AGC AGC AGC AGC AGC AGC AGC AGC AGC AGC AGC AGC AGC AGC AGC AGC AGC AGT AGC AGC AGT AGC AGC AGC AGC AGC TCT AGC TCT TCT AGT AGC AGC AGC AGC AGC AGC AGC AGC AGC AGC AGC AGC AGT AGT AGC AGT AGT AGT AGT AGT AGT AGC AGT AGT AGC AGC AGC AGT TCT AGC AGC TCT AGC AGC AGC AGC AGC AGC AGC AGC AGT AGC AGC AGT AGC AGT AGT AGT AGC AGC AGC AGC AGC AGC AGC AGT AGC TCT TCT AGC AGC AGC AGC AGC
Above are 'synonymous' sites with 2 types of serine codons: TC? and TCY.

Sequences read..
Counting site patterns..  0:05
       37930 patterns at    68856 /    68856 sites (100.0%),  0:05

Error: some gene labels are missing.

Sandra AC

unread,
Dec 16, 2022, 10:51:59 AM12/16/22
to PAML discussion group
Hi Kim, 

Are you running the latest version of PAML, PAML v4.10.6? If not, please download the latest version from the PAML GitHub repository here. You can find installation guidelines also here.

With regards to your analysis, perhaps you may want to check the format of your input sequence file first. You can follow the guidelines on pages 12 and 13 in the PAML documentation here as well as page 35 to read more about the option "Mgene". Make sure that you have formatted correctly the sites to enable option G (you have "Mgene = 1 ") in your sequence file and that you have accounted for having codon data (you have "seqtype = 1"). 

Hope this helps!

All the best,
Sandra

kim

unread,
Jan 6, 2023, 1:35:27 PM1/6/23
to PAML discussion group
Hi Sandra,

Thank you for your response. I installed PAML v4.10.6 and re-ran the analysis, and got the following error:

Error: some genes do not have any sites?.

This error helped me see that PAML had no sequence data for one of my genes. This happened because there was a fair amount of missing sequence data for a couple of my samples for that particular gene. Altogether there was no site within that gene for which sequence data was present for 100% of my samples in the alignment. So I removed that gene from my alignment and was able to successfully run the analysis.

Kim
Reply all
Reply to author
Forward
0 new messages