giving up... species occurs twice in locus 1

44 views
Skip to first unread message

Zach Cohen

unread,
May 29, 2025, 2:28:51 PMMay 29
to PAML discussion group

Hello, I'm attempting to run mcmctree on a partitioned file of nucleotides generated by fasta-phylip-partitions, but I keep getting a cryptic error. The control file works on the example mtDNApril123.txt file, but not on my custom input (see attached). 

TEST_all_loci.phy
mcmctree_error.log.rtf
mcmc_17_seqs_hessian.ctl

Sandra AC

unread,
May 29, 2025, 3:19:40 PMMay 29
to PAML discussion group
Hi Zach,

Thanks for your message! I am not sure what your original FASTA files looked like, but there are some issues with the sequences that you have aligned in your sequence alignment file:
  • Many of your sequences have exclamations marks (!), which are not valid characters. Did you mean to add question marks instead or another character?
  • Once I fixed this issue, all the alignment blocks were correctly parsed by MCMCtree, but then I encountered the following error: `error: Only bounds for the root age are implemented.`. You need to incorporate the root age constraint in the tree file as suggested in the PAML Wiki (see description for variable `RootAge`). 
  • Once I added the root age constraint in the Newick tree file and removed the variable from the control file, everything run without problems.
I suggest that you review the process you followed to generate your alignment -- any errors that are present in your inferred alignment will pass onto timetree inference, and thus the interpretations you make from the estimated evolutionary timeline will be incorrect. I have also seen that you had various errors in your control file (e.g.,  repeated variables that would keep overwriting themselves, outdated variables such `finetune` that are no longer required, etc.); you may want to get familiar with the PAML Wiki to understand the format of your input files as well as the control files.

I suggest that you follow this short tutorial that shows how to run MCMCtree using the approximate likelihood calculation. Once you get familiar with the format that your input files should follow, the variables you should include in the control files, and the workflow of a timetree inference, you can then try again with your dataset -- note that you will be running BASEML instead of CODEML as you have nucleotide data, but the workflow is the same :)

Hope this helps!

All the best,
Sandy

Zach Cohen

unread,
May 29, 2025, 3:39:51 PMMay 29
to PAML discussion group
Hello Sandy,

Fantastic, replacing the ! with ? did work. The alignment was generated using macse v2, which I think is the source for those characters. Thanks also for the suggestions, I'll revisit the documentation and rerun accordingly! 

Best,
Zach

Reply all
Reply to author
Forward
0 new messages