Inquiry Regarding Amino Acid Sequence Support in BEAST

70 views
Skip to first unread message

Shoaib Khan

unread,
Jun 19, 2025, 2:59:39 PM6/19/25
to beast-users

Dear BEAST Development Team,

I hope this message finds you well.

I am currently using BEAST for phylogenetic analysis and would like to clarify whether BEAST supports amino acid sequences as input data. Most of the examples and tutorials I have encountered focus on nucleotide sequences, so I wanted to confirm:

Does BEAST support phylogenetic analysis using amino acid sequence alignments, or is it limited to nucleotide sequences only?

I would greatly appreciate your guidance on this matter, especially regarding the appropriate data formats and models if amino acid sequences are supported.

Thank you for your time and support.

Best regards,
Shoaib Khan

Jordan Douglas

unread,
Jun 19, 2025, 4:02:42 PM6/19/25
to beast-users

Dear Shoaib,

Yes BEAST 2 and BEAST X both support amino acids, using substitution models like blosum62 and wag etc.. The obama package for BEAST 2 is the most advanced model. It performs model averaging, meaning that it finds the best substition models, and also decides whether to use site heterogeneity, invariant sites, and frequency estimation; all during MCMC. This is the Bayesian analog of the model selection step performed before building a maximum likelihood tree.



In terms of the clock model, I would suggest starting with the punctuated relaxed clock, or just a standard relaxed clock. Proteins go through some pretty non-clock-like changes over long evolutionary timescales, in which case strict clocks are not appropriate.


Hope this helps,
Jordan

Shoaib Khan

unread,
Jun 26, 2025, 2:45:37 AM6/26/25
to beast...@googlegroups.com

Dear Jordan,

Thank you once again for your earlier guidance.

I’ve installed BEAST 2 along with the OBAMA package, and I’m trying to load my amino acid alignment file into BEAUti. However, I’m encountering an error that says:

"Unsupported sequence file."

I've ensured that my sequence data is in amino acid format, but it seems BEAUti is not recognizing it correctly. I'm currently using a file exported in NEXUS format from MEGA, which might be part of the issue. I suspect the file may be missing some required headers or formatting expected by BEAST 2 (e.g., datatype=protein in the NEXUS block).

I’d really appreciate if you could confirm the best practices or requirements for properly formatting amino acid sequence files for BEAUti (especially with the OBAMA model in mind). Also, if there are any known issues when importing protein alignments from MEGA, that would be helpful to know.

Looking forward to your advice.

Best regards,
Shoaib


--
You received this message because you are subscribed to the Google Groups "beast-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beast-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/beast-users/8cc9d521-0a2c-4c38-badd-0c1d3bcef977n%40googlegroups.com.
Screenshot 2025-06-26 113917.png
Reply all
Reply to author
Forward
0 new messages