Multiple Substitution Matrices for Transmembrane Proteins

Vanantwerp, James

unread,

Jul 23, 2020, 12:15:32 PM7/23/20

to bali-ph...@googlegroups.com

I would like to use two substitution matrices on two different regions of a protein I am evaluating - one for the transmembrane spans, and one for the solvent-exposes areas of the protein. This paper describes the methodology. I'm not sure how to use two separate matrices in Bali-Phy, and I don't know how to use custom matrices such as the one described in the paper. Any help you could give would be appreciated!

James

Benjamin Redelings

unread,

Jul 23, 2020, 11:10:53 PM7/23/20

to bali-ph...@googlegroups.com

Hi James,

Good question. I think you can do this, but I have a few questions about exactly what you want to do:

(1) Do you think the main difference between transmembrane spans is just which amino acids are common? In that case, you could use two instances of the (for example) the LG matrix with different amino acid frequencies.

bali-phy amino-acids.fasta -S mixture[models=[lg08+f,lg08+f],rates=[1,1]]

I don't know if its clear from the command line above, but each +f model gets its own set of frequencies.

This would really divide the sequence into two frequency classes, which I presume would correspond to the transmembrane regions, but I am not sure.

(2) Alternatively, you could divide the sequence into a sequence of partitions a priori, where every other partition is a transmembrane partition. This prohibits sequences in one partition from aligning to sequences in another partition. Do you want to do that instead?

(3) To load your own exchangeability matrix, you can do empirical[file]+f. You need a file that describe the transition matrix, though. It should look like this: https://www.ebi.ac.uk/goldman-srv/WAG/wag.dat

Does that make sense?

-BenRI

On 7/23/20 9:08 AM, Vanantwerp, James wrote:

I would like to use two substitution matrices on two different regions of a protein I am evaluating - one for the transmembrane spans, and one for the solvent-exposes areas of the protein. This paper describes the methodology. I'm not sure how to use two separate matrices in Bali-Phy, and I don't know how to use custom matrices such as the one described in the paper. Any help you could give would be appreciated!

James

--
You received this message because you are subscribed to the Google Groups "bali-phy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bali-phy-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bali-phy-users/CH2PR12MB39254BC8067CCF1B0AFB77F5CE760%40CH2PR12MB3925.namprd12.prod.outlook.com.

James Van Antwerp

unread,

Jul 26, 2020, 11:00:49 PM7/26/20

to bali-phy-users

That looks very helpful! I am new to using Bali-Phy, and will need to talk with my advisor about the implementation of what you describe. It's enough that I think he can help me figure it out though. Thank you!

To unsubscribe from this group and stop receiving emails from it, send an email to bali-ph...@googlegroups.com.

Reply all

Reply to author

Forward