Doubts about the use of unrooted trees comparing sister species in codeML

19 views
Skip to first unread message

Elisa Josefa González Ugalde

unread,
Feb 10, 2025, 8:16:41 AMFeb 10
to PAML discussion group

Dear PAML comunity

First of all, I want to thank everyone for taking the time to respond to messages on this forum. Thanks to this community, I have been able to resolve many questions about using PAML, as others have often had the same concerns I do.

Today, I would like to ask about the use of rooted versus unrooted trees when identifying signals of selection using codeml. I understand this topic has been discussed extensively, but I haven’t found an answer to my specific question.

I am analyzing positive selection signals in two sister species separately, which diverged approximately 3 million years ago, as I hypothesize that both have been subjected to different selective pressures. I have also analyzed the "ancestral branch" of these two species to identify genes under selection inherited from their common ancestor. For the ancestral branch analysis, I used a rooted tree, assuming that the entire branch was evolving differently from the rest of the tree.

When analyzing the two sister species separately, I used an unrooted tree, as they belong to the same branch in my topology, which then diverges into the two species. Based on my understanding of the manual and documentation, an unrooted tree should be used in this scenario.

Is this approach correct? Specifically:

  1. Is it correct to use an unrooted tree when analyzing each of the sister species separately?

For further clarification, in relation to Figure S1C from Álvarez-Carretero et al. (2023) Supplementary Material, if I wanted to identify signals of selection only in species B (branch b2), should I use an unrooted tree?

Thank you in advance for your help!

Sandra AC

unread,
Feb 12, 2025, 8:56:25 AMFeb 12
to PAML discussion group
Dear Elisa Josefa,

Thanks for your message and your kind words! This is what this PAML discussion group is all about, glad that you find it useful :)

When you say that you have analysed positive selection signals in both taxa and the ancestral branch, I assume that you may have tested the following hypotheses:

```

    mrca_A-B  |------- A
|-------------|
|             |------- B

 -
 |
|----------------- O

((A,B #1) #1,O);

    mrca_A-B |------- A
|------------|
|            |------- B

-
 |
|----------------- O

((A #1,B) #1,O);

```

In the two cases above, you are assuming that the branch that goes from "O" to the root (coloured in blue; please note that I have placed a "virtual" root in the middle for visual purposes, but remember that the root could be anywhere from "O" to the MRCA of "A" and "B") has a different evolutionary process than the one over the branch that goes from the MRCA of "A" and "B" to the root. This hypothesis would somewhat resemble what Fig S1D in the Supp Mat published as part of Álvarez-Carretero et al., 2023 is trying to illustrate.

When you say that you want to analyse the branches leading to your two species separately, I believe these are the hypotheses that you have in mind:

```

    mrca_A-B |------- A
|------------|
|            |------- B

-
 |
|----------------- O

((A #1,B,O);


    mrca_A-B |------- A
|------------|
|            |------- B

-
 |
|----------------- O

((A,B #1,O);


```

If that's the case, then I believe you made the correct assumption :) You can use an unrooted tree because you are not constraining the evolutionary pressure on the "root branch" (see above in blue); you are only constraining the lineage from "A" (or "B") to the MRCA of "A" and "B" to be the foreground branch.

I hope that the illustrations above and the relevant explanations make sense, but please let us know if something remains unclear!

All the best,
Sandy

P.S.: The lineages highlighted in bold in my illustrations above would be those labelled as "foreground" branches by CODEML, while the rest of branches would be treated as "background" branches.

Reply all
Reply to author
Forward
0 new messages