sweet spot for speices or evolutionary time

36 views

Skip to first unread message

P M

unread,

Feb 21, 2025, 5:21:45 PMFeb 21

to hahnlab-cafe

I am new to CAFE and trying to learn how best I can use it.

I ran CAFE for different values of k, and three phylogenies from the same pool of species. My species-of-interest has roughly half the average number of genes found in other species in the phylogeny.

Results
Run_1: Started with a 20 species phylogeny. All ortholog families had failures at all values of k tested, 1 to 5. My species-of-interest showed contractions for at least 2k gene families.

Run_2: Switched to a 9 species phylogeny. No failures at k 1 to 6. But my species-of-interest showed contractions for about 200 gene families.

Run_3: Redid the phylogeny using 14 species. No failures at k 1 to 4, all families failed at k 4 to 6. Now my species-of-interest shows contractions for at least 2k gene families again.

In all three runs, lambda is maximum at k=2. And likelihood decreased as value of k increased for Run_2 and Run_3. For Run_1, the likelihood values first decreased and then increased.

Please see attached files.

Queries
1. The family expansion/contraction output for different k values within a given 'Run' does not vary by much. If I had started with Run_2 and left it that, then the CAFE results would have little utility in terms of gene family contractions. Is there a sweet spot of species or evolutionary time scale to be used?

2. The family expansion/contraction output changes with the phylogeny. For Run_1 and Run_3, I get similar results in terms of gene family contractions, except I get failures for Run_1. What do these failures really mean if the contraction numbers are the same?

3. I am using the divergence time estimates between the outgroup and its closest branch for generating the ultrametric tree. Would the choice of outgroup change the outcome?

4. Even though the family expansion/contraction output for different k values within a given 'Run' does not vary by much, for Run_3, do I use k=2 with max lambda, or k=4 with min likelihood without any failures?

Hope this makes sense.

Thanks.

Run_1.txt

Run_2.txt

Run_3_k02.png

Run_3.txt

Run_2_k02.png

Hahn, Matthew

unread,

Feb 22, 2025, 9:23:40 AMFeb 22

to P M, hahnlab-cafe

Hi,

It’s a bit hard to follow everything here, but in general there is no optimal way to find k. I would stick to getting sensible results without using the gamma model, and go from there.

Matt

On Feb 21, 2025, at 5:21 PM, P M <palmy...@gmail.com> wrote:

I am new to CAFE and trying to learn how best I can use it.

--
You received this message because you are subscribed to the Google Groups "hahnlab-cafe" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hahnlabcafe...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/hahnlabcafe/14ad6002-8698-49e4-afdd-423ffd912ebbn%40googlegroups.com.

<Run_1.txt>

<Run_2.txt>

<Run_3_k02.png>

<Run_3.txt>

<Run_2_k02.png>

Reply all

Reply to author

Forward

0 new messages