"Failed to initialize any reasonable values" on small dataset

227 views
Skip to first unread message

Alexandre Fernandes

unread,
Sep 5, 2023, 5:13:59 PM9/5/23
to hahnlab-cafe
Hello!

I'm having trouble getting CAFE5 to work on my machine and keep getting the same error message:
 
```
Families with largest size differentials:
TRIM5: 7
TRIM34: 2
TRIM22: 2
TRIM6: 0

You may want to try removing the top few families with the largest difference
between the max and min counts and then re-run the analysis.

Failed to initialize any reasonable values
```

My dataset is rather small (only 53 species and 4 genes), so I find it weird that the analysis fails to converge into a lambda value?

I've already tried the following to solve this problem:
  • Remove TRIM6 from the analysis because all species have the same number of copies (file: gene_counts_no_trim6.txt)
  • Run the analysis on a smaller tree with convergence time = 78.8 mya (file: laurasiatheria.nwk and laurasiatheria_genes.txt)
  • Consider all genes as being part of the same family (file: alltrims_count.txt)
But nothing has worked and I get the same error.

The command I've been running is:
cafe5 -i gene_counts.txt -t phylogeny.nwk

Any help would be much appreciated!

Using CAFE was a suggestion from a reviewer and I haven't found any alternative programs that do something similar so I kind of have to get this working 😅.


Cheers!
Alex


gene_counts_no_trim6.txt
laurasiatheria_genes.txt
gene_counts.txt
alltrims_count.txt
phylogeny.nwk
laurasiatheria.nwk

Hahn, Matthew

unread,
Sep 5, 2023, 5:45:30 PM9/5/23
to Alexandre Fernandes, hahnlab-cafe
Hi Alex,

Thanks for trying CAFE. Unfortunately, I think the problem is that your data are quite sparse in terms of number of gene families and quite large in terms of numbers of species and depth of the tree. There’s just not a lot of information for CAFE to estimate lambda from. (Plus, one family may not actually exist in the root.)

If you are after ancestral states, you could always plug in a lambda estimated from larger mammalian datasets. Otherwise, I don’t think there’s an easy way forward.



cheers,
Matt

--
You received this message because you are subscribed to the Google Groups "hahnlab-cafe" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hahnlabcafe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/a637b934-c64e-4a06-895f-94f77965df48n%40googlegroups.com.
<gene_counts_no_trim6.txt><laurasiatheria_genes.txt><gene_counts.txt><alltrims_count.txt><phylogeny.nwk><laurasiatheria.nwk>

Alexandre Fernandes

unread,
Sep 5, 2023, 6:32:58 PM9/5/23
to hahnlab-cafe
Hey Matt,

Thanks so much for answering!

Unfortunately even when using a fixed lambda, the analysis still fails with the same error:

> cafe5 -i gene_counts.txt -t phylogeny.nwk -l 0.0024443005606287

------------------------------------------------
Filtering families not present at the root from: 4 to 4

No root family size distribution specified, using uniform distribution
Inferring processes for Base model
Score (-lnL):             inf


Families with largest size differentials:
TRIM5: 7
TRIM34: 2
TRIM22: 2
TRIM6: 0

You may want to try removing the top few families with the largest difference
between the max and min counts and then re-run the analysis.

Failed to initialize any reasonable values
--------------------------------------------------

The lambda I used comes from CAFE5's tutorial, but I couldn't find any values that work...

Any ideas as to why that could be? We expect the root of the tree to have at least one ortholog of each gene, since both child clades of root contain a species that have all genes (fig. attached for reference).


Cheers,
Alex
Figure 2.pdf

Hahn, Matthew

unread,
Sep 5, 2023, 6:51:43 PM9/5/23
to Alexandre Fernandes, hahnlab-cafe
I think the tree is too long. Sorry!

On Sep 5, 2023, at 6:33 PM, Alexandre Fernandes <alexandrep...@gmail.com> wrote:


Figure 2.pdf
Reply all
Reply to author
Forward
0 new messages