--
You received this message because you are subscribed to the Google Groups "hahnlab-cafe" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hahnlabcafe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/54cb6eb1-6b2b-4da6-bb12-12cc072aec90o%40googlegroups.com.
<Screenshot 2020-05-29 at 19.09.21.png>
Hello Paschalis,you need to remove the most variable families from your data set as for these there is an infinitely small probability of the observed change under the lambda you are using.Typically a reasonable filter is to open the file in excel and add a column off to the right that is the difference between the min and max for each family. Sort the entire sheet such that the largest differences are at the top. Remove the top ~10 most variable families and save this as a new file (remember to check that the line breaks are unix and that you have deleted the new column you added). Try running CAFE with the reduced dataset (just see if it will initialize before running a full error model search). If it works great, if not try removing more families.
dv
On Jun 8, 2020, at 5:10 AM, Paschalis Natsidis <natsidis...@gmail.com> wrote:
Hello,I am using CAFE to try to infer gene family evolution among metazoa.However, when I use the caferror.py module, I get a warning that it fails to converge or initialise.Please check the attached file.Also, I am having a hard time to understand what 'inf' score means, and how can the different lambdas be interpreted.Many thanksPaschalis--
You received this message because you are subscribed to the Google Groups "hahnlab-cafe" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hahnl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/54cb6eb1-6b2b-4da6-bb12-12cc072aec90o%40googlegroups.com.
<Screenshot 2020-05-29 at 19.09.21.png>
__________________________
Dan Vanderpool
Postdoctoral Scholar
Department of Biology
Indiana University
Hahn Lab, Jordan Hall 249B
1001 East Third Street
Bloomington, IN 47405
To unsubscribe from this group and stop receiving emails from it, send an email to hahnlabcafe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/bafa299c-bf94-4b78-9e33-75e664e13574o%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/CAAuutjeFdjQgdROv-4YAa-RCr2O3UYdJNiix%3D64No9_A6kV%2BfA%40mail.gmail.com.
Hello Philip,Also, try putting your tree command first#!/home/philipp/CAFE/release/cafe#version#datetree (((((((((((((((Nematoplanacoelogynoporoides:1.0,Microstomumlineare:1.0):1.0,Stenostomumsthenum:2.0):1.0,Symbionpandora:3.0):3.0,((Malacobdellagrossa:1.0,Cerebratulussp:1.0):1.0,Cephalothrixlinearis:2.0):4.0):3.0,((Helobdellasp:1.0,Capitellateleta:1.0):1.0,Alvinellapompejana:2.0):7.0):5.0,(((Lymnaeastagnalis:1.0,Crassostreagigas:1.0):1.0,Leptochitonrugatus:2.0):2.0,(Phoronisvancouverensis:1.0,Lingulaanatina:1.0):3.0):10.0):4.0,((Spadellasp:1.0,Eukrohniasp:1.0):2.0,(Brachionusplicatilis:1.0,Adinetaricciae:1.0):2.0):15.0):10.0,((((((Romanomermisculicivorax:1.0,Plectussambesii:1.0):1.0,Enoplusbrevis:2.0):2.0,(Ramazzottiusvarieornatus:1.0,Hypsibiusdujardini:1.0):3.0):1.0,Paragordiusvarius:5.0):3.0,((Triboliumcastaneum:1.0,Daphniapulex:1.0):1.0,Strigamiamaritima:2.0):6.0):1.0,Priapuluscaudatus:9.0):19.0):24.0,((((((((Symsagittiferaroscoffensis:1.0,Praesagittiferanaikaiensis:1.0):2.0,(Pseudaphanostomavariabilis:1.0,Isodiametrapulchra:1.0):2.0):1.0,Hofsteniasp:4.0):1.0,Paratomellarubra:5.0):2.0,(Nemertodermawestbladi:1.0,Mearastichopi:1.0):6.0):2.0,(Xenoturbellaprofunda:1.0,Xenoturbellabocki:1.0):8.0):8.0,((((Strongylocentrotuspurpuratus:1.0,Parastichopussp:1.0):2.0,(Patiriaminiata:1.0,Amphiurafiliformis:1.0):2.0):1.0,Florometraserratisima:4.0):3.0,((Saccoglossuskowalevskii:1.0,Ptychoderaflava:1.0):1.0,Rhabdopleurasp:2.0):5.0):10.0):6.0,((((Homosapiens:1.0,Callorhinchusmilii:1.0):1.0,Petromyzonmarinus:2.0):2.0,(Salpathompsoni:1.0,Cionaintestinalis:1.0):3.0):1.0,Branchiostomafloridae:5.0):18.0):29.0):4.0,(((Liriopetetraphylla:1.0,Hydravulgaris:1.0):1.0,Alatinaalata:2.0):1.0,Nematostellavectensis:3.0):53.0):1.0,Trichoplaxadhaerens:57.0):3.0,((Lampeapancerina:1.0,Coeloplanameteoris:1.0):1.0,Mnemiopsisleidyi:2.0):58.0):3.0,((Syconcoactum:1.0,Plakinajani:1.0):1.0,Amphimedonqueenslandica:2.0):61.0):1.0,Capsasporaowczarzaki:64.0):3.0,((Salpingoecarosetta:1.0,Monosigabrevicollis:1.0):1.0,Acanthoecasp:2.0):65.0);load -i 68matrix_68_50_reduced.tsv -t 5 -l cafelog.txt -p 0.05lambda -sreport cafereportYou also don’t need a lambda tree if you are only using one rate.
On Jul 9, 2020, at 12:47 PM, Gregg Thomas <greg...@gmail.com> wrote:
Hi all,You could also try adding the -filter option to the load command. This makes sure your families are all present at the most recent common ancestor.Also, could you try running your input script directly with CAFE?source script.cafe or cafe script.cafeThe resulting CAFE log file might report more information. Let us know what that says and we can try and figure this out!-Gregg
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/bafa299c-bf94-4b78-9e33-75e664e13574o%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "hahnlab-cafe" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hahnl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/CAAuutjeFdjQgdROv-4YAa-RCr2O3UYdJNiix%3D64No9_A6kV%2BfA%40mail.gmail.com.
__________________________
Dan Vanderpool
Postdoctoral Scholar
Department of Biology
Indiana University
Hahn Lab, Jordan Hall 249B
1001 East Third Street
Bloomington, IN 47405
To unsubscribe from this group and stop receiving emails from it, send an email to hahnlabcafe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/a0339651-6e3c-4d37-8d9d-1d2c05da1ea2o%40googlegroups.com.
Heh, wonder why I did that. Thanks for pointing it out!-Gregg
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/a0339651-6e3c-4d37-8d9d-1d2c05da1ea2o%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to hahnlabcafe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/c3a69424-76aa-45a7-ac5f-8bc8fefd6a4ao%40googlegroups.com.
Hey again,Sorry for the confusion. We're trying to figure out why the CAFE runs with caferror are coming back with scores of 'inf'. A score of inf means that CAFE was unable to accurately converge on a rate estimate. This usually happens because of an issue with the input data having gene families that contain large amounts of variance, which is why Dan first suggested to remove families with a large range. I suggested running your script through CAFE alone to see if there was any more information in the CAFE log file that could help us determine why CAFE isn't converging. Looking at this output I can see that CAFE is still returning a score of inf, so now we at least know the problem isn't with caferror.If it's possible, do you think you could send your input data matrix so we can try and replicate this issue?Thanks!-Gregg
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/c3a69424-76aa-45a7-ac5f-8bc8fefd6a4ao%40googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to hahnlabcafe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/2aa07bb5-0dbe-44f3-b4b4-3e3519ea4a88o%40googlegroups.com.
<matrices.tgz>
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/5F8DAE4D-8B3A-42B7-8229-787B43FA2038%40gmail.com.
Yeah, I guess maybe if these units are in hundreds of millions of years. I didn’t see Human in there until Gregg pointed it out. In this case any estimation of gene family turnover will likely be nonsense. Too much time has passed to reliably count changes and discern family membership/orthologs (whole genome duplications likely exist in many/all(?) of these groups over this time span as well). I would definitely do as Gregg did and break the tree up into smaller, more closely related clades.
Dan
On Jul 13, 2020, at 12:22 PM, Gregg Thomas <greg...@gmail.com> wrote:
I think Dan is correct that the disparity between branch lengths in your tree is causing some problems (though it's not something I've run into before). I was unable to filter your data in such a way to get it to converge with the full tree, however limiting to a subset of species (Romanomermisculicivorax, Plectussambesii, Enoplusbrevis, Ramazzottiusvarieornatus, Hypsibiusdujardini, Paragordiusvarius, Triboliumcastaneum, Daphniapulex, Strigamiamaritima, Priapuluscaudatus) CAFE was able to finish (though still with some families failing to converge).I'm wondering how the branch lengths on your tree were estimated? CAFE works best with branches in terms of millions of years. However, for some of these species (e.g. Daphnia and Human) I know the divergence time on the order of hundreds of millions of years. This sampling may be at too deep a timescale for CAFE to accurately estimate gain/loss rates.-Gregg
On Mon, Jul 13, 2020 at 9:47 AM Dan Vanderpool <ddvand...@gmail.com> wrote:
Hello Philipp,Looking at your tree, I think you will have a very hard time estimating a meaningful lambda given all of the tips within a clade are closely related with clades separated by long branches between clades. You may want to try analyzing some of the clades separately, or using multiple lambdas. Also, CAFE5 incorporates among family rate variation which may help as well but convergence is harder to achieve while using both multiple lambdas and the discrete gamma correction.
Dan
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/2aa07bb5-0dbe-44f3-b4b4-3e3519ea4a88o%40googlegroups.com.
<matrices.tgz>
__________________________
Dan Vanderpool
Postdoctoral Scholar
Department of Biology
Indiana University
Hahn Lab, Jordan Hall 249B
1001 East Third Street
Bloomington, IN 47405
--
You received this message because you are subscribed to the Google Groups "hahnlab-cafe" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hahnl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hahnlabcafe/5F8DAE4D-8B3A-42B7-8229-787B43FA2038%40gmail.com.
__________________________
Dan Vanderpool
Postdoctoral Scholar
Department of Biology
Indiana University
Hahn Lab, Jordan Hall 249B
1001 East Third Street
Bloomington, IN 47405