Hi !
First, thanks a lot for the SimPhy program that is very user friendly.
Here is my question :
I am PhD student (CNRS - France) working on olfactory receptor genes and I observed that 2 sub-families of these genes were co-evolving (correlation of the number of genes in a species tree). I would like to test the probabilities to observe this pattern only by chance. To do so, I want to simulate the evolution of gene families with a given or simulated species tree.
I ran some simulations with SimPhy with this command, that (If I understand correctly) will generate 3 different gene families evolution scenario, for 100 different species trees :
./SimPhy_1.0.2/bin/simphy_mac64 -sb f:0.000001 -ld f:0.0000005 -lb f:0.0000005 -lt f:0 -rs 100 -rl U:3,3 -rg 1 -o SimPhy_test -sp f:10000 -su f:0.00001 -sg f:1 -sl U:185,185 -st f:1000000 -om 1 -v 2 -od 1 -op 1 -oc 1 -on 1 -cs 50 -ll 1000
Than I extracted the number of genes present in each of the species using R (let's say that "g_trees1.trees" is the first simulation for the first species tree) :
>library("ape")
>mytree <- read.tree("g_trees1.trees")
>all_labels <- mytree$tip.label
>sp_labels <- gsub("_.*","",all_labels)
> table(sp_labels)
With this command, I know that the species 89 have a 2 genes while the species 96 have 4 genes.
I am however facing some issues:
First, which parameter should I change for all my species to be represented by at least 1 genes in the gene tree ? I tried to modify -ll and -ls without any success.
The gene families I want to generate should also be way bigger (In the real dataset, I am working on olfactory receptor genes with hundreds of genes in each species). How can I achieve this ?
And finally, is that a good way to answer my initial goal ?
Thanks a lot for your help and advices ,
Maxime Policarpo