Hi everyone!
First of all, thank you for accepting my request for joining the group!
I am willing to use SimPhy to evaluate new approaches in phylogeny reconstruction, but first I am performing several small tests to be sure I understand how SimPhy handles the different parameters, and what are the consequences of tweaking with each one of them. At this moment I'm testing how the final number of species on a species tree changes as a function of the speciation rate. According to the manual, speciation rate (-SB parameter) is the number of events per time unit. My direct interpretation of it was that, since I keep extinction rate at zero, the final number of species on my tree should be, in average, the tree height (ST) times the speciation rate (SB), i.e., S=ST*SB, where S is the final number of species. In other words, I expected that at each time unit SB new species would appear (random splits throughout the existing species). An alternative expectation was that at each time unit every existing species would generate SB new species, what is equivalent to say I would observe S=2^(SB*ST) species at the end of the simulation.
So I ran a few amount of simulations in order to clarify this. I fixed SB at 1e-06 and tested ST values between 1e+06 and 7e+06. For each different ST value I ran 100 simulations, and registered the average final number S of species in each one. Attached is a table summarizing the results, including my two alternative expectations and the actual result. Doing an exponential regression on the observed values of S I got a function that is approximately S=1.8*exp(ST*SB).
Well, put very simply, why is that? I don’t think other simulation parameters are interfering (I checked that for at least some of them), but at the same time the function I found seems rather arbitrary.. Could anyone help me understand that? Thank you!