additional models and biological interpretation

340 views
Skip to first unread message

Marcial Escudero

unread,
Apr 12, 2016, 11:14:58 AM4/12/16
to BioGeoBEARS, Enrique Maguilla
Dear Dr Nicholas J. Matzke,

I am using your software to study a groups of plants of genus Carex (Cyperaceae). I have found very useful your R package and your tutorial. Specifically, I have followed your example

SCRIPT: EXAMPLE THAT SHOULD RUN OUT OF THE BOX: Hawaiian Psychotria (revised and improved, 2015-04-15) in your wiki website http://phylo.wikidot.com/biogeobears#toc22. It runs very well. Thank you.

You have attached the reconstructions.


These are the AIC values

                    LnL numparams          d            e           j     AICc       AIC_wt
DEC           -83.90746         2 0.09395133 9.194844e-09 0.000000000 172.2949 1.675029e-05
DEC+J         -83.34384         3 0.09061252 1.000000e-12 0.017269437 173.6877 8.348146e-06
DIVALIKE      -86.29972         2 0.10557918 1.000000e-12 0.000000000 177.0794 1.531359e-06
DIVALIKE+J    -86.23066         3 0.10346035 1.000000e-12 0.005546532 179.4613 4.654361e-07
BAYAREALIKE   -76.63176         2 0.02006117 1.746594e-01 0.000000000 157.7435 2.420019e-02
BAYAREALIKE+J -71.67489         3 0.01398802 1.480142e-01 0.009295747 150.3498 9.757727e-01

As you can see the best model is BAYAREALIKE+J.


I have two questions:


1. Is possible to try additional models with other combinations of parameters? And, if it is possible. Do you think is worth it to try additional model?


2. The selected model supports widespread sympatry and founder events. Taking into account that the areas in our analyses are Europe, Asia, North America, South America and Australia, Would you consider this result as biologically meaningful? Carex are well known for often long distance dispersals and high capacity of colonization. I am trying to interpret the result of widespread sympatry in these high scale areas (continents). I think, in this group, it could be more biologically plausible allopatric speciation (sympatry narrow speciation) and rapid colonization to reach a widespread distribution instead of widespread sympatry. Would you consider possible the widespread sympatry in this continent scale analysis?


Thank you very much for your help and feedback.


Sincerely,


Marcial.





Glareosae_DEC_vs_DEC+J_M0_unconstrained_v1.pdf
Glareosae_DIVALIKE_vs_DIVALIKE+J_M0_unconstrained_v1.pdf
Glareosae_BAYAREALIKE_vs_BAYAREALIKE+J_M0_unconstrained_v1.pdf

Nick Matzke

unread,
Apr 12, 2016, 10:10:49 PM4/12/16
to bioge...@googlegroups.com, Enrique Maguilla
Hi! 

I think you are getting the results you get because your dataset has many cases of sister species, or even larger clades, that all occupy (say) three continents.  If a clade of 4 species has ranges like this:

sp1  ABC
sp2  ABC
sp3  ABC
sp4  ABC

...then, phylogenetically speaking, and treating biogeographic range as a multistate character (which is what these methods do), then yes, the analysis will naturally infer that the ancestor nodes of these species also had range ABC, and will infer that the speciation events were ABC->(ABC,ABC).

So, that makes perfect sense in terms of statistical inference.  Whether or not you believe it as a scientific matter is a totally different question, and depends on your prior beliefs about what is plausible. I think the verbal argument you made about how speciation "should" work is reasonable, but I would encourage you to think of alternative hypotheses as well.  I don't know much about sedges, but perhaps the three-continent species are sympatric at the coarse scale of continents, but they have different niches (warm-cold, wet-dry, or some such).  Or perhaps flowering times or some such could cause widespread sympatry.

On the other hand, if you had evidence of (say) polyploids founding new species in this group, this would be pretty good evidence of point-origin-followed-by-rapid-range expansion (although sometimes polyploids can originate independently and still join the same new species). 

It is important to consider time-scale and spatial scale. All biogeographical analyses and results are conditional on these scales.  Phylogeny-level analyses can only "see" processes that occur at the time scale of the phylogeny (typically millions of years).  Any processes faster than that are "effectively instantaneous" as far as the analysis is concerned.  So, a scenario like local origin followed by rapid multiple continent range expansion might be biologically plausible, but you would never see it on the phylogeny except if you had that event preserved with, say, hundreds of ancient DNA sequences or something.

The data you do have, though, are still telling you something.  The fact that many species have the range Europe-Asia-North America, and exclude Australia, tells you there is *something* conserved across these species that conserves that geographic range.  Is it preference for a temperate environmental niche?  Geographical and habitat connectivity across Beringia?  The movements of birds / large mammals etc.?  Sedges hate edges (of continents) and thus have a hard time getting to Australia?  I think these issues would be good starting points for the discussion section of a paper.

(another possibility is that those widespread species are actually multiple cryptic species, this would take population-level sequencing to resolve)

Cheers!
Nick














--
You received this message because you are subscribed to the Google Groups "BioGeoBEARS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biogeobears...@googlegroups.com.
To post to this group, send email to bioge...@googlegroups.com.
Visit this group at https://groups.google.com/group/biogeobears.
To view this discussion on the web visit https://groups.google.com/d/msgid/biogeobears/c2cf98f4-500b-4cff-b5d3-3081cf5cf747%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nick Matzke

unread,
Apr 12, 2016, 10:44:44 PM4/12/16
to BioGeoBEARS, ema...@gmail.com


On Wednesday, April 13, 2016 at 1:14:58 AM UTC+10, Marcial Escudero wrote:
Dear Dr Nicholas J. Matzke,

I am using your software to study a groups of plants of genus Carex (Cyperaceae). I have found very useful your R package and your tutorial. Specifically, I have followed your example

SCRIPT: EXAMPLE THAT SHOULD RUN OUT OF THE BOX: Hawaiian Psychotria (revised and improved, 2015-04-15) in your wiki website http://phylo.wikidot.com/biogeobears#toc22. It runs very well. Thank you.

You have attached the reconstructions.


These are the AIC values

                    LnL numparams          d            e           j     AICc       AIC_wt
DEC           -83.90746         2 0.09395133 9.194844e-09 0.000000000 172.2949 1.675029e-05
DEC+J         -83.34384         3 0.09061252 1.000000e-12 0.017269437 173.6877 8.348146e-06
DIVALIKE      -86.29972         2 0.10557918 1.000000e-12 0.000000000 177.0794 1.531359e-06
DIVALIKE+J    -86.23066         3 0.10346035 1.000000e-12 0.005546532 179.4613 4.654361e-07
BAYAREALIKE   -76.63176         2 0.02006117 1.746594e-01 0.000000000 157.7435 2.420019e-02
BAYAREALIKE+J -71.67489         3 0.01398802 1.480142e-01 0.009295747 150.3498 9.757727e-01

As you can see the best model is BAYAREALIKE+J.


I have two questions:


1. Is possible to try additional models with other combinations of parameters? And, if it is possible. Do you think is worth it to try additional model?



I forgot to answer this. Briefly:

My philosophy is that trying models should be drive by hypothesis testing, since otherwise you get into the morass of trying every possible model combination you can think of, which will quickly get ridiculous as more and more parameters are added.

In you case, the most obvious model variants to look at are:

+x -- dispersal rates/weights multiplied by distance^x
+n -- dispersal rates/weights multiplied by environmental distance^n
+w -- dispersal rates/weights multiplied by manual dispersal multipliers^w

Your definitions of "distance", dispersal multipliers, etc., should again be driven by hypothesis testing.  Perhaps you think that minimum-overwater-distance is a relevant predictor of dispersal; if so, construct a distance matrix using that criterion, and then test it to see if it improves the LnL.

It is perfectly valid to try multiple distance matrices / weights, and even combine them. We did some of that in:

Van Dam, Matthew; Matzke, Nicholas J. (2016). Evaluating the influence of connectivity and distance on biogeographic patterns in the south-western deserts of North America. Journal of Biogeography. Special paper, published online 3 March 2016. 

In your case, given the size of your dataset, and only 4 areas, I wouldn't recommend going too crazy.  But one obvious model to try is one where Europe-Asian-North America are close/connected, and Australia is distant from them (but perhaps Australia is closer to Asia than the others).

The BioGeoBEARS google group and the wiki have lots of discussion/advice/examples. Example files:


Cheers,
Nick

A. Marcial Escudero

unread,
Apr 14, 2016, 5:34:38 AM4/14/16
to bioge...@googlegroups.com, Enrique Maguilla
Dear Dr. Nick Matzke,

Thank you very much for your email.

I agree that it does not make sense just to try more and more models and combination of parameters if they do not have behind a biological hypothesis.
I will read your paper, and consider the connection possibilities between different areas.

Regarding the biological interpretation of the results.
I was little worry about the inferred widespread sympatry because it seems hard to believe (even when each species exploits different niches, etc.) that speciation occurs in sympatry across Europe, Asia and North America at the same time. Taking into account the known high dispersal capacity of Carex species, it seemed more plausible that speciation occurs in one place and then the species colonize the rest of the northern Hemisphere.

Polyploidy is rare in Carex which has holocentric chromosomes. Chromosomes evolve very fast by fission and fusion. But a similar model to the one you describe in polyploids could be happen in Carex.

I agree it is important to consider time-scale and spatial scale. Carex has a very high capacity to colonize remote areas and those colonization events at population level could be "effectively instantaneous" for this analysis. That is why I think that a scenario like local origin followed by rapid multiple continent range expansion might be biologically more plausible.

I think the connection Europe-Asia-North America is easier (maybe Beringia or maybe the high capacity of long distance dispersal in Carex) then reaching Australia which is harder because Carex species have to cross the tropical and subtropical areas at low latitudes before they reach the temperate areas of the souther hemisphere in Australia and South America (there is one species also there).

We have many data about these species and the hypothesis of multiple cryptic species is not plausible.

Thank you very much for your comments and hypotheses.

Cheers,

Marcial.

 

--
You received this message because you are subscribed to a topic in the Google Groups "BioGeoBEARS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/biogeobears/a2BZ25n4-OI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to biogeobears...@googlegroups.com.

To post to this group, send email to bioge...@googlegroups.com.
Visit this group at https://groups.google.com/group/biogeobears.

For more options, visit https://groups.google.com/d/optout.



--
A. Marcial Escudero
Postdoctoral Researcher
Department of Plant Biology and Ecology
Avda. Reina Mercedes s/n
41012 Sevilla
SPAIN

A. Marcial Escudero

unread,
Oct 17, 2016, 7:52:21 AM10/17/16
to bioge...@googlegroups.com, Enrique Maguilla
Dear Nick,

I have an additional doubt which is about the number of parameters in the model DEC, DIVA and BayArea.
This is the summary output:

                  LnL numparams          d            e           j     AICc       AIC_wt
DEC           -83.90746         2 0.09395133 9.194844e-09 0.000000000 172.2949 1.675029e-05
DEC+J         -83.34384         3 0.09061252 1.000000e-12 0.017269437 173.6877 8.348146e-06
DIVALIKE      -86.29972         2 0.10557918 1.000000e-12 0.000000000 177.0794 1.531359e-06
DIVALIKE+J    -86.23066         3 0.10346035 1.000000e-12 0.005546532 179.4613 4.654361e-07
BAYAREALIKE   -76.63176         2 0.02006117 1.746594e-01 0.000000000 157.7435 2.420019e-02
BAYAREALIKE+J -71.67489         3 0.01398802 1.480142e-01 0.009295747 150.3498 9.757727e-01

I this summary we have the estimated dispersal (d) and extinction (e) parameters. Also founder (j) in some models.
But as far as I understood we have additional parameters in each model such as sympatry (y) in BayArea model, sympatry (y) and vicariance (v) in DIVA model or sympatry (y) and vicariance (v and s) in DEC model.
Why they are not taken into account to calculate AIC? I am probably missing something here.

Thank you for your help.

Cheers,

Marcial.

To unsubscribe from this group and all its topics, send an email to biogeobears+unsubscribe@googlegroups.com.

To post to this group, send email to bioge...@googlegroups.com.
Visit this group at https://groups.google.com/group/biogeobears.
--
A. Marcial Escudero
Postdoctoral Researcher
Department of Plant Biology and Ecology
Avda. Reina Mercedes s/n
41012 Sevilla
SPAIN



--
A. Marcial Escudero
Postdoctoral Researcher
Department of Plant Biology and Ecology
Avda. Reina Mercedes s/n
41012 Sevilla
SPAIN

torsten...@gmail.com

unread,
Oct 18, 2016, 4:08:57 AM10/18/16
to BioGeoBEARS, ema...@gmail.com
Hi Marcial,

I think the number of parameters are correct. Because what matters are the free parameter that are estimated with your data. Have a look at the parameter table:

YourBioGeoBEARSObject$BioGeoBEARS_model_object@params_table

For the DEC model only two parameters are set free and all the remaining ones are fixed to a specific coefficient.

HTH,
Torsten
To unsubscribe from this group and all its topics, send an email to biogeobears...@googlegroups.com.

To post to this group, send email to bioge...@googlegroups.com.
Visit this group at https://groups.google.com/group/biogeobears.
--
A. Marcial Escudero
Postdoctoral Researcher
Department of Plant Biology and Ecology
Avda. Reina Mercedes s/n
41012 Sevilla
SPAIN

A. Marcial Escudero

unread,
Oct 18, 2016, 4:35:14 AM10/18/16
to bioge...@googlegroups.com, Enrique Maguilla
Dear Torsten,

Thank you very much for your email. It helps a lot.
I suspect that the other parameters were not free parameters.
Thank you very much to confirm that.

Cheers,

Marcial.

To unsubscribe from this group and all its topics, send an email to biogeobears+unsubscribe@googlegroups.com.

To post to this group, send email to bioge...@googlegroups.com.
Visit this group at https://groups.google.com/group/biogeobears.

For more options, visit https://groups.google.com/d/optout.



--
A. Marcial Escudero
Postdoctoral Researcher
Department of Plant Biology and Ecology
Avda. Reina Mercedes s/n
41012 Sevilla
SPAIN

Nick Matzke

unread,
Oct 18, 2016, 6:44:01 AM10/18/16
to bioge...@googlegroups.com, Enrique Maguilla
Hi -- apologies for the slow reply, I've been traveling. It will be awhile before I can get to more complex matters.  Yes, when j is a free parameter, then y/s/v change when j changes, but because they change as a deterministic function of j, they are not free parameters. 

If one works through this page very carefully, the way that the models function becomes much clearer:

Also study carefully table 1 of Matzke 2014:

Cheers!
Nick


--
You received this message because you are subscribed to the Google Groups "BioGeoBEARS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biogeobears+unsubscribe@googlegroups.com.

To post to this group, send email to bioge...@googlegroups.com.
Visit this group at https://groups.google.com/group/biogeobears.

A. Marcial Escudero

unread,
Oct 18, 2016, 6:48:37 AM10/18/16
to bioge...@googlegroups.com, Enrique Maguilla
Dear Nick,

Thank you very much for your help.

Thanks!

Marcial.

To unsubscribe from this group and all its topics, send an email to biogeobears+unsubscribe@googlegroups.com.

To post to this group, send email to bioge...@googlegroups.com.
Visit this group at https://groups.google.com/group/biogeobears.

For more options, visit https://groups.google.com/d/optout.



--
A. Marcial Escudero
Postdoctoral Researcher
Department of Plant Biology and Ecology
Avda. Reina Mercedes s/n
41012 Sevilla
SPAIN
Reply all
Reply to author
Forward
0 new messages