Percent contribution and Pearson's correlation index

149 views
Skip to first unread message

serkan gul

unread,
Apr 18, 2016, 3:23:18 AM4/18/16
to max...@googlegroups.com
Hi everyone,

I sometimes have some suspicions about variables selection. For example, I run Maxent as 10 replicates with all 19 bioclimatic variables. The results showed that bio15, bio11, bio4 were the best percent contribution. However, when I analysed correlation index with ENMTool, the result was different, that is, according to the correlation analysis, bio 2, 8,9,15 did not show relationship with other variables. How should I decide variables selection? ,that is, which results of analyses should I consider? percent contribution or Pearson's correlation index.

all the best,
serkan

--
Asst. Prof. Dr. Serkan GÜL
Recep Tayyip Erdoğan University
Faculty of Arts & Sciences
Department of Biology 
53100 Rize, TURKEY
Phone: +90 464 2236126/1837

Ayşe S. Turak

unread,
Apr 19, 2016, 6:02:23 AM4/19/16
to max...@googlegroups.com
Hi Serkan,

Your question:
How should I decide variables selection?  percent contribution or Pearson's correlation index.

I would use both. One should interpret the information they provide together.

For example, from your results, my first idea would be that:
- Among the bioclimatic indexes available to you, Bio15 definitely provides information not provided by other indexes. So I would include bio15 in your variables.
- Bio 2,8,9 may not have explanatory value for the distribution of your species, or they might be correlated with bio4 and/or bio11, and thus not appear to contribute. Sometimes one index can provide the information on contribution of two indexes. For example, bio4 might be providing sufficiently instead of the contribution of bio 8 and bio9 together.

One idea is to first determine all correlations, then eliminate some of the variables with very high correlation by selecting one. I usually retain the variable which is more intuitive (i.e. the relationship with the species distribution is more easily understood) For example, is bio8 and bio 11 highly correlated in your study area? In a situation like that, if i think a plant needs the combination high temp and rainfall for germination and growth, I would go for  bio8. But if i think my plant can not tolerate very cold conditions, i would go for  bio11.

But trials and looking at contributions are necessary. Also, always one should be in contact with the species specialist.

Best wishes,
Ayşe
--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+un...@googlegroups.com.
To post to this group, send email to max...@googlegroups.com.
Visit this group at https://groups.google.com/group/maxent.
For more options, visit https://groups.google.com/d/optout.

--

Ayşe S. Turak, Ph.D.
Koruma CBS ve Modelleme Uzmanı

Doğa Koruma Merkezi www.dkm.org.tr

1293. Sok. No:9/2, Aşağı Öveçler, 06460, Ankara, Türkiye
fax:+90 312 287 8144 fax:+90 312 287 4067
e-posta:
ayse....@dkm.org.tr

serkan gul

unread,
Apr 19, 2016, 7:35:05 AM4/19/16
to max...@googlegroups.com
Hi Ayşe,

Thank you so much for your revealing answer.

all the best,
Serkan

Peter Galante

unread,
Apr 23, 2016, 9:47:44 AM4/23/16
to Maxent
Hi Serkan,

As Ayse mentioned, its a good idea to try both ways to really get a feel for your study region and data.
Maxent uses a type of regularization called "L1" regularization, which effectively "shrinks" the coefficients for each variable until they become 0, and are not used in the model. In this way it can use only a subset of the variables you try. If you want to know which variables were actually incorporated into your final model, check the lambdas file. Someone in this group posted a .pdf that explains the lambdas file that I found very helpful. Also, Maxent iterates through the variables, building and testing the gain from each variable, so you can get highly correlated variables that that are included in the model if they are important for your species, and the model was able to gain enough information from the small amount that they weren't correlated. 
In general, Maxent is pretty robust to problems with variable correlations. My worry is that if you do remove variables based on correlation, you run the risk of removing potentially important information that wasn't captured in the other variables.

Hope this helps,
Pete

serkan gul

unread,
May 4, 2016, 3:26:47 PM5/4/16
to max...@googlegroups.com
Hi Peter,

Thank you so much your suggests. I will follow your advise.

all the best,
Serkan
--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+un...@googlegroups.com.
To post to this group, send email to max...@googlegroups.com.
Visit this group at https://groups.google.com/group/maxent.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages