interpreting the jackknife tests

2,660 views
Skip to first unread message

Hira Fatima

unread,
Jun 24, 2014, 9:37:34 AM6/24/14
to max...@googlegroups.com
hi everybody
can anyone please help me in interpreting the gain in the jackknife test, like for example  a species shows a seasonal variations and is supposed to have high mortality or apparently dies out in cold temperatures but the model is showing high gain for those variables like for bio6 and bio11.. how can i interpret that..
Hira

Martin Damus

unread,
Jun 24, 2014, 11:01:09 AM6/24/14
to max...@googlegroups.com
That is a good result! The high gain means that these variables are good predictors for where the species can survive, and by your description (it is susceptible to cold) the high gain seems to be biologically meaningful.

The 'gain' is in general an indication of how much better-than-random the model fit is. A high gain for a particular variable therefore means that variable has a greater predictive value.

Cheers,
Martin


From: Hira Fatima <fatim...@gmail.com>
To: max...@googlegroups.com
Sent: Tuesday, June 24, 2014 9:37:34 AM
Subject: interpreting the jackknife tests

hi everybody
can anyone please help me in interpreting the gain in the jackknife test, like for example  a species shows a seasonal variations and is supposed to have high mortality or apparently dies out in cold temperatures but the model is showing high gain for those variables like for bio6 and bio11.. how can i interpret that..
Hira

--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+un...@googlegroups.com.
To post to this group, send email to max...@googlegroups.com.
Visit this group at http://groups.google.com/group/maxent.
For more options, visit https://groups.google.com/d/optout.


Husam El Alqamy

unread,
Jun 24, 2014, 3:16:11 PM6/24/14
to max...@googlegroups.com

it means that the distribution of the species is highly affected by this variable as you descriped. the response curve to this variable is expected to show the negative relation.

Hira Fatima

unread,
Jun 25, 2014, 12:17:57 AM6/25/14
to max...@googlegroups.com
Thankyou for your kind response... ok this sounds reassuring that this set of variables to which species is susceptible would contribute to model gain, but coming to response curves its not negative.. like i used principle components directly. and pca 4 is showing highest gain and it includes a combination of bio6 bio7 and bio11, having highest loadings (> 0.32) i.e.  0.69, -0.342, and 0.371 respectively, and here is what my response curve looks like:

Aedes_aegypti_pca4_only.png

Husam El Alqamy

unread,
Jun 25, 2014, 1:58:23 AM6/25/14
to max...@googlegroups.com
Yes... in this case the response curve is hard to interpret since it represents a composite of responses to the variables contributing to this PCA. I would suggest checking for collinearity and then modeling using plain variables. In all cases this approach make it easy to interpret the outcomes on terms of the used variable and check the model dependancy on the physical factors that are expected to shape the distribution of the species in a simpler ecological way rather than going into the complexities of accompanying statistical dilemas.
 


On Wed, Jun 25, 2014 at 8:17 AM, Hira Fatima <fatim...@gmail.com> wrote:
Thankyou for your kind response... ok this sounds reassuring that this set of variables to which species is susceptible would contribute to model gain, but coming to response curves its not negative.. like i used principle components directly. and pca 4 is showing highest gain and it includes a combination of bio6 bio7 and bio11, having highest loadings (> 0.32) i.e.  0.69, -0.342, and 0.371 respectively, and here is what my response curve looks like:

--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+un...@googlegroups.com.
To post to this group, send email to max...@googlegroups.com.
Visit this group at http://groups.google.com/group/maxent.
For more options, visit https://groups.google.com/d/optout.



--
Husam El Alqamy, B.Sc., M.Phil.
Sr. Biodiversity GIS Analyst ,
Environmental Information Sector, EIS
Environmental Agency Abu Dhabi,UAE
Antelope Specialist Group, ASG - IUCN

Martin Damus

unread,
Jun 25, 2014, 6:13:25 AM6/25/14
to max...@googlegroups.com
By the shape of your response curve I would suggest you included too many 'feature classes'. De-select "Auto features" and select only a few, perhaps linear, quadratic and threshold. You want your response curve to make biological sense, and a species would not respond to climate the way your response curve looks. It should be smoother. Maxent doesn't know it's modelling a living thing -- it wants to find the best fit to the data. But the presence data are not smoothly and evenly spread across the continuum of climate variables you are using, so it fits it with this choppy sort of response curve -- it is overfitting to the data points you supplied. You have to select fewer feature classes and / or increase the regularisation parameter in order to smooth things out.

Martin


Sent: Wednesday, June 25, 2014 12:17:57 AM
Subject: Re: interpreting the jackknife tests

Thankyou for your kind response... ok this sounds reassuring that this set of variables to which species is susceptible would contribute to model gain, but coming to response curves its not negative.. like i used principle components directly. and pca 4 is showing highest gain and it includes a combination of bio6 bio7 and bio11, having highest loadings (> 0.32) i.e.  0.69, -0.342, and 0.371 respectively, and here is what my response curve looks like:

Hira Fatima

unread,
Jun 25, 2014, 2:57:35 PM6/25/14
to max...@googlegroups.com
ok i tried using the linear quadratic and threshold feature as you told me and yes the results are smoother now, however the increase in response curve is there as shown in picture.. according to my understanding, since pca4 is composite of variables of temperature (min temperature of coldest month-"bio6", annual temperature range-"bio7", mean temperature of coldest quarter-"bio11" having the highest loadings), the component can be interpreted in terms of temperature variation, and the response curve is giving some kind of threshold that 0n around 130 or 140 values (around 13.0 C or 14.0 C) model starts showing a rapid increase in the logistic probability distribution which is pretty much understandable and biologically.. please tell if i am right..
Although i totally agree that plain variables are better for ecological interpretation....
P.S however increasing regularization parameter slight decreases value of AUC
Aedes_aegypti_pca4_only (1).png

Martin Damus

unread,
Jun 26, 2014, 10:01:51 AM6/26/14
to max...@googlegroups.com
Hmmm... I can understand the desire to remove collinearity by running maxent on PCA components rather than the original variables but I would be loath to try then to assign a biological response to the output. The mosquito is responding to temperature, moisture, etc. with its own combination of correlations, not to those forced by the translation to orthogonal PCA components. Maxent is robust to correlation among the variables, and I cannot say if it is as useful to run it on PCA components as it is to just run it on a subset of the original variables, even if they are somewhat or even highly correlated. I would like to leave comment on that to others who know the system better than I.

I also do not know if the PCA components are any longer scaled in the same units as the original variables. They may all have been in degrees celsius, but I do not think they can any longer be interpreted that way once they have been rotated by the PCA.

The decrease in AUC is expected because the increase in regularisation 'spreads' the distribution out further. Going just for highest AUC, as reported many times in this forum, is not the way to go if you want to identify the best model.

Cheers,
Martin


Sent: Wednesday, June 25, 2014 2:57:35 PM

Subject: Re: interpreting the jackknife tests
ok i tried using the linear quadratic and threshold feature as you told me and yes the results are smoother now, however the increase in response curve is there as shown in picture.. according to my understanding, since pca4 is composite of variables of temperature (min temperature of coldest month-"bio6", annual temperature range-"bio7", mean temperature of coldest quarter-"bio11" having the highest loadings), the component can be interpreted in terms of temperature variation, and the response curve is giving some kind of threshold that 0n around 130 or 140 values (around 13.0 C or 14.0 C) model starts showing a rapid increase in the logistic probability distribution which is pretty much understandable and biologically.. please tell if i am right..
Although i totally agree that plain variables are better for ecological interpretation....
P.S however increasing regularization parameter slight decreases value of AUC

Hira Fatima

unread,
Jun 26, 2014, 5:23:59 PM6/26/14
to max...@googlegroups.com, dam...@yahoo.com
Thankyou Martin for your kind response.. i am convinced now that using principle components wont be an option. i'll stick to plain variables.. i have  one more question please, if a variable is highly correlated but is ecologically important, should it be still included...???

Martin Damus

unread,
Jun 27, 2014, 5:55:38 AM6/27/14
to max...@googlegroups.com
Hi Hira,

If two variables are highly correlated (greater than 75%) I would keep the one that has the higher gain in the jackknife test. It is always interesting and useful to switch the two variables (drop the higher gain one and keep the other) and see if your outcome is similar, however. If it is not, then you may wonder about the robustness of the predicted distribution.

Cheers,
Martin


Cc: dam...@yahoo.com
Sent: Thursday, June 26, 2014 5:23:59 PM

Hira Fatima

unread,
Jun 29, 2014, 2:39:34 AM6/29/14
to max...@googlegroups.com, dam...@yahoo.com
ok things are pretty clear now.. Thankyou so much :)


Reply all
Reply to author
Forward
0 new messages