Understanding maxent

329 views
Skip to first unread message

degoe

unread,
Feb 7, 2008, 7:55:11 AM2/7/08
to Maxent
Hello everybody,

I want to use maxent to build a habitatmodel for my diploma thesis.

To get a better understanding I wanted to figure out if I can
reconstruct the maxent distribution function (q(x) = 1/Z *
exp(lambda_1 * f_1(x) + ... + lambda_m * f_m(x)) from the .lambdas-
file.

I used only one enviromental layer (prec) limited to a small island.
The map actually had a size of 14x10 pixels (including water) and I
used six arbitrary pixels for training.

So the model should look like this if I use only linear features :
q(x) = 1/Z * exp(lambda*prec(x))

The parameters from the lambdas-file:

-------------------
prec, 5.26784679491924, 59.0, 76.0
linearPredictorNormalizer, 5.26784679491924
densityNormalizer, 817.0795072192808
numBackgroundPoints, 10000
entropy, 8.124905977703612
------------------
The densityNormalizer should be Z and the first line is "layername,
lambda, min, max", is that right?
I couldn't find any information about the linearPredictorNormalizer.
How is it related to the function?

The reconstructed function, fed with the data from the enviromental
layer and the lambda from the maxent calculation, doesn't show the
same results as the output from the software.
If I calculate everything manually (openoffice calc) Z
(=Sum_x(exp(lambda*prec(x))) is 7E+173. Using this to calculate the
raw probabilities produces values between 1E-35 and 0.999. But the
predicted probabilities (software output) are all around 1E-5 (raw-
prob).

Perhaps somebody knows (or guess) what I'm doing wrong.


Thanks for any help
Dennis

Miroslav Dudik

unread,
Feb 7, 2008, 1:58:54 PM2/7/08
to Max...@googlegroups.com
Hi Dennis and others,

This is a technical answer to a technical question, but hopefully it
will be useful to some people.

The entries in the lambda file are indeed
feature, lambda, min, max
The exponent is then calculated as
lambda_1 * f_1(x)/(max_1 - min_1) + ... + lambda_n * f_n(x)/(max_n -
min_n) - linearPredictorNormalizer
i.e., features are scaled so that their values would lie between 0 and
1 on the training data. The linearPredictorNormalizer is a constant
chosen so that the exponent is always negative (for numerical
stability).

Terms corresponding to hinge features are evaluated slightly
differently. For example the hinge feature prec' derived from the
layer prec and described by the line
prec', lambda, min, max
evaluates to the term
lambda * clamp_at_0(prec-min)/(max-min)
i.e., if prec<min then the value is 0 otherwise it is (prec-min)/(max-min).

For the reverse hinge feature
prec`, lambda, min, max
the term is
lambda * clamp_at_0(max-prec)/(max-min)

The densityNormalizer is the normalization term Z calculated over the
background. The simplest way to check whether you understand how
Maxent is calculating the probabilities is to use the following
"hack". Call
java density.NceasApply species.lambdas test_samples.csv test_
where test_samples.csv is similar to the files in SWD format (except
for the initial colums):

dataset,siteid,x,y,layer1,layer2,...,layern
testdata,site1,200.0,30.0,113,-3.3,...,88
testdata,site2,220.0,23.2,234,-33,...,12
....

the resulting file test_species.csv will contain predictions using the
logistic model specified in species.lambdas, i.e., instead of
outputting the raw probabilities q(x), it is calculating
q(x)*e^entropy / (1+q(x)*e^entropy)
with layers, features and the exponent appropriately clamped (the
exponent is clamped to be smaller than 0 after subtracting
linearPredictorNormalizer; layers and features are clamped to lie
between their training min and max). NceasApply is working for now
(version 3.2), but it is not particularly user friendly and may not be
maintained in the future.

Miro

degoe

unread,
Feb 11, 2008, 6:22:11 AM2/11/08
to Maxent
Thank you for your answer. It helped me a lot. I just had not the time
to try NceasApply.

Dennis

B. Miller

unread,
Feb 12, 2008, 8:53:10 PM2/12/08
to Max...@googlegroups.com
Emacs!
Where can I find the raw data in the MaxEnt output to indicate which actual categorical value is the habitat that is showing the most influence on the model?  It appears that one could interpolate and come close but there must be a more exact way.

I am using a habitat grid with 80 variables that is coded and used as a categorical value.  
This works well with all of the layers of numerical valued grids.

For further mapping I would like to be able to use the GIS polygon layer and evaluate the extent of each of these habitat variables.

Thanks for any insight.

Bruce
 
2fd4482.jpg

Francisco Barreto

unread,
Mar 6, 2008, 2:06:30 PM3/6/08
to Max...@googlegroups.com
Hi all,

I am still trying to understand the jackknife process in Maxent (and
all theoretical approach of distribution modeling/ variable importance).
As far i know, the Jackknife is a non parametric estimator, randomize
the samples without replacement and gives an estimate of the importance
of each one of them to the model. ok.

The variables returns to "total pool" in each iteration or not? Is this
an modification of Jackknife?

I dont understand how it has been implemented in Maxent (and another
softwares, like OpenModeler). What is "m", "S" and "Q" (as presented in
Heltshe, J. & Forrester, N.E. 1983 . Estimating species richness using
the jackknife procedure. /Biometrics/ *39*, 1-11.)?

Can someone help me?

Thanks in advance

Francisco Barreto

--
-------------------------------------------------------------------
Francisco Candido Cardoso Barreto
Professor de Zoologia
Universidade Federal de Ouro Preto
Departamento de Ciências Biológicas - ICEB
Ouro Preto - MG
CEP 35400-000
-------------------------------------------------------------------
Programa de Pós Graduação em Entomologia
Universidade Federal de Viçosa
tel. (31) 8420-8499

*CV lattes: http://lattes.cnpq.br/4288236123208847

Reply all
Reply to author
Forward
0 new messages