This is a technical answer to a technical question, but hopefully it
will be useful to some people.
The entries in the lambda file are indeed
feature, lambda, min, max
The exponent is then calculated as
lambda_1 * f_1(x)/(max_1 - min_1) + ... + lambda_n * f_n(x)/(max_n -
min_n) - linearPredictorNormalizer
i.e., features are scaled so that their values would lie between 0 and
1 on the training data. The linearPredictorNormalizer is a constant
chosen so that the exponent is always negative (for numerical
stability).
Terms corresponding to hinge features are evaluated slightly
differently. For example the hinge feature prec' derived from the
layer prec and described by the line
prec', lambda, min, max
evaluates to the term
lambda * clamp_at_0(prec-min)/(max-min)
i.e., if prec<min then the value is 0 otherwise it is (prec-min)/(max-min).
For the reverse hinge feature
prec`, lambda, min, max
the term is
lambda * clamp_at_0(max-prec)/(max-min)
The densityNormalizer is the normalization term Z calculated over the
background. The simplest way to check whether you understand how
Maxent is calculating the probabilities is to use the following
"hack". Call
java density.NceasApply species.lambdas test_samples.csv test_
where test_samples.csv is similar to the files in SWD format (except
for the initial colums):
dataset,siteid,x,y,layer1,layer2,...,layern
testdata,site1,200.0,30.0,113,-3.3,...,88
testdata,site2,220.0,23.2,234,-33,...,12
....
the resulting file test_species.csv will contain predictions using the
logistic model specified in species.lambdas, i.e., instead of
outputting the raw probabilities q(x), it is calculating
q(x)*e^entropy / (1+q(x)*e^entropy)
with layers, features and the exponent appropriately clamped (the
exponent is clamped to be smaller than 0 after subtracting
linearPredictorNormalizer; layers and features are clamped to lie
between their training min and max). NceasApply is working for now
(version 3.2), but it is not particularly user friendly and may not be
maintained in the future.
Miro
I am still trying to understand the jackknife process in Maxent (and
all theoretical approach of distribution modeling/ variable importance).
As far i know, the Jackknife is a non parametric estimator, randomize
the samples without replacement and gives an estimate of the importance
of each one of them to the model. ok.
The variables returns to "total pool" in each iteration or not? Is this
an modification of Jackknife?
I dont understand how it has been implemented in Maxent (and another
softwares, like OpenModeler). What is "m", "S" and "Q" (as presented in
Heltshe, J. & Forrester, N.E. 1983 . Estimating species richness using
the jackknife procedure. /Biometrics/ *39*, 1-11.)?
Can someone help me?
Thanks in advance
Francisco Barreto
--
-------------------------------------------------------------------
Francisco Candido Cardoso Barreto
Professor de Zoologia
Universidade Federal de Ouro Preto
Departamento de Ciências Biológicas - ICEB
Ouro Preto - MG
CEP 35400-000
-------------------------------------------------------------------
Programa de Pós Graduação em Entomologia
Universidade Federal de Viçosa
tel. (31) 8420-8499
*CV lattes: http://lattes.cnpq.br/4288236123208847