Dear Maxent groupers,
I am fresh in SDM modeling and related statistics. And, I can not
reach who can help me nearby. I think my problem is maybe very simple
for you.
I found that two statistics used in a publication (NAKAZATO et. al.
(2010) American Journal of Botany 97(4): 680-693) are definitely
necessary for me to implement in my analysis. The two statistics are
showed in P682 in this
paper, the threshold of equal sensitivity and specificity and the
binomial probabilities for any species at this threshold. Also, in
P689 in this paper, author said that "The spatial distribution of
potentially suitable habitat, based on SDMs at the threshold of equal
sensitivity and specificity".
I do not know how to get these two statistics, though I have checked
carefully the Maxent guide book. It seems that these two statistics
were not described directly in Maxent guide book for me. Could you
please give any directions or advice on that, again? Your help is very
helpful and valuable for me.
What I have gotten by analysis the Maxent output in R is listed below.
Thanks a lot.
Best,
Sincerely,
Mao J-F
the Institute of Botany, Chinese Academy of Sciences
#############################################################################################################################################
# Part 1 AUC and boot
#############################################################################################################################################
> # read in the Maxent predictions at the presence and background points, and extract the columns we need
>
> presence <- read.csv("py_samplePredictions.csv",header=TRUE)
> background <- read.csv("py_backgroundPredictions.csv", header=TRUE)
> pp <- presence$Logistic.prediction # get the column of predictions
> testpp <- pp[presence$Test.or.train=="test"] # select only test points
> trainpp <- pp[presence$Test.or.train=="train"] # select only test points
> bb <- background$logistic
>
> # put the prediction values into the format required by ROCR, the package we will use to do some ROC analysis, and generate the ROC curve:
>
> combined <- c(testpp, bb) # combine into a single vector
> label <- c(rep(1,length(testpp)),rep(0,length(bb))) # labels: 1=present, 0=random
> pred <- prediction(combined, label) # labeled predictions
> perf <- performance(pred, "tpr", "fpr") # True / false positives, for ROC curve
> plot(perf, colorize=TRUE) # Show the ROC curve
> performance(pred, "auc")@y.values[[1]] # Calculate the AUC
[1] 0.9975658
>
> # Next, as an example of a test available in R but not in Maxent, we will make a bootstrap estimate of the standard deviation of the AUC.
> # 这个分析在R中实现,在Maxent不能实现
>
> AUC <- function(p,ind) {
+ pres <- p[ind]
+ combined <- c(pres, bb)
+ label <- c(rep(1,length(pres)),rep(0,length(bb)))
+ predic <- prediction(combined, label)
+ return(performance(predic, "auc")@y.values[[1]])
+ }
>
> b1 <- boot(testpp, AUC, 100) # do 100 bootstrap AUC calculations
> b1 # gives estimates of standard error and bias
ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = testpp, statistic = AUC, R = 100)
Bootstrap Statistics :
original bias std. error
t1* 0.9975658 -1.618421e-05 0.0004633955
>
> # The bootstrap results can also be used to determine confidence intervals for the AUC:
>
>
boot.ci(b1)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 100 bootstrap replicates
CALL :
boot.ci(boot.out = b1)
Intervals :
Level Normal Basic
95% ( 0.9967, 0.9985 ) ( 0.9967, 0.9987 )
Level Percentile BCa
95% ( 0.9964, 0.9984 ) ( 0.9964, 0.9983 )
Calculations and Intervals on Original Scale
Some basic intervals may be unstable
Some percentile intervals may be unstable
Some BCa intervals may be unstable
Warning message:
In
boot.ci(b1) : bootstrap variances needed for studentized intervals
#############################################################################################################################################
# Part 2 threshold and binomial probabilities
##############################################################################################################################################
> # the calculation of binomial and Cohen's Kappa statistics for some example threshold rules
> # First, the following R code calculates Kappa for the threshold given by the minimum presence prediction:
>
> confusion <- function(thresh) {
+ return(cbind(c(length(testpp[testpp>=thresh]),
length(testpp[testpp<thresh])),
+ c(length(bb[bb>=thresh]), length(bb[bb<thresh]))))
+ }
> mykappa <- function(thresh) {
+ return(Kappa(confusion(thresh)))
+ }
> mykappa(min(trainpp))
value ASE
Unweighted 0.2821367 0.051722953
Weighted 0.2821367 0.001648735
>
> # use the threshold that minimizes the sum of sensitivity and specificity on the test data, (最低估计灵敏性和特异性)
> # we can do the following, using the true positive rate and false positive rate values
> # from the "performance" object used above to plot the ROC curve:
>
> fpr = pe...@x.values[[1]]
> tpr = pe...@y.values[[1]]
> sum = tpr + (1-fpr)
> index = which.max(sum)
> cutoff = pe...@alpha.values[[1]][[index]]
> mykappa(cutoff) #This gives a kappa value
value ASE
Unweighted 0.3839954 0.055896135
Weighted 0.3839954 0.001412725
>
> # To determine binomial probabilities for these two threshold values
>
> mybinomial <- function(thresh) {
+ conf <- confusion(thresh)
+ trials <- length(testpp)
+ return(binom.test(conf[[1]][[1]], trials, conf[[1,2]] /
length(bb), "greater"))
+ }
> mybinomial(min(trainpp)) #This gives p-values
Exact binomial test
data: conf[[1]][[1]] and trials
number of successes = 38, number of trials = 38, p-value < 2.2e-16
alternative hypothesis: true probability of success is greater than
0.0189
95 percent confidence interval:
0.9241923 1.0000000
sample estimates:
probability of success
1
> mybinomial(cutoff) #This gives p-values
Exact binomial test
data: conf[[1]][[1]] and trials
number of successes = 38, number of trials = 38, p-value < 2.2e-16
alternative hypothesis: true probability of success is greater than
0.012
95 percent confidence interval:
0.9241923 1.0000000
sample estimates:
probability of success
1