With the help of both your great minds I think I (partly) figured it out--though there are still some important differences between the versions.
maxnet() automatically removes presences from the data before calculating raw output (thanks, Pascal!), whereas the default in maxent() is to keep them in. So if we re-write the code setting addsamplestobackground to "false" for maxent() then the results are *very* similar. I would put this down to rounding error or maybe the approximation used in maxnet() to solve the inhomogenous Poisson process where the value of 100 is used to approximate infinity (the next time someone owes me a dollar I'm going to ask for infinity cents return since they're approximately the same). So here's revised code. You can see the results vary between the two models by some very small amount (but see thereafter for remaining differences).
# setup
library(maxnet)
library(dismo)
data(bradypus)
data <- bradypus[ , 2:3] # just using these two predictors to make it simple
p <- bradypus$presence
# train models
# using just linear, product, and quadratic features as hinge and threshold features are calcualted differently between the two versions
# note: using addsamplestobackground=false in maxent() because this is default in maxnet()
netModel <- maxnet(p=as.vector(p), data=data, f=maxnet.formula(p=p, data=data, classes='lpq'))
mxModel <- maxent(p=as.vector(p), x=data, args=c('linear=true', 'quadratic=true', 'product=true', 'threshold=false', 'hinge=false', 'addsamplestobackground=false'))
# predict using raw output
netPredRaw <- c(predict(netModel, data, type='exponential'))
mxPredRaw <- predict(mxModel, data, args='outputformat=raw')
# they're now slightly different!
head(netPredRaw, 20)
head(mxPredRaw, 20)
par(mfrow=c(1, 2))
plot(mxPredRaw, netPredRaw, xlim=c(0, max(netPredRaw, mxPredRaw)), ylim=c(0, max(netPredRaw, mxPredRaw)), , main='ver 3.4.0 vs ver 3.3.3k')
abline(a=0, b=1)
plot(netPredRaw - mxPredRaw, main='ver 3.4.0 minus ver 3.3.3k')
# does sum of "raw" output equal 1? (calculating across background sites only)
sum(netPredRaw[p == 0])
sum(mxPredRaw[p == 0])
# entopies are slightly different
netModel$entropy
mxModel@lambdas[length(mxModel@lambdas)]
# predict using logistic output
netPredLog <- c(predict(netModel, data, type='logistic'))
mxPredLog <- predict(mxModel, data, args='outputformat=logistic')
# they're slightly different!
head(netPredLog, 20)
head(mxPredLog, 20)
par(mfrow=c(1, 2))
plot(netPredLog, mxPredLog, xlim=c(0, max(netPredLog, mxPredLog)), ylim=c(0, max(netPredLog, mxPredLog)), , main='ver 3.4.0 vs ver 3.3.3k')
abline(a=0, b=1)
plot(netPredLog - mxPredLog, main='ver 3.4.0 minus ver 3.3.3k')
# compare "manual" logistic output of maxnet() (see p. 7 of new tutorial for maxnet()) to logistic output from predict.maxnet()... very similar but not exactly (rounding error?)
netPredRawManual <- (exp(netModel$entropy) * netPredRaw) / (1 + exp(netModel$entropy) + netPredRaw)
windows()
plot(netPredRawManual - netPredRaw)
## Note, though that the coefficients are still different between the models (using Jamie's code--thanks!):
# Jamie: a little function I wrote that makes the lambdas vector in dismo into a data frame for easier querying
lambdasDF <- function(mx) {
lambdas <- mx@lambdas[1:(length(mx@lambdas)-4)]
data.frame(var=sapply(lambdas, FUN=function(x) strsplit(x, ',')[[1]][1]),
coef=sapply(lambdas, FUN=function(x) as.numeric(strsplit(x, ',')[[1]][2])),
min=sapply(lambdas, FUN=function(x) as.numeric(strsplit(x, ',')[[1]][3])),
max=sapply(lambdas, FUN=function(x) as.numeric(strsplit(x, ',')[[1]][4])),
row.names=1:length(lambdas))
}
library(dplyr)
lambdasDF(mxModel) %>% filter(coef != 0)
netModel$betas
length(netModel$betas)
## I'm guessing the difference is because the maxnet() coefficients are on the scale of the predictors (like using predict() on a glm object but not specifying predict(~~~~, type='response')). So maybe this wouldn't be bothersome, except that the response functions are more than slightly different:
plot(netModel, vars=names(data), type='logistic')
windows()
response(mxModel)
Which is the "true" Maxent?
Adam