Raw or Logistic output?

120 views
Skip to first unread message

Andros ST

unread,
Sep 29, 2020, 1:15:33 AM9/29/20
to Maxent
Hi, 
I'm trying to justify why I should use the Raw output instead of the Logistic output in Maxent, or vice versa. If I remember well, in some papers, autors like Merow et al. or Elith et al. recommend using Raw output rather than Logistic, because the latter makes some assumptions about the prevalence value (which we don't know a priori), and that could distort the results taking them away from reality. However, Raw output usually presents less dispersed results than Logistic (I think that's also because in Logistic you can modify the regularization parameter). So, I understand that Raw seems to produce more strict and accurate results than Logistic, nevertheless, I see that most of published papers use the Logistic output...and I don't understand why. Choosing the correct (or better) output is being a headache, if someone can help me with this dilemma I would be very grateful. 
If you need information about the objective of my work, it consists in making a model about a rare rodent with a stricted distribution determined by tidal marsh habitats.

Andros.

Stênio Foerster

unread,
Sep 29, 2020, 1:29:32 AM9/29/20
to max...@googlegroups.com
That's a good question. Maybe you can get some insights about it in Phillips et al. (2017).


--
You received this message because you are subscribed to the Google Groups "Maxent" group.
To unsubscribe from this group and stop receiving emails from it, send an email to maxent+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/maxent/afa0286a-c930-4513-8a7f-8eaf5201db87n%40googlegroups.com.

Adam B. Smith

unread,
Sep 29, 2020, 5:29:11 PM9/29/20
to Maxent
I think the reason logistic (or complementary log-log, the default in newer versions) is more common is that it puts predictions on a scale that is commensurate with just about any other SDM out there.  I can't recall a single paper I have read that showed maps where predicted values were outside [0, 1].  Also, with raw output the limits are not readily apparent (e.g., is 10^-6 high or low?), whereas it is with logistic/cloglog.  Merow says you should use raw when mapping ranks of locations, but the logistic and cloglog transforms preserve the ranking. The raw output tends to visually exaggerate the rank differences between sites compared to logistic/cloglog, so that's why Merow recommended it for these kinds of maps. Logistic and cloglpg compress ranks near the top and bottom, so it's hard to differentiate, say, very good versus very, very good sites.

Best,
Adam
~~~~~~~~~~~~~~~~~
Adam B. Smith, Associate Scientist in Global Change

Andros ST

unread,
Oct 14, 2020, 8:19:32 PM10/14/20
to max...@googlegroups.com
Thank you all, I'm starting to understand why is logistic output more common, but still, ¿isn't risky making the assumptions that makes the logistic output?


Adam B. Smith

unread,
Oct 15, 2020, 12:33:10 PM10/15/20
to Maxent
Hi Andros,

The main assumption behind logistic output is that occurrences represent detections with effort that guarantees ~50% detection when the species is actually present (it's a little different for complementary log-log).  You can change this value (the 50%) when you run the model and make predictions (at least in the jar, I haven't tried in R).  Regardless, it's not really recommended to interpret logistic or c-loglog output as a real probability.  You need real presence/absence data for that.  Rather, if the model is good, it's presumably correlated with the real probability of presence.  Again, though, the rank order of sites (best to worst) will be the same regardless of whether you use raw or logistic/cloglog output.

Adam

Reply all
Reply to author
Forward
0 new messages