Maxent does a neat trick--it first estimates the probability of the environment at presence sites by comparing their frequency to the frequency of the environmental values across the landscape (the background sites). It then uses Bayes' theorem to invert this to get the probability of occurrence given the environment (or an index thereof--you can't actually estimate the probability of occurrence without absence data). You can see in that the first step above that you need an estimate of the frequency of each variable in the background. So if you were to use absence data, it would not give you an unbiased estimate of the frequency of each variable across the entire landscape.
In some cases we used biased background sites to cancel out bias in the occurrences, but otherwise the background data should still be "background" (not absences). To see why, consider the case where a species occupies all habitats of a special type, and no habitats outside that. In this case, if we used absences, which are associated only with environments the species cannot live in, Maxent cannot use the environmental values associated with the absences to calculate the probability of the environment given occurrence because there are no such environments in the "background" (absence) data. In practice, Maxent will still work if you use absences, but the output is not mathematically robust.
A more intuitive way to think about what Maxent does is considering he question, "I just saw a polar bear. What kind of climate am I currently in?" Unless I'm in a zoo, I'm most certainly in a very cold climate. You figured this out by knowing 1) that polar bears prefer cold; and 2) there are cold places on Earth (the background available to polar bears). So now you know the "probability" I'm in a cold environment given that I saw a polar bear. But then, you invert the question (the Bayes part): "I'm in a very cold place--could I see a polar bear?" Obviously, there are conditions (you could be at the top of a temperature/tropical mountain or in the Antarctic), but that's where the study design comes in (e.g., selecting only accessible areas from which to draw background points).
I hope that helps,
Adam
~~~~~~~~~~~~~~~~~~
Adam B. Smith, Ph.D.
enmSdm: An R package for better, faster modeling of niches and distributions