Make predictions on the logit scale in unmarked

Joshua Jones

unread,

Jul 19, 2021, 3:55:02 PM7/19/21

to unmarked

I have occupancy predictions produced using the package unmarked using the function predict(). However, for the next stage of my analysis I need these predictions on the logit scale, whereas unmarked makes these predictions on the probability scale. Does anyone know how to make predictions from an occupancy model on a logit scale?

Ken Kellner

unread,

Jul 19, 2021, 3:56:39 PM7/19/21

to unmarked

Set the backTransform argument to FALSE.

predict(..., backTransform=FALSE)

Joshua Jones

unread,

Jul 19, 2021, 4:09:49 PM7/19/21

to unmarked

Worked perfectly! Thank you. One follow up, is there any reason why the data produced by unmarked on a logit scale is three orders of magnitude more than that produced by a binary GLM using naive occupancy values?

Ken Kellner

unread,

Jul 19, 2021, 4:15:49 PM7/19/21

to unmarked

I'm guessing you have very low estimated detection probabilities, resulting in the estimated occupancy being much higher than naive occupancy (on both scales). Or it may be a result of problems with model fit as you suggested in a previous post (indicated by NaNs) which looks like it was related to the occupancy covariate. It's hard to say without seeing the model results and/or the input data.

Ken

Joshua Jones

unread,

Jul 19, 2021, 5:14:00 PM7/19/21

to unmarked

Hi Ken,

Just to see if my models are usable here is the output for the unmarked model:

Head

Predicted SE lower upper acd

1 5.323980 2.494605 0.4346445 10.21332 4

2 6.757680 3.034477 0.8102155 12.70515 5

3 8.191380 3.579660 1.1753754 15.20739 6

4 9.625081 4.128052 1.5342480 17.71591 7

5 11.058781 4.678523 1.8890441 20.22852 8

6 12.492481 5.230418 2.2410505 22.74391 9

Tail

Predicted SE lower upper acd

296 428.2655 166.7728 101.3968 755.1342 299

297 429.6992 167.3300 101.7384 757.6600 300

298 431.1329 167.8872 102.0801 760.1857 301

299 432.5666 168.4444 102.4217 762.7115 302

300 434.0003 169.0016 102.7633 765.2373 303

301 435.4340 169.5587 103.1050 767.7630 304

And here they are for the binary GLM:

Head

acd pred se upperCI lowerCI family

1 4 -3.327853 0.6340353 -2.085144 -4.570562 Buprestidae

2 5 -3.321068 0.6306797 -2.084935 -4.557200 Buprestidae

3 6 -3.314282 0.6273319 -2.084711 -4.543853 Buprestidae

4 7 -3.307496 0.6239923 -2.084472 -4.530521 Buprestidae

5 8 -3.300711 0.6206609 -2.084216 -4.517206 Buprestidae

6 9 -3.293925 0.6173377 -2.083943 -4.503907 Buprestidae

Tail

acd pred se upperCI lowerCI family

296 299 -1.326107 0.7475176 0.1390279 -2.791241 Buprestidae

297 300 -1.319321 0.7510805 0.1527968 -2.791439 Buprestidae

298 301 -1.312535 0.7546481 0.1665749 -2.791646 Buprestidae

299 302 -1.305750 0.7582204 0.1803621 -2.791862 Buprestidae

300 303 -1.298964 0.7617972 0.1941582 -2.792087 Buprestidae

301 304 -1.292179 0.7653785 0.2079632 -2.792321 Buprestidae

Looking at the models, the occupancy isn't higher overall but it is more spread out. This could be due to low detection probabilities though I'd assume? And if so would I be correct in saying the unmarked models are a truer reflection of the actual data than the naive binary glms?

Ken Kellner

unread,

Jul 21, 2021, 10:29:12 AM7/21/21

to unmarked

To make a more informed assessment I'd also need to see the summary output (parameter estimates) from your final fitted model, I'm not sure if you are using the one from your previous post or a new one. The short answer here is no, based solely on this output I wouldn't use the model.

Even the smallest linear predictor you've generated (~5) here with predict corresponds to an occupancy ~1. When that's the case I would not be very confident in parameter estimates associated with the occupancy model - even though you have a range in the linear predictor from 5-435 the range in the actual occupancy is nonexistent - every site is 1. The model is going to really struggle to determine the effect of acd when it thinks every site is occupied.

I'm guessing the SEs of the parameter estimates are very large as well, which results in the huge errors around the estimates here. That suggests to me that there is some kind of issue with the input data, either the response or the covariate. I would take three steps here: first try fitting the model without a covariate (without acd) on occupancy, and see if the occupancy estimates you get back from predict are more reasonable. Then try fitting the model with acd again, but this time scale it to a Z-score first. Your covariate acd takes on some very large absolute values which might result in occu() struggling to get a good parameter estimate.

Finally, I would take another look at the response/y data. Is there at least one detection at every site? I'm guessing based on your naive model the answer is no, but just wanted to double check.

Ken

Reply all

Reply to author

Forward