Choice of response variable in NPMR

36 views
Skip to first unread message

Daniel Palacios

unread,
Aug 25, 2016, 9:06:17 PM8/25/16
to HyperNiche and NPMR
I'm trying to decide between two possible response variables in my NPMR model and I could use some feedback. The attached scatterplot matrices show the response variable (bState or pARS) along the bottom row with respect to a selection of terrain variables. I have a good handle on the predictors (2nd row and above in the scatterplot matrices attached), including dealing with collinear variables, so I'd like to focus the discussion on the distributional shape of the response variable (lower-right density panel) and how it would impact the model.

The two options for response variable are:
- bState: the mean behavioral state of all animals in a SU, a continuous value between 0 and 1, where 0 is "pure transiting" and 1 is "pure residence", and
- pARS: the mean proportion of "residents" to the total ("transients+residents") in the SU. I should note that the input values for pARS are essentially a discretized version of bState (as you can see along the bottom row of panels of the pARS plots attached).

My main question is: given the distributional shapes of these variables, can the group comment on which one would be a better choice? I have ran preliminary models, and bState gives a somewhat higher xR2 than pARS, but it's not a substantial difference. The selection of important predictors in both models is also similar, but not identical. My interest in using pARS over bState is that it is conceptually easier to explain as a proportion, while bState is a little harder to justify biologically, as in principle an animal should be either transiting (0) or resident (1) and not somewhere along that gradient.

A related question is whether I should transform pARS. Since this is a proportion, my understanding is that the arc-sin square-root transformation is appropriate. The density distribution of the transformed pARS does not change much, but it's shifted to the right a little bit. A preliminary model run with the transformed pARS actually results in a xR2 that's a tiny bit lower than the run with the untransformed pARS. Is there a compelling reason why I should transform, or is it okay to leave it unstransformed?

Your thoughts are much appreciated.

Daniel

 















dmtrx_grid_scatter_bstate_bathy.png
dmtrx_grid_scatter_pARS_bathy.png

Daniel Palacios

unread,
Aug 26, 2016, 4:17:52 PM8/26/16
to HyperNiche and NPMR
To provide a little more detail, I'm attaching here the scatterplot matrix containing the arcsine-squareroot transformed version of pARS (pARSasr). As per McCune and Grace (2002), the effect is that it spreads the ends and compresses the middle of the variable (relative to the density of pARS in the 2nd plot I sent yesterday).

For simple illustration, I have run NPMRs for a purely spatial-coordinates model (longitude x latitude), with the following xR2 results:
bState ~ lon x lat; xR2 = 0.15
pARS ~ lon x lat; xR2 = 0.12
pARSasr ~ lon x lat; xR2 = 0.11

The predicted 2-D surfaces for pARS and pARSasr look fairly similar, with interesting structure/detail, while the predicted 2-D surface for bState looks more uniform and smooth. See the 3 graphs attached.

Based on these results, and presuming that the shape of the pARSasr response variable is reasonable as input for NPMR, I think I'm going to go with this response variable in my modeling. I am willing to forego the tiny reduction in xR2.

Thoughts?

Thanks,

Daniel
dmtrx_grid_scatter_pARSasr_bathy.png
SU538_grph_bstate_lonlat.jpg
SU538_grph_pars_lonlat.jpg
SU538_grph_parsasr_lonlat.jpg

Bruce McCune

unread,
Aug 27, 2016, 10:59:31 AM8/27/16
to hyper...@googlegroups.com
Daniel, To my eye, the biggest difference between the response variables is that pARS has large piles of points on 0 and 1, while bState has a more uniform distribution. That clumping of response values apparently results in HyperNiche choosing a narrower tolerance for pARS, which in turn results in bumpier response surfaces.

You make a reasonable case for pARS, all else being equal. But it does appear that pARS isn't quite as strongly related to your predictors as bState. What I would try is forcing a more conservative model for pARS, setting the tolerance to give a relatively non-bumpy surface, similar to bState, THEN evaluate the difference in fit. I would expect this to drop the xR2 for pARS even further, but it would be interesting to know how much.

You can make the model more conservative by either hand tweaking the tolerances (smoothing parameters) to higher values or by setting more conservative model fitting parameters (in particular, increasing the minimum average neighborhood size).

As for arcsine square-root transforming pARS -- I don't see much difference, so using the principle of minimal transformation unless necessary, I'd probably stick with straight pARS. Besides, in this case it seems to have pushed the data slightly towards "trinary" (if there is such a word), with 3 values, zero, middle (0.5), and 1.

Bruce McCune



--
You received this message because you are subscribed to the Google Groups "HyperNiche and NPMR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hyperniche+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Daniel Palacios

unread,
Aug 27, 2016, 4:33:44 PM8/27/16
to HyperNiche and NPMR
Bruce: Thank you for your helpful response.

Interestingly, from my experience with these data my sense is that the predicted 2-D surface for bState is a bit too smooth while the bumps and dimples in pARS seem to better capture the observed spatial pattern. I also note that the spatial domain in the bState prediction extends beyond the boundaries of the data (as can be seen in the extreme north, where there are few SUs), while the predicted surface for pARS stays closer to the domain of the data -- likely a result of HyperNiche choosing a wider tolerance for bState.

So I have a couple questions:

Is HyperNiche's choice for a narrower tolerance for pARS undesirable? And,

Would it make sense to do the reverse of your suggestion, i.e., to choose a narrower tolerance for bState such that the predicted surface is bumpier and then compare with the fit for pARS?

So far I have chosen the "aggressive" setting in all models I have tried and have not otherwise tweaked any of the parameters.

Thanks!

Daniel
To unsubscribe from this group and stop receiving emails from it, send an email to hyperniche+...@googlegroups.com.

Bruce McCune

unread,
Aug 27, 2016, 7:35:20 PM8/27/16
to hyper...@googlegroups.com
Ah, ok, if the bumps and dimples make sense then yes, it's fine, and the narrower tolerances are not undesirable.

Likewise, you can adjust the settings to make bState fit the spatial data more closely.

Note that the degree of extrapolation in the fitted response is controlled by the minimum neighborhood size for making an estimate, so you can set that in the step of generating a response surface, independent of the minimum AVERAGE neighborhood size for the model overall.

Bruce

To unsubscribe from this group and stop receiving emails from it, send an email to hyperniche+unsubscribe@googlegroups.com.

Daniel Palacios

unread,
Aug 30, 2016, 2:18:54 PM8/30/16
to HyperNiche and NPMR
Bruce: thank you for your helpful practical advice. I will continue experimenting with the fitting controls, and if I have further questions I'll get back to the group.

For now I had a couple of "feature requests" for future versions of HyperNiche:

a) The possibility of inserting a transformed variable as a new column in the current data matrix. I find the ability to have more than one response variable in the same response matrix a great time and space saver in terms of generating models and having their results together, as this facilitates comparisons especially in the early phases of modeling. Right now having the most common transformations (in my case the arcsine-squareroot) built into HyperNiche is fantastic, but unless I'm missing something the results need to be saved as a new response matrix.

b) An analogous way for Screen Predictors of selecting/deleting all but best models for N predictors that is available for the Free Search output. The results from Screen Predictors (i.e. univariate fits) can be very illuminating in the early phases, but I didn't see a way of deleting all but the best model for each predictor, so I had to export he full results to a spreadsheet, sort them by predictor and by xR2 in descending order, and then deleting all but the best model for each predictor in order to obtain a clean table that is helpful in ranking each predictor.

I'm not a very experienced user of HyperNiche (although I'd like to become one), so please ignore this if these features are available and I just didn't find them.

Thanks,

Daniel

Bruce McCune

unread,
Aug 31, 2016, 5:11:34 PM8/31/16
to hyper...@googlegroups.com
Good suggestions, Daniel. I will add those to the list.
Bruce

To unsubscribe from this group and stop receiving emails from it, send an email to hyperniche+unsubscribe@googlegroups.com.

Daniel Palacios

unread,
Aug 31, 2016, 8:14:07 PM8/31/16
to HyperNiche and NPMR
Bruce:

Where you say "hand tweaking the tolerances (smoothing parameters) to higher values" are you talking about manually adjusting the "Minimum Neighborhood Size For Estimate" setting under the "Output Options" tab? I just couldn't find other settings that would control tolerances/smoothing parameters by these names.

Thanks,

Daniel


On Saturday, August 27, 2016 at 7:59:31 AM UTC-7, Bruce McCune wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to hyperniche+...@googlegroups.com.

Bruce McCune

unread,
Sep 2, 2016, 10:38:49 PM9/2/16
to hyper...@googlegroups.com
Daniel, there are two ways to adjust the model. One is that you can directly edit the model tolerances (Edit | Add model, then specify the tolerances you want for the variables that you want).

The other involves rerunning the free search but changing the overfitting controls to "manual" and setting the minimum average neighborhood size to "manual" and giving it an appropriate value. For example, if you want to try a more aggressive model than the "aggressive" setting, you could set the tolerance to be somewhat smaller to that used in the aggressive setting.

Bruce


To unsubscribe from this group and stop receiving emails from it, send an email to hyperniche+unsubscribe@googlegroups.com.

Daniel Palacios

unread,
Sep 6, 2016, 7:33:16 PM9/6/16
to HyperNiche and NPMR
Thanks, Bruce. This is very helpful.

Daniel
Reply all
Reply to author
Forward
0 new messages