Daniel, To my eye, the biggest difference between the response variables is that pARS has large piles of points on 0 and 1, while bState has a more uniform distribution. That clumping of response values apparently results in HyperNiche choosing a narrower tolerance for pARS, which in turn results in bumpier response surfaces.
You make a reasonable case for pARS, all else being equal. But it does appear that pARS isn't quite as strongly related to your predictors as bState. What I would try is forcing a more conservative model for pARS, setting the tolerance to give a relatively non-bumpy surface, similar to bState, THEN evaluate the difference in fit. I would expect this to drop the xR2 for pARS even further, but it would be interesting to know how much.
You can make the model more conservative by either hand tweaking the tolerances (smoothing parameters) to higher values or by setting more conservative model fitting parameters (in particular, increasing the minimum average neighborhood size).
As for arcsine square-root transforming pARS -- I don't see much difference, so using the principle of minimal transformation unless necessary, I'd probably stick with straight pARS. Besides, in this case it seems to have pushed the data slightly towards "trinary" (if there is such a word), with 3 values, zero, middle (0.5), and 1.
Bruce McCune