Stephen:
In your described scenario, I would suggest attribute the neighborhood characteristics to the business point that the point falls within (a point in polygon spatial join does that nicely). It is certainly true that some of the businesses that fall near the borders of the neighborhood might be influenced by neighboring neighborhood, but we also cannot rule out the possibility that even businesses in the dead center of the neighborhood will not have any influence from other neighborhoods. The argument here is not perfect, but practical - if a business falls within a neighborhood, we take the chance of assigning the neighborhood's characteristics to that business and attribute the other neighborhoods' influence to the stochastic process and our residuals will faithfully capture that effect. The spatial autoregressive models (SAR/SEM) are then applied to see if such design makes sense. After all, models are tools to facilitate the interpretation of the underlying research question, it is up to the scholar (us) to interpret the modeled result.
Other than SAR/SEM, attempting SpatialFiltering in R/spdep might be another alternative to look at things.
Hope this helps.
Best,
Danlin
--
You received this message because you are subscribed to the Google Groups "Openspace List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openspace-lis...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openspace-list/95f4c32e-9962-43df-a623-51e31bd40b85n%40googlegroups.com.
-- ___________________________________________ Danlin Yu, Ph.D. Professor of GIS and Urban Geography Department of Earth & Environmental Studies Montclair State University Montclair, NJ, 07043 Tel: 973-655-4313 Fax: 973-655-4072 Office: CELS 314 Email: y...@mail.montclair.edu webpage: csam.montclair.edu/~yu
--
Dear Stephen,
I don't know who your audience is or your purposes for making these estimates.
With these caveats aside, I'd advise starting with the simplest estimates you can. I suspect in your case this would be assigning polygon values to each business and then running a logit/probit estimate. Even then you'd have to account for spatial autocorrelation, but I think even the importance of this would depend on the sizes of your polygons and the spatial pattern of your businesses.
You can also get carried away with resampling methods and the like, largely because there may be no analytical solutions for your estimating statistics. But ask yourself what difference this would make.
Even commonly used OLS estimators are "BLUE" (Best Linear
Unbiased Estimators), and the preference for linearity is due to a
combination of analytical tractability and limitations on
20th-century computing power. In other words, mathematically
arbitrary, pragmatic concessions.
In my experience, such considerations as seeking the best
possible estimator is often a case of the juice not being worth
the squeeze: i.e., the substantive interpretation of the more
complicated results does not differ from the simple, but
mathematically flawed, results. This is especially true if your
intended audience is unfamiliar with the math and you have no
understandable way to explain your methods. With logit models one
can display logistic curves, but other methods may have nothing
similar.
Additionally, with logit models, I most often have found little difference between them and simpler methods. I always run OLS or GLS models first; typically the substantive interpretation of the OLS model results are identical to the GLS and MLE-logit models' results. If not, then this is another, methodological finding regarding under what circumstances the simpler-but-wrong method leads to substantively wrong results.
I'm not saying don't attempt something more complicated: just start with baby steps. If you go further, I think Miguel's suggestions about point processes is a good one.
But if you go to such trouble, be cautious with R. Years ago I
estimated a GLM in S but was suspicious of the results: they were
too close to the OLS estimates. So I took a textbook GLM model and
estimated it. The results differed from the textbook's.
Thankfully, S (and R) allows access to the source code, and I by
inspecting it I found a bug. After fixing it, both the textbook
model and my original were correct. I'm currently working to
modify an existing R package to analyze timeseries data. You may
have to do something similar.
I'd add that, IIRC, David Birch found the half-life of small businesses to be about five years. There are so many considerations like this (e.g., spatio-temporal status of the macro economy, firm niches in industrial sectors, etc.) that you may run out of degrees of freedom by the time you take all relevant considerations into account. Starting with simpler models would reveal this before you waste time with more complex modeling strategies.
Good luck!
Marshall Feldman
Emeritus Professor of Urban Studies and Labor Research
The University of Rhode Island
To view this discussion on the web visit https://groups.google.com/d/msgid/openspace-list/18364471-1a7f-49e2-b804-7384249e7641n%40googlegroups.com.