Here's another Q/A -- your thoughts are welcome.
------------
Q. I recently received comments on a manuscript regarding how I
transformed my data and I wanted to see if you might know of potential
references I could use in my revision. Specifically, the reviewer and
editor took issue with my applying square root transformation to the
response variable (abundance) and then using this variable in linear
regressions (landscape variables as explanatory variables). Their
concern is that the square root transform doesn't get around the "main
problem" which is that the data are constrained to be positive. They
go on to state that "this is a problem when you're trying to fit
linear models because it can flatten responses substantially and
interfere with your inference." Unfortunately, I have not had any
success tracking down further information on this issue from either
ecological statistic books or in the ecological literature. The editor
suggests that I use log-transformations but this would require about 3
weeks of redoing my analyses. Before I embark on this task I'd like
to be fully convinced that log-transformations are indeed the best way
to go.
------------
My answer: To some extent I agree with the reviewers. Species
responses to environment are often (usually?) nonlinear and even
nonmonotonic (i.e. a hump-shaped response). See:
http://home.centurytel.net/~mjm/whynpmr.htm
http://home.centurytel.net/~mjm/NPMRintro.pdf
With your kind of data you can expect asymptotic behavior near the
predictor axes, so a log transformation might be better. But this
isn't the only problem. Depending on the fit of your straight line, it
could predict negative abundances. You can see if this is a problem
with your regressions by superimposing the regression line on
scatterplots and seeing if the fitted line is ever in the negative
range. If it is, then one can argue that you’ve mis-specified the
model. If it is not, then you might be able to argue that your fits
are ok.
While you are looking at your scatterplots, look for nonmonotonic
behavior which would not be captured by straight line fits, either to
log-transformed or squareroot transformed data.
You could fix all of the above problems by using nonparametric
regression with a local mean – estimates from this never goes outside
of the range of the observed values, and the method should work well
whether the responses are log or squareroot transformed. Plus it
allows nonlinear and nonmonotonic responses.
-------------
Other thoughts?