Chris Pullig
I know that when I write reviews,
even if I am sure that I don't like what is there,
I try to keep in mind that I may be overlooking some
precedent or justification or clever explanation.
- so I hope that they are not *insisting* on any one solution.
If your scores are all between 20-80%, then maybe the reviewers are
being foolish. If your outcomes are percentages PER SUBJECT, then you
do not have a ready candidate for the usual, maximum likelihood logistic
regression, and that is a second chance that your reviewers could be
foolish.
On the other hand, if you do have extreme percents, then it is
reasonable to transform. The arcsin squareroot of P is one, if you
don't want to apply a pragmatic "probit" or logit transform
to each score.
Hope this helps.
--
Rich Ulrich, biostatistician wpi...@pitt.edu
http://www.pitt.edu/~wpilib/index.html Univ. of Pittsburgh
> -----Original Message-----
> From: Cpullig [SMTP:cpu...@AOL.COM]
> Sent: Monday, June 29, 1998 10:08 AM
> To: SPS...@UGA.CC.UGA.EDU
> Subject: Multiple Regression with Dependent Variable as a
> Percentage
>
> Reviewers are teeling me that I can not use multiple regression with a
> single
> dependent variable that is a percentage (0-100%). I am treating this
> variable
> as a continous variable and they are suggesting logistic regression.
>> Reviewers are teeling me that I can not use multiple regression with a
>> single
>> dependent variable that is a percentage (0-100%). I am treating this
>> variable
>> as a continous variable and they are suggesting logistic regression.
>> I am sure
>> that I can split the dependent variable into groups based on the
>> percentages,
>> but this seems to be giving up a great deal of information. Any
>> suggestions or
>> good arguments for my position.
>>
>> Chris Pullig
Chris,
The problem with using % as a dependent variable to OLS regression (the
REGRESSION procedure) is that the predictions aren't bounded within the
range of 0 to 1. You can wind up with predictions which are nonsensical
(i.e. -24% or 112%). Proportions tend to be non-normal on the extremes, 20%
to 80% for "large samples", so this can happen with OLS multiple regression.
By proportion, I mean percentage/100 (e.g. 20% is 0.20).
There's a couple of ways you can approach the problem. If your variable is
coded as 0 and 1 (i.e. yes/no at a respondent level), logistic regression is
technically the correct approach to take to handle the non-normality of the
distribution. I'd look at Hosmer and Lemeshow's Applied Logistic Regression
to learn more about LR.
On the other hand, if your DV is collected as 0 to 100 at a respondent
level, you can do a non-linear transformation on the DV of the form
ln(p/1-p), where p is the answer/100. Adjust the extreme answers of 0 and
100 to (say) 0.005 and 0.995 so that the transformation makes sense. Then,
you can run the regular OLS procedure.
However, keep in mind that the predictions are now of a different form than
the straight linear model. They take the form of exp(bx)/(1+exp(bx), where
"x" is the respondent answers, and "b" is the vector of beta weights.
Hope this helps--
Of course, the event's probability may behave in other ways, such as
being a linear or exponential function of your set of independent
variables, but your life as a data analyst consists precisely of such
decisions: which model is most appropriate for your data?
Hector Maletta
Universidad del Salvador
Buenos Aires, Argentina
HGM
Hector Maletta
Universidad del Salvador
Buenos Aires, Argentina
Stuart Drucker wrote comments on Chris Pullig's question.
Chris Pullig had wrote:
> >> Reviewers are teeling me that I can not use multiple regression with a
> >> single
> >> dependent variable that is a percentage (0-100%). I am treating this
> >> variable
> >> as a continous variable and they are suggesting logistic regression.
> >> I am sure
> >> that I can split the dependent variable into groups based on the
> >> percentages,
> >> but this seems to be giving up a great deal of information. Any
> >> suggestions or
> >> good arguments for my position.
> >>
> >> Chris Pullig
>