> theta <- beta[pc + 1L:q]
> gamm <- c(-100, cumsum(c(theta[1L], exp(theta[-1L]))), 100)
That's bad. There's no reason to suppose beta[pc+1L] is larger than
-100 or that the cumulative sum is smaller than 100. For practical
datasets those assumptions are frequently violated, causing the
optimization to fail. A work-around is to center the explanatory
variables. This helps keep the zetas small.
The correct approach is to use the values -Inf and Inf as the first
and last cut points. The functions plogis, dnorm, etc all behave
correctly when the input is one of these values. The dgumbel function
does not, returning NaN for -Inf. Correct this as follows
dgumbel <- function (x, loc = 0, scale = 1, log = FALSE)
{
x <- (x - loc)/scale
d <- log(1/scale) - x - exp(-x)
d[is.nan(d)] <- -Inf # -tjb
if (!log) exp(d) else d
}
The documentation states
> start: initial values for the parameters. This is in the format
> 'c(coefficients, zeta)': see the Values section.
The relevant code is
> s0 <- if(pc > 0) c(start[seq_len(pc+1)], diff(start[-seq_len(pc)]))
> else c(start[1L], diff(start))
This doesn't take the logs of the differences as required to repose
the zetas into the form used in the optimization. The fix is
obvious. polr.fit has the same problem which is responsible for
summary() frequently failing when the Hessian is not provided.
I'm not convinced the t values reported by summary() are
reliable. I've noticed that a one dimensional linear transformation
the independent variables can cause the reported t values to change by
a factor of more than 100, which doesn't seem right.
--tjb
______________________________________________
R-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
> polr(Species~Sepal_Length+Sepal_Width+Petal_Length+Petal_Width,data=iris)
Error in polr(Species ~ Sepal_Length + Sepal_Width + Petal_Length +
Petal_Width, :
attempt to find suitable starting values failed
In addition: Warning messages:
1: In glm.fit(X, y1, wt, family = binomial(), offset = offset) :
algorithm did not converge
2: In glm.fit(X, y1, wt, family = binomial(), offset = offset) :
fitted probabilities numerically 0 or 1 occurred
I suggest that simply setting the coefficients beta to zero and the
cutpoints zeta to sensible values will always produce a feasible starting
point for non-pathological data. Here is my code that does this:
if(missing(start)) {
# try something that should always work -tjb
u <- as.integer(table(y))
u <- (cumsum(u)/sum(u))[1:q]
zetas <-
switch(method,
"logistic"= qlogis(u),
"probit"= qnorm(u),
"cauchit"= qcauchy(u),
"cloglog"= -log(-log(u)) )
s0 <- c(rep(0,pc),zetas[1],log(diff(zetas)))
Using this start code the problem is not manifested.
> source('fixed-polr.R')
> polr(Species~Sepal_Length+Sepal_Width+Petal_Length+Petal_Width,data=iris)
Call:
polr(formula = Species ~ Sepal_Length + Sepal_Width + Petal_Length +
Petal_Width, data = iris)
Coefficients:
Sepal_Length Sepal_Width Petal_Length Petal_Width
-2.466724 -6.671515 9.431689 18.270058
Intercepts:
setosa|versicolor versicolor|virginica
4.080189 42.639320
Residual Deviance: 11.89857
AIC: 23.89857
My change would also likely fix the problem reported by Kevin Coombes on May
6, 2010.
--
View this message in context: http://r.789695.n4.nabble.com/bugs-and-misfeatures-in-polr-MASS-fixed-tp3024677p3030405.html
Sent from the R help mailing list archive at Nabble.com.
y,x,lx
0,3.2e-02,-1.49485
0,3.2e-02,-1.49485
0,1.0e-01,-1.00000
0,1.0e-01,-1.00000
0,3.2e-01,-0.49485
0,3.2e-01,-0.49485
1,1.0e+00,0.00000
0,1.0e+00,0.00000
1,3.2e+00,0.50515
1,3.2e+00,0.50515
0,1.0e+01,1.00000
1,1.0e+01,1.00000
1,3.2e+01,1.50515
2,3.2e+01,1.50515
2,1.0e+02,2.00000
1,1.0e+02,2.00000
2,3.2e+02,2.50515
1,3.2e+02,2.50515
2,1.0e+03,3.00000
2,1.0e+03,3.00000
Using the MASS version we get
> ahn$y<-as.factor(ahn$y)
> summary(polr(y~lx,data=ahn))
Re-fitting to get Hessian
Error in optim(s0, fmin, gmin, method = "BFGS", hessian = Hess, ...) :
initial value in 'vmmin' is not finite
Whereas,
> source('fixed-polr.R')
> summary(polr(y~lx,data=ahn))
Re-fitting to get Hessian
Call:
polr(formula = y ~ lx, data = ahn)
Coefficients:
Value Std. Error t value
lx 2.421 0.8146 2.971
Intercepts:
Value Std. Error t value
0|1 0.5865 0.8118 0.7224
1|2 4.8966 1.7422 2.8106
Residual Deviance: 20.43286
AIC: 26.43286
--
View this message in context: http://r.789695.n4.nabble.com/bugs-and-misfeatures-in-polr-MASS-fixed-tp3024677p3030411.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________