>>>>> J C Nash <
profj...@gmail.com>
>>>>> on Tue, 18 Apr 2017 13:32:52 -0400 writes:
> Recently Marie Boehnstedt reported a bug in the nlm()
> function for function minimization when both gradient and
> hessian are provided.
Indeed, on R's Bugzilla here :
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17249
> She has provided a work-around for some cases and it seems
> this will get incorporated into the R function eventually.
indeed.... the first part -- fixing the wrong choldc() -- in the C code has
been in my version of R-devel for a while now.
See my follow up
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17249#c4 and #c5
including my attachment which is an extended version of Marie's original :
https://bugs.r-project.org/bugzilla/attachment.cgi?id=2246
As that mentions _and_ shows: Fixing choldc() solves the problem
for the 2D Rosenbrook example, but _not_ the 4D Wood example.
> However, despite the great number of packages on CRAN,
> there does not appear to be a straightforward Newton
> approach to function minimization. This may be because
> providing the code for a hessian (the matrix of second
> derivatives) is a lot of work and error-prone.
> (R could also use some good tools for Automatic Differentiation).
The last part of what you say above is not at all true:
R -- and S (& S+ | S-PLUS) before it! -- always had deriv() and deriv3()
and my attachment above _shows_ you how to use deriv3() to get
both the gradient and the hessian via a version of automatic
differentiation completely effortlessly !!
For ease of readers, that part here, with an example:
##' Wood function (4 arguments 'x1' ... 'x4')
fwood <- function(x1,x2,x3,x4) {
100*(x1^2-x2)^2 + (1-x1)^2 + 90*(x3^2-x4)^2 + (1-x3)^2 +
10.1*((1-x2)^2 + (1-x4)^2) + 19.8*(1-x2)*(1-x4)
}
## automatically construct correct gradient and hessian:
woodf.gh <- function(x) {
stopifnot(is.numeric(x))
woodGH <- deriv3(body(fwood)[[2]],
c("x1","x2","x3","x4"), function.arg=TRUE)
if(length(x) == 4)
woodGH(x[1],x[2],x[3],x[4])
else if(is.matrix(x) && ncol(x) == 4)
woodGH(x[,1], x[,2], x[,3], x[,4])
else stop("'x' must have length 4 or be a matrix with 4 columns")
}
and now get both the function f(x), gradient g(x) and Hessian H(x)
for
x = (0 0 0 0),
x = (1 1 1 1), and
x = (1 2 3 4)
with such a simple calle :
>
woodf.gh(rbind(0, 1, 1:4))
[1] 42.0 0.0 2514.4
attr(,"gradient")
x1 x2 x3 x4
[1,] -2 -40.0 -2 -40.0
[2,] 0 0.0 0 0.0
[3,] -400 279.6 5404 -819.6
attr(,"hessian")
, , x1
x1 x2 x3 x4
[1,] 2 0 0 0
[2,] 802 -400 0 0
[3,] 402 -400 0 0
, , x2
x1 x2 x3 x4
[1,] 0 220.2 0 19.8
[2,] -400 220.2 0 19.8
[3,] -400 220.2 0 19.8
, , x3
x1 x2 x3 x4
[1,] 0 0 2 0
[2,] 0 0 722 -360
[3,] 0 0 8282 -1080
, , x4
x1 x2 x3 x4
[1,] 0 19.8 0 200.2
[2,] 0 19.8 -360 200.2
[3,] 0 19.8 -1080 200.2