How to use LBFGS.ApproximateHessian()

xuepeng sun

unread,

Jan 8, 2015, 9:11:10 AM1/8/15

to scala-...@googlegroups.com

Hi,

I'd like to use LBFGS.ApproximateHessian to compute the Hessian of an objective function with large sparse samples.

Does anyone have a use case of this class? It seems that no test case are available in the testSuite. Thanks a lot.

Best,

Xuepeng

David Hall

unread,

Jan 13, 2015, 12:27:31 PM1/13/15

to scala-...@googlegroups.com

Sorry for the delay.

It's really not meant to be used outside of the LBFGS framework. You need a sequence of gradient steps, recording \delta x and \delta \nabla f(x) by calling "updated."

--
You received this message because you are subscribed to the Google Groups "Scala Breeze" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scala-breeze...@googlegroups.com.
To post to this group, send email to scala-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/475f55e1-2e71-4c70-ab55-91a7ac2f0b96%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Debasish Das

unread,

Feb 11, 2015, 3:09:46 AM2/11/15

to scala-...@googlegroups.com

Hi,

Please look in the following PR:

https://github.com/scalanlp/breeze/pull/364

LBFGS.minEigen will generate the minimum eigen value using power iterations

LBFGS.maxEigen is tricky since I am building CompactHessian from ApproximateInverseHessian. The code fails right now but I asked David's help in fixing it...If CompactHessian generation works, power iterations will extract maxEigen value...

Thanks.

Deb

Soila Pertet Kavulya

unread,

Jun 13, 2015, 4:05:24 AM6/13/15

to scala-...@googlegroups.com

Hi David,

Would SecondOrderFunction.empirical() yield the correct approximation of the Hessian for L-BFGS?

Thanks,

Soila

On Tuesday, January 13, 2015 at 9:27:31 AM UTC-8, David Hall wrote:

David Hall

unread,

Jun 13, 2015, 4:08:47 AM6/13/15

to scala-...@googlegroups.com

No, LBFGS builds its own hessian that is a relatively low rank (2 * "memory", I think) approximation to the inverse hessian by tracking the history of steps it takes. The empirical hessian is a numeric approximation to the hessian at the current point using standard multivariate calculus results.

To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/ca6fc7e2-05c6-4c15-8016-60e9958f847b%40googlegroups.com.

Soila Pertet Kavulya

unread,

Jun 15, 2015, 2:54:06 PM6/15/15

to scala-...@googlegroups.com

Thanks David,

Sorry, my question was a bit unclear. I would like to approximate the Hessian after the L-BFGS algorithm completes so that I can compute the standard error of the coefficients. The approximate Hessian generated by L-BFGS would be out-of-date because the most recent approximation of the Hessian is for time (t-1). I was looking at R source code, and it looks like they are using numeric approximation to generate the Hessian after the optimizer converges if the flag "hessian=TRUE" is set in the optim function. I was wondering if I could use SecondOrderFunction.empirical() in Breeze to generate the numeric approximation.

Soila

David Hall

unread,

Jun 15, 2015, 6:11:17 PM6/15/15

to scala-...@googlegroups.com

Ah. SecondOrderFunction only computes Hessian vector products H * x. .empirical() computes H * x \approx (f'(theta + x * epsilon) - f'(theta))/epsilon.

I guess it's possible to do the work to extract the relevant bits of the inverse hessian given that and some kind of conjugate gradient algorithm?

What does R do, exactly?

-- David

To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/218fb619-1237-4bfd-b59d-26d18a0ab813%40googlegroups.com.

Soila Pertet Kavulya

unread,

Jun 16, 2015, 12:37:17 PM6/16/15

to scala-...@googlegroups.com

Hi David,

Thank you so much for looking into this. I didn't realize that empirical hessian was computing the hessian vector product. optim.c in the R stats package is using central difference to approximate the hessian. They calculate the first order difference as (f(theta + epsilon) - f(theta - epsilon))/ 2*epsilon. They also scale the parameters if the user input a scaling function. Next, they compute the second order derivatives based on the first order derivatives. It looks like the best option might be for me to implement central difference.

Soila

David Hall

unread,

Jun 16, 2015, 12:44:33 PM6/16/15

to scala-...@googlegroups.com

great, let me know if you want to contribute it back when you do!

To view this discussion on the web visit https://groups.google.com/d/msgid/scala-breeze/13987bd9-0f32-4584-9f98-befcce74b989%40googlegroups.com.

Soila Pertet Kavulya

unread,

Jun 16, 2015, 1:37:57 PM6/16/15

to scala-...@googlegroups.com

Thanks! Will do.

Soila Pertet Kavulya

unread,

Jun 18, 2015, 6:32:45 PM6/18/15

to scala-...@googlegroups.com

Hi David,

I created a pull request for the approximate hessian. https://github.com/scalanlp/breeze/pull/413

This is a bare bones implementation. I'm new to breeze and I was not sure what conventions and data structures to use. Could you take a look at it and let me know what I should change?

I noticed that breeze.optimize.ApproximateGradientFunction is using forward differences instead of central differences. Would it be ok if I changed it to use central differences?