As I understand the Hessian for the fmincon function is somehow calculated in another way than that of the fminunc which makes the standard errors that are calculated by using the Hessian of fmincon not the best possible approximation to the estimated standard errors.
My questions in here are: Is it valid to use Hessian from fmincon? Is this just a numerical issue that might happen or a general problem? How should we calculate the Hessian in this case then? Thanks.
Best,
The context isn't completely clear, so let me start by seeing if I
understand your problem.
You're attempting to estimate some paramters by solving a nonlinear
least squares problem. Right?
In the past, you've used fminunc to minimize the nonlinear least
squares problem, and then used Hessian of the least squares objective
function at the optimal parameter values to get standard errors for
the fitted parameters. Right?
Now, you'd like to repeat the process but with constraints on the
parameters and using fmincon instead of fminunc. Right?
Assuming that I've made all of the right quesses above, then
1. The Hessian returned by fmincon() is the Hessian of the Lagrangian
(incorporating the constraints and the objective function.) This
isn't the same thing as the Hessian of the objective function. In
any case, you can't construct meaningful confidence intervals for the
fitted parameters from the Hessian of the objective function because
of the constraints.
2. Depending on the options to fmincon and fminunc, the Hessian
returned by optimization routine could be a quasi-Newton approximation
to the actual Hessian of the objective function (fminunc) or
Lagrangian (fmincon) In either case, this approximation can be
extremely inaccurate as an approximation to the Hessian- it doesn't
have to be very accurate to be useful in finding search directions
that take the algorithm to an optimal solution.
Thank you for Brian. I am talking about a MLE optimization problem. I know that whenever there are constraints the Hessian is not a good approximation for the standard errors. Let me be more precise about about question: Assume that I want to use fmincon to assure that a parameter is non-zero. But I know that this parameter is not zero! This parameter might be standard error parameter which might be negative or positive and I want to assure that the final estimates will pick the positive one. Then in this case the constraints indeed do not bind. Do we still have issues for this case?
I know that it is possible to assure non-negativity by some transformations of parameters. But I still would like to see whether other techniques works or not.
About your point 2): let's assume that we are at the optimal point somehow and the constraints do not bind. In that case I presume that the Hessians are correct (for calculating standard errors)?
Thanks
> Thank you for Brian. I am talking about a MLE optimization problem. I know that whenever there are constraints the Hessian is not a good approximation for the standard errors. Let me be more precise about about question: Assume that I want to use fmincon to assure that a parameter is non-zero. But I know that this parameter is not zero! This parameter might be standard error parameter which might be negative or positive and I want to assure that the final estimates will pick the positive one. Then in this case the constraints indeed do not bind. Do we still have issues for this case?
Yes, if the constraints are all non-binding at optimality, then the
Hessian of the objective function can be used in this way.
Furthermore, if your constraints are all non-binding at optimality,
then the Lagrange multipliers should all be zero, and the Hessian of
the Lagrangian is the Hessian of the objective function.
> About your point 2): let's assume that we are at the optimal point somehow and the constraints do not bind. In that case I presume that the Hessians are correct (for calculating standard errors)?
Here's the problem- if you use a quasi-Newton method to approximate
the Hessian of the Lagrangian for purposes of optimization, that quasi-
Newton approximate Hessian is unlikely to be a good (for purposes of
standard errors) approximation to the actual Hessian, and you should
not use it for calculating standard errors.
Are you supplying the Hessian to fmincon, or are you letting fmincon
do it? If fmincon is approximating the Hessian, then you're using a
quasi-Newton method and you can't safely use the Hessian output by
fmincon for the purpose of computing standard errors.
> About your point 2): let's assume that we are at the optimal point somehow and the constraints do not bind. In that case I presume that the Hessians are correct (for calculating standard errors)?
=======
No. Again, depending on various fmincon parameters, the Hessians it computes may be very crude estimates.
Does this imply that we cannot approximate the relevant Hessian matrix in a numerical way? I am pretty sure that it is not always possible to get the analytical Hessiian matrix and we need some kind of approximation. Does calculating the numerical hessian manually by numerical differentiation help? If it does why wouldn't Matlab do it? Thanks.
Does this imply that we cannot approximate the relevant Hessian matrix in a numerical way? I am pretty sure that it is not always possible to get the analytical Hessiian matrix and we need some kind of approximation. Does calculating the numerical hessian manually by numerical differentiation help? If it does why wouldn't Matlab do it? Thanks.
It's not that a numerical approximation to the Hessian cannot be
computed, but rather that it isn't worth computing one for purposes of
optimization. The very crude approximation used in the quasi-Newton
methods is sufficient to get the optimization algorithm to the
optimum. The job of fmincon is optimization, not statistical
parameter estimation- it should be no surprise that the special needs
of statisticians aren't necessarily met by a general purpose
optimization routine.
Depending on the form of your likelihood function (not just its
mathematical form, but also how you compute it), it may be relatively
easy to compute a good approximate Hessian that could reasonably be
used to compute your standard errors. For example, you might be able
to use an automatic differentiation (AD) tool to translate a program
for computing the likehood into a program for computing the likelihood
and its Hessian. Even if this computation is slow, you'd only need to
run it at the optimum solution, not at all of the intermediate
solutions that the quasi-Newton method considered. Or, if your
likelihood was Gaussian, and you were just solving a nonlinear least
squares problem, then it would be straight forward to compute the
Jacobian (matrix of first derivatives) by finite differencing and then
use the approximation H=J'*J. I would recommend against trying to
compute the second derivatives directly by second order finite
differencing- in my experience it just doesn't work well in
practice.
Another approach that might be more practical would be to sample from
your likelihood (or more generally posterior) distribution using
Markov Chain Monte Carlo methods. If the likelihood can be computed
relatively quickly, then this can be a very effective
technique.
A third approach that you might consider is building a quadratic
metamodel of the likelihood near the optimal parameter values.
Thank you. Now, I see why in many of the codes that I see why people are recalculating the Hessian from the scratch rather than using the fminunc or fmincon ones. I thought there should be reason but did not exactly know the reason.
Best,
> Thank you. Now, I see why in many of the codes that I see why people are
> recalculating the Hessian from the scratch rather than using the fminunc
> or fmincon ones. I thought there should be reason but did not exactly
> know the reason.
> Best,
There is at least one function on the File Exchange that will calculate
the Hessian. See
http://www.mathworks.com/matlabcentral/newsreader/view_thread/271183#711741
For more information on the quality of Hessians from various solvers and
algorithms, see
http://www.mathworks.com/access/helpdesk/help/toolbox/optim/ug/bsapd3i.html
Alan Weiss
MATLAB mathematical toolbox documentation