fmincon vs lsqnonlin

Jason Park

unread,

Nov 18, 2010, 8:49:04 PM11/18/10

to

I've recently switched from LSQNONLIN to FMINCON on the account of speed to convergence. With LSQNONLIN, it took weeks while FMINCON took only a couple of hours or even less for some cases. Despite its faster speed, those two solvers do not yield results of any similarity given the same upper and lower bounds. So I do not know which is more reliable.

In more detail as to how I set up for my problem:
LSQNONLIN is based on Trust-Region-Reflective algorithm (probably this is why it took so long, but I need some confirmation by an optimization expert), while I recently changed it to FMINCON with Interior-Point algorithm in place of some nonlinear constraints.

First time I ran it and got some solution, and thought I should change the bounds because it was violating and lacking some economic meanings. So I did, and not too surprisingly, it said it could find no solution. Having tried many different ways to make it land on some reasonable solution, I did not succeed. So I thought I should try with the original solution which I got from the first run and set the bounds even looser to start again from there. Surprisingly this time the solver failed to reach the same solution. How come it this possible while the new bounds enclose the old bounds? So, would relaxing the bounds (or constraints) help in general?

Actually, I have a number of data sets, to which I apply the same function using the same solver. For some sets, it works all right, but for some others it does not. The only difference is a little bit of difference in the values contained in the data set. Is FMINCON this sensitive usually?

If this is what you'd suggest (http://www.mathworks.com/help/toolbox/optim/ug/br44iv5-1.html#br44i73) - No feasible point - Check nonlinear constraints, could you please set up a simple example because I do not understand how that works and how to carry that out?

Lastly, how different the results from LSQNONLIN and FMINCON? If FMINCON running on Interior-Point algorithm uses KKT algorithm, mathematically speaking, it should find the minimum given the constraints/bounds. Also, LSQNONLIN, if it truly searches the area for the (local) minimum, should not yield something that is too different from that found by FMINCON, should it?

It seems that I might have to ask more questions even after you answer those above. :'(

Thanks in advance,
Jason

Alan Weiss

unread,

Nov 19, 2010, 8:47:09 AM11/19/10

to

It sounds like you are not sure about optimization algorithms in
general. This forum is not designed to teach optimization theory, so I
won't address many of your questions.

Since you find fmincon interior-point to converge faster than lsqonlin,
I suspect that you formulated your problem incorrectly for lsqnonlin.
Did you pass lsqnonlin a vector of function values, or did you sum the
squares of the function values and use that as an objective function? I
suspect you passed the sum of squares. Don't do that, it slows the
solution quite a bit.
http://www.mathworks.com/help/toolbox/optim/ug/lsqnonlin.html#f662559

For an oversimplified discussion of how solvers work, see
http://www.mathworks.com/help/toolbox/optim/ug/br44i40.html#brhkghv-65
Solvers try to "walk downhill," which can lead to a local minimum. You
might want a global minimum. And if you start from an infeasible point,
solvers might not be able to find any solution.

Good luck,

Alan Weiss
MATLAB mathematical toolbox documentation

Matt J

unread,

Nov 19, 2010, 10:12:03 AM11/19/10

to

"Jason Park" <jason...@buseco.monash.edu.au> wrote in message <ic4l2g$ra6$1...@fred.mathworks.com>...

>
> So I thought I should try with the original solution which I got from the first run and set the bounds even looser to start again from there. Surprisingly this time the solver failed to reach the same solution. How come it this possible while the new bounds enclose the old bounds? So, would relaxing the bounds (or constraints) help in general?

========

Changing the bounds, whether loosening or tightening them, can certainly change the solution. For example, consider the trivial 1D function f(x)=x minimized with lb=0,
ub=0. The minimum will obviously be at x=0, in this case.

Now loosen the lower bound to lb=-1. The new minimum will be at x=-1. These are the facts of life with constrained optimization.

Jason Park

unread,

Nov 22, 2010, 8:47:03 PM11/22/10

to

Thanks Alan.
I've passed in sum of squared errors for FMINCON, and a vector of errors for LSQNONLIN. It is also surprising to me how they perform and what they return. One possibility is that my function has a number of focal minimal points, due to the fact that my function takes in a number of parameters to estimate. Another is that the algorithms that LSQNONLIN and FMINCON are based on may yield different results by nature. Trust-Region-Reflective algorithm is used for LSQNONLIN, while Interior-Point algorithm is employed for FMINCON for my cases. There are some more technical questions I might have to ask, but as you've said, this forum is not for learning optimization algorithms. So I guess I better resort to some other forums/people. Thank anyway for your constructive comments. The difference in speed of convergence was enormous by the way.

> Since you find fmincon interior-point to converge faster than lsqonlin,
> I suspect that you formulated your problem incorrectly for lsqnonlin.
> Did you pass lsqnonlin a vector of function values, or did you sum the
> squares of the function values and use that as an objective function? I
> suspect you passed the sum of squares. Don't do that, it slows the
> solution quite a bit.

Regards,
Jason

Jason Park

unread,

Nov 22, 2010, 8:51:03 PM11/22/10

to

Thanks Matt.
Yes. I understand the bound conditions to set allowing for enough space to search. It's kind of hard for my function since mine takes in about 100 parameter estimates, and for them to be jointly determined, it's hard to set bounds, not to mention constraint functions. I'm trying every way possible now ;)
Surprisingly, the solver (FMINCON) seems to be quite sensitive to the user-supplied initial points. Is this common?

Regards,
Jason

Walter Roberson

unread,

Nov 22, 2010, 9:50:03 PM11/22/10

to

On 22/11/10 7:51 PM, Jason Park wrote:

> Surprisingly, the solver (FMINCON) seems to be quite sensitive to the
> user-supplied initial points. Is this common?

Yes, especially for complicated surfaces.

A simple quadratic line, x^2+b*x+c, has two extreme values with the same
sign, together with one minima or else one maxima (never both except in
degenerate cases).

Extend the quadratic into two variables, a*x^2 + b*y^2 + c*x*y + d*x +
e*y + f. For any one fixed y, the equation reduces to the above
situation, so there is a minima or maxima for _each_ y. But there are an
infinite number of different y, so there are now an infinite number of
minima or maxima. How can any program sample them all in a finite number
of steps?

Now extend again to something with greater exponents or more variables
or that uses non-polynomial functions. How many minima or maxima do you
have now? In the quadratic case, you might be able to use gradient
information effectively, but as you increase the complexity, the
gradient information more and more gives information about local
conditions rather than about global trends -- partly because the further
away (more global) you get in your calculations of the function
differences, the more wrong the polynomial approximations get (and a
numeric gradient is effectively a polynomial approximation.)

Global searches are _hard_ to do efficiently while maintaining accuracy
on anything but simple functions.

Bruno Luong

unread,

Nov 23, 2010, 2:15:07 AM11/23/10

to

Walter Roberson <robe...@hushmail.com> wrote in message <vZFGo.46897$3f.3...@newsfe12.iad>...

> On 22/11/10 7:51 PM, Jason Park wrote:
>
> > Surprisingly, the solver (FMINCON) seems to be quite sensitive to the
> > user-supplied initial points. Is this common?
>
> Yes, especially for complicated surfaces.
>
> A simple quadratic line, x^2+b*x+c, has two extreme values with the same
> sign, together with one minima or else one maxima (never both except in
> degenerate cases).
>
> Extend the quadratic into two variables, a*x^2 + b*y^2 + c*x*y + d*x +
> e*y + f. For any one fixed y, the equation reduces to the above
> situation, so there is a minima or maxima for _each_ y. But there are an
> infinite number of different y, so there are now an infinite number of
> minima or maxima. How can any program sample them all in a finite number
> of steps?

A Local minima is not a directional minima (for any fixed y, etc...) as you imply Walter.

Bruno