In more detail as to how I set up for my problem:
LSQNONLIN is based on Trust-Region-Reflective algorithm (probably this is why it took so long, but I need some confirmation by an optimization expert), while I recently changed it to FMINCON with Interior-Point algorithm in place of some nonlinear constraints.
First time I ran it and got some solution, and thought I should change the bounds because it was violating and lacking some economic meanings. So I did, and not too surprisingly, it said it could find no solution. Having tried many different ways to make it land on some reasonable solution, I did not succeed. So I thought I should try with the original solution which I got from the first run and set the bounds even looser to start again from there. Surprisingly this time the solver failed to reach the same solution. How come it this possible while the new bounds enclose the old bounds? So, would relaxing the bounds (or constraints) help in general?
Actually, I have a number of data sets, to which I apply the same function using the same solver. For some sets, it works all right, but for some others it does not. The only difference is a little bit of difference in the values contained in the data set. Is FMINCON this sensitive usually?
If this is what you'd suggest (http://www.mathworks.com/help/toolbox/optim/ug/br44iv5-1.html#br44i73) - No feasible point - Check nonlinear constraints, could you please set up a simple example because I do not understand how that works and how to carry that out?
Lastly, how different the results from LSQNONLIN and FMINCON? If FMINCON running on Interior-Point algorithm uses KKT algorithm, mathematically speaking, it should find the minimum given the constraints/bounds. Also, LSQNONLIN, if it truly searches the area for the (local) minimum, should not yield something that is too different from that found by FMINCON, should it?
It seems that I might have to ask more questions even after you answer those above. :'(
Thanks in advance,
Jason
Since you find fmincon interior-point to converge faster than lsqonlin,
I suspect that you formulated your problem incorrectly for lsqnonlin.
Did you pass lsqnonlin a vector of function values, or did you sum the
squares of the function values and use that as an objective function? I
suspect you passed the sum of squares. Don't do that, it slows the
solution quite a bit.
http://www.mathworks.com/help/toolbox/optim/ug/lsqnonlin.html#f662559
For an oversimplified discussion of how solvers work, see
http://www.mathworks.com/help/toolbox/optim/ug/br44i40.html#brhkghv-65
Solvers try to "walk downhill," which can lead to a local minimum. You
might want a global minimum. And if you start from an infeasible point,
solvers might not be able to find any solution.
Good luck,
Alan Weiss
MATLAB mathematical toolbox documentation
Changing the bounds, whether loosening or tightening them, can certainly change the solution. For example, consider the trivial 1D function f(x)=x minimized with lb=0,
ub=0. The minimum will obviously be at x=0, in this case.
Now loosen the lower bound to lb=-1. The new minimum will be at x=-1. These are the facts of life with constrained optimization.
> Since you find fmincon interior-point to converge faster than lsqonlin,
> I suspect that you formulated your problem incorrectly for lsqnonlin.
> Did you pass lsqnonlin a vector of function values, or did you sum the
> squares of the function values and use that as an objective function? I
> suspect you passed the sum of squares. Don't do that, it slows the
> solution quite a bit.
Regards,
Jason
Regards,
Jason
> Surprisingly, the solver (FMINCON) seems to be quite sensitive to the
> user-supplied initial points. Is this common?
Yes, especially for complicated surfaces.
A simple quadratic line, x^2+b*x+c, has two extreme values with the same
sign, together with one minima or else one maxima (never both except in
degenerate cases).
Extend the quadratic into two variables, a*x^2 + b*y^2 + c*x*y + d*x +
e*y + f. For any one fixed y, the equation reduces to the above
situation, so there is a minima or maxima for _each_ y. But there are an
infinite number of different y, so there are now an infinite number of
minima or maxima. How can any program sample them all in a finite number
of steps?
Now extend again to something with greater exponents or more variables
or that uses non-polynomial functions. How many minima or maxima do you
have now? In the quadratic case, you might be able to use gradient
information effectively, but as you increase the complexity, the
gradient information more and more gives information about local
conditions rather than about global trends -- partly because the further
away (more global) you get in your calculations of the function
differences, the more wrong the polynomial approximations get (and a
numeric gradient is effectively a polynomial approximation.)
Global searches are _hard_ to do efficiently while maintaining accuracy
on anything but simple functions.
A Local minima is not a directional minima (for any fixed y, etc...) as you imply Walter.
Bruno