iter_cb abort with method = "least_squares" not producing updated parameters

448 views

Skip to first unread message

Patrick Greenfield

unread,

Sep 9, 2017, 4:26:10 PM9/9/17

to lmfit-py

Greetings all,

I have defined a non-linear least-squares residual function that I am trying to minimize using lmfit's "least_squares" minimization method (note that I am not using the method "leastsq" due to its unconstrained nature that results in NaN/inf function values), and I am having trouble figuring out how to stop the minimization fitting using "iter_cb". For example, stopping the minimizer with an if statement based on the number of iterations (show below in function "per_iteration"). Essentially, while I can define an "iter_cb" function that stops the minimizer after a certain number of minimization iterations, I cannot seem to access the parameters updated to reflect the minimization residuals before the stopping criterion had occurred. Thus, I am unclear how to access the last updated parameter values after a "True" statement is passed by the "iter_cb" function.

For my exercise, there are 130 parameters to minimize, and from what I understand the "iter_cb" will only currently stop when the number of iterations is greater than the number of parameters used in the minimization.

Below is the "iter_cb" function that I have defined (where an iteration stopping criterion of 300 is just for testing, i.e. with the initial 130 parameters the sum of squared residuals is roughly 100 and after 300 function calls sum of squared residuals reduces to roughly 2):

def per_iteration(pars, iter, resid, *args, **kws):
    print(" ITER ", iter, "RESID", sum(resid**2))
    if iter > 300:
        param_calib = pars
        return True
    else:
        return False

And also the minimizer and fit:

fitter = lmfit.Minimizer(swaption_atm_otm_ois_resid_lmm_dd_rho, params,iter_cb=per_iteration,fcn_args=(mats_swaptions,mats_swaps,P,tau,fwds_ois,beta_ois,us_atm_swaption_blk_ois_prices,n,N,T,mats_swaption_otm_ois,k_otm,us_otm_swaption_blk_ois_prices))
fitter.minimize(method="least_squares")

With the above "iter_cb" function it stops after 300 iterations with the following error:

(' ITER ', 301, 'RESID', 2.1894712267310341)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-202-ddfdd173b25b> in <module>()
      1 fitter = lmfit.Minimizer(swaption_atm_otm_ois_resid_lmm_dd_rho, params,iter_cb=per_iteration,fcn_args=(mats_swaptions,mats_swaps,P,tau,fwds_ois,beta_ois,us_atm_swaption_blk_ois_prices,n,N,T,mats_swaption_otm_ois,k_otm,us_otm_swaption_blk_ois_prices))
----> 2 fitter.minimize(method="least_squares")

/usr/local/lib/python2.7/site-packages/lmfit/minimizer.pyc in minimize(self, method, params, **kws)
   1647                         val.lower().startswith(user_method)):
   1648                     kwargs['method'] = val
-> 1649         return function(**kwargs)
   1650 
   1651 

/usr/local/lib/python2.7/site-packages/lmfit/minimizer.pyc in least_squares(self, params, **kws)
   1216                             bounds=(lower_bounds, upper_bounds),
   1217                             kwargs=dict(apply_bounds_transformation=False),
-> 1218                             **kws)
   1219 
   1220         for attr in ret:

/usr/local/lib/python2.7/site-packages/scipy/optimize/_lsq/least_squares.pyc in least_squares(fun, x0, jac, bounds, method, ftol, xtol, gtol, x_scale, loss, f_scale, diff_step, tr_solver, tr_options, jac_sparsity, max_nfev, verbose, args, kwargs)
    906         result = trf(fun_wrapped, jac_wrapped, x0, f0, J0, lb, ub, ftol, xtol,
    907                      gtol, max_nfev, x_scale, loss_function, tr_solver,
--> 908                      tr_options.copy(), verbose)
    909 
    910     elif method == 'dogbox':

/usr/local/lib/python2.7/site-packages/scipy/optimize/_lsq/trf.pyc in trf(fun, jac, x0, f0, J0, lb, ub, ftol, xtol, gtol, max_nfev, x_scale, loss_function, tr_solver, tr_options, verbose)
    126         return trf_bounds(
    127             fun, jac, x0, f0, J0, lb, ub, ftol, xtol, gtol, max_nfev, x_scale,
--> 128             loss_function, tr_solver, tr_options, verbose)
    129 
    130 

/usr/local/lib/python2.7/site-packages/scipy/optimize/_lsq/trf.pyc in trf_bounds(fun, jac, x0, f0, J0, lb, ub, ftol, xtol, gtol, max_nfev, x_scale, loss_function, tr_solver, tr_options, verbose)
    380             cost = cost_new
    381 
--> 382             J = jac(x, f)
    383             njev += 1
    384 

/usr/local/lib/python2.7/site-packages/scipy/optimize/_lsq/least_squares.pyc in jac_wrapped(x, f)
    864                 J = approx_derivative(fun, x, rel_step=diff_step, method=jac,
    865                                       f0=f, bounds=bounds, args=args,
--> 866                                       kwargs=kwargs, sparsity=jac_sparsity)
    867                 if J.ndim != 2:  # J is guaranteed not sparse.
    868                     J = np.atleast_2d(J)

/usr/local/lib/python2.7/site-packages/scipy/optimize/_numdiff.pyc in approx_derivative(fun, x0, method, rel_step, f0, bounds, sparsity, args, kwargs)
    357 
    358     if sparsity is None:
--> 359         return _dense_difference(fun_wrapped, x0, f0, h, use_one_sided, method)
    360     else:
    361         if not issparse(sparsity) and len(sparsity) == 2:

/usr/local/lib/python2.7/site-packages/scipy/optimize/_numdiff.pyc in _dense_difference(fun, x0, f0, h, use_one_sided, method)
    385             x = x0 + h_vecs[i]
    386             dx = x[i] - x0[i]  # Recompute dx as exactly representable number.
--> 387             df = fun(x) - f0
    388         elif method == '3-point' and use_one_sided[i]:
    389             x1 = x0 + h_vecs[i]

TypeError: unsupported operand type(s) for -: 'NoneType' and 'float'

When I call the variable fitter's parameters, via

fitter.params.values()

the results are just the initial parameters.

Perhaps the fitter is returning the last update "calibrated" parameters and I just don't know how to access them, but anyways I am unclear either a) how to extract the last updated parameters or b) if I am doing something more globally incorrect. Any help would be appreciated.

Best regards, and thank you in advance for your help and thoughts,

Patrick

Matt Newville

unread,

Sep 10, 2017, 6:46:40 PM9/10/17

to lmfit-py

Hi Patrick,

The name "iter_cb" is potentially confusing, since "iteration" appears to have different usages for different solvers. With lmfit, the "iter_cb" is called just after each evaluation of the objective function, just before the results are passed back to the solver.

Certainly for `leastsq` and almost certainly for `least_squares` 'trf' and 'dogleg' methods, the objective function will generally be called Nvarys (mabye +1 or so) times for each of the early "iterations" of the fit -- this is done to build the Jacobin (derivatives) matrix used to decide how to update the candidate values for the parameters. For this reason, the fit needs to do at least Nvarys evaluations. It might actually need to do more, and from your report it sort of looks like `least_squares` might require more initial iterations as it is constructing its Jacobian.

Aborting a fit by having your callback return `True` does render the fit "unsuccessful", and the fit results should really not be taken as meaningful. The `results.params` should be updated, but may not be changed significantly from their initial values if you abort too early.

That is: if your idea is to abort after some number of "iterations" of the fit procedure, you may want to multiply that by Nvarys. I suggest letting the fit go at least 3*Nvarys steps before aborting, unless you are using "abort" to mean "cancel the fit do to impossible condition, user input, etc". And be prepared for nearly useless results.

FWIW, it looks like we are more careful with aborted fits with `leastsq` than `least_squares`, clearly setting a flag indicating that the fit was aborted. We should probably do that for `least_squares` too.

A couple other things to consider:

a) If you are using the `iter_cb` to prevent wasting time on polishing a pretty good fit, you might also consider relaxing the fit tolerances, which are often set to rather high precision (in the 1.e-7 to 1.e-8 range). If you have a lot of variables or just want to explore the trends without being perfect, increasing this to 1.e-3 or so might be useful.

b) Parameter bounds do work with `leastsq` or any other fitting method of lmfit, not only with `least_squares`. It is true that if the value hits that boundary, the fit has a very hard time figuring out the derivatives (and so how to update the value for the next step) and probably won't be able to estimate uncertainties. But, if bounded parameters are actually hitting the bounds, then the concepts used to think about uncertainties become unclear (to me, anyway). I believe `least_squares` is meant to do this a bit better in theory, but I don't know how that holds up in practice.

That is not saying you should not use `least_squares`. We try to support it, but that support may be a bit rough around the edges. It is a rather complicated beast with interacting options -- not due so much to the bounds, but choice of methods, cost-functions etc: we handle these concepts differently in lmfit, which makes it somewhat challenging to wrap well.

Hope that helps.

--Matt

Patrick Greenfield

unread,

Sep 12, 2017, 10:59:44 AM9/12/17

to lmfi...@googlegroups.com

Hello Matt,

Thank you very much for your reply. In my particular minimization Nvarys=130, so I've tried extending the "iter_cb" iteration stop to 2000+ to no avail, meaning that the resulting 'results.params', or in my case 'fitter.params' are still only the initial conditions. My only work around is to print the parameters to the screen with a sufficiently high number of decimal places so that I can examine the fit.

I will try modifying the tolerances as you suggest.

Patrick

--
You received this message because you are subscribed to a topic in the Google Groups "lmfit-py" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lmfit-py/KDEasvGDoxE/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lmfit-py+unsubscribe@googlegroups.com.
To post to this group, send email to lmfi...@googlegroups.com.
Visit this group at https://groups.google.com/group/lmfit-py.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/CA%2B7ESbr_dPVQiF_%3Dd1pGOEKeW-OSEuWdpLBQh8QSWAUxUkgFUw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Matt Newville

unread,

Sep 12, 2017, 5:34:13 PM9/12/17

to lmfit-py

Hi Patrick,

On Tue, Sep 12, 2017 at 9:59 AM, Patrick Greenfield <patrick.g...@gmail.com> wrote:

Hello Matt,

Thank you very much for your reply. In my particular minimization Nvarys=130, so I've tried extending the "iter_cb" iteration stop to 2000+ to no avail, meaning that the resulting 'results.params', or in my case 'fitter.params' are still only the initial conditions. My only work around is to print the parameters to the screen with a sufficiently high number of decimal places so that I can examine the fit.

I will try modifying the tolerances as you suggest.

I would also suggest thinking about trying 'leastsq' -- that's a simple change to make and it might be interesting if it shows different behavior. I don't know why `least_squares` with 'trf' method would be very different, but it might be.

But also, I think that the way you're trying to use 'iter_cb' to intentionally stop a fit at an arbitrary number of function evaluations is going to sometimes give results where "the values did not change very much", especially if you set that number of function evaluations to be too low. I would guess what "7*Nvarys" would not be too low to see real changes, but maybe your problem is especially challenging?

Another thing that might be simple to try would be to freeze half (or more) of your variables, and see if the remaining variables actually get refined in a reasonable number of iterations. That is, while I have every confidence that you could write an objective function with 130 variables that all really affected the fit and were robust variables, I doubt that I could do that without carefully building that up from components and testing along the way.