float64 coercion in 25d0a85 breaks our fitting code

30 views
Skip to first unread message

John Parejko

unread,
Feb 15, 2023, 4:43:30 PM2/15/23
to lmfit-py
Hello lmfit,

We have some code that uses lmfit on a fairly complicated 2d model to fit images. We found that upgrading to 1.0.3 causes the fit to fail (no errors, parameters barely move from initial values). I've `git bisected` it to commit 25d0a85, line 1005, which coerces `data` to a float64 [1]. The input data from our images is float32. I'm quite surprised that increasing the data precision causes the fitter to not be able to fit the data.

I didn't see any other problem reports in github or the mailing list related to this upcasting change. Does anyone have suggestions about approaches for mitigating this? I can provide more details, but creating a minimum working example from our code may be tricky.

Thank you in advance,
John

Matt Newville

unread,
Feb 15, 2023, 5:13:57 PM2/15/23
to lmfi...@googlegroups.com
Hi John, 

Hopefully, you have looked at some of the discussions around https://github.com/lmfit/lmfit-py/pull/723 and https://github.com/lmfit/lmfit-py/issues/727?   

The parameter values really must be float64 scalers ("float").  The array to be minimized really must be float64 too.

When using the Model class, the expectation is that the data is float64.  The original intention was that you can pass in any independent data you want, but the result of the model calculation should also be float64.   We saw enough problems with the placement of that burden to coerce to float64 on the user, that we force the coercion ourselves for NumPy arrays and pandas Series.   I fear that going back would cause more problems than it solves ;).

It is certainly possible that we could do better, but I'm slightly surprised to hear that something that had been working stopped working at that change.  Maybe you can give us a bit more detail about what you are doing?



--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/f963b431-aca0-47e0-a061-48d00682c57fn%40googlegroups.com.


--
--Matt Newville <newville at cars.uchicago.edu630-327-7411

John Parejko

unread,
Feb 15, 2023, 7:04:36 PM2/15/23
to lmfit-py
Thank you for those links and the quick response.

Reading those discussions led me to dig into our internal model, which is generating the model images as float32; that's all the precision our data has, and is a performance and memory boost over float64. The discussion on PR 723 pointed me at the `epsfcn` kwarg to leastsq. Passing epsfcn of around 1e-13 or larger results in a successful fit, without any other changes to our model or to lmfit. I see you had proposed on that PR defaulting epsfcn to 1e-10, which would be sufficient for our use case. I don't know if it would cause other problems elsewhere, though, or result in too-large of steps for some models.

Do you foresee us having other problems keeping our internal model in float32, and letting lmfit/scipy cast as necessary when computing residuals? It's been serving us well so far, except for not being able to upgrade from 1.0.2 (which I now have a fix for, with epsfcn).

Thank you,
John

Matt Newville

unread,
Feb 16, 2023, 9:04:05 AM2/16/23
to lmfi...@googlegroups.com
Hi John, 

On Wed, Feb 15, 2023 at 6:05 PM John Parejko <pare...@uw.edu> wrote:
Thank you for those links and the quick response.

Reading those discussions led me to dig into our internal model, which is generating the model images as float32; that's all the precision our data has, and is a performance and memory boost over float64. The discussion on PR 723 pointed me at the `epsfcn` kwarg to leastsq. Passing epsfcn of around 1e-13 or larger results in a successful fit, without any other changes to our model or to lmfit. I see you had proposed on that PR defaulting epsfcn to 1e-10, which would be sufficient for our use case. I don't know if it would cause other problems elsewhere, though, or result in too-large of steps for some models.

Do you foresee us having other problems keeping our internal model in float32, and letting lmfit/scipy cast as necessary when computing residuals? It's been serving us well so far, except for not being able to upgrade from 1.0.2 (which I now have a fix for, with epsfcn).


Yeah, as I look at this, it does seem like maybe we should not be coercing the input independent variables at all.  It should be totally fine to send uint16 images, if that is what you have.   Even as it is, if the independent data is in more complex structures (say, a dict!), then we don't look inside to coerce those.     

OTOH, we probably need to coerce the output to float64, before it gets sent off to the solver.  I would lean toward doing this.

In fairness to how we are doing it currently, coercing "late" does risk losing some sensitivity:  small changes in the result might be too small to alter a float32 value. It seems like we're probably going to see such issues at some point.  

As you suggest, turning up the default value of `epsfcn` might help mitigate those problems.  It currently takes the default value of around 2e-16.  Initial relative step sizes are given by the square root of that (so 1.5e-7 or so).   I agree that `epsfcn` should be more like 1.e-10 or even 1.e-8,  so the initial relative steps are 1.e-5 or 1.e-4, which are more assured to cause noticeable changes for float32 values. 
  
Doing this currently causes some tests to fail because values are not "close enough" to expected values,  but I think we can relax those tests.

I think this topic ("don't coerce input independent variables, do coerce output, increase epsfcn" all sort of go together) probably needs more discussion and review as Issues/Pull Requests on Github.

Thanks, 
--Matt




Reply all
Reply to author
Forward
0 new messages