Yeah, as I look at this, it does seem like maybe we should not be coercing the input independent variables at all. It should be totally fine to send uint16 images, if that is what you have. Even as it is, if the independent data is in more complex structures (say, a dict!), then we don't look inside to coerce those.
OTOH, we probably need to coerce the output to float64, before it gets sent off to the solver. I would lean toward doing this.
In fairness to how we are doing it currently, coercing "late" does risk losing some sensitivity: small changes in the result might be too small to alter a float32 value. It seems like we're probably going to see such issues at some point.
As you suggest, turning up the default value of `epsfcn` might help mitigate those problems. It currently takes the default value of around 2e-16. Initial relative step sizes are given by the square root of that (so 1.5e-7 or so). I agree that `epsfcn` should be more like 1.e-10 or even 1.e-8, so the initial relative steps are 1.e-5 or 1.e-4, which are more assured to cause noticeable changes for float32 values.
Doing this currently causes some tests to fail because values are not "close enough" to expected values, but I think we can relax those tests.
I think this topic ("don't coerce input independent variables, do coerce output, increase epsfcn" all sort of go together) probably needs more discussion and review as Issues/Pull Requests on Github.
Thanks,
--Matt