stderr=0.0 for expression of a single expression

21 views
Skip to first unread message

Shane Caldwell

unread,
Mar 25, 2023, 11:51:22 AM3/25/23
to lmfit-py
I'm wondering if this is a bug and wanted to ask here first.

Summary

If I add an expression parameter to a Model, then a second expression parameter that refers only to the first (and no varied parameters) the stderr of the second after fitting is zero. I checked this page of the docs and have the understanding that the second expression should show a correct nonzero stderr.

Example

Case 1: Add expression parameter "period" to SineModel

Here we have correct behavior.

import numpy as np
from lmfit.models import SineModel

model = SineModel()
params = model.make_params()
params.add("period", expr="1/frequency")

xs = np.linspace(0, 1, 21)
ys = np.sin(2 * np.pi * 1.23 * xs) + 0.01 * np.random.normal(size=ys.shape)

params["amplitude"].value = 1.0
params["frequency"].value = 2 * np.pi * 1.23
params["shift"].value = 0.0
fit = model.fit(ys, x=xs, params=params)
print(fit.fit_report())

[[Model]] Model(sine) [[Fit Statistics]] # fitting method = leastsq # function evals = 13 # data points = 21 # variables = 3 chi-square = 0.00205900 reduced chi-square = 1.1439e-04 Akaike info crit = -187.831248 Bayesian info crit = -184.697681 R-squared = 0.99979523 [[Variables]] amplitude: 1.00287241 +/- 0.00335121 (0.33%) (init = 1) frequency: 7.72239860 +/- 0.01068709 (0.14%) (init = 7.728318) shift: 0.00406073 +/- 0.00605504 (149.11%) (init = 0) period: 0.12949345 +/- 1.7921e-04 (0.14%) == '1/frequency' [[Correlations]] (unreported correlations are < 0.100) C(frequency, shift) = -0.839

Case 2: Like 1, but add an expression "period2" = "2*period"

Here is the problem.

import numpy as np
from lmfit.models import SineModel

model = SineModel()
params = model.make_params()
params.add("period", expr="1/frequency")
params.add("period2", expr="2*period")

xs = np.linspace(0, 1, 21)
ys = np.sin(2 * np.pi * 1.23 * xs) + 0.01 * np.random.normal(size=ys.shape)

params["amplitude"].value = 1.0
params["frequency"].value = 2 * np.pi * 1.23
params["shift"].value = 0.0
fit = model.fit(ys, x=xs, params=params)
print(fit.fit_report())

[[Model]] Model(sine) [[Fit Statistics]] # fitting method = leastsq # function evals = 13 # data points = 21 # variables = 3 chi-square = 0.00156582 reduced chi-square = 8.6990e-05 Akaike info crit = -193.581181 Bayesian info crit = -190.447614 R-squared = 0.99984395 [[Variables]] amplitude: 1.00258986 +/- 0.00292230 (0.29%) (init = 1) frequency: 7.72949993 +/- 0.00932408 (0.12%) (init = 7.728318) shift: 4.3461e-04 +/- 0.00527820 (1214.47%) (init = 0) period: 0.12937448 +/- 1.5606e-04 (0.12%) == '1/frequency' period2: 0.25874895 +/- 0.00000000 (0.00%) == '2*period' [[Correlations]] (unreported correlations are < 0.100) C(frequency, shift) = -0.839

Case 3: Like 1, but add an expression "period3" = "period*amplitude".

Here we recover correct behavior again.

import numpy as np
from lmfit.models import SineModel

model = SineModel()
params = model.make_params()
params.add("period", expr="1/frequency")
params.add("period3", expr="period * amplitude")

xs = np.linspace(0, 1, 21)
ys = np.sin(2 * np.pi * 1.23 * xs) + 0.01 * np.random.normal(size=ys.shape)

params["amplitude"].value = 1.0
params["frequency"].value = 2 * np.pi * 1.23
params["shift"].value = 0.0
fit = model.fit(ys, x=xs, params=params)
print(fit.fit_report())

[[Model]] Model(sine) [[Fit Statistics]] # fitting method = leastsq # function evals = 13 # data points = 21 # variables = 3 chi-square = 0.00164905 reduced chi-square = 9.1614e-05 Akaike info crit = -192.493671 Bayesian info crit = -189.360104 R-squared = 0.99983537 [[Variables]] amplitude: 1.00148023 +/- 0.00299836 (0.30%) (init = 1) frequency: 7.74910426 +/- 0.00958574 (0.12%) (init = 7.728318) shift: -0.00910251 +/- 0.00541644 (59.50%) (init = 0) period: 0.12904717 +/- 1.5963e-04 (0.12%) == '1/frequency' period3: 0.12923819 +/- 3.8693e-04 (0.30%) == 'period * amplitude' [[Correlations]] (unreported correlations are < 0.100) C(frequency, shift) = -0.839
Version Information

Python: 3.9.11 (main, Apr 28 2022, 23:52:57)
[Clang 13.1.6 (clang-1316.0.21.2.3)]

lmfit: 1.1.0, scipy: 1.10.1, numpy: 1.23.5,asteval: 0.9.29, uncertainties: 3.1.7

Shane Caldwell

unread,
Mar 25, 2023, 11:56:17 AM3/25/23
to lmfit-py
Actually, looking at Case 3 more closely, it looks like there is no error contribution from "period" there either, ie. it is entirely from the "amplitude" dependence.

>>> (3.8693e-04 / 0.12923819) - (0.00299836 / 1.00148023)
9.737000264235796e-10

Renee Otten

unread,
Mar 25, 2023, 11:57:01 AM3/25/23
to lmfi...@googlegroups.com
I suspect that’s because you are adding a parameter that is not actually used in the model. I suppose one could propagate the uncertainty from fitted “period” parameter, but to be honest I see no real reason to do so. 

Best,
Renee

On Mar 25, 2023, at 11:51 AM, Shane Caldwell <caldwel...@gmail.com> wrote:

I'm wondering if this is a bug and wanted to ask here first.
--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/8c6a8bde-6840-4ccb-8ae8-a992016d1c1en%40googlegroups.com.

Shane Caldwell

unread,
Mar 25, 2023, 12:10:06 PM3/25/23
to lmfit-py
Thanks for your reply, Renee.

Is your position that the user must always write expressions only in terms of varied parameters if they want valid stderrs? Ie. in Case 3 I would write:

params.add("period3", expr="amplitude / frequency").

I agree that's possible but have two points.
1. As a long-time consumer of Lmfit in a production codebase, I found this behavior surprising. I think it should at least be documented.
2. The reason I was doing this is that it allows building up more complex expressions of interest more succinctly. I think it's fair not to support this, but I don't think it's fair to say "there's no reason" to do so based on the simplicity of the example I provided.

Shane

Renee Otten

unread,
Mar 25, 2023, 1:17:15 PM3/25/23
to lmfi...@googlegroups.com
hi Shane,

I’d need to check the code again in how/when exactly we calculate the uncertainties and whether they get propagated. 

But, my view is that if you’re going to define additional parameters in a model they should be used in the actual fitting. If you want to do error propagation to calculate the uncertainties in another parameter that is based on your fitted parameters, you can (and in my opinion should) do so yourself after the fit; for example, using the “uncertainties” package we use in lmfit as well. 

Having said that it’s possible that I am not understanding your use-case based on the “toy” example you show here. It that’s the case, please do elaborate on what you exactly want to accomplish. 

Best,
Renee


On Mar 25, 2023, at 12:10 PM, Shane Caldwell <caldwel...@gmail.com> wrote:

Thanks for your reply, Renee.

Matt Newville

unread,
Mar 25, 2023, 2:28:13 PM3/25/23
to lmfi...@googlegroups.com
Hi Shane, Renee, 

Sorry for the trouble.  I think that Renee's diagnosis is correct.  Currently, error propagation from variables to constrained parameters happens by making `uncertainties.ufloat` values (from the `uncertainties` package) for each "real" variable.   Then it uses those to calculate the uncertainties in the expressions.  But, as with your example, it does not use those "derived uncertainties" for non-variable parameters.

I would consider this an oversight that we should try to fix, and I think it should be fixable.  




--
--Matt Newville <newville at cars.uchicago.edu630-327-7411
Reply all
Reply to author
Forward
0 new messages