# Calculating confidence interval AND prediction interval with lmfit

572 views

### Ham

Aug 20, 2022, 11:20:53 AM8/20/22
to lmfit-py
Hi,

I am new here - I am trying to use lmfit to do a simple biexponential fit, and also calculate both the confidence interval (CI) and prediction interval (PI).

I am not sure exactly what is calculated by ModelResult.eval_uncertainty().  It appears to me to calculate the confidence interval.  How do I get the prediction interval (i.e., range of values likely to contain a new observation)?  Basically, I have many many observations, and I think the prediction interval bands should contain about 95% of the observations on my plot - but, the bands are really narrow and suggest this is conveying uncertainty in the best fit (i.e., confidence interval).

Thanks!!

### Matt Newville

Aug 20, 2022, 3:28:08 PM8/20/22
to lmfit-py
Hi,

On Sat, Aug 20, 2022 at 10:20 AM Ham <lmc...@gmail.com> wrote:
Hi,

I am new here - I am trying to use lmfit to do a simple biexponential fit, and also calculate both the confidence interval (CI) and prediction interval (PI).

I am not sure exactly what is calculated by ModelResult.eval_uncertainty().  It appears to me to calculate the confidence interval.

Yes, given the best-fit values and uncertainties in the parameters and the correlations between them, it estimates the uncertainty in the predicted model at each point (set of independent variables).

How do I get the prediction interval (i.e., range of values likely to contain a new observation)?

The calculation doesn't make a distinction between values of the independent variable(s) for which data exists and values of the independent variable(s) without existing data: it is all prediction based on the parameters which were determined from the data.    Of course, the estimate will be better for values of the independent variable(s) that were actually used in the fit than at values far from those used in the fit.   In general, interpolation is more reliable than extrapolation.

Basically, I have many many observations, and I think the prediction interval bands should contain about 95% of the observations on my plot - but, the bands are really narrow and suggest this is conveying uncertainty in the best fit (i.e., confidence interval).

By default, `ModelResult.eval_uncertainties()` predicts the 1-sigma uncertainties (~68% confidence).  You can pass in a different value of `sigma` to change the confidence level.  You probably recall that 2-sigma gives ~95% and 3-sigma ~99.7%.  Using sigma>3 probably indicates that a) you are unjustifiably confident in your data and your model, or b) that you know your data collection and reduction processes and your models in such detail that you don't need help from a mailing list ;).

--Matt

### Ham

Aug 21, 2022, 3:00:50 PM8/21/22
to lmfit-py

I looked at the code for lmfit's eval_uncertainties() function, and there is a URL included there describing the approach lmfit is based on:  https://www.astro.rug.nl/software/kapteyn/kmpfittutorial.html#confidence-and-prediction-intervals

The link actually describes exactly what I want to do - Eqn. 103 and the plot below Eqn. 103 (prediction band vs confidence band).  It even includes the python code to achieve it.  Is there a way to achieve that with lmfit now, or if not, do you think this feature could be added to lmfit in the future?

Thanks!
-Ham

### Matt Newville

Aug 21, 2022, 3:33:43 PM8/21/22
to lmfit-py
On Sun, Aug 21, 2022 at 2:00 PM Ham <lmc...@gmail.com> wrote:

I looked at the code for lmfit's eval_uncertainties() function, and there is a URL included there describing the approach lmfit is based on:  https://www.astro.rug.nl/software/kapteyn/kmpfittutorial.html#confidence-and-prediction-intervals

The link actually describes exactly what I want to do - Eqn. 103 and the plot below Eqn. 103 (prediction band vs confidence band).  It even includes the python code to achieve it.  Is there a way to achieve that with lmfit now, or if not, do you think this feature could be added to lmfit in the future?

Yes, as I said (or tried to say) earlier, lmfit makes no distinction between confidence intervals and prediction intervals:  uncertainties in the model can be calculated at any value of the independent variable(s).    If you've done a fit of y(x) to a model, with something like

result = my_model.fit(y, params, x=x)
y_best = result.best_fit
dy =   result.eval_uncertainty(x=x)

will give the uncertainties at the `x` values used to fit the data.  But you can also do

y_pred = result.eval(x=x_new)
dy_pred= result.eval_uncertainty(x=x_new)

to evaluate predicted values and uncertainties for any `x_new` values you want.   How well that prediction works will depend on how far the values of `x_new` are from `x`.

As you will see in most other messages here, if you have further questions we very strongly encourage you to post minimal example code that shows what you are trying to do, and any results or error messages you get.

--Matt