I am getting obviously incorrect results from eval_uncertainty() in my application of lmfit with an expression model.
The best fit of the model for the predicted y compared with observations appears to be a good result, but the prediction intervals or confidence intervals of the predicted y from eval_uncertainty() appears to be very wrong.
I am not sure if I am doing something wrong, or if it is a bug in lmfit. Any advice is appreciated.
How do I specify whether the output is prediction intervals or confidence intervals (I want 95% prediction intervals for the uncertainty of predicted y at new observations)?
Does eval_uncertainty() use the delta-method to estimate the confidence intervals or prediction intervals for the predicted y?
P.S. below is my python code that is also attached along with the example data file and output plots
# - - -
# Example of nonlinear regression for Nina's project using hard clam (Mercenaria mercenaria) data from Ries et al 2009
# using the lmfit Python package
# Greg Pelletier (
gjpel...@gmail.com)
# - - -
# Non-Linear Least-Squares Minimization and Curve-Fitting for Python
#
https://lmfit.github.io//lmfit-py/index.htmlimport matplotlib.pyplot as plt
# the following line is needed for making plots in ipython:
%matplotlib auto
import numpy as np
from numpy import loadtxt
from numpy import exp, linspace, sin
import lmfit as fit
from lmfit.models import ExpressionModel
# import data
data = loadtxt('/mnt/c/nina/hanna/ries_hardclam.dat')
x = data[:, 0]
y = data[:, 1]
# find 25%tile y to use for initial b1
y25th = np.percentile(y,25)
# buidl the expression model and set the initial parameter values
mod = ExpressionModel('b1 + b2 * exp(b3 * x)')
pars = fit.Parameters()
pars['b1'] = fit.Parameter(name='b1', value=y25th, min=-np.inf, max=np.inf)
pars['b2'] = fit.Parameter(name='b2', value=0, min=-np.inf, max=np.inf)
pars['b3'] = fit.Parameter(name='b3', value=0, min=-np.inf, max=np.inf)
print(pars)
# Parameters([('b1', <Parameter 'b1', value=0.326622, bounds=[-inf:inf]>), ('b2', <Parameter 'b2', value=0, bounds=[-inf:inf]>), ('b3', <Parameter 'b3', value=0, bounds=[-inf:inf]>)])
# find the best-fit for the model
out = mod.fit(y, pars, x=x)
# - - -
# Using initial b1 = y25th (good result):
# These results are nearly identical to what I found with matlab using this data set
print(out.fit_report())
# [[Model]]
# Model(_eval)
# [[Fit Statistics]]
# # fitting method = leastsq
# # function evals = 1451
# # data points = 25
# # variables = 3
# chi-square = 2.0705e-08
# reduced chi-square = 9.4115e-10
# Akaike info crit = -516.793759
# Bayesian info crit = -513.137132
# R-squared = 0.82814534
# [[Variables]]
# b1: 6.9536e-05 +/- 1.6571e-05 (23.83%) (init = -1.665168e-05)
# b2: -675692.875 +/- 5434609.15 (804.30%) (init = 0)
# b3: -22.2179803 +/- 8.14731384 (36.67%) (init = 0)
# [[Correlations]] (unreported correlations are < 0.100)
# C(b2, b3) = +0.9999
# C(b1, b3) = +0.8711
# C(b1, b2) = +0.8671
# - - -
# Plot best fit ypred prediciton line at interpolated/extrapolated xpred values
# This result does a good job of matching the results of matlab for the best fitting paramters b1, b2, and b3
# x values to use for the smooth lines of the prediction interval plot
xpred = linspace(.95, 1.3, 100)
# extract parameter values from the best fit output and make user-defined function for the regression
d = out.params
b1 = d['b1'].value
b2 = d['b2'].value
b3 = d['b3'].value
param = [b1, b2, b3]
f = lambda param,xval : param[0] + param[1] * exp(param[2] * xval)
# predicted y values at xpred
ypred = f(param,xpred)
plt.figure()
plt.plot(x, y, '.', label='observations')
plt.plot(x, out.init_fit, '--', label='initial fit')
plt.plot(xpred, ypred, '-', label='interpolated/extrapolated best fit')
plt.legend()
plt.savefig('/mnt/c/nina/hanna/python_ries_hardclam_bestfit_goodresult_v06.png',dpi=300)
# - - -
# Plot prediction intervals
# This is my first attempt to plot prediction intervals using lmfit
# Obviously the result is not correct.
# I am not sure if I am doing something wrong, or if there is a bug in lmfit that gives incorrect results
plt.figure()
plt.plot(x, y, '.', label='observations')
plt.plot(xpred, ypred, '-', label='interpolated/extrapolated best fit')
dy = out.eval_uncertainty(x=xpred,sigma=2)
plt.fill_between(xpred, ypred-dy, ypred+dy,
color="#ABABAB", label=r'2-$\sigma$ uncertainty band')
plt.legend()
plt.savefig('/mnt/c/nina/hanna/python_ries_hardclam_predictioninterval_badresult_v06.png',dpi=300)