confidence interval of fitted parameters

23 views
Skip to first unread message

Sebastien Binet

unread,
Mar 17, 2025, 11:30:14 AMMar 17
to gonum-dev
hi there,

I am trying to replicate a piece of code originally written in MATLAB, in Go, with Gonum.

There's a call to 'fit(...)' to extract some parameters with a non-linear fit.
so far so good, I could leverage gonum/optimize to get results that looked similar to the MATLAB ones.

but the orignal piece of code also computes a 95% confidence interval w/ 'confint' on the resulting fit.

I believe I know how to compute a 95% confidence interval on a given weighted sample:

```go
func ConfidenceInterval(lvl float64, xs, weights []float64) (lo, hi float64) {
var (
mean, std = stat.MeanStdDev(xs, weights)

n = float64(len(xs))
df = n - 1

t = distuv.StudentsT{
Mu: 0,
Sigma: 1,
Nu: df,
}.Quantile(0.5 * (1 + lvl))

merr = t * std / math.Sqrt(n)
)

lo = mean - merr
hi = mean + merr
return lo, hi
}

func foo() {
lo, hi := ConfidenceInterval(0.95, xs, ws)
}
```

(this could be added to gonum as an example, BTW)

but, to compute a confidence interval on the best parameters from the fit, I would need to get my hands on the variance-covariance matrix from the fit (what scipy.optimize.curve_fit calls 'pcov').

IIRC, this can be approximated by inversing the Hessian at the provided solution and rescaling by the goodness-of-fit at the provided solution:

```
func getPcov(xs, ys, popt []float64, f func(x float64, ps []float64) float64) *mat.SymDense {
gof := newGoF(f1d.X, f1d.Y, popt, func(x float64) float64 {
return f(x, popt)
})
inv := mat.NewSymDense(len(popt), nil)
f1d.Hessian(inv, popt)
pcov := mat.NewDense(2, 2, nil)
{
var chol mat.Cholesky
if ok := chol.Factorize(inv); !ok {
panic("cov-matrix not positive semi-definite")
}

err := chol.InverseTo(inv)
if err != nil {
panic(err)
}
pcov.Copy(inv)
}
pcov.Scale(gof.SSE/float64(len(f1d.X)-len(popt)), pcov)
return pcov
}

type GoF struct {
SSE float64 // Sum of squares due to error
Rsquare float64 // R-Square is the square of the correlation between the response values and the predicted response values
NdF int // Number of degrees of freedom
AdjRsquare float64 // Degrees of freedom adjusted R-Square
RMSE float64 // Root mean squared error
}

func newGoF(xs, ys, ps []float64, f func(float64) float64) GoF {
switch {
case len(xs) != len(ys):
panic("invalid lengths")
}

var gof GoF

var (
ye = make([]float64, len(ys))
nn = float64(len(xs) - 1)
vv = float64(len(xs) - len(ps))
)

for i, x := range xs {
ye[i] = f(x)
dy := ys[i] - ye[i]
gof.SSE += dy * dy
}

gof.Rsquare = stat.RSquaredFrom(ye, ys, nil)
gof.AdjRsquare = 1 - ((1 - gof.Rsquare) * nn / vv)
gof.RMSE = math.Sqrt(gof.SSE / float64(len(ys)-len(ps)))
gof.NdF = len(ys) - len(ps)

return gof
}
```

am I on a non completely crazy track ?
(my 'getPcov' and what scipy returns do match for a little toy model with a 1000-large sample, but...)

cheers,
-s

PS: if the math above is sound, this could perhaps also be provided as an example)
Reply all
Reply to author
Forward
0 new messages