Fits using method differential evolution gives wrong parameters for minimum

319 views
Skip to first unread message

Frederic Noel

unread,
Jul 3, 2020, 11:02:23 AM7/3/20
to lmfi...@googlegroups.com
Hi all,

I hope this is the right place to ask this.
I have been using the lmfit tool for my fits for quite a while now, and been enjoing the took. I often used the differential evolution method to find better minimas.
However I discoverd that in some of my fit the given Chi^2 does not match with a externally calculated Chi^2 with the given final parameters.
Investigating this even more i tried to build a minimal (non)working example. I therefore tried to fit just a linear model to some linear distributed data points.
This is my code:
-----------------------------------------------
#!/usr/bin/env python
# coding: utf-8

import numpy as np
from numpy.random import random
from lmfit import minimize, Parameters, Parameter, fit_report

#define residual calculation
def residuals(params, x, modelfct, data=None, weight=None, scalar=False):
   
    paramdict = params.valuesdict()
   
    model = modelfct(x, paramdict)
   
    if data is None:
        return model
    elif weight is None:
        return model - data
    else:
        r = (model - data)*weight
        xi_sq = (r*r).sum()
       
        if scalar:
            return xi_sq
       
        return (model - data)*weight

#define model
def model0(x, paramdict=None):
   
    if paramdict is None:
        m=1.34
        b=-4.5
    else:
        m=paramdict['par_m']
        b=paramdict['par_b']
   
    return m*x+b

# generate some "data" points
n=10
x_data=np.arange(n,dtype=float)
y_data=model0(x_data)+0.5*random(n)
y_err=0.5*random(n)
data=np.stack([x_data,y_data,y_err],axis=-1)

#build parameters
params0 = Parameters()
par_m = Parameter('par_m', value=1., vary=True, min=-1e2, max=1e2, expr=None, brute_step=None)
par_b = Parameter('par_b', value=0., vary=True, min=-1e2, max=1e2, expr=None, brute_step=None)
params0.add_many(par_m,par_b)

# function to print xi^2 and parameters at each interation step
def info(params, iter, resid, x, modelfct, data=None, weight=None, scalar=False):
    print("iteration:",iter)
    for param in params:
        print(param,": ",params[param].value)
    print("xi^2",np.sum(resid**2))


result_diff = minimize(residuals,params0, iter_cb=info , method='differential_evolution',args=((data[:,0]),model0), kws={'data':data[:,1],'weight':1./data[:,2],'scalar':False},seed=666)
fit_report(result_diff)

result_leastsq = minimize(residuals,params0, method='leastsq', iter_cb=info ,args=((data[:,0]),model0), kws={'data':data[:,1],'weight':1./data[:,2]})
fit_report(result_leastsq)

----------------------------------------------------------------------

The prints for the differential evolution fit look like this (i shorted it a bit):

.
.
.
iteration: 1051
par_m :  1.371219604552536
par_b :  -4.293066874884573
xi^2 1.7191296062209818
iteration: 1051
par_m :  1.371219604552536
par_b :  -4.293066874884573
xi^2 1.7191296062209818
iteration: 1052
par_m :  -76.56942327989591
par_b :  -4.293066874884573
xi^2 38790133.57808929
iteration: 1053
par_m :  74.7768453852581
par_b :  -4.293066874884573
xi^2 34407410.5376032
iteration: 1054
par_m :  90.35056507851729
par_b :  92.63245655527132
xi^2 74277469.70569123
iteration: 1055
par_m :  90.35056507851729
par_b :  -89.05936415713427
xi^2 34978585.3295675
.
.
.
iteration: 1169 par_m : 1.0936850178317172 par_b : -4.5703498472581 xi^2 702.9040023042832 iteration: 1170 par_m : 1.371219604552536 par_b : -3.738403848812368 xi^2 104.9619027026542 iteration: 1171 par_m : 1.371219604552536 par_b : -4.847597610959724 xi^2 104.91227600870884

The fit_report is this:
'[[Fit Statistics]]\n    # fitting method   = differential_evolution\n    # function evals   = 1050\n    # data points      = 10\n    # variables        = 2\n    chi-square         = 1.71912961\n    reduced chi-square = 0.21489120\n    Akaike info crit   = -13.6076697\n    Bayesian info crit = -13.0024995\n[[Variables]]\n    par_m:  1.37121960 +/- 0.01000214 (0.73%) (init = 1)\n    par_b: -4.84759761 +/- 0.04363044 (0.90%) (init = 0)\n[[Correlations]] (unreported correlations are < 0.100)\n    C(par_m, par_b) = -0.815'

As one can see the chi^2 seems to be best value within all iterations, but the parameters are the last ones.

In the least square case this seems not to happen (although here the last iteration also has the best chi^2):

One has for the prints:

iteration: -1
par_m :  1.0
par_b :  0.0
xi^2 3265.741717022397
iteration: 0
par_m :  1.0
par_b :  0.0
xi^2 3265.741717022397
iteration: 1
par_m :  1.0
par_b :  0.0
xi^2 3265.741717022397
iteration: 2
par_m :  1.0000000149006638
par_b :  0.0
xi^2 3265.7417989467594
iteration: 3
par_m :  1.0
par_b :  1.4901161193847656e-06
xi^2 3265.7446913150293
iteration: 4
par_m :  1.3712136626022726
par_b :  -4.291755132717512
xi^2 1.7196891631086437
iteration: 5
par_m :  1.3712136830336732
par_b :  -4.291755132717512
xi^2 1.7196892253485263
iteration: 6
par_m :  1.3712136626022726
par_b :  -4.291755068804676
xi^2 1.719689218495164
iteration: 7
par_m :  1.3712213896957053
par_b :  -4.293073734974456
xi^2 1.7191295992862299
iteration: 8
par_m :  1.3712214101272053
par_b :  -4.293073734974456
xi^2 1.7191295992889075
iteration: 9
par_m :  1.3712213896957053
par_b :  -4.293073671041995
xi^2 1.7191295992876123
iteration: 10
par_m :  1.3712213896957053
par_b :  -4.293073734974456
xi^2 1.7191295992862299

And this is the fit result:
 
'[[Fit Statistics]]\n    # fitting method   = leastsq\n    # function evals   = 9\n    # data points      = 10\n    # variables        = 2\n    chi-square         = 1.71912960\n    reduced chi-square = 0.21489120\n    Akaike info crit   = -13.6076698\n    Bayesian info crit = -13.0024996\n[[Variables]]\n    par_m:  1.37122139 +/- 0.01000215 (0.73%) (init = 1)\n    par_b: -4.29307373 +/- 0.04363043 (1.02%) (init = 0)\n[[Correlations]] (unreported correlations are < 0.100)\n    C(par_m, par_b) = -0.815'


Comparing the two results one can also see that the differential evolution fit really does not end on his minimum.

Have i done something wrong in applying the methods? Or is there any other reason why this does not work?

Thanks a lot for the help already!!!!!

Kind regards,

Frederic


Mark Dean

unread,
Jul 3, 2020, 11:55:07 AM7/3/20
to lmfit-py
Dear Frederic,

The random function is producing values distributed between 0 and 1. So when you calculate
y_err=0.5*random(n)
weights = 1/y_err
You get weights between infinity and 0. It's possible that you are effectively only fitting one point and if so there are multiple different pairs of m and b that can do this.

Best,
Mark
params0

Frederic Noel

unread,
Jul 3, 2020, 12:37:07 PM7/3/20
to lmfi...@googlegroups.com
Dear Mark,

Thanks for the quick response!

You may be right, that this might happen if one gets "unlucky". But I do not think that is the problem here.
I just checked with y=0.25+0.25*random(n) and the same problem occurs.
Also if one actually plots the two results one finds that the differential evolution curve lies below the data points and is clearly not the minimum. (see plots below)

Thanks anyway, and I am happy for other suggestions!

Best regards,

Frederic


least square

fit_test_leastsq.png




differential evolution

fit_test_diff.png


Matt Newville

unread,
Jul 3, 2020, 1:20:41 PM7/3/20
to lmfit-py
Hi Frederic,

Yeah, like Mark says, using 'random' as a simulation of noise can get you in trouble.  Maybe try `np.random.normal`. 

But, I think your main question was:

> As one can see the chi^2 seems to be best value within all iterations, but the parameters are the last ones.
> In the least square case this seems not to happen (although here the last iteration also has the best chi^2):

I know we've had similar problems (that is, the parameters are "last evaluated" not strictly "best"). I thought that we fixed that, but it is true that `differential_evolution` is handled a bit differently from the other scalar minimizers so maybe its last evaluation is not with the best-fit values. 

I think this should be raised as an Issue and be investigated more thoroughly.


params0

--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/7210df2c-476b-43d7-ba54-93339ef9ba70o%40googlegroups.com.


Mark Dean

unread,
Jul 3, 2020, 2:43:25 PM7/3/20
to lmfi...@googlegroups.com
Hi,

Please see
https://nbviewer.jupyter.org/gist/BNL-XRG/3d2f75bed6a030e49a17c124ff3c0ca1
I did not reproduce the mismatch in fits.

What I did realize, however, is that the differential evolution algorithm runs a polishing fit by default.

I think this is the source of the inconsistency? If true I would say that lmfit and scipy.optimize are doing what they tell us?

Best,
Mark




--
--------------------------------------------
Mark P. M. Dean
Brookhaven National Laboratory
Condensed Matter Physics Bldg 734
Upton NY 11973
USA
Phone: 001 631 344 7847

Frederic Noel

unread,
Jul 3, 2020, 4:35:21 PM7/3/20
to lmfit-py
Hello Matt,

Thanks! Yeah exactly this is my main problem I just choose some easy way to generate simple data to test, the situation where this first happened hat real data anyway.
But thanks nevertheless, of cause youre right its probably better to generate random data points accordingly to some distribution.
If your right and this is an Issue of lmfit, I am at least happy to know that I am not just unable to read some documentation or something like that, although this would solve my problem probably faster.
Do I need to do anything to officially "report" this as an Issue or are you guys onto it?

Thanks a lot!!!

Best regards,

Frederic
To unsubscribe from this group and stop receiving emails from it, send an email to lmfi...@googlegroups.com.

Frederic Noel

unread,
Jul 3, 2020, 4:45:13 PM7/3/20
to lmfi...@googlegroups.com
Hello Mark,

I am using them lmfit minimizer directly while you are doing a model fit. I do not know if this changes anything, but maybe there is something implemented on that higher level that prevents the problem I have?
I will check tomorrow if i can reproduce your result that the problem does not appear with a model fit. Perhaps we can then find out whats the difference.
Thanks a lot for the work of coding it!

I also saw that scipy differential evolution algorithm runs a polisher. When I was trying to find out why my fits where not working I already tried turning that off (by setting polish=False in the minimization), which unfortunately did not make any difference.
I just checked it again and you can simply take my minimal working example and add that parameter and the same problem still occurs.

Thanks anyway,

Frederic
To unsubscribe from this group and stop receiving emails from it, send an email to lmfi...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfi...@googlegroups.com.

Frederic Noel

unread,
Jul 4, 2020, 4:54:19 AM7/4/20
to lmfit-py
Hello Mark,

I just rebuild your approach with my much smaller data set and much smaller uncertainty spread and I still get the problem also according to your programm:

I just did:

-------------------------------------------

import lmfit
import numpy as np
import matplotlib.pyplot as plt

def linear(x, intercept=0., slope=1.):
    prediction = x*slope + intercept
    chisqr = ((y-prediction)**2).sum()
    chisqr_list.append(chisqr)
    #print(f"chisq={chisq}\tintercept={intercept}\tslope={slope}")
    return prediction

model = lmfit.Model(linear)

params = model.make_params()
params['intercept'].set(min=-10, max=10)
params['slope'].set(min=-10, max=10)

x = np.linspace(0, 9, 10)
y = x* 1.34 - 4.5 + np.random.normal(loc=0.0, scale=0.25, size=x.size)

chisqr_list = []

result_diff = model.fit(y, x=x, method='differential_evolution', params=params)
print(result_diff.fit_report())

fig, ax = plt.subplots()

print(f"chissq_returned {result_diff.chisqr}\n chisqr mmin-returned {min(chisqr_list)-result_diff.chisqr}\nchisqr final - returned {chisqr_list[-1]-result_diff.chisqr}")

ax.plot(chisqr_list)

---------------------------------------------

This gave me:

[[Model]]
    Model(linear)
[[Fit Statistics]]
    # fitting method   = differential_evolution
    # function evals   = 750
    # data points      = 10
    # variables        = 2
    chi-square         = 0.15486430
    reduced chi-square = 0.01935804
    Akaike info crit   = -37.6779109
    Bayesian info crit = -37.0727407
[[Variables]]
    intercept: -4.48384150 +/- 0.08177605 (1.82%) (init = 0)
    slope:      1.28546989 +/- 0.01531806 (1.19%) (init = 1)
[[Correlations]] (unreported correlations are < 0.100)
    C(intercept, slope) = -0.843
chissq_returned 0.15486430450850974
 chisqr mmin-returned -1.5265566588595902e-15
chisqr final - returned 0.8630829382245354

FittestChi.png

which now shows that (big) difference, so maybe this just happens with smaller data sets or you were just "luckier".
Nevertheless I also get that difference with your program.

Best regards,

Frederic

Matt Newville

unread,
Jul 4, 2020, 1:10:29 PM7/4/20
to lmfit-py
Hi Frederic, 

Yes, I can reproduce this.  I played around with this a little bit and I see something that matches what you see, but I think I have some more details.
First, it does also happen when using scalar minimizers other than `differential_evolution`.   

What I see that chi-square and the best-fit parameter values are out of sync because the final value in the list of variable parameters has the wrong value.  This seems to be because the calculation of the covariance matrix (done in our code for the scalar minimizers to allow the estimates of error bars) is not properly restoring the values to the best-fit values, but only for the final variable in the list, and only when bounds are supplied. 

I think this is probably a quirk (well, you can read that as "mistake", but I'm willing to be easy on ourselves for a generally very small change, and only under certain conditions when trying to do a calculation of great help) of how we're using `numdifftools.Hessian` with our kind-of-patchwork implementation of bounds. I have not explored *that* in detail, but explicitly restoring the best-fit values after this calculation of the covariance does fix the problem.

I'll raise this an Issue on Github and propose a solution there.  Hopefully, we'll get a fix for this out soon.


--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/7e887c86-3df2-4f02-8e8d-407a2b5a9585o%40googlegroups.com.


--Matt 

Frederic Noel

unread,
Jul 4, 2020, 2:09:56 PM7/4/20
to lmfit-py
Dear all,

Thank you very very much!
When this gets patched this would really help me a lot! I am looking forward to that!
Thanks for answering so quickly and in such a nice and understanding way!

Best regards,

Frederic

On Friday, July 3, 2020 at 5:02:23 PM UTC+2, Frederic Noel wrote:

Mark Dean

unread,
Jul 4, 2020, 2:14:09 PM7/4/20
to lmfi...@googlegroups.com
That's very convincing. 
Best,
Mark

--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/7e887c86-3df2-4f02-8e8d-407a2b5a9585o%40googlegroups.com.

Renee Otten

unread,
Jul 4, 2020, 5:26:04 PM7/4/20
to lmfi...@googlegroups.com
hi Frederic, Matt, 


It’s totally possible that I made a mistake somewhere when adding the numdifftools code (I don’t immediately see where though). However, setting ``numpy.random.seed(2020)`` to get reproducible fits in your example below, the chisqr we report as ``result_diff.chisqr`` is the identical value that is returned from ```scipy.optimize.differential_evolution``. If you add a print statement in `lmfit.minimizer.scalar_minimize` you’ll see the output below:

     fun: 0.19539282170879718
     jac: array([4.81864549e-05, 1.57959756e-04])
 message: 'Optimization terminated successfully.'
    nfev: 636
     nit: 19
 success: True
       x: array([-0.49890285,  0.13753485])

[[Model]]
    Model(linear)
[[Fit Statistics]]
    # fitting method   = differential_evolution
    # function evals   = 636
    # data points      = 10
    # variables        = 2
    chi-square         = 0.19539282
    reduced chi-square = 0.02442410
    Akaike info crit   = -35.3532837
    Bayesian info crit = -34.7481135
[[Variables]]
    intercept: -4.78462411 +/- 0.09185541 (1.92%) (init = 0)
    slope:      1.36903551 +/- 0.01720610 (1.26%) (init = 1)
[[Correlations]] (unreported correlations are < 0.100)
    C(intercept, slope) = -0.843

chissq_returned 0.19539282170879718
chisqr mmin-returned -4.818645482629336e-13
chisqr final - returned 0.001118626857495808

as you can see the value from lmfit (“chi-square”) and SciPy (“fun”) are pretty much identical: 0.19539282170879718 - 0.19539282 = 1.7087971870832774e-09 ; so I am pretty convinced that the actual output you get is correct. 

I don’t fully understand why there appears to be a lower value in your “chisqr_list”, and perhaps that could have something to do with the calculations of numdifftools, which does call “self.penalty” and therefore indirectly “__residual”. But unless I am not completely understanding the points you’re making, it seems to me that this might be we’re chasing ghosts here. 

Best, 
Renee




--Matt 

Renee Otten

unread,
Jul 4, 2020, 5:58:17 PM7/4/20
to lmfi...@googlegroups.com
okay, I spoke too quickly and I should have waited for the Matt to open the Issue (https://github.com/lmfit/lmfit-py/issues/655) and/or read the messages a bit better. It seemed to me your problem was with the reported chi-square, but now realize that I misunderstood: everything in the output directly coming from the fit is correct (i.e., result_diff.chisqr, result_diff.residual), but for some reason there is a small discrepancy in one of the parameter values, which results in a difference in the chi-square if you *recalculate* the value using these parameters. It indeed clearly has to do with the numdifftools code, but not sure yet though where it originates from…

Matt Newville

unread,
Jul 4, 2020, 6:15:59 PM7/4/20
to lmfit-py
On Sat, Jul 4, 2020 at 4:58 PM Renee Otten <otten...@gmail.com> wrote:
okay, I spoke too quickly and I should have waited for the Matt to open the Issue (https://github.com/lmfit/lmfit-py/issues/655) and/or read the messages a bit better. It seemed to me your problem was with the reported chi-square, but now realize that I misunderstood: everything in the output directly coming from the fit is correct (i.e., result_diff.chisqr, result_diff.residual), but for some reason there is a small discrepancy in one of the parameter values, which results in a difference in the chi-square if you *recalculate* the value using these parameters. It indeed clearly has to do with the numdifftools code, but not sure yet though where it originates from…

Yes, your first take was my first guess too, but then I followed it "down the rabbit hole" a little bit more.  

I think the Pull Request now at github should fix this.  It's sort of "belt and suspenders"  (sorry for all the American-isms!)  approach to resetting the best fit values after any potential mess from whatever numdifftools tools + the sort-of-mess that is `_penalty` vs `residual` and various wrinkles in bounds transformations.
 
--Matt 

Reply all
Reply to author
Forward
0 new messages