Problems to calculate uncertainties

701 views
Skip to first unread message

Ivan Lopez

unread,
Jul 30, 2019, 11:21:58 AM7/30/19
to lmfi...@googlegroups.com
Hi everybody!

I am using lmfit for fitting spectra since a while and now I realized that's I have issues to calculate the uncertainties in my code. 
I leave here an example with the same error of what its happened, because my whole code use another package to parallel the fitting and can complicate the reading.

import time
import numpy as np
from lmfit import Model, Minimizer, Parameters
from numpy import exp, loadtxt, pi, sqrt, arange, interp

#Useful Functions

def gaussian(x, amplitude, xcen, std):
    """1-d gaussian"""
    return amplitude * exp(-(x-xcen)**2 / (2*std**2))

def powerlaw(x, c1, c2):
    """power law function"""
    return c1*(x**c2)

def Modelcero(x):
    """Simple model of 0 to initialize some models"""
    return 0*x

def gaussian_defpar(name,amplitude,xcen,fwhm,pars):
    """Set the parameters for a gaussian"""
    std=fwhm/2.355
    name=str(name)
    xcen=xcen*OneZ
    pars[name+'_std'].set(std,vary=False,min=None,max=None)
    pars[name+'_xcen'].set(xcen,vary=False,min=None,max=None)
    pars[name+'_amplitude'].set(amplitude,vary=True,min=0,max=None)
    return

def set_generic_mpf(OneZ):
    """Make a generic model with generic parameters"""
    #Powerlaw
    p_law = Model(powerlaw,prefix='pwl_')
    #Definition of the different gaussians
    h2lines=['55','61','69','80','96','122']
    gaussH2=Model(Modelcero)
    for line in h2lines:
        prefix='h'+line+'_'
        gaussH2+=Model(gaussian,prefix=prefix)
    #Composite Model
    generic_mod=gaussH2+p_law
    #Definition of initial parameters
    generic_pars = generic_mod.make_params()
    generic_pars['pwl_c1'].set(0.1,vary=True,min=None,max=None)
    generic_pars['pwl_c2'].set(0.5,vary=True,min=None,max=None)
    gaussian_defpar('h55',0.05, 5.5, 0.053,generic_pars)
    gaussian_defpar('h61',0.05, 6.109, 0.053,generic_pars)
    gaussian_defpar('h69',0.05, 6.909, 0.053,generic_pars)
    gaussian_defpar('h80',0.05, 8, 0.100,generic_pars)
    gaussian_defpar('h96',0.05, 10, 0.100,generic_pars)
    gaussian_defpar('h122',0.05, 12, 0.100,generic_pars)
    return generic_mod, generic_pars

def mpf(spec,spec_err):
    """Run MPF with the model and parameters over the spectra"""
    starter_time = time.time()
    data_point = np.array(spec)
    error_point= np.array(spec_err)
    print('Starting')
    result=mod.fit(spec, pars, weights=1./spec_err, x=x)
    print('Finished','- Redchi',result.redchi, '- took', time.time()-starter_time,'sec')
    return result


#Read the data (some invented data for test)
z=0.000784
OneZ=z+1
x=np.arange(1,21)

addelse=np.arange(20)
addelse_er=np.zeros(20)+0.01
addelse=addelse+gaussian(x, 0.8, 6, 0.05)+gaussian(x, 0.8, 8.1, 0.05)+gaussian(x, 0.8, 10.1, 0.05)+gaussian(x, 0.8, 12.1, 0.05)

#Make models
mod, pars = set_generic_mpf(OneZ)

#Run MPF
result_else=mpf(addelse,addelse_er)

So, for this example I obtained this report:
[[Model]]
    (((((((Model(Modelcero) + Model(gaussian, prefix='h55_')) + Model(gaussian, prefix='h61_')) + Model(gaussian, prefix='h69_')) + Model(gaussian, prefix='h80_')) + Model(gaussian, prefix='h96_')) + Model(gaussian, prefix='h122_')) + Model(powerlaw, prefix='pwl_'))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 84
    # data points      = 20
    # variables        = 8
    chi-square         = 9194.54877
    reduced chi-square = 766.212398
    Akaike info crit   = 138.612676
    Bayesian info crit = 146.578534
##  Warning: uncertainties could not be estimated:
    h55_amplitude:   at initial value
    h55_xcen:        at initial value
    h55_std:         at initial value
    h61_xcen:        at initial value
    h61_std:         at initial value
    h69_xcen:        at initial value
    h69_std:         at initial value
    h80_xcen:        at initial value
    h80_std:         at initial value
    h96_xcen:        at initial value
    h96_std:         at initial value
    h122_xcen:       at initial value
    h122_std:        at initial value
[[Variables]]
    h55_amplitude:   0.05000000 (init = 0.05)
    h55_xcen:        5.504312 (fixed)
    h55_std:         0.02250531 (fixed)
    h61_amplitude:   307743.582 (init = 0.05)
    h61_xcen:        6.113789 (fixed)
    h61_std:         0.02250531 (fixed)
    h69_amplitude:   174.530546 (init = 0.05)
    h69_xcen:        6.914417 (fixed)
    h69_std:         0.02250531 (fixed)
    h80_amplitude:   0.28157829 (init = 0.05)
    h80_xcen:        8.006272 (fixed)
    h80_std:         0.04246285 (fixed)
    h96_amplitude:   0.32645055 (init = 0.05)
    h96_xcen:        10.00784 (fixed)
    h96_std:         0.04246285 (fixed)
    h122_amplitude:  0.31942478 (init = 0.05)
    h122_xcen:       12.00941 (fixed)
    h122_std:        0.04246285 (fixed)
    pwl_c1:          0.65227562 (init = 0.1)
    pwl_c2:          1.12942602 (init = 0.5)

You can see there are some parameters with nan values as error. Also I used to received this warning ##  Warning: uncertainties could not be estimated: with parameters that say at boundary and in the Variables section there are the final parameters but without errors.

My data doesn't have nan values, so I don't know where I am failing. Somebody knows why this happened?

Matt Newville

unread,
Jul 30, 2019, 9:37:46 PM7/30/19
to lmfit-py
On Tue, Jul 30, 2019 at 10:21 AM Ivan Lopez <ilo...@stsci.edu> wrote:
Hi everybody!

I am using lmfit for fitting spectra since a while and now I realized that's I have issues to calculate the uncertainties in my code. 
I leave here an example with the same error of what its happened, because my whole code use another package to parallel the fitting and can complicate the reading.

Your example does not run for me.  Please read the instructions on how to ask for help, especially the part about providing a minimal, complete example that demonstrates the problem.
Some things to ponder while you are putting together your minimal example:

   why (or even "why the heck") are you wrapping so many operations in one-line functions?  That appears to be intentionally aimed at maximizing a lack of clarity.  Don't do it.
   why (or even "why the heck") are you automatically setting very tight bounds on your "center" parameters.  That is asking for trouble.  Don't do it. 

--Matt


import time
import numpy as np
from lmfit import Model, Minimizer, Parameters
from numpy import exp, loadtxt, pi, sqrt, arange, interp

#Useful Functions

def gaussian(x, amplitude, xcen, std):
    """1-d gaussian"""
    return amplitude * exp(-(x-xcen)**2 / (2*std**2))

def powerlaw(x, c1, c2):
    """power law function"""
    return c1*(x**c2)

def Modelcero(x):
    """Simple model of 0 to initialize some models"""
    return 0*x

def fwhm2std(fwhm):
    """from fwhm to standard deviation of a gaussian"""
    return fwhm/2.355

def setpar(pars,name,value,vary,minus,maxim):
    """Set any parameter"""
    pars[name].set(value,vary=vary,min=minus,max=maxim)

def gaussian_defpar(name,amplitude,xcen,fwhm,pars):
    """Set the parameters for a gaussian"""
    std=fwhm2std(fwhm)
    name=str(name)
    xcen=xcen*OneZ
    setpar(pars,name+'_std',std,False,None,None)
    setpar(pars,name+'_xcen',xcen,True,xcen-0.08,xcen+0.08)
    setpar(pars,name+'_amplitude',amplitude,True,0,None)
    return

def set_generic_mpf(OneZ):
    """Make a generic model with generic parameters"""
    #Powerlaw
    p_law = Model(powerlaw,prefix='pwl_')
    #Definition of the different gaussians
    h2lines=['69','80','96','122']
    gaussH2=Model(Modelcero)
    for line in h2lines:
        prefix='h'+line+'_'
        gaussH2+=Model(gaussian,prefix=prefix)
    #Composite Model
    generic_mod=p_law+gaussH2
    #Definition of initial parameters
    generic_pars = generic_mod.make_params()
    setpar(generic_pars,'pwl_c1',0.001,True,None,None)
    setpar(generic_pars,'pwl_c2',0.002,True,None,None)
    gaussian_defpar('h69',0.05, 6.909, 0.053,generic_pars)
    gaussian_defpar('h80',0.05, 8, 0.100,generic_pars)
    gaussian_defpar('h96',0.05, 10, 0.100,generic_pars)
    gaussian_defpar('h122',0.05, 12, 0.100,generic_pars)
    return generic_mod, generic_pars

def mpf(spec,spec_err):
    """Run MPF with the model and parameters over the spectra"""
    starter_time = time.time()
    data_point = np.array(spec)
    error_point= np.array(spec_err)
    print('Starting')
    result=mod.fit(spec, pars, weights=1./spec_err, x=x)
    print('Finished','- Redchi',result.redchi, '- took', time.time()-starter_time,'sec')
    return result


#Read the data (some invented data for test)
z=0.000784
OneZ=z+1
x=np.arange(0,20)
addelse=np.arange(20)
#addelse_er=np.random.rand(20)
addelse=addelse+gaussian(x, 0.8, 8.1, 0.05)+gaussian(x, 0.8, 10.1, 0.05)+gaussian(x, 0.8, 12.1, 0.05)

print(addelse,addelse_er)

#Make models
generic_mod, generic_pars = set_generic_mpf(OneZ)
mod=generic_mod
pars=generic_pars

#Run MPF
result_else=mpf(addelse,addelse_er)

So, for this example I obtained this report:

[[Model]]
    (Model(powerlaw, prefix='pwl_') + ((((Model(Modelcero) + Model(gaussian, prefix='h69_')) + Model(gaussian, prefix='h80_')) + Model(gaussian, prefix='h96_')) + Model(gaussian, prefix='h122_')))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 1015
    # data points      = 20
    # variables        = 10
    chi-square         = 1.7844e-09
    reduced chi-square = 1.7844e-10
    Akaike info crit   = -442.798549
    Bayesian info crit = -432.841226
##  Warning: uncertainties could not be estimated:
    h69_std:         at initial value
    h80_std:         at initial value
    h96_std:         at initial value
    h122_std:        at initial value
[[Variables]]
    pwl_c1:          0.99999661 +/- 1.0755e-06 (0.00%) (init = 0.001)
    pwl_c2:          1.00000125 +/- 3.9717e-07 (0.00%) (init = 0.002)
    h69_amplitude:   0.00102681 +/- 110971.794 (10807483239.18%) (init = 0.05)
    h69_xcen:        6.93048309 +/- 894906.946 (12912620.00%) (init = 6.914417)
    h69_std:         0.02250531 (fixed)
    h80_amplitude:   0.32687742 +/- 511.226535 (156397.02%) (init = 0.05)
    h80_xcen:        7.93687686 +/- 56.1356566 (707.28%) (init = 8.006272)
    h80_std:         0.04246285 (fixed)
    h96_amplitude:   0.39600960 +/- 1902.95071 (480531.47%) (init = 0.05)
    h96_xcen:        9.93161557 +/- 126.701495 (1275.74%) (init = 10.00784)
    h96_std:         0.04246285 (fixed)
    h122_amplitude:  0.39205515 +/-        nan (nan%) (init = 0.05)
    h122_xcen:       11.9318803 +/-        nan (nan%) (init = 12.00941)
    h122_std:        0.04246285 (fixed)
[[Correlations]] (unreported correlations are < 0.100)
    C(h69_xcen, h80_amplitude)      = -15.904
    C(h69_xcen, h80_xcen)           =  12.657
    C(h69_amplitude, h80_amplitude) =  12.586
    C(h69_amplitude, h80_xcen)      = -10.016
    C(h96_amplitude, h96_xcen)      = -1.000
    C(pwl_c1, pwl_c2)               = -1.000
    C(h69_amplitude, h69_xcen)      = -0.880
    C(h80_amplitude, h80_xcen)      = -0.796
    C(h69_xcen, h96_amplitude)      = -0.714
    C(h69_xcen, h96_xcen)           =  0.714
    C(h69_amplitude, h96_xcen)      =  0.171
    C(h69_amplitude, h96_amplitude) = -0.171
    C(h80_xcen, h96_xcen)           =  0.136
    C(h80_xcen, h96_amplitude)      = -0.136

You can see there are some parameters with nan values as error. Also I used to received this warning ##  Warning: uncertainties could not be estimated: with parameters that say at boundary and in the Variables section there are the final parameters but without errors.

My data doesn't have nan values, so I don't know where I am failing. Somebody knows why this happened?

--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/e125af4d-67ca-400c-a6ee-64767b670b68%40googlegroups.com.


--
--Matt Newville <newville at cars.uchicago.edu> 630-252-0431

Renee Otten

unread,
Jul 30, 2019, 10:08:24 PM7/30/19
to lmfi...@googlegroups.com
hi Ivan, 


First off, I agree with all Matt his observations. Secondly, you can ignore the messages about the fixed parameters at their initial values, that’s of course how it should be for fixed parameters and this was a mistake in the code; fixed in commit (https://github.com/lmfit/lmfit-py/commit/2950ad582071a9cc245fea987203bfdb87769272) and this will be available in the next release. 

You should be worried though about the highly correlated parameters and uncertainties that are ridiculously large; it probably means that these fitting parameters have no effect on the overall fit and are likely the reason why the uncertainties cannot be calculated correctly.

Good luck,
Renee


Ivan Lopez

unread,
Aug 1, 2019, 11:41:41 AM8/1/19
to lmfi...@googlegroups.com
Thanks for the reply Matt and Rene!

Sorry, I edited the code in the original post so now it must work. If you see the report, the variable h55_amplitude don't change after the fit, maybe is that the problem? If I remove the h55 gaussian, I can calculate the uncertains correctly. The problems is that I don't know if my data has or hasn't that gaussian, so when I ran this for a lot of spectra there are some that fails in the calculation of the uncertains.

For your other comments, sorry for the one-line fuctions. I removed two of these, so you can read more easy the code. I am working with a 1000 of spectra for 40 different objects, so I did some of thats functions to reduced the size of the code. And about the very tight bounds on the center of the gaussians, that's is because I don't know if the central parameters are 100% correctly for the observations but I know that are near.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfi...@googlegroups.com.

Matt Newville

unread,
Aug 1, 2019, 8:58:35 PM8/1/19
to lmfit-py
On Thu, Aug 1, 2019 at 10:41 AM Ivan Lopez <ilo...@stsci.edu> wrote:
Thanks for the reply Matt!

Sorry, I edited the code in the original post so now it must work. If you see the report, the variable h55_amplitude don't change after the fit, maybe is that the problem?

I don't see the edits.  I don't see the report.  I have no idea what `h55` is. Your example above doesn't have that.   For some definition of "the problem", yes, that is exactly the problem.  Re-read the instructions on how to ask for help, especially the part about providing a minimal, complete example that demonstrates the problem.
If I remove the h55 gaussian, I can calculate the uncertains correctly. The problems is that I don't know if my data has or hasn't that gaussian, so when I ran this for a lot of spectra there are some that fails in the calculation of the uncertains.

Wouldn't that by itself be a clue?

For your other comments, sorry for the one-line fuctions. I removed two of these, so you can read more easy the code.

I cannot read it at all.
I am working with a 1000 of spectra for 40 different objects, so I did some of thats functions to reduced the size of the code. And about the very tight bounds on the center of the gaussians, that's is because I don't know if the central parameters are 100% correctly for the observations but I know that are near.


But, why do you need such tight bounds?  Are you sure you need bounds at all?


Ivan Lopez

unread,
Aug 2, 2019, 11:54:15 AM8/2/19
to lmfit-py
Thanks Renee also for the answer. I will ignore the messages about the fixed parameters at their initial values.

Matt, I am using the web interface for the Google groups so I can edit the original code but maybe if you are using some mail interface, you can't see the modifications. I will sent the modifications here, so you can read it.

Code:
import time
import numpy as np
from lmfit import Model, Minimizer, Parameters
from numpy import exp, loadtxt, pi, sqrt, arange, interp

#Useful Functions

def gaussian(x, amplitude, xcen, std):
   """1-d gaussian"""
   return amplitude * exp(-(x-xcen)**2 / (2*std**2))

def powerlaw(x, c1, c2):
   """power law function"""
   return c1*(x**c2)

def Modelcero(x):
   """Simple model of 0 to initialize some models"""
   return 0*x

def gaussian_defpar(name,amplitude,xcen,fwhm,pars):
   """Set the parameters for a gaussian"""
   std=fwhm/2.355
   name=str(name)
   xcen=xcen*OneZ
   pars[name+'_std'].set(std,vary=False,min=None,max=None)
   pars[name+'_xcen'].set(xcen,vary=False,min=None,max=None)
   pars[name+'_amplitude'].set(amplitude,vary=True,min=0,max=None)
   return

def set_generic_mpf(OneZ):
   """Make a generic model with generic parameters"""
   #Powerlaw
   p_law = Model(powerlaw,prefix='pwl_')
   #Definition of the different gaussians
   h2lines=['55','61','69','80','96','122']
   gaussH2=Model(Modelcero)
   for line in h2lines:
       prefix='h'+line+'_'
       gaussH2+=Model(gaussian,prefix=prefix)
   #Composite Model
   generic_mod=gaussH2+p_law
   #Definition of initial parameters
   generic_pars = generic_mod.make_params()
   generic_pars['pwl_c1'].set(0.1,vary=True,min=None,max=None)
   generic_pars['pwl_c2'].set(0.5,vary=True,min=None,max=None)
   gaussian_defpar('h55',0.05, 5.5, 0.053,generic_pars)
   gaussian_defpar('h61',0.05, 6.109, 0.053,generic_pars)
   gaussian_defpar('h69',0.05, 6.909, 0.053,generic_pars)
   gaussian_defpar('h80',0.05, 8, 0.100,generic_pars)
   gaussian_defpar('h96',0.05, 10, 0.100,generic_pars)
   gaussian_defpar('h122',0.05, 12, 0.100,generic_pars)
   return generic_mod, generic_pars

def mpf(spec,spec_err):
   """Run MPF with the model and parameters over the spectra"""
   starter_time = time.time()
   data_point = np.array(spec)
   error_point= np.array(spec_err)
   print('Starting')
   result=mod.fit(spec, pars, weights=1./spec_err, x=x)
   print('Finished','- Redchi',result.redchi, '- took', time.time()-starter_time,'sec')
   return result


#Read the data (some invented data for test)
z=0.000784
OneZ=z+1
x=np.arange(1,21)

addelse=np.arange(20)
addelse_er=np.zeros(20)+0.01
addelse=addelse+gaussian(x, 0.8, 6, 0.05)+gaussian(x, 0.8, 8.1, 0.05)+gaussian(x, 0.8, 10.1, 0.05)+gaussian(x, 0.8, 12.1, 0.05)

#Make models
mod, pars = set_generic_mpf(OneZ)

#Run MPF
result_else=mpf(addelse,addelse_er)

 And the report:
[[Model]]
    (((((((Model(Modelcero) + Model(gaussian, prefix='h55_')) + Model(gaussian, prefix='h61_')) + Model(gaussian, prefix='h69_')) + Model(gaussian, prefix='h80_')) + Model(gaussian, prefix='h96_')) + Model(gaussian, prefix='h122_')) + Model(powerlaw, prefix='pwl_'))
[[Fit Statistics]]
   
# fitting method   = leastsq

   
# function evals   = 84

   
# data points      = 20

   
# variables        = 8
    chi
-square         = 9194.54877
    reduced chi
-square = 766.212398
   
Akaike info crit   = 138.612676
   
Bayesian info crit = 146.578534
##  Warning: uncertainties could not be estimated:

    h55_amplitude
:   at initial value
    h55_xcen
:        at initial value
    h55_std
:         at initial value
    h61_xcen
:        at initial value
    h61_std
:         at initial value
    h69_xcen
:        at initial value
    h69_std
:
         at initial value
    h80_xcen
:        at initial value
    h80_std
:         at initial value
    h96_xcen
:        at initial value
    h96_std
:         at initial value
    h122_xcen
:       at initial value
    h122_std
:        at initial value
[[Variables]]

    h55_amplitude
:   0.05000000 (init = 0.05)
    h55_xcen
:        5.504312 (fixed)
    h55_std
:         0.02250531 (fixed)
    h61_amplitude
:   307743.582 (init = 0.05)
    h61_xcen
:        6.113789 (fixed)
    h61_std
:         0.02250531 (fixed)
    h69_amplitude
:   174.530546 (init = 0.05)
    h69_xcen
:        6.914417 (fixed)
    h69_std
:         0.02250531 (fixed)
    h80_amplitude
:   0.28157829 (init = 0.05)
    h80_xcen
:        8.006272 (fixed)
    h80_std
:         0.04246285 (fixed)
    h96_amplitude
:   0.32645055 (init = 0.05)
    h96_xcen
:        10.00784 (fixed)
    h96_std
:         0.04246285 (fixed)
    h122_amplitude
:  0.31942478 (init = 0.05)
    h122_xcen
:       12.00941 (fixed)
    h122_std
:        0.04246285 (fixed)
    pwl_c1
:          0.65227562 (init = 0.1)
    pwl_c2
:          1.12942602 (init = 0.5)

Sorry for the inconvenience and thanks for the help.

Ivan.

Ivan Lopez

unread,
Aug 2, 2019, 11:58:24 AM8/2/19
to lmfit-py
Sorry. Google groups limited my characters in the previous answer and the report was incomplete. This is the full report:

Thanks for the help and sorry for the inconvenience.
Ivan

Matt Newville

unread,
Aug 2, 2019, 9:59:19 PM8/2/19
to lmfit-py
On Fri, Aug 2, 2019 at 10:54 AM Ivan Lopez <ilo...@stsci.edu> wrote:
Thanks Renee also for the answer. I will ignore the messages about the fixed parameters at their initial values.

Matt, I am using the web interface for the Google groups so I can edit the original code but maybe if you are using some mail interface, you can't see the modifications. I will sent the modifications here, so you can read it.

Yeah, don't expect me or anyone else to re-read a posting on Google Groups.



You have 21 x values, from `x = np.arange(1, 21)`.  Then you fixed H55_xcen to 5.5 and H55_std = 0.00225... why these vales??? I have no clue.

At the sampled values x=5 and x=6, that component will be essentially 0.  So, the fit tries H55_amplitude = 0.05 and then H55_amplitude = 0.050000001 and sees no difference in the result and concludes that changes in H55_amplitude do not change the result. 

Try adding
      print("H55 component:  ", gaussian(x, 1, 5.5, 0.00225))

to your code.

--Matt

Reply all
Reply to author
Forward
0 new messages