Good fit but very low chi-squared and red.chi-squared. Huge errors of fitted parameters

671 views
Skip to first unread message

Filippo Monteverdi

unread,
Jun 6, 2019, 11:29:02 AM6/6/19
to lmfit-py
Good morning.
I wrote a script that should fit a multi-component absorption line profile. The program works very well, the obtained fit is very good (by eye). Actually the problems reside in the fact that the chi-squared is too low, kind of 0.003. Secondly the errors on the extracted parameters are too huge, sometimes 19000%. I'm trying to figure out what I can do to fix it, but i'm really out of ideas. can someone help me?
I attach an image of the fit to the function. My output is this:

[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 184
    # data points      = 27
    # variables        = 8
    chi-square         = 0.03801124
    reduced chi-square = 0.00200059
    Akaike info crit   = -161.274174
    Bayesian info crit = -150.907479
[[Variables]]
    A1:       3.72428209 +/- 308.225500 (8276.11%) (init = 3.293586)
    alphaD1:  0.15554118 +/- 0.25919609 (166.64%) (init = 0.01075787)
    nu_01:    8687.45 (fixed)
    alphaL1:  0.53080056 +/- 0.05437951 (10.24%) (init = 1.056746)
    tau1:     1.48688918 +/- 123.072724 (8277.20%) (init = 1.313183)
    A2:       1.53369709 +/- 235.624010 (15363.14%) (init = 2.570257)
    alphaD2:  0.48769148 +/- 0.03348135 (6.87%) (init = 0.4047748)
    nu_02:    8688.754 (fixed)
    alphaL2:  4.7612e-05 +/- 0.00904746 (19002.47%) (init = 0.6137988)
    tau2:     0.76845591 +/- 118.046420 (15361.51%) (init = 1.285023)
[[Correlations]] (unreported correlations are < 0.100)
    C(A2, tau2)         = -1.000
    C(A1, tau1)         = -1.000
    C(alphaD1, alphaL1) = -0.512
    C(alphaD2, alphaL2) = -0.477
    C(alphaD1, alphaD2) =  0.319
    C(alphaL2, tau2)    =  0.275
    C(A2, alphaL2)      = -0.275
    C(alphaD1, alphaL2) = -0.265
    C(alphaD1, A2)      =  0.265
    C(alphaD1, tau2)    = -0.264
    C(A1, alphaL2)      =  0.196
    C(tau1, alphaL2)    = -0.196
    C(alphaL1, alphaL2) =  0.156
    C(alphaL1, A2)      = -0.156
    C(alphaL1, tau2)    =  0.156
    C(alphaD2, tau2)    = -0.147
    C(A2, alphaD2)      =  0.147

i

Matt Newville

unread,
Jun 6, 2019, 12:22:01 PM6/6/19
to lmfit-py
On Thu, Jun 6, 2019 at 10:29 AM Filippo Monteverdi <filomon...@gmail.com> wrote:
Good morning.
I wrote a script that should fit a multi-component absorption line profile. The program works very well, the obtained fit is very good (by eye). Actually the problems reside in the fact that the chi-squared is too low, kind of 0.003. Secondly the errors on the extracted parameters are too huge, sometimes 19000%. I'm trying to figure out what I can do to fix it, but i'm really out of ideas. can someone help me?


As the instruction clearly state: include a script showing what you are doing.   

Among other things, such a script would show whether you are providing uncertainties or weights in the fit.  That would likely change the scale of chi-square.  

For sure, if pairs of variables have correlations > 0.999, the uncertainties for those variables will be huge.  If you had included an example script, we might know why `A1` and `tau1` are completely correlated.

--Matt

Filippo Monteverdi

unread,
Jun 7, 2019, 4:18:45 AM6/7/19
to lmfit-py
Actually it's not so short....Basically I extract data from a.fits file. I locate the peacks by ginput, so that the script can understand the intervals in which a gaussan, a lorentzian a voigt and a line profile functios can be fitted...After that I merge the final ones in a unique function, here called final_2profile...in all the fit the chi squared was recovered yesterday by weighting the data...the problems on huge errors still remain:

import numpy as np
import scipy.stats as stats
import matplotlib
import matplotlib.pyplot as plt
import os
import math
import random
import pylab
import scipy
import peakutils
import itertools
from scipy import spatial
from scipy.constants import c
from astropy.io import fits
from numpy import exp, sqrt
from lmfit import Model
from scipy import optimize, signal
from scipy.interpolate import interp1d
from scipy import integrate
from lmfit.models import LorentzianModel, VoigtModel, GaussianModel
from astropy.modeling.models import Voigt1D
from scipy.special import wofz
import matplotlib.mlab as mlab
import matplotlib.cbook as cbook
from matplotlib.widgets import Slider

matplotlib.use('TkAgg')
np.seterr(divide='ignore')

def fmt(x, y):
    return 'x: {x:0.2f}\ny: {y:0.2f}'.format(x = x, y = y)

class DataCursor(object):    #Create the data cursor and connect it to the relevant figure
    def __init__(self, artists, x = [], y = [], tolerance = 1, offsets = (-20, 20),
                 formatter = fmt, display_all = False):
        self._points = np.column_stack((x,y))
        self.formatter = formatter
        self.offsets = offsets
        self.display_all = display_all
        if not cbook.iterable(artists):
            artists = [artists]
        self.artists = artists
        self.axes = tuple(set(art.axes for art in self.artists))
        self.figures = tuple(set(ax.figure for ax in self.axes))

        self.annotations = {}
        for ax in self.axes:
            self.annotations[ax] = self.annotate(ax)

        for artist in self.artists:
            artist.set_picker(tolerance)
        for fig in self.figures:
            fig.canvas.mpl_connect('pick_event', self)

    def annotate(self, ax):    #Draws and hides the annotation box
        annotation = ax.annotate(self.formatter, xy = (0, 0), ha = 'right',
                xytext = self.offsets, textcoords = 'offset points', va = 'bottom',
                bbox = dict(boxstyle = 'round,pad=0.5', fc = 'yellow', alpha = 0.5),
                arrowprops = dict(arrowstyle = '->', connectionstyle = 'arc3,rad=0')
                )
        annotation.set_visible(False)
        return annotation

    def snap(self, x, y):    #Return the value in self._points closest to (x, y) selected
        idx = np.nanargmin(((self._points - (x,y))**2).sum(axis = -1))
        return self._points[idx]

    def __call__(self, event):
        # Rather than trying to interpolate, just display the clicked coords
        # This will only be called if it's within "tolerance", anyway.
        x, y = event.mouseevent.xdata, event.mouseevent.ydata
        annotation = self.annotations[event.artist.axes]
        if x is not None:
            if not self.display_all:
                # Hide any other annotation boxes...
                for ann in self.annotations.values():
                    ann.set_visible(False)
            # Update the annotation in the current axis..
            x, y = self.snap(x, y)
            annotation.xy = x, y
            annotation.set_text(self.formatter(x, y))
            annotation.set_visible(True)
            event.canvas.draw()

def voigt(x, y):    #The following 3 function are used to fit a voigt to data
   z = x + 1j*y
   I = wofz(z).real
   return I

def Voigt(nu,A, alphaD, nu_0, alphaL):
   f = np.sqrt(np.log(2))
   x = (nu-nu_0)/np.exp(np.log(alphaD)) * f
   y = np.exp(np.log(alphaL))/np.exp(np.log(alphaD)) * f
   V = np.exp(np.log(A))*f/(np.exp(np.log(alphaD))*np.sqrt(np.pi)) * voigt(x, y)
   return V

def funcV(p, x):
    A, alphaD, nu_0, alphaL= p
    return Voigt(x, A, alphaD, nu_0, alphaL)

def line_profile(x, A, alphaD, nu_0, alphaL, tau):    #From voigt to line profile
    return np.exp(-tau*(Voigt(x, A, alphaD, nu_0, alphaL)))

def final_profile(x, A1, alphaD1, nu_01, alphaL1, tau1):#the following 5 function are the final fit for 1,2,3,4,5 components, respectively
    return np.exp(np.log(line_profile(x, A1, alphaD1, nu_01, alphaL1, tau1)))

def final_2profile(x, A1, alphaD1, nu_01, alphaL1, tau1, A2, alphaD2, nu_02, alphaL2, tau2):
       
    a=np.array(line_profile(x, A1, alphaD1, nu_01, alphaL1, tau1))
    b=np.array(line_profile(x, A2, alphaD2, nu_02, alphaL2, tau2))
    dim=len(x)
    for i in range(0, dim-1): #This cicle is to avoid negative value, not passable to log
        if a[i]+b[i]-1<=0:
            a[i]=0.50000000001
            b[i]=0.5
    return     np.exp(np.log(a+b-1))

def final_3profile(x, A1, alphaD1, nu_01, alphaL1, tau1, A2, alphaD2, nu_02, alphaL2, tau2, A3, alphaD3, nu_03, alphaL3, tau3):
       
    a=np.array(line_profile(x, A1, alphaD1, nu_01, alphaL1, tau1))
    b=np.array(line_profile(x, A2, alphaD2, nu_02, alphaL2, tau2))
    c=np.array(line_profile(x, A3, alphaD3, nu_03, alphaL3, tau3))
    dim=len(x)
    for i in range(0, dim-1):
        if a[i]+b[i]+c[i]-2<=0:    #This cicle is to avoid negative value, not passable to log
            a[i]=0.666666667
            b[i]=0.6666666666
            c[i]=0.6666666666
    return     np.exp(np.log(a+b+c-2))

def final_4profile(x, A1, alphaD1, nu_01, alphaL1, tau1, A2, alphaD2, nu_02, alphaL2, tau2, A3, alphaD3, nu_03, alphaL3, tau3, A4, alphaD4, nu_04, alphaL4, tau4):
       
    a=np.array(line_profile(x, A1, alphaD1, nu_01, alphaL1, tau1))
    b=np.array(line_profile(x, A2, alphaD2, nu_02, alphaL2, tau2))
    c=np.array(line_profile(x, A3, alphaD3, nu_03, alphaL3, tau3))
    d=np.array(line_profile(x, A4, alphaD4, nu_04, alphaL4, tau4))
    dim=len(x)
    for i in range(0, dim-1):
        if a[i]+b[i]+c[i]+d[i]-3<=0:    #This cicle is to avoid negative value, not passable to log
            a[i]=0.7500000001
            b[i]=0.75
            c[i]=0.75
            d[i]=0.75
    return     np.exp(np.log(a+b+c+d-3))

def final_5profile(x, A1, alphaD1, nu_01, alphaL1, tau1, A2, alphaD2, nu_02, alphaL2, tau2, A3, alphaD3, nu_03, alphaL3, tau3, A4, alphaD4, nu_04, alphaL4, tau4, A5, alphaD5, nu_05, alphaL5, tau5):
       
    a=np.array(line_profile(x, A1, alphaD1, nu_01, alphaL1, tau1))
    b=np.array(line_profile(x, A2, alphaD2, nu_02, alphaL2, tau2))
    c=np.array(line_profile(x, A3, alphaD3, nu_03, alphaL3, tau3))
    d=np.array(line_profile(x, A4, alphaD4, nu_04, alphaL4, tau4))
    e=np.array(line_profile(x, A5, alphaD5, nu_05, alphaL5, tau5))
    dim=len(x)
    for i in range(0, dim-1):
        if a[i]+b[i]+c[i]+d[i]+e[i]-4<=0:    #This cicle is to avoid negative value, not passable to log
            a[i]=0.800000001
            b[i]=0.8
            c[i]=0.8
            d[i]=0.8
            e[i]=0.8
    return     np.exp(np.log(a+b+c+d+e-4))

def check(val,cen, a, z):    #check if an absortion line in ism.dat is close to a found centroid
    a=a*(1+z) 
    tres=abs(cen-a)
    if (tres<val).any():
        return True
    return False

def more_lines(x, a, z):    #check if 2 ore more absorption line are present inside a found centroid
    conta=0
    a=a*(1+z)
    low=x[0]
    high=x[-1]
    dim=len(a)
    for i in range(0,dim):
        if a[i]>low and a[i]<high:
            conta=conta+1
    if conta>1:
        return True
    else:
        return False
   
def which_line(x, a, z):    #find which line in ism.dat stay inside a selected interval
    a=a*(1+z)
    dim=len(a)
    for i in range(0, dim-1):
        if a[i]>x[0] and a[i]<x[-1]:
            return i

def which_point(x, n, z):    #no more used
    n=n*(1+z)
    dim=len(x)
    for i in range(0, dim-1) :
        diff1=abs(n-x[i])
        diff2=abs(n-x[i+1])
        if diff2>diff1:
            return i

plot=fits.open('/home/filippo/spettri1D/GRB/090926A/VIS/ADP.2018-10-09T11_00_06.335.fits')[1].data


ism=open('/home/filippo/spettri1D/GRB/090926A/VIS/ism.txt').readlines()

ismw=[]
ismn=[]
for lines in ism:
    line=lines.strip('\n')
    lin=line.split()
    ismw.append(float(lin[1]))
    ismn.append(lin[0])

ism=np.array(ismw)

red=2.1069

lam= plot[0][0]
flux= plot[0][1]
err= plot[0][2]
pol= plot[0][4]
tel= plot[0][6]
norm=flux/pol
norm_err=err/pol

#interval to be analized
lim1=8682
lim2=8692


tab1=np.column_stack((lam,norm))
siII=tab1[tab1[:,0]>lim1]  
siII=siII[siII[:,0]<lim2]

tab2=np.column_stack((lam,norm,err/pol))
siII_err=tab2[tab2[:,0]>lim1] 
siII_err=siII_err[siII_err[:,0]<lim2]

x=siII[:,0]
y=-siII[:,1]+1

#if y<1, force it to be 0
dim=len(y)
for i in range(0,dim-1):
    if y[i]>=1:
        y[i]=1
        siII_err[i,1]=0

#locate the lines in ism.at which stay inside the chosen interval
line1=[]
dim=len(ism)
for i in range(0,dim):
    if (ism[i]*(1+red))>x[0] and (ism[i]*(1+red))<x[-1]:
        line1.append(ism[i])

line=np.array(line1)

#plot both the first graph and the located lines on canvas
figC=plt.figure('Best Fit', figsize=(8,5))
figC1=figC.add_subplot(111)

figC1.set_xlim(x[0],x[-1])
figC1.scatter(x, -y+1, color='b')
figC1.errorbar(siII_err[:,0],siII_err[:,1],yerr=siII_err[:,2], ls='none')
figC1.step(x, -y+1, color='b')

i=0
for lam1 in ismw:
    figC1.axvline(float(lam1)*(1+red), linestyle='--', color='k')
    figC1.text(float(lam1)*(1+red)+0.05,0.95,ismn[i],fontsize=10, rotation=90)
    i+=1

#peacks and valleys determination->find the indices of peacks and holes
cb = np.array(y)
plt.gcf().canvas.draw()

#Interactive selection of the intervals: First select the limits, than the central points
print "Selezionare i massimi del fit (linea blu): al termine cliccare il tasto centrale"
valley=plt.ginput(-1, show_clicks=True, timeout=0)
print "Selezionare i minimi del fit (linea blu): al termine cliccare il tasto centrale"
peak=plt.ginput(-1, show_clicks=True, timeout=0)

Xu=[i[0] for i in valley]
Yu=[1-i[1] for i in valley]

Xd=[i[0] for i in peak]
Yd=[1-i[1] for i in peak]

Xd=np.array(Xd)
Yd=np.array(Yd)
Xu=np.array(Xu)
Yu=np.array(Yu)

dim=len(Xu)
dime=len(x)

coord=[]

    #gives coordinates to data points
for i in range(0,dime):
    coords = tuple([x[i],cb[i]])
    coord.append(coords)

posu=[] #find peacks indices
for i in range(0,dim-1):
    tree = spatial.KDTree(coord)
    distu, indiceu = tree.query([(Xd[i], Yd[i])])
    posu.append(int(indiceu))

posd=[] #find valley indices
for i in range(0,dim):
    tree = spatial.KDTree(coord)
    distd, indiced = tree.query([(Xu[i], Yu[i])])
    posd.append(int(indiced))


valli= np.array(posu)
picchi= np.array(posd)

dim=len(picchi)
k=picchi[0]

#From here the fid routine starts
FWHM1=[]
I=0
neg1=[]
cen1=[]
amp1=[]
wid1=[]

vgprofile=[]
sig=[]
cen=[]
amp=[]
gam=[]
t=[]

for i in range(0, dim-1):
    #control procedure to avoid any y<0 point
    X=x[k:picchi[i+1]:1]
    Y=cb[k:picchi[i+1]:1]
    ERR=err[k:picchi[i+1]:1]
    POL=pol[k:picchi[i+1]:1]
    ERR=ERR/POL
    #print k, picchi[i]+1
    neg1.append(np.flatnonzero(Y < 0))
    neg=np.asarray(np.where(Y<0))
    dneg=len(neg)
    Y=Y[Y>=0]
    X=np.delete(X, neg[0])
   
    if len(X)<3:    #Fit cannt be performed on less than 3 points intervals...exclude them from fit. This procedure can be used to neglet                  some points
        print "l'intervallo contiene troppi pochi elementi!"
        del neg1[:]
        k=picchi[i+1]
    else:
        #fit a gaussian and save his parameters
        ga= GaussianModel()
        pars=ga.guess(Y, x=X)
        resga = ga.fit(Y, pars, x=X, weights=1/(ERR**2))
        print(resga.fit_report(min_correl=0.25))
        ampga=resga.best_values["amplitude"]
        sigga=resga.best_values["sigma"]
        cenga=resga.best_values["center"]
        componentsga = resga.eval_components(x=x)
        gaprofile=componentsga['gaussian']
       
        #fit a lorentian and save his parameters
        lo= LorentzianModel()
        pars=lo.guess(Y, x=X)
        reslo = lo.fit(Y, pars, x=X, weights=1/(ERR**2))
        print(reslo.fit_report(min_correl=0.25))
        amplo=reslo.best_values["amplitude"]
        siglo=reslo.best_values["sigma"]
        cenlo=reslo.best_values["center"]
        componentslo = reslo.eval_components(x=x)
        loprofile=componentslo['lorentzian']
       
        #Use the found centers, gammas and sigmas to fit a voigt           
        mod = VoigtModel()
        pars = mod.guess(Y,x=X)
        pars['gamma'].set(value=siglo, vary=True, min=0, expr='')
        pars['sigma'].set(value=sigga, vary=True, min=0, expr='')
        pars['center'].set(value=cenga, vary=True, min=0, expr='')
        out = mod.fit(Y, pars, x=X, weights=1/(ERR**2))
        print(out.fit_report(min_correl=0.25))
        ampvg=out.best_values["amplitude"]
        sigvg=out.best_values["sigma"]
        gamvg=out.best_values["gamma"]
        cenvg=out.best_values["center"]
        ampvg=out.best_values["amplitude"]
        amp.append(ampvg)
        sigvg=out.best_values["sigma"]
        sig.append(sigvg)
        gamvg=out.best_values["gamma"]
        gam.append(gamvg)
        cenvg=out.best_values["center"]
        cen.append(cenvg)
        "uncomment if you want to plot the voigts"
        #p_=ampvg, sigvg, cenvg, gamvg
        #fwhmvg=1.0692*gamvg+sqrt(0.8664*gamvg**2+5.545083*sigvg**2)
        #hvg=(ampvg/(sigvg*np.sqrt(2*np.pi)))*wofz((1j*gamvg)/(sigvg*np.sqrt(2))).real
        componentsvg = out.eval_components(x=x)
        #figC1.plot(x, np.exp(-10*funcV(p_,x)))
        #figC1.plot(x, 1-componentsvg['voigt'])
       
        #use the voigt center (fixed) and the others parameters to fit the best line profile
        line_model=Model(line_profile, nan_policy='propagate')
        params = line_model.make_params(A=ampvg, alphaD=sigvg, nu_0=cenvg, alphaL=gamvg, tau=1)
        params['nu_0'].set(value=cenvg, vary=True, expr='')
        result= line_model.fit(1-Y, params, x=X, weights=1/(ERR**2))
        print(result.fit_report())
        ampvg=result.best_values["A"]
        sigvg=result.best_values["alphaD"]
        gamvg=result.best_values["alphaL"]
        cenvg=result.best_values["nu_0"]
        #p_=ampvg, sigvg, cenvg, gamvg
        sed = result.eval_components(x=x)
        #figC1.plot(x, sed['line_profile'])
        vgprofile.append(sed['line_profile'])
        t.append(result.best_values["tau"])
        #print sed#sedprofile=csed['voigt']
       
        I=I+1
    del neg1[:]
    k=picchi[i+1]

xx=x[picchi[0]:picchi[-1]:1]
yy=cb[picchi[0]:picchi[-1]:1]
ERR=err[picchi[0]:picchi[-1]:1]
POL=pol[picchi[0]:picchi[-1]:1]
ERR=ERR/POL

vgprofile=np.array(vgprofile) #How many voigt have you detected? Bring me to the correct final procedure
amp=np.array(amp)
sig=np.array(sig)
gam=np.array(gam)
cen=np.array(cen)
t=np.array(t)


    final_model=Model(final_2profile, nan_policy='propagate')
    params=final_model.make_params(A1=amp[0], alphaD1=sig[0], nu_01=cen[0], alphaL1=gam[0], tau1=t[0], A2=amp[1], alphaD2=sig[1], nu_02=cen[1], alphaL2=gam[1], tau2=t[1])
    params['nu_01'].set(value=cen[0], vary=False, expr='')#-((48/299792.458)*(line[1]*(1+red)))
    params['nu_02'].set(value=cen[1], vary=False, expr='')
    params['A1'].set(value=amp[0], vary=True, min=0, expr='')
    params['A2'].set(value=amp[1], vary=True, min=0, expr='')
    params['alphaD1'].set(value=sig[0], vary=True, min=0, expr='')
    params['alphaD2'].set(value=sig[1], vary=True, min=0, expr='')
    params['alphaL1'].set(value=gam[0], vary=True, min=0, expr='')
    params['alphaL2'].set(value=gam[1], vary=True, min=0, expr='')
    params['tau1'].set(value=t[0], vary=True, min=0, expr='')
    params['tau2'].set(value=t[1], vary=True, min=0, expr='')
    result= final_model.fit(1-yy, params, x=xx, weights=1/(ERR**2))
    sed = result.eval_components(x=x)
    print(result.fit_report())
    result.plot_residuals(figC1)

Matt Newville

unread,
Jun 7, 2019, 7:23:07 AM6/7/19
to lmfit-py
Hi,

On Fri, Jun 7, 2019 at 3:18 AM Filippo Monteverdi <filomon...@gmail.com> wrote:
Actually it's not so short....Basically I extract data from a.fits file. I locate the peacks by ginput, so that the script can understand the intervals in which a gaussan, a lorentzian a voigt and a line profile functios can be fitted...After that I merge the final ones in a unique function, here called final_2profile...in all the fit the chi squared was recovered yesterday by weighting the data...the problems on huge errors still remain:

Please re-read and follow the instructions.  The fit report and plot you showed earlier suggests you are modeling your data as the sum of two lineshapes.  That should take fewer that 50 lines of code.  There are several examples of this basic scenario in the docs and `examples` folder.  

Yes, it will be some work to simplifiy your code down to the essentials of the fit.  In that process of cleaning up and simplify your line profile, you will no doubt see that `A` and `tau` really are completely correlated.

--Matt

Jaime Villaseñor

unread,
Jun 26, 2019, 7:23:31 AM6/26/19
to lmfi...@googlegroups.com
(I'm sorry to intervene here)

Dear Filippo,

Most of my work is related to fitting Gaussians to spectral lines.
The problem I see here is that your resolution is too low, i.e.,
you can't resolve the secondary component properly. This is why
your errors are so big, you could recover the same result (the total
fit) with very different combinations of Gaussians, even more than
two for all we know. Unfortunately there is not a way around this
unless you give the code more information regarding the amplitude
and width of your two components, which I assume you don't know.

Now, I assumed this is a binary, or is it two different lines of the same
component? I saw the Mg line label, so, is there a reason for fitting a
Lorentzian or Voigt profile to a metallic line?

Cheers,
Jaime

--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/CA%2B7ESbpw6BiJ_1cb4DcMqSUDZqdDf3xo6SodtyLCVbkSmKZELg%40mail.gmail.com.


--
Jaime Villaseñor
PhD Student
Institute for Astronomy
The University of Edinburgh
Royal Observatory
Blackford Hill
Edinburgh EH9 3HJ
U.K.

Matt Newville

unread,
Jun 26, 2019, 9:20:24 PM6/26/19
to lmfit-py
On Wed, Jun 26, 2019 at 6:23 AM Jaime Villaseñor <jaime....@gmail.com> wrote:
(I'm sorry to intervene here)

Dear Filippo,

Most of my work is related to fitting Gaussians to spectral lines.
The problem I see here is that your resolution is too low, i.e.,
you can't resolve the secondary component properly. This is why
your errors are so big, you could recover the same result (the total
fit) with very different combinations of Gaussians, even more than
two for all we know. Unfortunately there is not a way around this
unless you give the code more information regarding the amplitude
and width of your two components, which I assume you don't know.

Now, I assumed this is a binary, or is it two different lines of the same
component? I saw the Mg line label, so, is there a reason for fitting a
Lorentzian or Voigt profile to a metallic line?


I would not necessarily agree that the data shown could not be fitted with two resolvable Gaussian or Voigt peaks.  In the spectroscopic methods I use, this spectra would look to be pretty obviously the overlap of two resolvable lines, and does not look too noisy to me.

I think the problem is that the OP has a poor definition of Voigt (there is one built in to lmfit),  has code littered with useless junk like `np.exp(np.log(Thing))` - an expensive but effective way to say "please generate as many NaNs as possible",  and has code that is too messy for them to debug.   They were unable to reduce their problem to a simple example. It is a disturbing but common theme here.  

I believe a simple script like this:

    from lmfit.models import VoigtModel, ConstantModel

    model = ConstantModel('offset_') +  VoigtModel('peak1_') + VoigtModel('peak2_')
    params = model.make_params(offset_c=1, 
                                             peak1_center=8687, peak1_sigma=1, peak1_amplitude=-1,
                                             peak2_center=8689, peak2_sigma=1, peak2_amplitude=-0.5)

    params['peak1_center'].max = 8688   # prevent two peaks from crossing
    params['peak2_center'].min = 8688
    params['peak1_amplitude'].max = 0     # make sure peaks amplitudes are negative 
    params['peak2_amplitude'].max = 0   
    # params['offset_c'].vary = False       # may want to assert that offset=1?

    result = model.fit(ydata, params, x=xdata)

would probably fit this data pretty well, or at least be a good start.   It would be easy to replace the VoigtModels with GaussianModels, or explore varying the `gamma` parameter separately from `sigma` (as one peak does sort of look Lorentzian-like).

Starting with a simple example like this or coming back to such an example when something is not working is really key to doing science.   No, I don't mean just scientific programming -- that of course requires debugging.  I really do mean doing science:   such scripts are how you test your methods and if you can not test your methods, you are not doing science. 

--Matt

Reply all
Reply to author
Forward
0 new messages