How to prevent NaNs with log in model function?

489 views
Skip to first unread message

Felix Schlegel

unread,
Aug 24, 2018, 5:30:49 AM8/24/18
to lmfit-py
hi, 

I wan to fit multiple peaks at once (because they should share the shape parameters and only differ in amplitude), using a gamma variate function for each peak. 
In order to shift individual onsets the correct position t0, the formula contains several instances of log(t-t0) giving NaNs until t becomes larger than t0. 
To work around the errors, the script for the model loops thorugh every single timepoint and replacing values of t<=t0 with 0. Unfortunately, it turns out to be extremely slow, I cannot even run a simple case to completion.

What would be a better solution to prevent NaNs from occuring, or why can't they just be ignored? 


script =  """
def gammavar(ymax, tmax, a, x, t0):
    output=zeros(size(x))
    for i in range(size(x)):
        if i <= t0:
            output[i]=0
        else:
            output[i]=exp(log(ymax)+a*(1+log((i-t0)/tmax)-(i-t0)/tmax))
    return output
"""



def CreateModel(stimulus_times):
    model
= ExpressionModel(f'gammavar(ymax0, tmax0, a0, x, 0)',
                            init_script
=script,
                            intependent_vars
=['x'],
                            nan_policy
='propagate')
    k
=1
   
for i in stimulus_times[1:]:
        model
+= ExpressionModel(f'gammavar(ymax{k}, tmax{k}, a{k}, x, {i})',
                            init_script
=script,
                            intependent_vars
=['x'],
                            nan_policy
='propagate')
        k
+= 1


   
params = model.make_params()
   
params['ymax0'].set(2, min=0.1, max=10)
   
params['tmax0'].set(2, min=0.1, max=10)
   
params['a0'].set(1, min=0.1, max=10)
   
for i in range(1,len(stimulus_times)):
       
params[f'tmax{i}'].set(expr='tmax0')  
       
params[f'a{i}'].set(expr='a0')    
       
params[f'ymax{i}'].set(1, min=0.1, max=10)  
       
   
return model, params

    
model
, params = CreateModel(n_stimuli)
   
   
signal
= blocks_to_fit.iloc[:,0].values*100
time
= blocks_to_fit.index.values

result
= model.fit(signal, params, x=time)



Felix Schlegel

unread,
Aug 24, 2018, 6:39:58 AM8/24/18
to lmfit-py
Actually the fit runs with nan_policy='propagate' and the script below without any loops:

script =  """

        def gammavar(ymax, tmax, a, x, t0):
        return exp(log(ymax)+a*(1+log((x-t0)/tmax)-(x-t0)/tmax))
        """

But all variables become NaN. I am not sure if its a mistake with how the composite model is created. 

Fit report:

[[Model]] (((((((((((((((((((((((Model(_eval, intependent_vars='['x']') + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) + Model(_eval, intependent_vars='['x']')) [[Fit Statistics]] # fitting method = leastsq # function evals = 54001 # data points = 3800 # variables = 26 chi-square = nan reduced chi-square = nan Akaike info crit = nan Bayesian info crit = nan [[Variables]] ymax0: nan (init = 2) tmax0: nan (init = 2) a0: nan (init = 1) ymax1: nan (init = 1) tmax1: nan == 'tmax0' a1: nan == 'a0' ymax2: nan (init = 1) tmax2: nan == 'tmax0' a2: nan == 'a0' ymax3: nan (init = 1) tmax3: nan == 'tmax0' a3: nan == 'a0' ymax4: nan (init = 1) tmax4: nan == 'tmax0' a4: nan == 'a0' ymax5: nan (init = 1) tmax5: nan == 'tmax0' a5: nan == 'a0' ymax6: nan (init = 1) tmax6: nan == 'tmax0' a6: nan == 'a0' ymax7: nan (init = 1) tmax7: nan == 'tmax0' a7: nan == 'a0' ymax8: nan (init = 1) tmax8: nan == 'tmax0' a8: nan == 'a0' ymax9: nan (init = 1) tmax9: nan == 'tmax0' a9: nan == 'a0' ymax10: nan (init = 1) tmax10: nan == 'tmax0' a10: nan == 'a0' ymax11: nan (init = 1) tmax11: nan == 'tmax0' a11: nan == 'a0' ymax12: nan (init = 1) tmax12: nan == 'tmax0' a12: nan == 'a0' ymax13: nan (init = 1) tmax13: nan == 'tmax0' a13: nan == 'a0' ymax14: nan (init = 1) tmax14: nan == 'tmax0' a14: nan == 'a0' ymax15: nan (init = 1) tmax15: nan == 'tmax0' a15: nan == 'a0' ymax16: nan (init = 1) tmax16: nan == 'tmax0' a16: nan == 'a0' ymax17: nan (init = 1) tmax17: nan == 'tmax0' a17: nan == 'a0' ymax18: nan (init = 1) tmax18: nan == 'tmax0' a18: nan == 'a0' ymax19: nan (init = 1) tmax19: nan == 'tmax0' a19: nan == 'a0' ymax20: nan (init = 1) tmax20: nan == 'tmax0' a20: nan == 'a0' ymax21: nan (init = 1) tmax21: nan == 'tmax0' a21: nan == 'a0' ymax22: nan (init = 1) tmax22: nan == 'tmax0' a22: nan == 'a0' ymax23: nan (init = 1) tmax23: nan == 'tmax0' a23: nan == 'a0'

Matt Newville

unread,
Aug 24, 2018, 8:08:17 AM8/24/18
to lmfit-py
Hi Felix,

On Fri, Aug 24, 2018 at 5:40 AM Felix Schlegel <schlege...@gmail.com> wrote:
Actually the fit runs with nan_policy='propagate' and the script below without any loops:

script =  """
        def gammavar(ymax, tmax, a, x, t0):
        return exp(log(ymax)+a*(1+log((x-t0)/tmax)-(x-t0)/tmax))
        """

But all variables become NaN. I am not sure if its a mistake with how the composite model is created. 


Why not use a python model instead of an ExpressionModel?  I think you're making this more complicated than it needs to be.

As you have noticed, using log() can definitely cause NaNs, and if the model generates NaNs the fit will not succeed.   So you almost certainly want to prevent the NaNs from happening.  I would suggest trying something like this:

    import numpy as np
    from lmfit import Model

    def gammavar(t, t0, a, ymax, tmax):
       g = np.exp(np.log(ymax)+a*(1+np.log((t-t0)/tmax)-(t-t0)/tmax))
       g[np.where(~np.isfinite(g))] = 0.0
       return g
 
    model = Model(gammavar)
   
params = model.makeparams(t0=0, a=1, ymax=2, tmax=2)
    params['t0'].vary = False


You probably have (or I hope you have) tried building one component, then addding more before jumping to a fit with more than 20 components.

--Matt


Pengcheng Zhang

unread,
Aug 27, 2018, 5:52:30 PM8/27/18
to lmfit-py
Hi Felix,

One way to avoid NaNs and while at the same time improving performance is to take advantage of the Boolean arrays.

You can rewrite your gammavar function as follows:


def gammavar(x, ymax, tmax, a, t0):
    x_input = x * (x > t0) + (t0 + tmax) * (x <= t0)
    return (exp(log(ymax)+a*(1+log((x_input-t0)/tmax)-(x_input-t0)/tmax))) * (x > t0)


Then you can construct your composite model wrapped around gammavar.


A modified input variable (x_input) is constructed. Taking advantage of the Boolean values (True = 1, False = 0), x_input is constructed so that when x > t0, x_input = x, whereas when x <= t0, x_input = t0 + tmax. This ensures that no value less than t0 will be plugged in to log(x_input - t0), thus avoiding NaNs.

Multiplying the function value by (x > t0) ensures that the proper function values are returned when x > t0, otherwise when x <= t0, the function returns 0.

This syntax works for both scalar and array inputs.

Hope this helps!
Pengcheng

Anusha Kodimela

unread,
Aug 28, 2018, 1:28:53 PM8/28/18
to lmfit-py

Hello Pengcheng!

could you please help me deal with Nans in my case?My code works fine if the B value obtained is a positive value but throws Nan values error if its negative.the data in the excel sheet is attached.
the plot obtained when b value is positive is also attached.

Kindly help

code:
import matplotlib.pyplot as plt
import pandas as pd
from lmfit import Model
import numpy as np


def ssexp_if(x, a, b, c):
arg = (b * x) ** c
g = a * np.exp(-2 * arg)
return g

datafile = pd.read_excel(r'C:\Users\Desktop\VIP_SSexp_curvefitting\Trial10.xls')
x = datafile.iloc[ : , 0]
y = datafile.iloc[ : , 1]
gmodel = Model(ssexp_if)
first = 5 #change the index after inserting a heading to the excel
last = 89
index = 10
xdatam = x[first:last]
ydatam = y[first:last]
listfirst =6
a = np.mean(ydatam[first+listfirst:first+10+listfirst])
b = (np.log(ydatam[first+listfirst]/ydatam[index+first+listfirst]))/(2*(xdatam[index+first+listfirst]-xdatam[first+listfirst]))
c = 1
params = gmodel.make_params()

params = gmodel.make_params(a=a, b=b, c=c)


yfit = gmodel.eval(params, x=x)
#print(pd.isnull(x))
result = gmodel.fit(yfit, params, x=x, nan_policy='raise')
plt.semilogx(x, y, 'bo')
#plt.semilogx(x, result.init_fit, 'k--',)
plt.semilogx(x, result.best_fit, 'r-')
plt.show()

error:
C:\UsersycharmProjects\CurveFitting_Python\venv\Scripts\python.exe C:/Users/ifosys/PycharmProjects/CurveFitting_Python/CurveFitting_Lmfit.py
C:/Users/PycharmProjects/CurveFitting_Python/CurveFitting_Lmfit.py:9: RuntimeWarning: overflow encountered in exp
  g = a * np.exp(-2 * arg)
Traceback (most recent call last):
  File "C:/Users/PycharmProjects/CurveFitting_Python/CurveFitting_Lmfit.py", line 32, in <module>
    result = gmodel.fit(yfit, params, x=x, nan_policy='raise')
  File "C:\Users\PycharmProjects\CurveFitting_Python\venv\lib\site-packages\lmfit\model.py", line 873, in fit
    output.fit(data=data, weights=weights)
  File "C:\Users\PycharmProjects\CurveFitting_Python\venv\lib\site-packages\lmfit\model.py", line 1217, in fit
    _ret = self.minimize(method=self.method)
  File "C:\Users\PycharmProjects\CurveFitting_Python\venv\lib\site-packages\lmfit\minimizer.py", line 1811, in minimize
    return function(**kwargs)
  File "C:\Users\PycharmProjects\CurveFitting_Python\venv\lib\site-packages\lmfit\minimizer.py", line 1364, in leastsq
    lsout = scipy_leastsq(self.__residual, variables, **lskws)
  File "C:\Users\PycharmProjects\CurveFitting_Python\venv\lib\site-packages\scipy\optimize\minpack.py", line 383, in leastsq
    shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
  File "C:\Users\PycharmProjects\CurveFitting_Python\venv\lib\site-packages\scipy\optimize\minpack.py", line 27, in _check_func
    res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
  File "C:\Users\PycharmProjects\CurveFitting_Python\venv\lib\site-packages\lmfit\minimizer.py", line 517, in __residual
    nan_policy=self.nan_policy)
  File "C:\Users\PycharmProjects\CurveFitting_Python\venv\lib\site-packages\lmfit\minimizer.py", line 2021, in _nan_policy
    raise ValueError("The input contains nan values")
ValueError: The input contains nan values

Process finished with exit code 1
Trial10_new.png
Trial10.xls
Message has been deleted

Felix Schlegel

unread,
Sep 10, 2018, 10:14:14 AM9/10/18
to lmfit-py
Hi Anusha,
Wouldn't it give a good fit if you just not give any initial values for the parameters? And set the boundaries to 0 or larger
as shown in

Matt Newville

unread,
Sep 10, 2018, 10:57:17 AM9/10/18
to lmfit-py
On Mon, Sep 10, 2018 at 9:14 AM Felix Schlegel <schlege...@gmail.com> wrote:
Hi Anusha,
Wouldn't it give a good fit if you just not give any initial values for the parameters? And set the boundaries to 0 or larger
as shown in


I must not be understanding you.   It is never a good idea to "not give any initial values" and it is always a good idea to give good initial values.  The link you point to shows one of a few ways to set initial values for parameters. 

Reply all
Reply to author
Forward
0 new messages