24 views

Skip to first unread message

Apr 3, 2024, 4:07:51 PMApr 3

to lmfit-py

Dear all,

I hope I did not forget to mention anything important, and please let me know if you need more information

I've been struggling with some model fits, and would like some advice, please

Summary: I am fitting a custom model to my dataset and I have some weird fit results, which I believe are due to NaN values on the data. I don't know whether I missed anything in my codes (or in the codes of the library that was developed in the lab to do these fits to the data) or I'm doing anything wrong...

Detailed information:

I have a dataset of eye velocity over time, which has NaN values due to saccades (= very high peaks on the eye velocity that I'm not interested in, so I removed them)

My model is a concatenation of three pieces of functions: a flat regression - an exponential - a sigmoid

Here is an example (data in blue, fitted model in red):

It works very well when the data does not have NaN values, but otherwise (with the nan_policy set to 'omit' -- it does not work at all with 'propagate') I get weird fits like this one:

So I started doing some tests. If I use the eval() function from the lmfit library without specifying the independent variable, I get an array of the same length as the data without NaNs, as you can see in the figure below (same data as above)

If I now use the function plot_fit(), which takes into consideration the independent variable used in the fit process by default, I get this:

i.e., a "broken" exponential, when I would actually expect something like what is shown here: https://stackoverflow.com/questions/73199517/omit-values-when-fitting-lmfit-nan-policy

Do you have any idea of why this happens and what I could do to have a proper fit? (besides interpolations in the data -- this would really be my last option, as interpolation introduces biases in the fitted models...)

Here is the model function that I use (in case it helps):

def velocity_ExponentialAnti_SigmoidalPursuit(x, target_dir, baseline, anti_onset, anti_slope, anti_offset, pursuit_slope, horizontal_shift, steady_state, fit_baseline:bool=False) :

'''

Model of eye-velocity over time, in a smooth pursuit task.

Anticipatory phase is modeled as an exponential, and initial visually guided pursuit as a sigmoid

Parameters

----------

x : ndarray

time in milliseconds

target_dir : int

direction of the target

-1 (corresponding usually to left/down) or 1 (corresponding usually to right/up)

anti_onset : int

anticipation onset in milliseconds

anti_slope : float

defines the slope of the sigmoid corresponding to the anticipatory phase

anti_offset : int

offset of the anticipatory phase in milliseconds

pursuit_slope : float

defines the slope of the sigmoid corresponding to the initial phase of the visually-guided pursuit

horizontal_shift : int

shift in the sigmoid function corresponding to the initial phase of the visually-guided pursuit

steady_state : float

eye velocity reached in the late phase of visually-guided pursuit (in degrees/sec)

fit_baseline : bool

if ``True``, fits a flat regression corresponding to the baseline velocity

this parameter is intended to be used when there were problems with the eye-tracker calibration,

such that eye velocity is not centered in zero during the fixation phase

Returns

-------

velocity : list

model of the eye velocity in deg/sec

'''

if not fit_baseline: baseline = 0

anti_sign = np.sign(anti_slope)

anti_slope = abs(anti_slope)/1000 # to switch from deg/sec to deg/ms

pursuit_slope = target_dir*pursuit_slope/1000

steady_state = steady_state*target_dir

anti_onset = np.floor(anti_onset)

anti_offset = np.floor(anti_offset)

time = x

idx_baseline = np.where(time < anti_onset)[0]

idx_anticipation = np.where((time >= anti_onset)&(time < anti_offset))[0]

idx_pursuit = np.where(time >= anti_offset)[0]

x_anticipation = np.arange(0,len(idx_anticipation),1)

x_pursuit = np.arange(0,len(idx_pursuit),1)

velocity = np.zeros(len(time))

velocity[idx_baseline] = baseline * np.ones(len(idx_baseline))

velocity[idx_anticipation] = baseline + anti_sign*(np.exp(anti_slope*x_anticipation)-1)

max_anticipation = anti_sign*max(abs(velocity))

sigmoid_pursuit = (steady_state-max_anticipation)/(1 + np.exp(-pursuit_slope*(x_pursuit-horizontal_shift)))

velocity[idx_pursuit] = max_anticipation + sigmoid_pursuit - sigmoid_pursuit[0]

return velocity

the parameters are set somewhere else in the library, but they are basically this:

param_fit=[{'name':'steady_state', 'value':value_steady_state, 'min':3., 'max':40., 'vary':True },

{'name':'dir_target', 'value':dir_target, 'min':None, 'max':None, 'vary':False },

{'name':'anti_slope_tmp', 'value':value_anti, 'min':-40., 'max':40., 'vary':vary_anti,},

{'name':'anti_slope', 'expr':'anti_slope_tmp if abs(anti_slope_tmp) >= .5 else 0'}, # arbitrary threshold for valid acceleration

{'name':'anti_offset', 'value':TargetOn+90, 'min':TargetOn+30, 'max':TargetOn+120, 'vary':True },

{'name':'anti_onset_tmp', 'value':TargetOn-100, 'min':t_0, 'max':TargetOn, 'vary':vary_start_anti},

{'name':'anti_onset', 'expr':'anti_onset_tmp if a_anti != 0 else anti_offset-1'},

{'name':'pursuit_slope', 'value':80, 'min':3., 'max':500., 'vary':'vary'},

{'name':'baseline', 'value':0, 'min':-1, 'max':1, 'vary':True},

{'name':'horizontal_shift', 'value':100, 'min':0, 'max':110, 'vary':True},

]

The library does the fit in two steps, and for parameters where 'vary'='vary', vary=False in the first step and vary=True in the second step:

out = model.fit(data_trial, params, nan_policy='omit', **inde_vars)

# make the other parameters vary now

for num_par in range(len(param_fit)) :

if 'vary' in param_fit[num_par].keys() :

if param_fit[num_par]['vary'] == 'vary' :

out.params[param_fit[num_par]['name']].set(vary=True)

result_deg = model.fit(data_trial, out.params, method='powell', nan_policy='omit', **inde_vars)

Thanks in advance,

Vanessa

Apr 3, 2024, 10:28:40 PMApr 3

to lmfi...@googlegroups.com

Hi Vanessa,

Including a complete, minimal example is always recommended.

If you have data with NaN values meant to signal "missing data", then the best thing to do is simply remove those data points.

Using `nan_policy='omit'` is meant to do that: remove NaN before doing the fit. If there are NaN, then a fit with

`nan_policy='propagate'` will definitely fail.

It looks to me like that is what you are doing, and that it is working. That is, your third plot is plotting with the x-axis being just the array index, and values with y=NaN were removed (omitted) from the array. So, that looks OK to me (but, again, example code would never hurt)

You can certainly evaluate a model at any value of the independent variable, including those values where the y data was NaN.

That should then "fill in" the missing data points.

--

You received this message because you are subscribed to the Google Groups "lmfit-py" group.

To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/9ebc4c59-e15b-47a6-a67c-75df8f9f4426n%40googlegroups.com.

Apr 5, 2024, 10:05:32 AMApr 5

to lmfit-py

Hi Matt,

thanks again! I was in the process of writing/sending you the reply below when I had a click: I could actually change the model function, so that it would take into consideration that the data can possibly have NaN values... and it works like a charm

I'm sending also the new version of the notebook (example2.ipynb), in case someone else runs into the same problem and needs a solution :)

best

-------------------------------

Hi Matt,thank you very much for your quick reply! I'm sorry that I didn't include a minimal example in my first message, but it was already too late and I had to sleep in order to be a functional human being today :)

In any case, I attached a python notebook (example.ipynb) with two examples: one with NaNs, one without...

Yes, I understand that the *nan_policy=omit* removes NaNs from both the data and independent variable... what doesn't make sense to me is that I have the impression that the fit process is ignoring my independent variable (or, more precisely, the gaps in it)

Reply all

Reply to author

Forward

0 new messages

Search

Clear search

Close search

Google apps

Main menu