Feature request: add a SineModel to models.py

32 views
Skip to first unread message

Leonhard Neuhaus

unread,
Nov 7, 2019, 4:55:31 AM11/7/19
to lmfit-py
I hope this is the right place to post this feature request:

I have the impression a simple sinusoidal fit model like ``f(x) = a * sin(b * x + c) + d``  is missing from the collection of standard models in models.py. I believe it makes sense to try to add such a model here, as it is in general non-trivial to find good guesses. For example, while ``a``and ``d`` can be reasonably well guessed from the min/max of the data, and ``c`` might be inferred from an argmax or argmin of the data, I find that a good guess of ``b`` requires taking the argmax of the Fourier transformed data. I believe the place to develop a common solution for this problem is lmfit. I would be happy to contribute a pull-request that implements this, just wanted to make sure
- that there is no such solution available already, 
- that the Fourier transform approach is the best one to get the frequency/period,
- and that my pull-request would be appreciated.

Thanks!

Matt Newville

unread,
Nov 7, 2019, 1:33:22 PM11/7/19
to lmfit-py
Hi Leonhard, 



You're right that there is not any sort of periodic function, and there probably should be.  I'd even suggest maybe having a damped sine wave would be a good idea.  A PR for this would be great. 

I would suggest that giving more explicit names to the variables would be good, perhaps
    f(x) = amplitude * sin(frequency * x + phase_shift) + offset

(and maybe even leave "offset" out as building a composite model of SineModel() + ConstantModel() would be easy).

The `guess` method of a builtin models could be considered "nice to have" but optional.  Still, I would agree that having a `guess` method would be doable and preferable here.  Like you say, `amplitude` and `offset` (if retained) would be pretty easy (perhaps `amplitude=(data.max()-data.min())/2` and `offset=data.mean()`) while `frequency` and `phase_shift` could be more challenging.   IMHO, using a FT for that would be fine -- we know that `scipy` is required, so using anything from `scipy` has to be OK.   You might also be able to find the distance between a local max (with argmax) and the closest minimum to get the frequency, but I'm not sure how error-prone that would be for noisy signals. 

Anyway, yes a PR for this would be welcome!

--Matt Newville 

Joseph Fox-Rabinovitz

unread,
Nov 7, 2019, 3:21:45 PM11/7/19
to lmfi...@googlegroups.com
Hi,

Estimating the parameters of sine waves without resorting to fft is something I've been working with for a while. Here is a paper that proposes a simple, and in my opinion rather clever, method for estimating all the parameters: https://www.scribd.com/doc/14674814/Regressions-et-equations-integrales. I started translating the paper and implementing the methods it describes in a scikit-like format a while back, but had to take a long break. The aborted translation is here: https://scikit-guess.readthedocs.io/en/latest/appendices/reei/translation.html, and the code is here: https://github.com/madphysicist/scikit-guess. It's still very incomplete, but I'm hoping to finish it by mid next year. In the meantime, perhaps you will find it useful.

At the same time, I'm working on a unique non-linear algorithm for estimating the frequency and phase of the wave. I will make it available to you as soon as I've completed the paper. The algorithm does not fit the amplitude of the wave at all. As a consequence, I've found that one of the better estimators for amplitude is std(y) * sqrt(2). It's much more robust against outliers than using min and max. Perhaps you could work out a similar metric from the IQR, which would probably be even more robust.

Apologies for the shameless (and premature) plugs of my ongoing projects.

Could you please tag me on GitHub as @madphysicist when you submit the PR? I would be very interested to take a look at it.

Regards,

- Joe



--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/CA%2B7ESbrK9m2B2mh713y1n0LjC5sJhY_v3opJw_FpXKS_YoEqiQ%40mail.gmail.com.

Matt Newville

unread,
Nov 7, 2019, 4:01:09 PM11/7/19
to lmfit-py
On Thu, Nov 7, 2019 at 2:21 PM Joseph Fox-Rabinovitz <jfoxrab...@gmail.com> wrote:
Hi,

Estimating the parameters of sine waves without resorting to fft is something I've been working with for a while. Here is a paper that proposes a simple, and in my opinion rather clever, method for estimating all the parameters: https://www.scribd.com/doc/14674814/Regressions-et-equations-integrales. I started translating the paper and implementing the methods it describes in a scikit-like format a while back, but had to take a long break. The aborted translation is here: https://scikit-guess.readthedocs.io/en/latest/appendices/reei/translation.html, and the code is here: https://github.com/madphysicist/scikit-guess. It's still very incomplete, but I'm hoping to finish it by mid next year. In the meantime, perhaps you will find it useful.

At the same time, I'm working on a unique non-linear algorithm for estimating the frequency and phase of the wave. I will make it available to you as soon as I've completed the paper. The algorithm does not fit the amplitude of the wave at all. As a consequence, I've found that one of the better estimators for amplitude is std(y) * sqrt(2). It's much more robust against outliers than using min and max. Perhaps you could work out a similar metric from the IQR, which would probably be even more robust.

Apologies for the shameless (and premature) plugs of my ongoing projects.

Could you please tag me on GitHub as @madphysicist when you submit the PR? I would be very interested to take a look at it.


That all seems fine to me, and glad to see interest and work on this sort of stuff.   I would guess (sorry ;)) that getting initial values for amplitude and frequency that were even "in the ballpark" would be sufficient., and that off by a factor of 5 would usually be OK.  That is, peak-to-peak or root-mean-square is probably a fine starting estimate of amplitude for a fit.   For frequency, I wouldn't be surprised if `5.0/abs(x.max()-x.min()))` wasn't good enough most of the time when a single frequency dominates.  That said, having initial values that are "almost right" and only need refining are worth the effort when mixing frequencies or when there are other components (decays, etc).   In those cases, using an FT seems like a more robust approach.

But, again: cool! Let's add the best stuff you come up with!



Regards,

- Joe



On Thu, Nov 7, 2019, 1:33 PM Matt Newville <newv...@cars.uchicago.edu> wrote:
Hi Leonhard, 



On Thu, Nov 7, 2019 at 3:55 AM Leonhard Neuhaus <pyrpl.read...@gmail.com> wrote:
I hope this is the right place to post this feature request:

I have the impression a simple sinusoidal fit model like ``f(x) = a * sin(b * x + c) + d``  is missing from the collection of standard models in models.py. I believe it makes sense to try to add such a model here, as it is in general non-trivial to find good guesses. For example, while ``a``and ``d`` can be reasonably well guessed from the min/max of the data, and ``c`` might be inferred from an argmax or argmin of the data, I find that a good guess of ``b`` requires taking the argmax of the Fourier transformed data. I believe the place to develop a common solution for this problem is lmfit. I would be happy to contribute a pull-request that implements this, just wanted to make sure
- that there is no such solution available already, 
- that the Fourier transform approach is the best one to get the frequency/period,
- and that my pull-request would be appreciated.

You're right that there is not any sort of periodic function, and there probably should be.  I'd even suggest maybe having a damped sine wave would be a good idea.  A PR for this would be great. 

I would suggest that giving more explicit names to the variables would be good, perhaps
    f(x) = amplitude * sin(frequency * x + phase_shift) + offset

(and maybe even leave "offset" out as building a composite model of SineModel() + ConstantModel() would be easy).

The `guess` method of a builtin models could be considered "nice to have" but optional.  Still, I would agree that having a `guess` method would be doable and preferable here.  Like you say, `amplitude` and `offset` (if retained) would be pretty easy (perhaps `amplitude=(data.max()-data.min())/2` and `offset=data.mean()`) while `frequency` and `phase_shift` could be more challenging.   IMHO, using a FT for that would be fine -- we know that `scipy` is required, so using anything from `scipy` has to be OK.   You might also be able to find the distance between a local max (with argmax) and the closest minimum to get the frequency, but I'm not sure how error-prone that would be for noisy signals. 

Anyway, yes a PR for this would be welcome!

--Matt Newville 

--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lmfit-py/CA%2B7ESbrK9m2B2mh713y1n0LjC5sJhY_v3opJw_FpXKS_YoEqiQ%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups "lmfit-py" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lmfit-py+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages