2D Gaussian fitting to image

1,397 views
Skip to first unread message

Hadrien Mary

unread,
Apr 10, 2016, 6:13:45 PM4/10/16
to lmfi...@googlegroups.com
Hi,

And first thanks for your package lmfit which is a very nice lib to make easier fitting stuff in Python !

I am trying to fit model (here 2D gaussian but it could something else) to images in order to retrieve position of filaments (very short such as gaussian or very long which needs another model I won't discuss here).

Trying to use lmfit I have two questions.

I don't understand why when I am trying to fit a simulated 2D gaussian, the fit does not converge if my initial value are not very close to the solution... While my parameters are bounded inside the image and so the algorithm could easily visit all the set of parameters (at least for x0 and y0).

I also tried with different fitting method without success (some methods does not work because they need a Jacobian... but I don't know how to give that to the model).

Next when I try to fit my model on some real data, the fit just never converge and I really don't understand why...

Because code will explain probably better my issue, see this notebook : https://github.com/hadim/public_notebooks/blob/master/2d_gaussian_fit/notebook.ipynb

Thanks for the help.

--
HadiM

HadiM

unread,
Apr 10, 2016, 6:17:38 PM4/10/16
to lmfit-py
Note that I also tried to flatten the data with ravel() but I don't see differences...

Matt Newville

unread,
Apr 10, 2016, 8:25:14 PM4/10/16
to Hadrien Mary, lmfit-py
Hi HadiM,


On Sun, Apr 10, 2016 at 5:13 PM, Hadrien Mary <mare...@gmail.com> wrote:
Hi,

And first thanks for your package lmfit which is a very nice lib to make easier fitting stuff in Python !

I am trying to fit model (here 2D gaussian but it could something else) to images in order to retrieve position of filaments (very short such as gaussian or very long which needs another model I won't discuss here).

Trying to use lmfit I have two questions.

I don't understand why when I am trying to fit a simulated 2D gaussian, the fit does not converge if my initial value are not very close to the solution...

Well, least-square is not a peak finding algorithm....
 
While my parameters are bounded inside the image and so the algorithm could easily visit all the set of parameters (at least for x0 and y0).


Perhaps the algorithm could visit all the set of parameters, but it does not do this.  At least, "least-squares" does not do this.  And, actually, it could *not* do this "easily" because there is no easy way of knowing what step size to take.   As far as the fitting algorithm is concerned, x0 and y0 are 64-bit floating point numbers.   There are 10^19 available values for these, and you don't want the algorithm to try them all ;).

But, you might want to try walking the initial guess values over a 10x10 grid, and take the best result.   That would definitely work, but it's probably less efficient than it needs to be.

For the test data, you could simply guess starting points based on the x, y values of the maximum intensity.   Of course, finding peaks in more complicated images or datasets can be somewhat more challenging, but there's a lot of literature on this.


I also tried with different fitting method without success (some methods does not work because they need a Jacobian... but I don't know how to give that to the model).


I'll try to make a few suggestions:   First, even a crude peak-finding method would help.  Second, a larger initial value for sigma will allow "feeling" where the actual intensity is.   Even if you move the true values for the peak into one of the corners (say, centered at (5, 5)), an initial guess for x0, y0, sigma of (25, 25, 25) (that is, centered in the image and very broad) should find the right solution.
 
Hope that helps,

--Matt

Hadrien Mary

unread,
Apr 10, 2016, 9:10:08 PM4/10/16
to Matt Newville, lmfit-py
Hi Matt and thank you for your answer,

Sorry I should have explain what I want to do more precisely.

I know peak finding algorithm exists but it's not what I want to do here. Gaussian model was an example bu the idea is to detect more complex objet such as filaments in images. To do that I have some models describing gaussian wall, stretched peaks, filament tips, etc. I am trying to implement a method from this paper http://www.cell.com/biophysj/fulltext/S0006-3495(11)00467-X in Python (the original implementation is in Matlab). In the paper they use a fitting method (leastsq).

With the simulated data and the initial guess at (25, 25, 25) works nicely. But when I apply it to the real data, it fails. Note that x0 and y0 is indeed moving to the right direction but the sigma always stay at a very high value.

Coming back to simulated data with (25, 25, 25), I found that the fit is not robust to noise (even small), for example adding `image += np.random.poisson(50, image.shape)` prevents the fit to work. Same remark as with the real data, x0 and y0 is moving the right direction but the sigma value never decrease. So I guess that simulated data + noise acts the same as my real data.

What criteria make the method to stop doing more iterations ? Because each time the fit is ending pretty close to the correct value and I would hope that some more iterations would allow it to find the correct x0 and y0 and give it time to decrease the sigma value.

As you probably have noted I have more a broad feeling of the fitting algorithms than a really good understanding of what is going on :-)

Thank you again for your help.


--
HadiM

Hadrien Mary

unread,
Apr 10, 2016, 9:27:06 PM4/10/16
to Matt Newville, lmfit-py
Looking at the fit report, it appears that when the fit fails x0 and sigma and y0 and sigma are highly positively or negatively correlated (0.8-0.9) while it's not the case when the fit works. What does that mean ?

Poor fit : 

[[Model]]
    Model(func)
[[Fit Statistics]]
    # function evals   = 106
    # data points      = 2500
    # variables        = 4
    chi-square         = 1194887.925
    reduced chi-square = 478.721
    Akaike info crit   = 15435.796
    Bayesian info crit = 15459.092
[[Variables]]
    x0:      1.3086e-08 +/- 3.108836 (23756449250.30%) (init= 25)
    y0:      9.1276e-05 +/- 4.532151 (4965345.44%) (init= 25)
    sigma:   85.3817857 +/- 7.256131 (8.50%) (init= 25)
    H:       116.364443 +/- 1.454531 (1.25%) (init= 195.8755)
[[Correlations]] (unreported correlations are <  0.100)
    C(y0, sigma)                 = -0.883 
    C(x0, sigma)                 = -0.870 
    C(x0, y0)                    =  0.766 
    C(y0, H)                     = -0.757 
    C(x0, H)                     = -0.751 
    C(sigma, H)                  =  0.571 

Good fit : 

[[Model]]
    Model(func)
[[Fit Statistics]]
    # function evals   = 85
    # data points      = 2500
    # variables        = 4
    chi-square         = 70446.751
    reduced chi-square = 28.224
    Akaike info crit   = 8358.419
    Bayesian info crit = 8381.715
[[Variables]]
    x0:      5.00456894 +/- 0.021736 (0.43%) (init= 25)
    y0:      5.00448600 +/- 0.021736 (0.43%) (init= 25)
    sigma:   3.12933197 +/- 0.016463 (0.53%) (init= 25)
    H:       201.230519 +/- 1.395864 (0.69%) (init= 196.9911)
[[Correlations]] (unreported correlations are <  0.100)
    C(sigma, H)                  = -0.723 


--
HadiM

Matt Newville

unread,
Apr 10, 2016, 10:55:58 PM4/10/16
to Hadrien Mary, lmfit-py
Well,

On Sun, Apr 10, 2016 at 8:09 PM, Hadrien Mary <mare...@gmail.com> wrote:
Hi Matt and thank you for your answer,

Sorry I should have explain what I want to do more precisely.

I know peak finding algorithm exists but it's not what I want to do here. Gaussian model was an example bu the idea is to detect more complex objet such as filaments in images. To do that I have some models describing gaussian wall, stretched peaks, filament tips, etc. I am trying to implement a method from this paper http://www.cell.com/biophysj/fulltext/S0006-3495(11)00467-X in Python (the original implementation is in Matlab). In the paper they use a fitting method (leastsq).

Perhaps you need a different model that considers these effects?

 
With the simulated data and the initial guess at (25, 25, 25) works nicely. But when I apply it to the real data, it fails. Note that x0 and y0 is indeed moving to the right direction but the sigma always stay at a very high value.


Well, again, I'd say that approximately finding the peaks or features is important.

 
Coming back to simulated data with (25, 25, 25), I found that the fit is not robust to noise (even small), for example adding `image += np.random.poisson(50, image.shape)` prevents the fit to work. Same remark as with the real data, x0 and y0 is moving the right direction but the sigma value never decrease. So I guess that simulated data + noise acts the same as my real data.


I think I don't know enough about your problem to know why that would be.
 
What criteria make the method to stop doing more iterations ? Because each time the fit is ending pretty close to the correct value and I would hope that some more iterations would allow it to find the correct x0 and y0 and give it time to decrease the sigma value.

The fit goes approximately until the tolerances for the fit are met or until small changes in the values of the Parameters stop making the fit residual better. 

--Matt
Reply all
Reply to author
Forward
0 new messages