Fitting a 2D 'Model'

Moe

unread,

Oct 22, 2014, 8:04:54 AM10/22/14

to lmfi...@googlegroups.com

First: Sorry to Matt for posting this one as an issue. I just didn't see the mailing list. ;)

I'm trying to use lmfit to fit (mostly Gaussians) to greyscale image data.
Therefore I created a 'Model' with 2 independent variables x1, x2.
However, telling lmfit to treat is as 2D data seems to fail, since scipy throws "minpack.error: Result from function call is not a proper array of floats." in minimizer.py::minimize().

Could you please explain how to do correctly? Maybe an example in the documentation would be usefull for others, too.

Here are the basics of my approach:

def _function(self, x1, x2, amplitude, sigma, center1, center2):
    val = (amplitude/(np.sqrt(2*np.pi)*sigma)) * np.exp(-((x1-cen1)**2+(x2-cen2)**2)/(2*sigma**2))
    return val

data = _loadRawImage('path')   # loaded as numpy array
x1 = np.arange(len(img[0, :]))   
x2 = np.arange(len(img[:, 0]))

model = lmfit.Model(self._function, independent_vars=["x1", "x2"],
                        param_names=["amplitude", "sigma", "center1", "center2"])
model.fit(data, x1=x1, x2=x2, params=params)    # params made elsewhere, get loaded in Model correctly

Matt Newville

unread,

Oct 22, 2014, 9:53:40 AM10/22/14

to lmfit-py

On Wed, Oct 22, 2014 at 7:04 AM, Moe <timo...@googlemail.com> wrote:

First: Sorry to Matt for posting this one as an issue. I just didn't see the mailing list. ;)

No problem -- the list is new. I'd like to see it used instead of Github Issues partly because I think it's more easily searched.

--

Basically, you need the function and data to be flattened to 1-d arrays, as with

data = data.flatten()

and then


    def _function(self, x1, x2, amplitude, sigma, center1, center2):

        val = ( (amplitude/(np.sqrt(2*np.pi)*sigma)) * 
                 np.exp(-((x1-cen1)**2+(x2-cen2)**2)/(2*sigma**2))
)

        return val.flatten()

That is, the minimization is done on a 1-D array, and it doesn't matter (for the minimization) how your data is arranged into a 2d array.

Admittedly, that's sort of a work-around to make your data look like lmfit expects it to look.  It's reasonable to say that lmfit should be able to do this flattening for you automatically.   That would allow you to use the same function to generate the 2-D array that best matched your data, which would probably be preferred.  

We haven't done much explicitly for 2- or 3-d datasets, but such "auto-flattening" would probably go a long way in better support for 2- and 3-d models.

--Matt

Yuxiang Wang

unread,

Oct 22, 2014, 12:55:16 PM10/22/14

to lmfi...@googlegroups.com, newv...@cars.uchicago.edu

1) Aside from ndarray.flatten(), I think ndarray.ravel() is another option... It doesn't always make a copy, so it will be faster (but be careful if you are using C/Fortran code).

2) Matt - being curious, what could be the potential pitfalls if we simply check the ndarray.ndim, and do a flattening if ndim > 1? Or, what if we flatten every ndarray (returned by the function, and the fitting target array)?

-Shawn

Yuxiang Wang

unread,

Oct 22, 2014, 9:56:46 PM10/22/14

to Matt Newville, lmfi...@googlegroups.com

Matt,

Thanks for your thoughts on this! And that sounds really an interesting project to work on. Looking forward to play with it a little bit when I got a chance :)

-Shawn

On Wed, Oct 22, 2014 at 2:26 PM, Matt Newville <newv...@cars.uchicago.edu> wrote:

Hi Shawn,

On Wed, Oct 22, 2014 at 11:55 AM, Yuxiang Wang <wangyux...@gmail.com> wrote:
1) Aside from ndarray.flatten(), I think ndarray.ravel() is another option... It doesn't always make a copy, so it will be faster (but be careful if you are using C/Fortran code).

You're right. I think ndarray.ravel() would be preferred: the order doesn't matter as long as it is consistent between calls, and any performance gain is appreciated.

2) Matt - being curious, what could be the potential pitfalls if we simply check the ndarray.ndim, and do a flattening if ndim > 1? Or, what if we flatten every ndarray (returned by the function, and the fitting target array)?

Lmfit is already "hijacking" the objective function for scipy.optimize.leastsq, and handling the output of the user's callback function. So it might be as simple as changing Minimizer.__residual() to return "out.ravel()" if the dimension is not 1.

The main potential pitfall I see are when a user-supplied Jacobian function is provided. The user would supply a function that gave a N+1 dimension output, and this would need to be reshaped to be a 2-d array with the order matching the results of ndarray.ravel. I think that's solvable but needs to be done with care. I think the confidence interval code would need no change.

A Pull Request with a test case would be great!

--Matt Newville

--

Yuxiang "Shawn" Wang
Gerling Research Lab
University of Virginia
yw...@virginia.edu
+1 (434) 284-0836
https://sites.google.com/a/virginia.edu/yw5aj/

Matt Newville

unread,

Oct 22, 2014, 10:01:06 PM10/22/14

to lmfit-py

Hi Shawn,

Sorry I originally replied only to Shawn.... Need to understand why gmail doesn't auto-respond to the google group!

On Wed, Oct 22, 2014 at 11:55 AM, Yuxiang Wang <wangyux...@gmail.com> wrote:

1) Aside from ndarray.flatten(), I think ndarray.ravel() is another option... It doesn't always make a copy, so it will be faster (but be careful if you are using C/Fortran code).

You're right. I think ndarray.ravel() would be preferred: the order doesn't matter as long as it is consistent between calls, and any performance gain is appreciated.

2) Matt - being curious, what could be the potential pitfalls if we simply check the ndarray.ndim, and do a flattening if ndim > 1? Or, what if we flatten every ndarray (returned by the function, and the fitting target array)?

Lmfit is already "hijacking" the objective function for scipy.optimize.leastsq, and handling the output of the user's callback function. So it might be as simple as changing Minimizer.__residual() to return "out.ravel()" if the dimension is not 1.

The main potential pitfall I see are when a user-supplied Jacobian function is provided. The user would supply a function that gave a N+1 dimension output, and this would need to be reshaped to be a 2-d array with the order matching the results of ndarray.ravel. I think that's solvable but needs to be done with care. I think the confidence interval code would need no change.

A Pull Request with a test case would be great!

--
--Matt Newville

Yuxiang Wang

unread,

Oct 22, 2014, 10:14:26 PM10/22/14

to lmfi...@googlegroups.com, newv...@cars.uchicago.edu

Matt,

Haha I noticed - that's why I replied to the whole group with yours attached.

PS: Thank you for working on such a nice library and promoting it on the SciPy-Users mailing list! When I first saw your post I felt like "man, why haven't I heard of that earlier!"

-Shawn

Moe

unread,

Oct 23, 2014, 4:25:31 AM10/23/14

to lmfi...@googlegroups.com

Hey Matt, Hi Shawn,

thanks for the answer. Flattening it to 1D probably wouldn't have come to my mind. ;)

Just to complete it:
Here's like I flatten the 2D x data. Results from the fits look good, but don't hesitate if you notice a possible problem with that!

            x1n = []
            x2n = []

            for i in x1:
                for j in x2:
                    x1n.append(i)
                    x2n.append(j)

            x1n = (np.asarray(x1n)).flatten()
            x2n = (np.asarray(x2n)).flatten()
            data = data.flatten()
            return model.fit(data, x1=x1n, x2=x2n, params=params)

You might notice that I'm not really the guy to improve performance in a 'pythonic' way. ;)

Reply all

Reply to author

Forward