[SciPy-User] Alternatives to scipy.optimize

Erik Petigura

unread,

Feb 25, 2012, 2:17:13 PM2/25/12

to scipy...@scipy.org

Dear Scipy,

Up until now, I've found the optimize module very useful. Now, I'm finding that I need finer control. I am fitting a model to data that is of the following from:

model = func1(p1) + func2(p2)

func1 is nonlinear in its parameters and func2 is linear in its parameters.

There are two things I am struggling with:

1. I'd like to find the best fit parameters for func1 using an iterative approach (e.g. simplex algorithm that changes p1.). At each iteration, I want to compute the optimum p2 by linear least squares in the interest of speed and robustness.

2. I'd also like the ability to hold certain parameters fixed in the optimization with out redefining my objective function each time.

Is there another module you would recommend? I've found openopt, but I wanted to get some guidance before I dive in to that.

Erik

josef...@gmail.com

unread,

Feb 25, 2012, 2:34:00 PM2/25/12

to SciPy Users List

On Sat, Feb 25, 2012 at 2:17 PM, Erik Petigura <ept...@gmail.com> wrote:
> Dear Scipy,
>
> Up until now, I've found the optimize module very useful. Now, I'm finding
> that I need finer control. I am fitting a model to data that is of the
> following from:
>
> model = func1(p1) + func2(p2)
>
> func1 is nonlinear in its parameters and func2 is linear in its parameters.
>
>
> There are two things I am struggling with:
>
> 1. I'd like to find the best fit parameters for func1 using an iterative
> approach (e.g. simplex algorithm that changes p1.). At each iteration, I
> want to compute the optimum p2 by linear least squares in the interest of
> speed and robustness.

you can still do this with any regular optimizer like optimize.fmin,
just calculate the linear solution inside the outer function that is
optimized by fmin.

I haven't seen any python package yet, that would estimate partial
linear models.

If you find a solution, then I would be interested in it for statsmodels.

Josef

>
> 2. I'd also like the ability to hold certain parameters fixed in the
> optimization with out redefining my objective function each time.
>
>
> Is there another module you would recommend? I've found openopt, but I
> wanted to get some guidance before I dive in to that.
>
> Erik
>
>

> _______________________________________________
> SciPy-User mailing list
> SciPy...@scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
_______________________________________________
SciPy-User mailing list
SciPy...@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user

Christopher Jordan-Squire

unread,

Feb 25, 2012, 3:07:45 PM2/25/12

to SciPy Users List

On Sat, Feb 25, 2012 at 11:17 AM, Erik Petigura <ept...@gmail.com> wrote:
> Dear Scipy,
>
> Up until now, I've found the optimize module very useful. Now, I'm finding
> that I need finer control. I am fitting a model to data that is of the
> following from:
>
> model = func1(p1) + func2(p2)
>
> func1 is nonlinear in its parameters and func2 is linear in its parameters.
>
>
> There are two things I am struggling with:
>
> 1. I'd like to find the best fit parameters for func1 using an iterative
> approach (e.g. simplex algorithm that changes p1.). At each iteration, I
> want to compute the optimum p2 by linear least squares in the interest of
> speed and robustness.
>

Are p1 and p2 coupled somehow? They must be, or computing p2 at each
iteration wouldn't be relevant.

This seems like a modification that could be made to the source code
without too much trouble, if iirc. Alternatively, you could possibly
use the callback option in fmin.

> 2. I'd also like the ability to hold certain parameters fixed in the
> optimization with out redefining my objective function each time.
>
>

Could you pass in a lambda function with those parameters fixed?

-Chris JS

> Is there another module you would recommend? I've found openopt, but I
> wanted to get some guidance before I dive in to that.
>
> Erik
>
>

josef...@gmail.com

unread,

Feb 25, 2012, 6:33:29 PM2/25/12

to SciPy Users List

On Sat, Feb 25, 2012 at 3:07 PM, Christopher Jordan-Squire
<cjor...@uw.edu> wrote:
> On Sat, Feb 25, 2012 at 11:17 AM, Erik Petigura <ept...@gmail.com> wrote:
>> Dear Scipy,
>>
>> Up until now, I've found the optimize module very useful. Now, I'm finding
>> that I need finer control. I am fitting a model to data that is of the
>> following from:
>>
>> model = func1(p1) + func2(p2)
>>
>> func1 is nonlinear in its parameters and func2 is linear in its parameters.
>>
>>
>> There are two things I am struggling with:
>>
>> 1. I'd like to find the best fit parameters for func1 using an iterative
>> approach (e.g. simplex algorithm that changes p1.). At each iteration, I
>> want to compute the optimum p2 by linear least squares in the interest of
>> speed and robustness.
>>
>
> Are p1 and p2 coupled somehow? They must be, or computing p2 at each
> iteration wouldn't be relevant.

assuming he meant y = f(x1,p1) + x2*p2 + error
and minimizing for example sum of squared error
then the estimation of p1 and p2 cannot be separated.

a quickly written draft of how I would do it, (which might be added to
statsmodels after cleanup, adding results statistics and testing).
It should be possible to subclass and overwrite the nonlinear
function/method predict_nonlin
https://gist.github.com/1911544

no fixed parameters yet

Josef

Erik Petigura

unread,

Feb 25, 2012, 9:21:43 PM2/25/12

to scipy...@scipy.org

Thanks for getting back to me!

I'd like to minimize p1 and p2 together. Let me try to describe my problem a little better:

I'm trying to fit an exoplanet transit light curve. My model is a box + a polynomial trend.

https://gist.github.com/1912265

The polynomial coefficients and the depth of the box are linear parameters, so I want to fit them using linear least squares. The center and width of the transit are non-linear so I need to fit them with an iterative approach like optimize.fmin. Here's how I implemented it.

https://gist.github.com/1912281

There is a lot unpacking and repacking the parameter array as it gets passed around between functions. One option that might work would be to define functions based on a "parameter object". This parameter object could have attributes like float/fix, linear/non-linear. I found a more object oriented optimization module here:

http://newville.github.com/lmfit-py/

However, it doesn't allow for linear fitting.

Erik

Charles R Harris

unread,

Feb 25, 2012, 11:30:24 PM2/25/12

to SciPy Users List

On Sat, Feb 25, 2012 at 12:34 PM, <josef...@gmail.com> wrote:

On Sat, Feb 25, 2012 at 2:17 PM, Erik Petigura <ept...@gmail.com> wrote:
> Dear Scipy,
>
> Up until now, I've found the optimize module very useful. Now, I'm finding
> that I need finer control. I am fitting a model to data that is of the
> following from:
>
> model = func1(p1) + func2(p2)
>
> func1 is nonlinear in its parameters and func2 is linear in its parameters.
>
>
> There are two things I am struggling with:
>
> 1. I'd like to find the best fit parameters for func1 using an iterative
> approach (e.g. simplex algorithm that changes p1.). At each iteration, I
> want to compute the optimum p2 by linear least squares in the interest of
> speed and robustness.

you can still do this with any regular optimizer like optimize.fmin,
just calculate the linear solution inside the outer function that is
optimized by fmin.

That's what I do: use leastsq and let it vary the p1 parameters which are passed to a function that uses linear least squares to compute the residuals of the linear least squares problem func2(p2) = data - func1(p1). The residuals are the the values returned to leastsq. The func2 doesn't even have to be linear if the solution can be easily computed for subsets of the data. I've used this for fits involving hundreds of quaternions as the p2 and a far smaller number of p1.

> 2. I'd also like the ability to hold certain parameters fixed in the
> optimization with out redefining my objective function each time.

This is trickier, but leastsq will generally work if you just ignore some of the p1 parameters. Better would be to adjust the number of parameters passed to the inner function, but that is more complicated.

Chuck

josef...@gmail.com

unread,

Feb 26, 2012, 12:21:05 AM2/26/12

to SciPy Users List

On Sat, Feb 25, 2012 at 9:21 PM, Erik Petigura <ept...@gmail.com> wrote:
> Thanks for getting back to me!
>
> I'd like to minimize p1 and p2 together. Let me try to describe my problem
> a little better:
>
> I'm trying to fit an exoplanet transit light curve. My model is a box + a
> polynomial trend.
>
> https://gist.github.com/1912265
>
> The polynomial coefficients and the depth of the box are linear parameters,
> so I want to fit them using linear least squares. The center and width of
> the transit are non-linear so I need to fit them with an iterative approach
> like optimize.fmin. Here's how I implemented it.
>
> https://gist.github.com/1912281

Took me a while to work my way through it, especially that you have a
linear coefficient in front of the nonlinear part.

The idea of calculating the nonlinear part by setting the linear
coefficients to "neutral" values is nice.
Is (((p-pNL0)/dpNL0)**2).sum() a penalization term? Since your
objective function is quadratic optimize.leastsq might work better
than fmin.
The next numpy version will have a vander function to build Legendre
polynomials. (Or maybe it already has, I'm on 1.5)

The next thing will be to get the covariance matrix for the parameter
estimates :)

>
> There is a lot unpacking and repacking the parameter array as it gets passed
> around between functions. One option that might work would be to define
> functions based on a "parameter object". This parameter object could have
> attributes like float/fix, linear/non-linear. I found a more object
> oriented optimization module here:
>
> http://newville.github.com/lmfit-py/

The easiest is to just write some helper functions to stack or unstack
the parameters, or set some to fixed. In statsmodels we use this in
some cases (as methods since our models are classes), also to
transform parameters.
Since often this affects groups of parameters, I don't know if the
lmfit approach would helps in this case.

(Personally, I like numpy arrays with masks or fancy indexing, which
is easy to understand. Ast manipulation scares me.)

Josef

>
> However, it doesn't allow for linear fitting.
>
> Erik
>
>

Gregor Thalhammer

unread,

Feb 26, 2012, 6:15:52 AM2/26/12

to SciPy Users List

Am 26.2.2012 um 03:21 schrieb Erik Petigura:

Thanks for getting back to me!

I'd like to minimize p1 and p2 together. Let me try to describe my problem a little better:

I'm trying to fit an exoplanet transit light curve. My model is a box + a polynomial trend.

https://gist.github.com/1912265

The polynomial coefficients and the depth of the box are linear parameters, so I want to fit them using linear least squares. The center and width of the transit are non-linear so I need to fit them with an iterative approach like optimize.fmin. Here's how I implemented it.

https://gist.github.com/1912281

I didn't look in detail at your code, but it seems to me the approach described e.g. in

Separable NonLinear Least Squares would be a good choice for you, especially since you are able to analytically calculate the derivatives. The method is similar to the approach you chose, it first solves a linear least squares problem to determine estimates for the linear parameters. This information is then used to calculate the derivatives (Jacobian) with respect to the nonlinear parameters for an iterative minimization (Levenberq-Marquardt). I have a python implementation, if you are interested, I can share the code - but it's poorly documented.

Gregor

Erik Petigura

unread,

Feb 25, 2012, 2:16:28 PM2/25/12

to scipy...@scipy.org

Dear Scipy,

Up until now, I've found the optimize module very useful. Now, I'm finding that I need finer control. I am fitting a model to data that is of the following from:

model = func1(p1) + func2(p2)

func1 is nonlinear in its parameters and func2 is linear in its parameters.

There are two things I am struggling with:

1. I'd like to find the best fit parameters for func1 using an iterative approach (e.g. simplex algorithm that changes p1.). At each iteration, I want to compute the optimum p2 by linear least squares in the interest of speed and robustness.

2. I'd also like the ability to hold certain parameters fixed in the optimization with out redefining my objective function each time.

Kevin Gullikson

unread,

Feb 25, 2012, 2:31:55 PM2/25/12

to SciPy Users List

Erik,

You can do the least-squares fit to func2 within the function that you pass to scipy.optimize.leastsq (or similar).

For fixed parameters, I use a second array called const_pars, and pass it to leastsq as one of the arguments (e.g. leastsq(ErrFunc, pars, args=(const_pars)) ) For example:

def func1(p1):

//some non-linear function of p1

def func2(p2):

//some linear function of p2

def ErrFunc(p1,p2):

//Do linear fit to func2 and optimize parameters p1

return func1(p1) + func2(p2)

//Run leastsq:

pars, success = scipy.optimize.leastsq(ErrFunc, p1, args=(p2))

Hope that helps!

Kevin Gullikson

Matthew Newville

unread,

Feb 26, 2012, 10:14:37 AM2/26/12

to scipy...@googlegroups.com, SciPy Users List

Hi Erik, Josef,

On Saturday, February 25, 2012 8:21:43 PM UTC-6, Erik Petigura wrote:
>
> Thanks for getting back to me!
> I'd like to minimize p1 and p2 together. Let me try to describe my problem a little better:
>
> I'm trying to fit an exoplanet transit light curve. My model is a box + a polynomial trend.
>
> https://gist.github.com/1912265
>
> The polynomial coefficients and the depth of the box are linear parameters, so I want to
> fit them using linear least squares. The center and width of the transit are non-linear
> so I need to fit them with an iterative approach like optimize.fmin.
> Here's how I implemented it.
>
> https://gist.github.com/1912281

I'm not sure I fully follow your model, but if I understand correctly, you're looking to find optimal parameters for something like
model = linear_function(p1) + nonlinear_function(p2)

for sets of coefficients p1 and p2, each set having a few fitting variables, some of which may be related. Is there an instability that prevents you from just treating this as a single non-linear model?

Another option might be to have the residual function for scipy.optimize.leastsq (or lmfit) call numpy.linalg.lstsq at each iteration. I would think that more fully explore the parameter space than first fitting nonlinear_function with scipy.optimize.fmin() then passing those best-fit parameters to numpy.linalg.lstsq(), but perhaps I'm not fully understanding the nature of the problem.

> There is a lot unpacking and repacking the parameter array as it gets passed around
> between functions. One option that might work would be to define functions based on a
> "parameter object". This parameter object could have attributes like float/fix,
> linear/non-linear. I found a more object oriented optimization module here:
> http://newville.github.com/lmfit-py/
>
> However, it doesn't allow for linear fitting.

Linear fitting could probably be added to lmfit, though I haven't looked into it. For this problem, I would pursue the idea of treating your fitting problem as a single model for non-linear least squares with optimize.leastsq or with lmfit. Perhaps I missing something about your model that makes this approach unusually challenging.

Josef P wrote:

> The easiest is to just write some helper functions to stack or unstack
> the parameters, or set some to fixed. In statsmodels we use this in
> some cases (as methods since our models are classes), also to
> transform parameters.
> Since often this affects groups of parameters, I don't know if the
> lmfit approach would helps in this case.

If many people who are writing their own model functions find themselves writing similar helper functions to stack and unstack parameters, "the easiest" here might not be "the best", and providing tools to do this stacking and unstacking might be worthwhile. Lmfit tries to do this.

> (Personally, I like numpy arrays with masks or fancy indexing, which
> is easy to understand. Ast manipulation scares me.)

I don't understand how masks or fancy indexing would help here. How would that work?

FWIW, lmfit uses python's ast module only for algebraic constraints between parameters. That is,
    from lmfit import Parameter
    Parameter(name='a', value=10, vary=True)
    Parameter(name='b', expr='sqrt(a) + 1')

will compile 'sqrt(a)+1' into its AST representation and evaluate that for the value of 'b' when needed. So lmfit doesn't so much manipulate the AST as interpret it. What is manipulated is the namespace, so that 'a' is interpreted as "look up the current value of Parameter 'a'" when the AST is evaluated.   Again, this applies only for algebraic constraints on parameters.

Having written fitting programs that support user-supplied algebraic constraints between parameters in Fortran77, I find interpreting python's AST to be remarkably simple and robust. I'm scared much more by statistical modeling of economic data ;)

Cheers,

--Matt Newville

josef...@gmail.com

unread,

Feb 27, 2012, 10:56:30 AM2/27/12

to SciPy Users List

On Sun, Feb 26, 2012 at 10:14 AM, Matthew Newville
<matt.n...@gmail.com> wrote:
>
> Hi Erik, Josef,
>
>
> On Saturday, February 25, 2012 8:21:43 PM UTC-6, Erik Petigura wrote:
>>
>> Thanks for getting back to me!
>> I'd like to minimize p1 and p2 together. Let me try to describe my
>> problem a little better:
>>
>> I'm trying to fit an exoplanet transit light curve. My model is a box + a
>> polynomial trend.
>>
>> https://gist.github.com/1912265
>>
>> The polynomial coefficients and the depth of the box are linear
>> parameters, so I want to
>> fit them using linear least squares. The center and width of the transit
>> are non-linear
>> so I need to fit them with an iterative approach like optimize.fmin.
>> Here's how I implemented it.
>>
>> https://gist.github.com/1912281
>
> I'm not sure I fully follow your model, but if I understand correctly,
> you're looking to find optimal parameters for something like
> model = linear_function(p1) + nonlinear_function(p2)

yes, I've read about this mostly in the context of semiparametric
versions when the nonlinear function does not have a parametric form.

http://en.wikipedia.org/wiki/Semiparametric_regression#Partially_linear_models

>
> for sets of coefficients p1 and p2, each set having a few fitting variables,
> some of which may be related. Is there an instability that prevents you
> from just treating this as a single non-linear model?

I think p1 and p2 shouldn't have any cross restrictions or it will get
a bit more complicated.
The main reason for splitting it up is computational, I think. It's
quite common in econometrics to "concentrate out" parameters that have
an explicit solution, so we need the nonlinear optimization only for a
smaller parameter space.

>
> Another option might be to have the residual function for
> scipy.optimize.leastsq (or lmfit) call numpy.linalg.lstsq at each
> iteration. I would think that more fully explore the parameter space than
> first fitting nonlinear_function with scipy.optimize.fmin() then passing
> those best-fit parameters to numpy.linalg.lstsq(), but perhaps I'm not fully
> understanding the nature of the problem.

That's what both of us did, my version https://gist.github.com/1911544

---- <detour warning>
a few examples how we currently handle parameter restrictions in statsmodels

Skippers example: transform parameters in the loglikelihood to force
the parameter estimates of an ARMA process to produce a stationary
(stable) solution - transforms groups of parameters at once
https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/tsa/arima_model.py#L230
https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/tsa/arima_model.py#L436

not a clean example: fitting distributions with some frozen parameters
https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/sandbox/distributions/sppatch.py#L267
select parameters that are not frozen to use with fmin
x0 = np.array(x0)[np.isnan(frmask)]
expand the parameters again to include the frozen parameters inside
the loglikelihood function
theta = frmask.copy()
theta[np.isnan(frmask)] = thetash

It's not as user friendly as the version that got into
scipy.stats.distribution, (but it's developer friendly because I don't
have to stare at it for hours to spot a bug)

structural Vector Autoregression uses a similar mask pattern
https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/tsa/vector_ar/svar_model.py#L89

In another sandbox model I build a nested dictionary to map the model
parameters to the reduced list that can be fed to
scipy.optimize.fmin_xxx

But we don't have user defined nonlinear constraints yet.
---------

>
> FWIW, lmfit uses python's ast module only for algebraic constraints between
> parameters. That is,
>     from lmfit import Parameter
>     Parameter(name='a', value=10, vary=True)
>     Parameter(name='b', expr='sqrt(a) + 1')
>
> will compile 'sqrt(a)+1' into its AST representation and evaluate that for
> the value of 'b' when needed. So lmfit doesn't so much manipulate the AST
> as interpret it. What is manipulated is the namespace, so that 'a' is
> interpreted as "look up the current value of Parameter 'a'" when the AST is
> evaluated.   Again, this applies only for algebraic constraints on
> parameters.

It's a bit similar to a formula framework for specifying a statistical
model that is to be estimated (with lengthy discussion on the
statsmodels list).

I see the advantages but I haven't spent the weeks of time to figure
out what's behind the machinery that is required (especially given all
the other statistics and econometrics that is missing, and where I
only have to worry about how numpy, scipy and statsmodels behaves. And
I like Zen of Python #2)

>
> Having written fitting programs that support user-supplied algebraic
> constraints between parameters in Fortran77, I find interpreting python's
> AST to be remarkably simple and robust. I'm scared much more by statistical
> modeling of economic data ;)

different tastes and applications.
I'd rather think about the next 20 statistical tests I want to code,
than about the AST or how sympy translates into numpy.

Cheers,

Josef

>
> Cheers,
>
> --Matt Newville

Nathaniel Smith

unread,

Feb 27, 2012, 11:33:00 AM2/27/12

to SciPy Users List

On Sun, Feb 26, 2012 at 3:14 PM, Matthew Newville
<matt.n...@gmail.com> wrote:
>     from lmfit import Parameter
>     Parameter(name='a', value=10, vary=True)
>     Parameter(name='b', expr='sqrt(a) + 1')
>
> will compile 'sqrt(a)+1' into its AST representation and evaluate that for
> the value of 'b' when needed. So lmfit doesn't so much manipulate the AST
> as interpret it. What is manipulated is the namespace, so that 'a' is
> interpreted as "look up the current value of Parameter 'a'" when the AST is
> evaluated.   Again, this applies only for algebraic constraints on
> parameters.
>
> Having written fitting programs that support user-supplied algebraic
> constraints between parameters in Fortran77, I find interpreting python's
> AST to be remarkably simple and robust. I'm scared much more by statistical
> modeling of economic data ;)

So you use the 'ast' module to convert Python source into a syntax
tree, and then you wrote an interpreter for that syntax tree?

...Wouldn't it be easier to use Python's interpreter instead of
writing your own, i.e., just call eval() on the source code? Or are
you just using the AST to figure out which variables are referenced?
(I have some code to do just that without going through the ast
module, on the theory that it's nice to be compatible with python 2.5,
but I'm not sure it's really worth it.)

-- Nathaniel

Matt Newville

unread,

Feb 27, 2012, 2:12:11 PM2/27/12

to scipy...@googlegroups.com, SciPy Users List

Hi Nathaniel,

On Mon, Feb 27, 2012 at 10:33 AM, Nathaniel Smith <n...@pobox.com> wrote:
> On Sun, Feb 26, 2012 at 3:14 PM, Matthew Newville
> <matt.n...@gmail.com> wrote:
>>     from lmfit import Parameter
>>     Parameter(name='a', value=10, vary=True)
>>     Parameter(name='b', expr='sqrt(a) + 1')
>>
>> will compile 'sqrt(a)+1' into its AST representation and evaluate that for
>> the value of 'b' when needed. So lmfit doesn't so much manipulate the AST
>> as interpret it. What is manipulated is the namespace, so that 'a' is
>> interpreted as "look up the current value of Parameter 'a'" when the AST is
>> evaluated.   Again, this applies only for algebraic constraints on
>> parameters.
>>
>> Having written fitting programs that support user-supplied algebraic
>> constraints between parameters in Fortran77, I find interpreting python's
>> AST to be remarkably simple and robust. I'm scared much more by statistical
>> modeling of economic data ;)
>
> So you use the 'ast' module to convert Python source into a syntax
> tree, and then you wrote an interpreter for that syntax tree?

Yes (though just for the expressions for the constraint). I accept
that this can be viewed as anywhere on the continuum between highly
useful and stark raving mad. It's such a fine line between stupid
and, uh, ... clever.

> ...Wouldn't it be easier to use Python's interpreter instead of
> writing your own, i.e., just call eval() on the source code?

Probably, though if evaluating an AST tree scares some people, then I
think that eval() would scare the rest even more. So the goal was
sort of a safe-ish, mathematically-oriented, eval().

> Or are you just using the AST to figure out which variables are referenced?
> (I have some code to do just that without going through the ast
> module, on the theory that it's nice to be compatible with python 2.5,
> but I'm not sure it's really worth it.)

I do figure out which variables are referenced, but do more than that.
I run ast.parse(expression) prior to the fit, and then evaluate by
walking through the resulting tree at each iteration. When reaching
an ast.Name node, I look up the name in pre-reserved dictionary.
Many python and numpy symbols (sqrt, pi, etc) are automatically
included in this 'namespace'. For each fit, the names of all the
Parameters are also included prior to the fit. That gives a
restricted language and namespace for writing constraints. For a
simple example,

from lmfit import Parameter
Parameter(name='a', value=10, vary=True)
Parameter(name='b', expr='sqrt(a) + 1')

Parameter(name='c', expr='a + b')

the expression for 'b' is parsed more or less (you can do
ast.dump(ast.parse('sqrt(a) + 1')) for a more complete result) to:
BinOp(op=Add(),
left=Call(func=Name('sqrt'), args=[Name('a')]),
right=Num(1)
)

that tree is simple to walk, with each node evaluated appropriately.
Evaluation of such a tree is so easy that, although it is not highly
useful for constraint expressions in a fitting problem, the
lmfit.asteval code includes support for while and for loops,
if-then-else, and try-except, as well as full slicing and attribute
lookups for both evaluation and assignment. The ast module does the
parsing and gives a walkable tree -- amazing and beautifl, and
standard python (for python 2.6+).

As mentioned above, the evaluation of these constraints happens at
each iteration of the python function to be minimized by
scipy.optimize.leastsq, or others. Dependencies (ie, that 'c'
depends on 'a' and 'b' and that 'b' depends on 'a') are recorded so
that 'b' above is evaluated once per fitting loop. Admittedly,
that's probably a minor optimization, but it is in the
lmfit/minimizer.py if anyone is looking.

This approach gives a user a lot of flexibility in setting up fitting
models, and can allow the fitting function to remain relatively static
and written in terms of the "physical" model. Of course, it is not
going to be as fast as numexpr for array calculations, but it is more
general, and faster ufuncs is not the issue at hand.

Cheers,

--Matt Newville

Matt Newville

unread,

Feb 27, 2012, 2:50:05 PM2/27/12

to scipy...@googlegroups.com, SciPy Users List

Hi Josef,

Right, that does seem to be the preferred solution here, though it
wasn't completely clear to me that Erik meant that the parameters in
p1 and p2 were always decoupled. I may not have understood the model
(and there were a lot of objects named 'p'!). Allowing coupling
between some of the elements of p1 and p2 would seem potentially
useful to me.

Ah, thanks -- I think I understand what you're doing, at least in some
of the examples. You're using certain ranges of an array of parameter
values to be treated differently, possibly with some able to be fixed
or bounded.

As you no doubt understand, the approach in lmfit is completely
different, and much more flexible.

Instead of the user writing a function for leastsq() that takes as the
first argument an array of parameter values, they write a function
that takes as the first argument an array of Parameters. At each
iteration, the Parameters will have up-to-date value, after apply the
bounds and constraint expression as set for each parameter. The point
is that someone can write the function once, in terms of named,
physical parameters, but then a user change whether any of the
parameters are varied/fixed, have bounds, or are constrained to a
mathematical expression in terms of other variables without changing
the function that calculates the residual.

> ---------
>>
>> FWIW, lmfit uses python's ast module only for algebraic constraints between
>> parameters. That is,
>>     from lmfit import Parameter
>>     Parameter(name='a', value=10, vary=True)
>>     Parameter(name='b', expr='sqrt(a) + 1')
>>
>> will compile 'sqrt(a)+1' into its AST representation and evaluate that for
>> the value of 'b' when needed. So lmfit doesn't so much manipulate the AST
>> as interpret it. What is manipulated is the namespace, so that 'a' is
>> interpreted as "look up the current value of Parameter 'a'" when the AST is
>> evaluated.   Again, this applies only for algebraic constraints on
>> parameters.
>
> It's a bit similar to a formula framework for specifying a statistical
> model that is to be estimated (with lengthy discussion on the
> statsmodels list).

Sorry, I don't follow that discussion list (a little outside my
field), and wasn't aware of a formula framework in statsmodels. How
does that compare?

> I see the advantages but I haven't spent the weeks of time to figure
> out what's behind the machinery that is required (especially given all
> the other statistics and econometrics that is missing, and where I
> only have to worry about how numpy, scipy and statsmodels behaves. And
> I like Zen of Python #2)

Fair enough.

The code in lmfit/asteval.py and lmfit/astutils.py is < 1000 Lines, is
BSD, and imports only from:
ast, math, (numpy), os, re, sys, __future__

That is, the import of numpy is tried, and symbols from numpy will be
used if available. Python 2.6+ is required, as the ast module
changed quite a bit between 2.5 and 2.6. The 'import from __future__'
are for division and print_function, for Python3 compatibility.

>> Having written fitting programs that support user-supplied algebraic
>> constraints between parameters in Fortran77, I find interpreting python's
>> AST to be remarkably simple and robust. I'm scared much more by statistical
>> modeling of economic data ;)
>
> different tastes and applications.
> I'd rather think about the next 20 statistical tests I want to code,
> than about the AST or how sympy translates into numpy.

OK. That's all completely fair. I'm just saying it's not that hard,
and also exists if you're interested.

Cheers,

--Matt Newville

Erik Petigura

unread,

Feb 27, 2012, 5:31:50 PM2/27/12

to SciPy Users List

Thanks for all the suggestions! The discussion was very enlightening.

I wound up writing the following wrappers:

https://gist.github.com/1927518

I think Matt has the right idea with lmfit:

This approach gives a user a lot of flexibility in setting up fitting
models, and can allow the fitting function to remain relatively static
and written in terms of the "physical" model. Of course, it is not
going to be as fast as numexpr for array calculations, but it is more
general, and faster ufuncs is not the issue at hand.

In the end, I decided to fit my linear parameters (polynomial trend) separately from my box parameters, by just excluding the region with the box.

Erik

Reply all

Reply to author

Forward