Error if independent variable isn't first in model function

385 views
Skip to first unread message

Ben W-S

unread,
Oct 8, 2017, 1:33:01 AM10/8/17
to lmfit-py

I was doing a fit for some of my data and kept getting the error
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I finally figured out that it was because my independent variable wasn't the first variable in the defining function.
Using slope-intercept as the example function, this works:
def works(x, m, b):
   
return m*x+b
And this doesn't:
def breaks(m, x, b):
   
return m*x+b

I've attached a file that demonstrates this.

There's nothing inherently wrong with requiring the independent variable come first, but it's extremely user-unfriendly to have this be completely undocumented. Nothing in the error message gives a hint this is the reason for the error (at least for a novice like me), and the documentation on this page actually says "The model function will normally take an independent variable (generally, the first argument)" which implies the variable order in the function doesn't matter. It was blind luck that I changed the order of my variables when I was retyping my model function and then figured out that was the reason it worked.

Now that I know, it's not a problem for me. But hopefully something can be done to save someone else the same headache.
I'm using python 2.7 and the spyder IDE
linefit.py

Matt Newville

unread,
Oct 8, 2017, 8:49:04 AM10/8/17
to lmfit-py
Hi Ben,

On Sun, Oct 8, 2017 at 12:33 AM, Ben W-S <benja...@gmail.com> wrote:

I was doing a fit for some of my data and kept getting the error
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I finally figured out that it was because my independent variable wasn't the first variable in the defining function.
Using slope-intercept as the example function, this works:
def works(x, m, b):
   
return m*x+b
And this doesn't:
def breaks(m, x, b):
   
return m*x+b

I've attached a file that demonstrates this.

There's nothing inherently wrong with requiring the independent variable come first, but it's extremely user-unfriendly to have this be completely undocumented.


It is documented.  See below.
 
Nothing in the error message gives a hint this is the reason for the error (at least for a novice like me),


The error message from your "broken" example (where `m` is taken as the independent variables, and `x` as a fitting parameter) is a bit hard to interpret.  We should consider adding checks to Model.fit that the dependent variable passed in (your `y`) and the independent variable (your `m`) are the same length, and that none of the values of the variables (including your `x`) can be arrays -- that is what generated the confusing error message you got.
  
and the documentation on this page actually says "The model function will normally take an independent variable (generally, the first argument)" which implies the variable order in the function doesn't matter.


Hm, I don't see how that implies that order does not matter.  It says that a model normally takes and independent variable, and that this will normally be the first argument.  That implies to me that a model may not need to have an independent variable and that the independent variable may not always be the first argument.

Depending on how you count, within about 5 lines of that sentence in the doc and docstring, it also lists `independent_vars` as an argument to `Model`, describing this as "Arguments to func that are independent variables (default is None)".

It may be that you were assuming that the argument name `x` implies that it should be an independent variable.  Though a large majority of the examples do use `x` as the only independent variable, it is only position that sets the default independent variable, not name.

 
It was blind luck that I changed the order of my variables when I was retyping my model function and then figured out that was the reason it worked.


It's possiblle that you missed this in the documentation. There are two sections on independent variables in the Model chapter of the documentation. These show up in the table of contents on the column on the right of the web page of  http://lmfit.github.io/lmfit-py/model.html.  See:


which does say that the first argument of the model function will be taken as the only independent variable by default.   And  see the section immediately following that:


which gives an example of explicitly setting the second argument of a function as the independent variable.   
For your example,  you could replace
    breaks_linear = lm.Model(breaks)

with
    fixed_linear = lm.Model(breaks, independent_vars=['x'])

This will give the same results as your working version.

Now that I know, it's not a problem for me. But hopefully something can be done to save someone else the same headache.
I'm using python 2.7 and the spyder IDE

 
Well, it's always a good idea to consult the documentation when you run into trouble.  Suggestions for improving those docs are always welcome.

Cheers,

--Matt

Message has been deleted

Matt Newville

unread,
Oct 10, 2017, 8:34:25 AM10/10/17
to lmfit-py
Hi Ben,


On Sun, Oct 8, 2017 at 7:28 PM, Ben W-S <benja...@gmail.com> wrote:
Hm, I don't see how that implies that order does not matter.  It says that a model normally takes and independent variable, and that this will normally be the first argument.  That implies to me that a model may not need to have an independent variable and that the independent variable may not always be the first argument.

Depending on how you count, within about 5 lines of that sentence in the doc and docstring, it also lists `independent_vars` as an argument to `Model`, describing this as "Arguments to func that are independent variables (default is None)".

It may be that you were assuming that the argument name `x` implies that it should be an independent variable.  Though a large majority of the examples do use `x` as the only independent variable, it is only position that sets the default independent variable, not name.

I didn't think about how Model determines the independent variable much, I guess I thought that it assumed whichever variable it was given that had multiple values was the independent i.e. m=3, b=2, x=[0,1,2,3] so x is the independent variable. I definitely didn't think it was taking x to be the independent variable (in my actual function angular frequency is the independent variable, so it's symbol is w).

Yeah, the truth is that for lmfit "independent variable"  really means "not a parameter, and will be passed in by the user".  There can be more than one, and none of them actually need to be an array or even a sequence with a length, though a single array the same length as the data to be fit is by far the most common thing to do.  But you could go crazy and pass in some custom, complicated object.    So, I think we cannot really call it an error if the independent variables are not the same length as the data to be fit.

But we should definitely clean up the error messages for parameter values that cannot be coerced into floats.


I see the other places in the documentation, and that makes it much clearer now that I know what I'm looking for. I'm not great at parsing and understanding documentation because I'm not a programmer, just a chemist who likes using Python and LMFIT to analyze and graph his data :)

I'm not a trained programmer either, just a scientist who's been doing this awhile.   We are all scientists writing code, docs, and examples that we think other scientists might find useful.

So for me the reading of 
(generally, the first argument)
Implied "usually the first argument is independent, but it doesn't need to be", and I didn't need to look at argument order as the problem with my function, which meant that the independent_vars parameter didn't instantly jump out to me as "oh, I need to specify the independent parameter if it's not first in the function" While something like
(the first argument, unless otherwise specified)
Instantly tells me the order matters and I messed up, while also telling me it doesn't have to be and I can look through the documentation for how to change it. As I said, I'm not a programmer but I appreciate the LMFIT module very much and wanted to do anything I could to help, even if that's just addressing a single slightly ambiguous line in the documentation.

I think those are all fair comments.  It might be hard to make sure no one can pick out one summarizing sentence from the docs and mistakenly assume things that are explained in more detail elsewhere, but we'll try to make the docs clearer on these points.   Suggestions for how to do that are most welcome.

Cheers,

--Matt

Reply all
Reply to author
Forward
0 new messages