Re: [pystatsmodels] Problem generating OLS regression as per example in 'Interactions and ANOVA' page

159 views
Skip to first unread message

josef...@gmail.com

unread,
Oct 6, 2012, 3:06:53 PM10/6/12
to pystat...@googlegroups.com
On Sat, Oct 6, 2012 at 2:05 PM, fullofquestions
<fullofq...@gmail.com> wrote:
> I am fairly new to statsmodels and, to a lesser extent, to python. I'm
> following the example(s) provided in the following page:
> http://statsmodels.sourceforge.net/devel/examples/generated/example_interactions.html
>
> and I'm getting stuck in the third section, i.e. 'In [23]' to be exact. I
> have numpy, scipy and statmodels working and the call to 'ols(formula,
> salary_table)' first complained about ols() not being callable. So I dug in
> a bit and I see that we should be calling ols.OLS(formula, salary_table).
> Unfortunately it then complains that
> "AttributeError:'builtin_function_or_method' object has no attribute
> 'equals'"
>
> I'm not sure and I don't see documentation on how to define these string
> formulas so that they work with the statmodels regression functions. I can
> tell you that the following piece of code works:
> # get data
> nsample = 100
> x = np.linspace(0,10, 100)
> X = sm.add_constant(np.column_stack((x, x**2)))
> beta = np.array([1, 0.1, 10])
> y = np.dot(X, beta) + np.random.normal(size=nsample)
> results = sm.OLS(y, X).fit()
>
> So I tweaked the formula a bunch of ways and, for example, the below call
> did return results although they were all off
> ols.OLS(salary_table['S'] , salary_table['E'] + salary_table['M'] +
> salary_table['X']).fit()
>
> I realize that I'm not very proficient with these modules just yet and I'm
> open to suggestions. Thank you for your help.

for now you have to use lower case ``ols`` or OLS.from_formula.
using formulas is not (yet) an option for the regular OLS and other
model instances.

Note the import in line 7
In [7]: from statsmodels.formula.api import ols

>>> import statsmodels.formula.api as smf
>>> smf.ols
<bound method type.from_formula of <class
'statsmodels.regression.linear_model.OLS'>>
>>> dir(smf)
['GLM', 'GLS', 'GLSAR', 'Logit', 'MNLogit', 'OLS', 'Poisson',
'Probit', 'RLM', 'WLS', '__builtins__', '__doc__', '__file__',
'__name__', '__package__', 'glm', 'gls', 'glsar', 'logit', 'mnlogit',
'ols', 'poisson', 'probit', 'rlm', 'wls']
>>>


Josef

>
> By the way, I'm using eclipse Indigo and windows 7. The result is the same
> if I use the Python console so I don't think that it is eclipse related.
> Thanks!
>

fullofquestions

unread,
Oct 6, 2012, 7:02:39 PM10/6/12
to pystat...@googlegroups.com
Thank you very much. That did it. Funny how I tried pretty much everything other than what you recommended. To be perfectly clear, to those that happen to encounter my problem, please change 'In [23]' to read as follows:

lm = ols.ols(formula, salary_table).fit()

Done! And thanks again.


On Saturday, October 6, 2012 11:05:45 AM UTC-7, fullofquestions wrote:
I am fairly new to statsmodels and, to a lesser extent, to python. I'm following the example(s) provided in the following page: 

and I'm getting stuck in the third section, i.e. 'In [23]' to be exact. I have numpy, scipy and statmodels working and the call to 'ols(formula, salary_table)' first complained about ols() not being callable. So I dug in a bit and I see that we should be calling ols.OLS(formula, salary_table). Unfortunately it then complains that "AttributeError:'builtin_function_or_method' object has no attribute 'equals'"

I'm not sure and I don't see documentation on how to define these string formulas so that they work with the statmodels regression functions. I can tell you that the following piece of code works:
# get data
nsample = 100
x = np.linspace(0,10, 100)
X = sm.add_constant(np.column_stack((x, x**2)))
beta = np.array([1, 0.1, 10])
y = np.dot(X, beta) + np.random.normal(size=nsample)
results = sm.OLS(y, X).fit()

So I tweaked the formula a bunch of ways and, for example, the below call did return results although they were all off
ols.OLS(salary_table['S'] , salary_table['E'] + salary_table['M'] + salary_table['X']).fit()

I realize that I'm not very proficient with these modules just yet and I'm open to suggestions. Thank you for your help.

Reply all
Reply to author
Forward
0 new messages