lm = smf.ols(formula='sales_n ~ own_facings + dist_products + oos_products', data=df[['sales_n','own_facings','dist_products','oos_products']]).fit()all the columns in the dataframe df are in float64 type, and no nulls.I'm getting the error:ValueError: For numerical factors, num_columns must be an int.
It doesn't make sense that the types should be integer.
Can someone please advise?
Thanks,
Keren
data = pd.read_csv('http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv', index_col=0)
data.head()
Out[8]:
TV Radio Newspaper Sales
1 230.1 37.8 69.2 22.1
2 44.5 39.3 45.1 10.4
3 17.2 45.9 69.3 9.3
4 151.5 41.3 58.5 18.5
5 180.8 10.8 58.4 12.9
import statsmodels.formula.api as smf
lm = smf.ols(formula='Sales ~ TV', data=data).fit()
the last line gives the error:
File "C:\Python27\lib\site-packages\statsmodels\base\model.py", line 147, in from_formula
missing=missing)
File "C:\Python27\lib\site-packages\statsmodels\formula\formulatools.py", line 65, in handle_formula_data
NA_action=na_action)
File "C:\Python27\lib\site-packages\patsy\highlevel.py", line 297, in dmatrices
NA_action, return_type)
File "C:\Python27\lib\site-packages\patsy\highlevel.py", line 152, in _do_highlevel_design
NA_action)
File "C:\Python27\lib\site-packages\patsy\highlevel.py", line 57, in _try_incr_builders
NA_action)
File "C:\Python27\lib\site-packages\patsy\build.py", line 706, in design_matrix_builders
categories=None)
File "C:\Python27\lib\site-packages\patsy\design_info.py", line 88, in __init__
raise ValueError("For numerical factors, num_columns "
ValueError: For numerical factors, num_columns must be an int
Hi,
I'm using:INSTALLED VERSIONS------------------Python: 2.7.6.final.0Statsmodels===========Installed: 0.6.1 (C:\Python27\lib\site-packages\statsmodels)Required Dependencies=====================cython: Not installednumpy: 1.9.2 (C:\Python27\lib\site-packages\numpy)scipy: 0.15.1 (C:\Python27\lib\site-packages\scipy)pandas: 0.16.0 (C:\Python27\lib\site-packages\pandas)dateutil: 2.4.2 (C:\Python27\lib\site-packages\dateutil)patsy: 0.4.0 (C:\Python27\lib\site-packages\patsy)
On Aug 3, 2015 9:00 AM, "Keren Kapach" <ker...@gmail.com> wrote:
>
> I'm getting the error:
>
> ValueError: For numerical factors, num_columns must be an int.
Here's a shot in the dark, but:
1) what platform and word size are you using? In particular, is this a 64 bit python build on windows?
2) what does this code return?
type(np.zeros(10).shape[0])
-n
On Aug 8, 2015 12:06 AM, "Sameer Lalwani" <sameer...@gmail.com> wrote:
>
> I just getting this same error.
> i Have a 64bit python built for windows , anaconda distribution
>
> I have
> pandas 0.16.2,
> patsy 0.4.0
> statsmodels 0.6.1
>
>
> type(np.zeros(10).shape[0]) returns
> long
Yep, that's what I suspected!
Can you try
pip install https://github.com/pydata/patsy/archive/master.zip
and report whether that fixes your problem?
-n
On Nov 10, 2015 06:46, "Mattia Ferrini" <mattia.c...@gmail.com> wrote:
>
> Hi,
>
> I am on Win7-64.
> I have just upgraded to patsy 0.4.1 but I still get `ValueError: For numerical factors, num_columns must be an int`
>
> Not sure how to proceed.
> What was the err due to in the very beginning?
>
Can you paste the full traceback?
-n
ValueError Traceback (most recent call last)
<ipython-input-38-7811978107c6> in <module>()
1 # Define the model
----> 2 lm = smf.ols('Y ~ X1 + X2 + X3 + X4 + X5 + X6', data=df)
3 # Fit the model
4 fit1 = lm.fit()
5 # Print summary statistics of the model's performance
C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\statsmodels\base\model.pyc in from_formula(cls, formula, data, subset, *args, **kwargs)
145 (endog, exog), missing_idx = handle_formula_data(data, None, formula,
146 depth=eval_env,
--> 147 missing=missing)
148 kwargs.update({'missing_idx': missing_idx,
149 'missing': missing})
C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\statsmodels\formula\formulatools.pyc in handle_formula_data(Y, X, formula, depth, missing)
63 if data_util._is_using_pandas(Y, None):
64 result = dmatrices(formula, Y, depth, return_type='dataframe',
---> 65 NA_action=na_action)
66 else:
67 result = dmatrices(formula, Y, depth, return_type='dataframe',
C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\patsy\highlevel.pyc in dmatrices(formula_like, data, eval_env, NA_action, return_type)
295 return rhs
296
--> 297 def dmatrices(formula_like, data={}, eval_env=0,
298 NA_action="drop", return_type="matrix"):
299 """Construct two design matrices given a formula_like and data.
C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\patsy\highlevel.pyc in _do_highlevel_design(formula_like, data, eval_env, NA_action, return_type)
150 # ModelDesc(...)
151 # DesignInfo
--> 152 # (DesignInfo, DesignInfo)
153 # any object with a special method __patsy_get_model_desc__
154 def _do_highlevel_design(formula_like, data, eval_env,
C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\patsy\highlevel.pyc in _try_incr_builders(formula_like, data_iter_maker, eval_env, NA_action)
55 raise PatsyError(
56 "On Python 2, formula strings must be either 'str' objects, "
---> 57 "or else 'unicode' objects containing only ascii "
58 "characters. You passed a unicode string with non-ascii "
59 "characters. I'm afraid you'll have to either switch to "
C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\patsy\build.pyc in design_matrix_builders(termlists, data_iter_maker, eval_env, NA_action)
704 factor_states[factor],
705 num_columns=num_column_counts[factor],
--> 706 categories=None)
707 else:
708 assert factor in cat_levels_contrasts
C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\patsy\design_info.pyc in __init__(self, factor, type, state, num_columns, categories)
86 if self.type == "numerical":
87 if not isinstance(num_columns, six.integer_types):
---> 88 raise ValueError("For numerical factors, num_columns "
89 "must be an integer")
90 if categories is not None:
ValueError: For numerical factors, num_columns must be an int
Hi Dielia,
It's a bit hard to tell from your post what you're doing or what went wrong -- could you paste the actual code you are running, and the complete output you get?
-n
Hi Dielia,
It's a bit hard to tell from your post what you're doing or what went wrong -- could you paste the actual code you are running, and the complete output you get?