num_columns must be an int

3.603 weergaven
Naar het eerste ongelezen bericht

Keren Kapach

ongelezen,
3 aug 2015, 12:00:3003-08-2015
aan pystatsmodels
Hi,

I am trying to build linear regression model, using this code:

lm = smf.ols(formula='sales_n ~ own_facings + dist_products + oos_products', data=df[['sales_n','own_facings','dist_products','oos_products']]).fit()

all the columns in the dataframe df are in float64 type, and no nulls.

I'm getting the error:
ValueError: For numerical factors, num_columns must be an int.


It doesn't make sense that the types should be integer.
Can someone please advise?


Thanks,
Keren

Skipper Seabold

ongelezen,
3 aug 2015, 12:06:4203-08-2015
aan pystat...@googlegroups.com
Indeed, that's odd, and I can't reproduce trying a few things. Can you
post the data somewhere or code to otherwise reproduce.

What version of statsmodels and patsy are you using?

You might also drop in to the debugger and see what's going on or
otherwise omit columns until you've figured out which one is giving
you the error.

Skipper

Keren Kapach

ongelezen,
4 aug 2015, 02:15:2904-08-2015
aan pystatsmodels
Hi Skipper,

Thanks for your reply.
I'm using statsmodels version 0.6.1, which i've installed using 'pip install statsmodels'
I've tried version 0.6.0 and the problem re-occur.
I did not install pasty just yet.

as for the data and the code:
I'm using the data and code taken from here:

data = pd.read_csv('http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv', index_col=0)
data.head()
Out[8]: 
      TV  Radio  Newspaper  Sales
1  230.1   37.8       69.2   22.1
2   44.5   39.3       45.1   10.4
3   17.2   45.9       69.3    9.3
4  151.5   41.3       58.5   18.5
5  180.8   10.8       58.4   12.9

import statsmodels.formula.api as smf
lm = smf.ols(formula='Sales ~ TV', data=data).fit()

the last line gives the error:
File "C:\Python27\lib\site-packages\statsmodels\base\model.py", line 147, in from_formula
    missing=missing)
  File "C:\Python27\lib\site-packages\statsmodels\formula\formulatools.py", line 65, in handle_formula_data
    NA_action=na_action)
  File "C:\Python27\lib\site-packages\patsy\highlevel.py", line 297, in dmatrices
    NA_action, return_type)
  File "C:\Python27\lib\site-packages\patsy\highlevel.py", line 152, in _do_highlevel_design
    NA_action)
  File "C:\Python27\lib\site-packages\patsy\highlevel.py", line 57, in _try_incr_builders
    NA_action)
  File "C:\Python27\lib\site-packages\patsy\build.py", line 706, in design_matrix_builders
    categories=None)
  File "C:\Python27\lib\site-packages\patsy\design_info.py", line 88, in __init__
    raise ValueError("For numerical factors, num_columns "
ValueError: For numerical factors, num_columns must be an int


Can you please advise?

Thanks,
Keren

josef...@gmail.com

ongelezen,
4 aug 2015, 03:45:0504-08-2015
aan pystatsmodels
I don't get an error with statsmodels master but old versions of dependencies. This must be specific to some versions.

Please check
>>> import statsmodels.api as sm
>>> sm.show_versions()

to see what versions of pandas and patsy are used


>>> data.dtypes
TV           float64
Radio        float64
Newspaper    float64
Sales        float64
dtype: object


Josef

Keren Kapach

ongelezen,
4 aug 2015, 03:59:1804-08-2015
aan pystatsmodels
Hi,

I'm using:

INSTALLED VERSIONS
------------------
Python: 2.7.6.final.0

Statsmodels
===========

Installed: 0.6.1 (C:\Python27\lib\site-packages\statsmodels)

Required Dependencies
=====================

cython: Not installed
numpy: 1.9.2 (C:\Python27\lib\site-packages\numpy)
scipy: 0.15.1 (C:\Python27\lib\site-packages\scipy)
pandas: 0.16.0 (C:\Python27\lib\site-packages\pandas)
    dateutil: 2.4.2 (C:\Python27\lib\site-packages\dateutil)
patsy: 0.4.0 (C:\Python27\lib\site-packages\patsy)

Optional Dependencies
=====================

matplotlib: 1.4.3 (C:\Python27\lib\site-packages\matplotlib)
cvxopt: Not installed

Developer Tools
================

IPython: 3.1.0 (C:\Python27\lib\site-packages\IPython)
    jinja2: 2.7.3 (C:\Python27\lib\site-packages\jinja2)
sphinx: Not installed
    pygments: 2.0.2 (C:\Python27\lib\site-packages\pygments)
nose: Not installed
virtualenv: Not installed

josef...@gmail.com

ongelezen,
4 aug 2015, 05:14:0604-08-2015
aan pystatsmodels
On Tue, Aug 4, 2015 at 3:59 AM, Keren Kapach <ker...@gmail.com> wrote:
Hi,

I'm using:

INSTALLED VERSIONS
------------------
Python: 2.7.6.final.0

Statsmodels
===========

Installed: 0.6.1 (C:\Python27\lib\site-packages\statsmodels)

Required Dependencies
=====================

cython: Not installed
numpy: 1.9.2 (C:\Python27\lib\site-packages\numpy)
scipy: 0.15.1 (C:\Python27\lib\site-packages\scipy)
pandas: 0.16.0 (C:\Python27\lib\site-packages\pandas)
    dateutil: 2.4.2 (C:\Python27\lib\site-packages\dateutil)
patsy: 0.4.0 (C:\Python27\lib\site-packages\patsy)


I still cannot replicate this.
I installed patsy 0.4 and pandas 0.16.0  but I only had a python 3.4 available to try it out.
Also using statsmodels 0.6.1
I'm using Windows 64 bit  (WinPython)


As Skipper said, using the debugger would be the easiest to track this.
In case you are not familiar with pdb, you could try to narrow down more where something might be going wrong.

>>> import numpy as np
>>> np.asarray(data).dtype
dtype('float64')

>>> import patsy
>>> y, x = patsy.dmatrices('Sales ~ TV', data=data)

>>> np.asarray(x).dtype

Josef

Nathaniel Smith

ongelezen,
4 aug 2015, 16:43:1904-08-2015
aan pystatsmodels

On Aug 3, 2015 9:00 AM, "Keren Kapach" <ker...@gmail.com> wrote:
>
> I'm getting the error:
>
> ValueError: For numerical factors, num_columns must be an int.

Here's a shot in the dark, but:

1) what platform and word size are you using? In particular, is this a 64 bit python build on windows?

2) what does this code return?

type(np.zeros(10).shape[0])

-n

Nathaniel Smith

ongelezen,
7 aug 2015, 21:50:0807-08-2015
aan pystatsmodels
Hi Keren,

I might have fixed this in the latest (still unreleased) version of
patsy -- try doing

pip install https://github.com/pydata/patsy/archive/master.zip

and then see if that fixes things?

-n
--
Nathaniel J. Smith -- http://vorpus.org

Sameer Lalwani

ongelezen,
8 aug 2015, 03:06:0408-08-2015
aan pystatsmodels
I just getting this same error.
i Have a 64bit python built for windows , anaconda distribution

I have 
pandas 0.16.2,
patsy 0.4.0
statsmodels 0.6.1


type(np.zeros(10).shape[0]) returns 
long

Nathaniel Smith

ongelezen,
8 aug 2015, 04:26:0008-08-2015
aan pystatsmodels

On Aug 8, 2015 12:06 AM, "Sameer Lalwani" <sameer...@gmail.com> wrote:
>
> I just getting this same error.
> i Have a 64bit python built for windows , anaconda distribution
>
> I have 
> pandas 0.16.2,
> patsy 0.4.0
> statsmodels 0.6.1
>
>
> type(np.zeros(10).shape[0]) returns 
> long

Yep, that's what I suspected!

Can you try
  
  pip install https://github.com/pydata/patsy/archive/master.zip

and report whether that fixes your problem?

-n

Keren Kapach

ongelezen,
9 aug 2015, 02:49:0009-08-2015
aan pystatsmodels
Thanks!
The new yet unreleased version fixed my problem and now it works.

Btw, type(np.zeros(10).shape[0]) returns long :)

בתאריך יום שבת, 8 באוגוסט 2015 בשעה 11:26:00 UTC+3, מאת Nathaniel Smith:

Nathaniel Smith

ongelezen,
9 aug 2015, 02:54:3309-08-2015
aan pystatsmodels
Okay, great! I'll release 0.4.1 shortly with this fix... though I'll
wait until tomorrow at least in case someone comes up with a
workaround for https://github.com/scipy/scipy/issues/5127 ...

Luca Nicoli

ongelezen,
29 okt 2015, 08:08:3529-10-2015
aan pystatsmodels
I have got the same problem.

With this fix it works.

Thanks.

James Millington

ongelezen,
8 nov 2015, 11:47:0608-11-2015
aan pystatsmodels
I had this problem too. The pip install https://github.com/pydata/patsy/archive/master.zip fix works great, thanks

When will patsy 0.4.1 be available direct via pip? Latest version via pip search patsy still returns 0.4.0.

I need to teach a class Thurs and would be nice to just do a regular pip update!

cheers

Mattia Ferrini

ongelezen,
10 nov 2015, 09:46:3710-11-2015
aan pystatsmodels
Hi, 

I am on Win7-64.
I have just upgraded to patsy 0.4.1 but I still get `ValueError: For numerical factors, num_columns must be an int` 

Not sure how to proceed.
What was the err due to in the very beginning?

Thanks.
Mattia

Nathaniel Smith

ongelezen,
10 nov 2015, 11:09:1310-11-2015
aan pystatsmodels

On Nov 10, 2015 06:46, "Mattia Ferrini" <mattia.c...@gmail.com> wrote:
>
> Hi, 
>
> I am on Win7-64.
> I have just upgraded to patsy 0.4.1 but I still get `ValueError: For numerical factors, num_columns must be an int` 
>
> Not sure how to proceed.
> What was the err due to in the very beginning?
>

Can you paste the full traceback?

-n

David Kronenberg

ongelezen,
10 nov 2015, 16:02:0010-11-2015
aan pystatsmodels
I'm getting the same error as well after installing the master.zip. Running in a Jupyter notebook. 




ValueError                                Traceback (most recent call last)
<ipython-input-38-7811978107c6> in <module>()
      1 # Define the model
----> 2 lm = smf.ols('Y ~ X1 + X2 + X3 + X4 + X5 + X6', data=df)
      3 # Fit the model
      4 fit1 = lm.fit()
      5 # Print summary statistics of the model's performance

C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\statsmodels\base\model.pyc in from_formula(cls, formula, data, subset, *args, **kwargs)
    145         (endog, exog), missing_idx = handle_formula_data(data, None, formula,
    146                                                          depth=eval_env,
--> 147                                                          missing=missing)
    148         kwargs.update({'missing_idx': missing_idx,
    149                        'missing': missing})

C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\statsmodels\formula\formulatools.pyc in handle_formula_data(Y, X, formula, depth, missing)
     63         if data_util._is_using_pandas(Y, None):
     64             result = dmatrices(formula, Y, depth, return_type='dataframe',
---> 65                                NA_action=na_action)
     66         else:
     67             result = dmatrices(formula, Y, depth, return_type='dataframe',

C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\patsy\highlevel.pyc in dmatrices(formula_like, data, eval_env, NA_action, return_type)
    295     return rhs
    296
--> 297 def dmatrices(formula_like, data={}, eval_env=0,
    298               NA_action="drop", return_type="matrix"):
    299     """Construct two design matrices given a formula_like and data.

C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\patsy\highlevel.pyc in _do_highlevel_design(formula_like, data, eval_env, NA_action, return_type)
    150 #   ModelDesc(...)
    151 #   DesignInfo
--> 152 #   (DesignInfo, DesignInfo)
    153 #   any object with a special method __patsy_get_model_desc__
    154 def _do_highlevel_design(formula_like, data, eval_env,

C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\patsy\highlevel.pyc in _try_incr_builders(formula_like, data_iter_maker, eval_env, NA_action)
     55             raise PatsyError(
     56                 "On Python 2, formula strings must be either 'str' objects, "
---> 57                 "or else 'unicode' objects containing only ascii "
     58                 "characters. You passed a unicode string with non-ascii "
     59                 "characters. I'm afraid you'll have to either switch to "

C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\patsy\build.pyc in design_matrix_builders(termlists, data_iter_maker, eval_env, NA_action)
    704                             factor_states[factor],
    705                             num_columns=num_column_counts[factor],
--> 706                             categories=None)
    707         else:
    708             assert factor in cat_levels_contrasts

C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\patsy\design_info.pyc in __init__(self, factor, type, state, num_columns, categories)
     86         if self.type == "numerical":
     87             if not isinstance(num_columns, six.integer_types):
---> 88                 raise ValueError("For numerical factors, num_columns "
     89                                  "must be an integer")
     90             if categories is not None:


ValueError: For numerical factors, num_columns must be an int

Nathaniel Smith

ongelezen,
10 nov 2015, 16:45:0510-11-2015
aan pystatsmodels
On Tue, Nov 10, 2015 at 12:57 PM, David Kronenberg
<dm.kro...@gmail.com> wrote:
> I'm getting the same error as well after installing the master.zip. Running
> in a Jupyter notebook.

[...]

> C:\Users\Dm.kronenberg\Anaconda2\lib\site-packages\patsy\highlevel.pyc in
> _do_highlevel_design(formula_like, data, eval_env, NA_action, return_type)
> 150 # ModelDesc(...)
> 151 # DesignInfo
> --> 152 # (DesignInfo, DesignInfo)
> 153 # any object with a special method __patsy_get_model_desc__
> 154 def _do_highlevel_design(formula_like, data, eval_env,

So apparently the bad code is here in this commented text... which
makes no sense at all :-).

What this means is that your .py and .pyc files have gotten out of
sync with each other: you've installed the new .py files, but Python
for some reason hasn't noticed and is loading the old .pyc files
instead of regenerating them like it ought to. Basically your
installation has gotten corrupted. I'm not sure why -- possibly some
argument between pip and conda, or some issue with clock skew
(normally Python is supposed to notice that the timestamps on the .py
files are newer than the timestamps on the .pyc files, and
ignore/regenerate the .pyc files), or something like that.

Deleting all the .pyc files in your patsy folder should fix things. Or
just recreating your environment from scratch...

This is pretty puzzling; it'd be nice to know what went wrong here,
but I got nothin'.

David Kronenberg

ongelezen,
10 nov 2015, 17:07:5910-11-2015
aan pystatsmodels
That worked for me, thanks for the quick response. 

Ludwid Reyes

ongelezen,
23 dec 2015, 11:08:3923-12-2015
aan pystatsmodels
I'm new to this, and thank you so much for your help. Simply upgrading patsy to 4.1 in the official pip repository fixed my issue.

Maneesh Janyavula

ongelezen,
27 apr 2016, 22:47:4527-04-2016
aan pystatsmodels
Upgrading the patsy version to 0.4.1 worked for me.
Do a pip install -U patsy

Dielia Ba

ongelezen,
24 mei 2016, 19:13:5524-05-2016
aan pystatsmodels
Hi everyone, I know that it's been a long since the first publication. But I have the same error message. 
I am creating a simulation model that uses the elasticities, which I obtained by using a quantile regression (using R) now I have to run the model under iPython using a pre-defined microsimulation module. 
I installed Patsy and followed the instructions but it's not working which is totally normal since patsy does not have the elasticity calculator.
Do you know any other way of fixing this problem?
Thank you in advance. 

Nathaniel Smith

ongelezen,
24 mei 2016, 19:18:3824-05-2016
aan pystatsmodels

Hi Dielia,

It's a bit hard to tell from your post what you're doing or what went wrong -- could you paste the actual code you are running, and the complete output you get?

-n

josef...@gmail.com

ongelezen,
24 mei 2016, 19:22:1324-05-2016
aan pystatsmodels
On Tue, May 24, 2016 at 7:18 PM, Nathaniel Smith <n...@vorpus.org> wrote:

Hi Dielia,

It's a bit hard to tell from your post what you're doing or what went wrong -- could you paste the actual code you are running, and the complete output you get?


and make sure you actually have patsy >= 0.4.1 installed in the python version that ipython uses, just in case you have several python installations.

Josef

Dielia Ba

ongelezen,
1 jun 2016, 11:08:5201-06-2016
aan pystatsmodels
Hi Josef, 
Thank you for your message.
The code is very long before getting to use the  calculator command for the elasticities and it's not my code, my colleague created a module instead of directly running the quantile regression. Thus I am not sure I am allowed to share it in an open discussion :/
Dielia

Sebastain Gould

ongelezen,
25 mei 2017, 08:06:5825-05-2017
aan pystatsmodels
This: pip install -U patsy
is still helping solve this issue in 2017. 

bump 
Allen beantwoorden
Auteur beantwoorden
Doorsturen
0 nieuwe berichten