Logit regression with categorical variables

2,227 views
Skip to first unread message

Robert Garrison II

unread,
Dec 10, 2015, 8:50:26 PM12/10/15
to pystatsmodels
Good Evening,

 I am attempting to determine whether a mortgage is prime of subprime classification based upon certain characteristics. Is it possible to do a Logit regression with categorical variables?  I am attempting to implement the following model:

prime_logit= smf.Logit(mortgage2005_df['prime_flag'], mortgage2005_df[['initial_interest_rate','DateCat','disposition','altpmi2','Mortgage_FICO_Bins','Mortgage_LTV_Bins','alt_loantype','conform2005','alt_lien','doctype','alt_occ','units']])

where all variables besides 'initial_interest_rate' are categorical variables.   When attempting to run this code, I get the following:

prime_logit= smf.Logit(mortgage2005_df['prime_flag'], mortgage2005_df[['initial_interest_rate','DateCat','disposition','altpmi2','Mortgage_FICO_Bins','Mortgage_LTV_Bins','alt_loantype','conform2005','alt_lien','doctype','alt_occ','units']])
Traceback (most recent call last):

  File "<ipython-input-265-0f22804725c6>", line 1, in <module>
    prime_logit= smf.Logit(mortgage2005_df['prime_flag'], mortgage2005_df[['initial_interest_rate','DateCat','disposition','altpmi2','Mortgage_FICO_Bins','Mortgage_LTV_Bins','alt_loantype','conform2005','alt_lien','doctype','alt_occ','units']])

  File "/data/unixhome/rgarrison/anaconda3/lib/python3.4/site-packages/statsmodels/discrete/discrete_model.py", line 401, in __init__
    super(BinaryModel, self).__init__(endog, exog, **kwargs)

  File "/data/unixhome/rgarrison/anaconda3/lib/python3.4/site-packages/statsmodels/discrete/discrete_model.py", line 154, in __init__
    super(DiscreteModel, self).__init__(endog, exog, **kwargs)

  File "/data/unixhome/rgarrison/anaconda3/lib/python3.4/site-packages/statsmodels/base/model.py", line 186, in __init__
    super(LikelihoodModel, self).__init__(endog, exog, **kwargs)

  File "/data/unixhome/rgarrison/anaconda3/lib/python3.4/site-packages/statsmodels/base/model.py", line 60, in __init__
    **kwargs)

  File "/data/unixhome/rgarrison/anaconda3/lib/python3.4/site-packages/statsmodels/base/model.py", line 84, in _handle_data
    data = handle_data(endog, exog, missing, hasconst, **kwargs)

  File "/data/unixhome/rgarrison/anaconda3/lib/python3.4/site-packages/statsmodels/base/data.py", line 566, in handle_data
    **kwargs)

  File "/data/unixhome/rgarrison/anaconda3/lib/python3.4/site-packages/statsmodels/base/data.py", line 72, in __init__
    self.endog, self.exog = self._convert_endog_exog(endog, exog)

  File "/data/unixhome/rgarrison/anaconda3/lib/python3.4/site-packages/statsmodels/base/data.py", line 428, in _convert_endog_exog
    raise ValueError("Pandas data cast to numpy dtype of object. "

ValueError: Pandas data cast to numpy dtype of object. Check input data with np.asarray(data).


josef...@gmail.com

unread,
Dec 10, 2015, 9:06:11 PM12/10/15
to pystatsmodels
Statsmodels is currently not doing any automatic conversion of dtypes
object (patsy handles some in the formula interface)

If you check the dtype of your data
(mortgage2005_df['prime_flag'].dtype
mortgage2005_df[['initial_interest_rate','DateCat','disposition','altpmi2','Mortgage_FICO_Bins','Mortgage_LTV_Bins','alt_loantype','conform2005','alt_lien','doctype','alt_occ','units']].dtypes

There should be at least one that is not numeric.

As check:
if you do np.asarray(your_data_frame).dtypes then it should be numeric
and not object
where your_data_frame contains either endog, exog or both.

pandas uses object arrays quite frequently now, e.g. in some cases if
there is a missing value, and for strings, ..., AFAIK

Josef

josef...@gmail.com

unread,
Dec 10, 2015, 9:11:57 PM12/10/15
to pystatsmodels
more IIUC how pandas works

boolean with missing values are object arrays
pandas has convenience function that should help as_numeric ? and dropna
If you have missing values in both endog and exog, then statsmodels
can remove them if `missing='drop'`, but I think it still requires
numeric data

Josef

>
> Josef
Reply all
Reply to author
Forward
0 new messages