How do I run Nadaraya-Watson kernel regression?

David Montgomery

unread,

Mar 7, 2014, 5:48:46 AM3/7/14

to pystat...@googlegroups.com

Hi,

I am using version 5.

How do I run Nadaraya-Watson kernel regression?

Here is xo the point I am at. array([ 1.66172, 1.66167, 1.66179, 1.66167, 1.66176])

Here are my K nearest neighbors to xo.

        x_0          x_1          x_2     x_3         x_4
0   1.66070 1.66076 1.66134 1.66133 1.66175
1   1.66170 1.66123 1.66115 1.66152 1.66175
2   1.66185 1.66196 1.66171 1.66145 1.66178
3   1.66152 1.66175 1.66188 1.66186 1.66173
4   1.66209 1.66181 1.66172 1.66167 1.66179
5   1.66189 1.66193 1.66209 1.66181 1.66172
6   1.66214 1.66208 1.66185 1.66191 1.66180
7   1.66178 1.66189 1.66193 1.66209 1.66181
8   1.66142 1.66150 1.66185 1.66196 1.66171
9   1.66133 1.66175 1.66118 1.66112 1.66170
10 1.66208 1.66185 1.66191 1.66180 1.66170
11 1.66185 1.66191 1.66180 1.66170 1.66183
12 1.66193 1.66209 1.66181 1.66172 1.66167
13 1.66181 1.66172 1.66167 1.66179 1.66167
14 1.66095 1.66116 1.66142 1.66150 1.66185
15 1.66164 1.66192 1.66214 1.66208 1.66185
16 1.66115 1.66152 1.66175 1.66188 1.66186
17 1.66175 1.66188 1.66186 1.66173 1.66165
18 1.66188 1.66186 1.66173 1.66165 1.66164
19 1.66123 1.66115 1.66152 1.66175 1.66188

kernel_regression = statsmodels.nonparametric.kernel_regression
kr = kernel_regression.KernelReg([embedding[-1]],X,'c')

Or....is X a list of arrays? e.g, [[1.66070 1.66076 1.66134 1.66133 1.66175],[1.66070 1.66076 1.66134 1.66133 1.66175]]
tried tat too. Dont work.

Traceback (most recent call last):
File "/home/ubuntu/workspace/chaos/forecast.py", line 168, in <module>
    forecast = get_forecast(local_df,X_cols)
File "/home/ubuntu/workspace/chaos/forecast.py", line 132, in get_forecast
    kr = kernel_regression.KernelReg([embedding[-1]],X,'c')
File "/usr/local/lib/python2.7/dist-packages/statsmodels/nonparametric/kernel_regression.py", line 100, in __init__
    self.exog = _adjust_shape(exog, self.k_vars)
File "/usr/local/lib/python2.7/dist-packages/statsmodels/nonparametric/_kernel_base.py", line 443, in _adjust_shape
    dat = np.reshape(dat, (nobs, k_vars))
File "/usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line 172, in reshape
    return reshape(newshape, order=order)
ValueError: total size of new array must be unchanged

Only thing that worked was the below.

kr = kernel_regression.KernelReg([ 1.66172, 1.66167, 1.66179, 1.66167, 1.66176],1.66070 1.66076 1.66134 1.66133 1.66175,'c')
KernelReg instance
Number of variables: k_vars = 1
Number of samples:   N = 5
Variable types:      c
BW selection method: cv_ls
Estimator type: ll

josef...@gmail.com

unread,

Mar 7, 2014, 7:33:38 AM3/7/14

to pystatsmodels

kernel regression will also need an array of y and an array of x, like
the other models, y and x with same shape[0], and y is 1 dimensional.

The model is y = f(x) + u, where f is the unknown, nonparametrically
estimated function.

It seems to me that you are using one x point as the dependent variable y.

There are some examples inside the statsmodels sourcetree, that I used
to try them out

for example
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/examples/ex_kernel_regression_dgp.py
https://github.com/statsmodels/statsmodels/blob/master/statsmodels/examples/ex_kernel_regression2.py

Josef

>
>
>
>
>
>
>
>
>
>
>
>
>

Anastasia Sokolova

unread,

Dec 8, 2019, 4:50:25 PM12/8/19

to pystatsmodels

Hi, I think I'm late with the answer for 5 years, mb it will be useful for others.. I faced the same problem as you, my python didn't want to work with error "cannot reshape 9805 to size (1961,1) while I wanted to give 5 variables with following code:

KernelReg(X_train_new_1['goal1'], X_train_new_1[['field1','field12','field14','field16','field25']], 'c')

where y = X_train_new_1['goal1']

X = X_train_new_1[['field1','field12','field14','field16','field25']]

y.shape = 1961 (observations)

X.shape = (1961, 5) (5 variables of 1961 obs)

I was surprised, when I found that problem was in the third positional argument var_type. Because I wanted to pass 5 variables I should give as a var_type the list of types of vars for each varible, for example:

KernelReg(X_train_new_1['goal1'], X_train_new_1[['field1','field12','field14','field16','field25']], var_type = ['c', 'c', 'c', 'c', 'c'])

or

KernelReg(X_train_new_1['goal1'], X_train_new_1[['field1','field12','field14','field16','field25']], var_type = ['c', 'u', 'c', 'o', 'c'])

I think the main problem in using statsmodels is poor documentation

пятница, 7 марта 2014 г., 14:48:46 UTC+4 пользователь David Montgomery написал:

kalyan dasgupta

unread,

May 5, 2020, 9:14:18 AM5/5/20

to pystatsmodels

Hi,

Thanks a lot for the reply. It did help. A slight change is required. You have to convert the list to a string. For example,

var = ['c', 'c', 'c', 'c', 'c']

var1=''

for ii in var:

var1 += ii

KernelReg(Y, X, var_type = var1)

Thanks a lot. I was struggling with this. You made it happen. :)

Reply all

Reply to author

Forward