[R] Error of Stepwise Regression with number of rows in use has changed: remove missing values?

1,544 views
Skip to first unread message

Kum-Hoe Hwang

unread,
Feb 16, 2010, 3:24:01 AM2/16/10
to r-h...@r-project.org
Howdy, R Grues

I have enjoyed R, but I cannot solve one problem easily. Please help my problem.
When I tried the R script, I got the following Error. This error
results from input data file exported through a Excel spreadsheet
software.

Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
as.numeric(nation.grant) +  :
  number of rows in use has changed: remove missing values?

Could you direct me to solve the Error?
Thanks in advance,


> ############### outputs from R console ###############
> pop <- step(
+             lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
as.numeric(nation.grant)
+                + as.numeric(do.grant) + as.numeric(city.grant) +
as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data,
na.action = na.omit)
+             )
Start:  AIC=494.27
pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) +
    as.numeric(do.grant) + as.numeric(city.grant) + as.numeric(DMZ.dist) +
    as.numeric(Seoul.dist)
                           Df Sum of Sq    RSS    AIC
- as.numeric(do.grant)      1      0.71 6622.9 492.28
- as.factor(policy)         1      1.21 6623.4 492.29
- as.numeric(DMZ.dist)      1      1.91 6624.1 492.30
- as.numeric(city.grant)    1      5.07 6627.3 492.36
- as.numeric(nation.grant)  1     11.51 6633.7 492.47
- as.numeric(year)          1     29.58 6651.8 492.80
<none>                                  6622.2 494.27
- as.numeric(Seoul.dist)    1    673.22 7295.4 503.79
Step:  AIC=492.28
pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) +
    as.numeric(city.grant) + as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
                           Df Sum of Sq    RSS    AIC
- as.factor(policy)         1      1.99 6624.9 490.32
- as.numeric(DMZ.dist)      1      2.09 6625.0 490.32
- as.numeric(city.grant)    1      7.18 6630.1 490.41
- as.numeric(nation.grant)  1     20.08 6643.0 490.64
- as.numeric(year)          1     28.89 6651.8 490.80
<none>                                  6622.9 492.28
- as.numeric(Seoul.dist)    1    697.46 7320.4 502.20
Step:  AIC=490.32
pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
as.numeric(city.grant) +
    as.numeric(DMZ.dist) + as.numeric(Seoul.dist)
                           Df Sum of Sq    RSS    AIC
- as.numeric(DMZ.dist)      1      2.08 6627.0 488.35
- as.numeric(city.grant)    1     10.65 6635.6 488.51
- as.numeric(nation.grant)  1     31.30 6656.2 488.88
- as.numeric(year)          1     31.44 6656.4 488.88
<none>                                  6624.9 490.32
- as.numeric(Seoul.dist)    1    732.88 7357.8 500.80
Step:  AIC=488.35
pop.rate ~ as.numeric(year) + as.numeric(nation.grant) +
as.numeric(city.grant) +
    as.numeric(Seoul.dist)
                           Df Sum of Sq    RSS    AIC
- as.numeric(city.grant)    1      9.86 6636.9 486.53
- as.numeric(year)          1     31.42 6658.4 486.92
- as.numeric(nation.grant)  1     33.33 6660.3 486.95
<none>                                  6627.0 488.35
- as.numeric(Seoul.dist)    1    754.40 7381.4 499.18

Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
as.numeric(nation.grant) +  :
-------------------------------------------------------------------------------------------------------------------------------------------
  number of rows in use has changed: remove missing values?
------------------------------------------------------------------------------------------


--
Kum-Hoe Hwang, Ph.D.

Phone : 82-31-250-3516
Email : phdh...@gmail.com

______________________________________________
R-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Mohamed Lajnef

unread,
Feb 16, 2010, 5:48:33 AM2/16/10
to Kum-Hoe Hwang, r-h...@r-project.org

Hi Kum,

If you look at the code step function ( by typing step in the R
console), the condition (if (length(fit$residuals) != n) ) is not
fulfilled, this explains the error!
i hope this can help

Regards
M


Kum-Hoe Hwang a écrit :


--


Mohamed Lajnef,IE
INSERM U955 eq 15
Pôle de Psychiatrie
Hôpital CHENEVIER
40, rue Mesly
94010 CRETEIL Cedex FRANCE
Mohamed...@inserm.fr
tel : 01 49 81 31 31 (poste 18470)
Sec : 01 49 81 32 90
fax : 01 49 81 30 99

Peter Ehlers

unread,
Feb 16, 2010, 6:09:37 AM2/16/10
to Kum-Hoe Hwang, r-h...@r-project.org
On 2010-02-16 1:24, Kum-Hoe Hwang wrote:
> Howdy, R Grues
>
> I have enjoyed R, but I cannot solve one problem easily. Please help my problem.
> When I tried the R script, I got the following Error. This error
> results from input data file exported through a Excel spreadsheet
> software.
>
> Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) +
> as.numeric(nation.grant) + :
> number of rows in use has changed: remove missing values?
>
> Could you direct me to solve the Error?
> Thanks in advance,

This is a common situation when you use step() on data where
the predictors have missing values.

A case (row) is included in the model only if all the
predictors for that model are non-missing for the case.

As you vary which predictors are to be in the model, the
included cases will vary, resulting in models based on
different data. (Think of your cases as subjects; you want
all your models to be based on the same set of subjects.)

Finally: (Re-)read the help page and note the 'warning'.

-Peter Ehlers

--
Peter Ehlers
University of Calgary

Kum-Hoe Hwang

unread,
Feb 17, 2010, 3:41:04 AM2/17/10
to r-h...@r-project.org
I thank those who helped to solve a error in stepwise regression with
missing values.


Kum

*
*

A good solution that I have tried was Andreas's advice.

=====================================================================

Try

data<-na.omit(original database) before you run step() or stepAIC()

--
Kum-Hoe Hwang, Ph.D.

Phone : 82-31-250-3516
Email : phdh...@gmail.com

[[alternative HTML version deleted]]

Kum-Hoe Hwang

unread,
Feb 17, 2010, 3:43:47 AM2/17/10
to r-h...@r-project.org
Sorry for my faulty email and another correct email

I thank those who helped to solve a error in stepwise regression with
missing values.

A good solution that I have tried was Andreas's advice.

=====================================================================

Try

data<-na.omit(original database) before you run step() or stepAIC()


Kum

On Tue, Feb 16, 2010 at 8:09 PM, Peter Ehlers <ehl...@ucalgary.ca> wrote:
>

Greg Snow

unread,
Feb 19, 2010, 3:57:29 PM2/19/10
to Kum-Hoe Hwang, r-h...@r-project.org
Have you considered the implications of that solution?

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg...@imail.org
801.408.8111

> >> --------------------------------------------------------------------

Kum-Hoe Hwang

unread,
Feb 22, 2010, 3:22:05 AM2/22/10
to Greg Snow, r-h...@r-project.org
This solution such as " data<-na.omit(original database) before you
run step() or stepAIC()" has some limitations, I think. I reduced the
number of data lines, and it enhance R square value.

If you have some tips or advices for another solution, I welcome.

Kum

Urban and Regional Planning, GRI

Reply all
Reply to author
Forward
0 new messages