[R] problem with svyglm

1,932 views
Skip to first unread message

Pablo Menese

unread,
Nov 23, 2012, 3:08:06 PM11/23/12
to r-h...@r-project.org
I have this problem.

test <- svydesign(id=~1,weights=~peso)

logit <- svyglm(bach ~ job2 + mujer + egp4 + programa + delay + mdeo + str
+ evprivate, family=binomial,design=test)

then appear:

Error in svyglm.survey.design(bach ~ job2 + mujer + egp4 + programa + :
all variables must be in design= argument

I don't know what this mean...
Please help.


Pablo.

[[alternative HTML version deleted]]

______________________________________________
R-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius

unread,
Nov 23, 2012, 3:56:32 PM11/23/12
to Pablo Menese, r-h...@r-project.org

On Nov 23, 2012, at 12:08 PM, Pablo Menese wrote:

> I have this problem.
>
> test <- svydesign(id=~1,weights=~peso)
>
> logit <- svyglm(bach ~ job2 + mujer + egp4 + programa + delay + mdeo
> + str
> + evprivate, family=binomial,design=test)
>
> then appear:
>
> Error in svyglm.survey.design(bach ~ job2 + mujer + egp4 + programa
> + :
> all variables must be in design= argument
>
> I don't know what this mean...

I suspect you have attach()-ed your dataset and are expecting
regression functions to be "aware" of your column names. That
expectation doesn't always get fulfilled since the authrs of
regression packages are expecting dataframe arguments to be supplied.
You may want to detach the dataset and use data= arguments in
svydesign().

You should forget that you ever heard about the function attach().

--

David Winsemius, MD
Alameda, CA, USA

Pablo Menese

unread,
Nov 27, 2012, 4:27:46 PM11/27/12
to David Winsemius, r-h...@r-project.org
I colud not, even without attach the dataset.

The thing is, when I use this





On Fri, Nov 23, 2012 at 5:56 PM, David Winsemius <dwins...@comcast.net>wrote:

>
> On Nov 23, 2012, at 12:08 PM, Pablo Menese wrote:
>
> I have this problem.
>>
>> test <- svydesign(id=~1,weights=~peso)
>>
>> logit <- svyglm(bach ~ job2 + mujer + egp4 + programa + delay + mdeo + str
>> + evprivate, family=binomial,design=test)
>>
>> then appear:
>>
>> Error in svyglm.survey.design(bach ~ job2 + mujer + egp4 + programa + :
>> all variables must be in design= argument
>>
>> I don't know what this mean...
>>
>
> I suspect you have attach()-ed your dataset and are expecting regression
> functions to be "aware" of your column names. That expectation doesn't
> always get fulfilled since the authrs of regression packages are expecting
> dataframe arguments to be supplied. You may want to detach the dataset and
> use data= arguments in svydesign().
>
> You should forget that you ever heard about the function attach().
>
> --
>
> David Winsemius, MD
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]

Pablo Menese

unread,
Nov 27, 2012, 4:31:30 PM11/27/12
to David Winsemius, r-h...@r-project.org
Sorry, it send it alone...

When I use it:

logit <- glm(bach ~ egp4 + programa, weight=wst7,
family=quasibinomial(link"logit"))

I reach the same betas that in STATA, but the hypothesis test, the t value,
and the std. error is different.

I think that the solution can't be so far from this...

David Winsemius

unread,
Nov 27, 2012, 9:49:36 PM11/27/12
to Pablo Menese, r-h...@r-project.org

On Nov 27, 2012, at 2:31 PM, Pablo Menese wrote:

> Sorry, it send it alone...
>
> When I use it:
>
> logit <- glm(bach ~ egp4 + programa, weight=wst7,
> family=quasibinomial(link"logit"))
>
> I reach the same betas that in STATA, but the hypothesis test, the t
> value, and the std. error is different.

As might be expected if one (Stata) were a weighted analysis and the
(R) other is using a different interpretation of "weights".
>
> I think that the solution can't be so far from this...

If so, then you will be the one to achieve it. You have offered no
data in either the original question for which you have omitted
context, and the code in this posting is obviously incorrect.
Furthermore you started with a `svyglm` question and this code only
attempts to use `glm`.

--

David Winsemius, MD
Alameda, CA, USA

Pablo Menese

unread,
Nov 28, 2012, 1:08:14 PM11/28/12
to David Winsemius, R help
I achive something diferent, I replicated the t value, the std. error and
the hypothesis test but differents betas.
But, you are right, the thing is, I detach the dataset, but even with it, I
couldn't.

I going to describe all because perhaps I omitted something important.
I have this vector for the weights "wst7". My dataset it's a panel survey
with 103 observations missing. "wst7" is the weight and the non response
adjustment factor, with data only for 248 observations.

> class(wst7)
[1] "numeric"
> summary(wst7)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
0.00 10.26 13.52 21.33 25.85 146.00 103

If I use "wst7" to create de svydesign this appear:

> test <- svydesign(id=~1,weights=~wst7)
Error in function (object, ...) : missing values in `weights'

So, I create a vector without NAs (now with the dataset detach).

peso<-na.omit(matrix(data$wst7))

Then

test <- svydesign(id=~fullid,weights=~peso)

(fullid is the identification for each observation, I also used "1", or
whatever you whant there)

Then

logit <- svyglm(bach ~ job2 + mujer + egp4 + programa + delay + mdeo + str
+
evprivate, family=binomial(link="logit"), design=test,
data=data)

This appear
Error in svyglm.survey.design(bach ~ job2 + mujer + egp4 + programa + :
all variables must be in design= argument

Even if I try to use svy as svymean

svymean(data$mujer, design=test)
mean SE
[1,] 0.78843 0.0479
Warning messages:
1: In x * pweights :
longer object length is not a multiple of shorter object length
2: In x * pweights :
longer object length is not a multiple of shorter object length

When the mean for "mujer" is

. svy: mean mujer
(running mean on estimation sample)

Survey: Mean estimation

Number of strata = 1 Number of obs = 248
Number of PSUs = 248 Population size = 5290.16
Design df = 247

Linearized
Mean Std. Err. [95% Conf. Interval]
mujer .5551581 .0410122 .4743798 .6359363

So, I thing that the problem is in the survey design...



On Tue, Nov 27, 2012 at 11:49 PM, David Winsemius <dwins...@comcast.net>wrote:

>
> On Nov 27, 2012, at 2:31 PM, Pablo Menese wrote:
>
> Sorry, it send it alone...
>>
>> When I use it:
>>
>> logit <- glm(bach ~ egp4 + programa, weight=wst7,
>> family=quasibinomial(link"**logit"))
>>
>> I reach the same betas that in STATA, but the hypothesis test, the t
>> value, and the std. error is different.
>>
>
> As might be expected if one (Stata) were a weighted analysis and the (R)
> other is using a different interpretation of "weights".
>
>
>> I think that the solution can't be so far from this...
>>
>
> If so, then you will be the one to achieve it. You have offered no data in
> either the original question for which you have omitted context, and the
> code in this posting is obviously incorrect. Furthermore you started with
> a `svyglm` question and this code only attempts to use `glm`.
>
>
> --
>
> David Winsemius, MD
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]
Reply all
Reply to author
Forward
0 new messages