covariates that make my model (and I) crazy

166 views
Skip to first unread message

Emmanuel W

unread,
Oct 26, 2018, 8:53:55 AM10/26/18
to lavaan
Hello again,

I have a problem with the covariates in my model but I don't know if it's a question of understanding the method or the package. In both cases, thank you in advance for your help.

I have a model with a second-order latent variable (depression) and four outcomes. I have three adjustment variables, namely age, sex and education in 3 categories (so I created 2 binary variables educ2 and educ3). I adjust both when creating my second-order latent variable and when regressing:

depression=~latent1 + latent2 +age + sex + educ2 + educ3

outcome1~depression + age + educ2 + educ3 + sex
outcome2~depression + age + educ2 + educ3 + sex
outcome3~depression + age + educ2 + educ3 + sex
outcome4~depression + age + educ2 + educ3 + sex

The age-adjusted model (withoout sex and education) works as well as the all-adjusted model. On the other hand, the model adjusted only for age and education (without sex) does not work:

Warning messages:
1: In lav_data_full(data = data, group = group, cluster = cluster, :
  lavaan WARNING: some observed variances are (at least) a factor 1000 times larger than others; use varTable(fit) to investigate
2: In sqrt(A1[[g]]) : NaN production
3: In lav_model_vcov(lavmodel = lavmodel, lavsamplestats = lavsamplestats, :
  lavaan WARNING:
    Could not compute standard errors! The information matrix could
    not be inverted. This may be a symptom that the model is not
    identified.
4: In sqrt(A1[[g]]) : NaN production
5: In lav_test_satorra_bentler(lavobject = NULL, lavsamplestats = lavsamplestats, :

  lavaan WARNING: could not invert information matrix

A little by chance, I tried to add to the syntax "ordered=c(..., "educ2", "educ3")" and it works again. But if I compare the all-adjusted model with or without this parameter, the results are different...

1) What is the difference between these two models?
2) It seemed to me that when the variables were exogenous, it was not necessary to specify among the "ordered" variables. Was I wrong?
3) Moreover, why is the addition of sex in the model enough to solve this problem?

Thank you very much

Emmanuel

ddu...@g.uky.edu

unread,
Oct 26, 2018, 11:00:55 AM10/26/18
to lavaan
Hi Emmanuel,

I wonder if your problem is not all wrapped up in this line of your model:

depression=~latent1 + latent2 +age + sex + educ2 + educ3

Do you really want age, sex, and education to be indicators of depression?

-David

Emmanuel W

unread,
Oct 26, 2018, 11:47:06 AM10/26/18
to lavaan
You must be right David!

I want adjust for these variables as the authors do here:
 

Maybe with

depression=~latent1 + latent2
depression~age + sexe + educ2 + educ3 ?

Emmanuel

ddu...@g.uky.edu

unread,
Oct 26, 2018, 1:20:17 PM10/26/18
to lavaan
Hi Emmanuel,

That looks better to me. But, I wonder: Why are you using your covariates as predictors of your exogenous variable? Usually, we only predict endogenous variables with covariates. That way, the interpretation of the effect of the predictor on the outcome is simpler: "After controlling for covariates x,y,z, the effect of depression on outcome1 is ..."

When you also have the covariates predicting depression, the interpretation is a bit less intuitive: "After removing variability explained by covariates x, y, z from both depression and outcome1, the effect of depression on outcome1 is ..." While there may be some instances when this is the desired interpretation, I would have a serious look at your research questions to make sure your model will help you answer them appropriately.

-David

Emmanuel W

unread,
Oct 29, 2018, 10:20:36 AM10/29/18
to lavaan
Hi David,

the research question would be to determine whether latent1 or latent2 predicts my outcomes above and beyond the association attributable to the depression.

the new model runs but I can't have my modification indices (I had them when my covariates were indicators of depression) :

lavTestScore(fit, add=test.mi)

Error in tmp[cbind(REP$row[idx], REP$col[idx])] <- lavpartable$free[idx] :
  NAs are not allowed in subscripted assignments


Emmanuel

Emmanuel W

unread,
Oct 29, 2018, 12:47:29 PM10/29/18
to lavaan
After multiple attempts, I realize that the only way to insert adjustment variables (for the outcome or for my depression variable) is first to create a latent variable corresponding to this variable. So model 1 does not give the modification indices:

Model1 :
depression=~dep+pos+som+som+int
depression~age

outcome1~depression + age
outcome2~depression + age
outcome3~depression + age
outcome4~depression + age

but model 2 does not return errors for the indices:

Model 2:
depression=~dep+pos+som+som+int
latent_age=~age
depression~latent_age

outcome1~depression + latent_age
outcome2~depression + latent_age
outcome3~depression + latent_age
outcome4~depression + latent_age


I don't know how to fix it because conceptually age is not a latent variable...

ddu...@g.uky.edu

unread,
Oct 29, 2018, 12:56:00 PM10/29/18
to lavaan
Hi Emmanuel, 

I am not very familiar with modification indices in lavaan. Hopefully someone who is can give you better advice than this:

From the error, it sounds like the problem is missingness on the age variable. The ML estimator will deal with this missingness properly when age is not exogenous (i.e., it is in the likelihood function), which is why it gives you MIs when you make age latent. 

I don't know if it would work in lavaan, but in Mplus, I can pull exogenous variables into the likelihood function just by mentioning their correlation with other exogenous variables (e.g. "age ~~ sex"). You could try something like this to see if it works.

I personally have no problem with your latent_age trick, so long as all the model parameter estimates are still the same.

-David

Emmanuel W

unread,
Oct 29, 2018, 1:29:24 PM10/29/18
to lavaan
Thank you very much David for your ideas. But I think the explanation for the error is different because I don't have any missing data for age.

Emmanuel W

unread,
Oct 30, 2018, 8:20:11 AM10/30/18
to lavaan
Hello,
Sorry to flood you with messages but I have tested several models again. With my three covariates, I can obtain the modification indices for the regressions between my first order latent variables and my outcomes. The error appears when I want to see the indices for the regressions between the items constituting the latent variables and the outcomes...

Emmanuel W

unread,
Nov 8, 2018, 1:48:52 PM11/8/18
to lavaan
Hello,

Before comparing each model one by one (20 items with 4 outcomes and 2 sexes...), I would like to send you my entire script if anyone has an idea...

model <-
'dep=~CESD3 + CESD6 + CESD9 + CESD10 + CESD14 + CESD17 + CESD18
pos=~CESD4 + CESD8 + CESD12 + CESD16
som=~CESD1 + CESD2 + CESD5 + CESD7 + CESD11 + CESD13 + CESD20
int=~CESD15 + CESD19
depression=~dep+pos+som+int
depression~age + sex + educ2 + educ3

outcome1~depression + age + sex + educ2 + educ3
outcome2~depression + age + sex + educ2 + educ3
outcome3~depression + age + sex + educ2 + educ3
outcome4~depression + age + sex + educ2 + educ3

outcome1~~outcome2
outcome1~~outcome3
outcome1~~outcome4
outcome2~~outcome3
outcome2~~outcome4
outcome3~~outcome4
'

fit <- cfa(model, data = bdd, estimator = "WLSMV",
ordered=c("CESD1","CESD2","CESD3","CESD4","CESD5","CESD6","CESD7","CESD8",
"CESD9","CESD10","CESD11","CESD12","CESD13","CESD14","CESD15","CESD16",
"CESD17","CESD18","CESD19","CESD20","outcome1","outcome2",
"outcome3","outcome4"))

summary(fit, fit.measures=TRUE, standardized=TRUE)

test.mi<-
'outcome1~dep'

lavTestScore(fit, add=test.mi)
runs

but not test.mi<-'outcome1~CESD1' :

Error in tmp[cbind(REP$row[idx], REP$col[idx])] <- lavpartable$free[idx] :
  NAs are not allowed in subscripted assignments

Thank you!

Emmanuel

Terrence Jorgensen

unread,
Nov 9, 2018, 6:27:52 AM11/9/18
to lavaan
lavaan uses (a subset of) the LISREL representation, which uses the lambda matrix for arrows that point from a latent predictor to an observed outcome.  I know you aren't interpreting outcome1~dep as a factor loading, but statistically there is no distinction: loadings are regression slopes.  So try adding the parameter dep=~outcome1 instead.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Emmanuel W

unread,
Nov 9, 2018, 7:36:10 AM11/9/18
to lavaan
Thank you Terence for your suggestion! but it's still a problem...

dep=~outcome1 works and (give me the same result that outcome1~dep).
 
But with CESD1=~outcome1, the error is:

> lavTestScore(fit, add=test.mi)
Error in lav_model(lavpartable = lavpartable, lavoptions = lavoptions,  :
lavaan ERROR: parameter is not defined: CESD1 | t1

I have a result with CESD1~outcome1 but I think that it's not the relationship I'm interested in.

Emmanuel

Terrence Jorgensen

unread,
Nov 12, 2018, 7:56:33 AM11/12/18
to lavaan
dep=~outcome1 works and (give me the same result that outcome1~dep). 

Woops, I was confused before.  I thought this was the test that wouldn't run.

But with CESD1=~outcome1, the error is:

> lavTestScore(fit, add=test.mi)
Error in lav_model(lavpartable = lavpartable, lavoptions = lavoptions,  :
lavaan ERROR: parameter is not defined: CESD1 | t1

Right, that's not possible because CESD1 is observed, not latent.  And your original outcome1~CESD1 did not work because CESD1 is an ordered factor.  Cagetorical predictors are only allowed when they are represented with numeric codes:


Only the reverse would be possible (CESD1~outcome1), which is how differential item functioning can be tested using a MIMIC model.  But that is clearly not the hypothesized direction of effects.  

A solution that might work would be to define a single-indicator construct for CESD1, then use that "construct" as the indicator of som, and it could also predict outcome1.  The single-indicator construct is essentially equivalent to what the model is doing anyway: assuming there is a normally distributed latent response underlying the categorical observed response.  Defining a single-indicator construct just brings that latent item-response explicitly into the model to use as a predictor.

Emmanuel W

unread,
Nov 12, 2018, 8:21:16 AM11/12/18
to lavaan
Thank you Terence, I'm going to try that. However, I don't understand why outcome1~CESD1 works when I remove all my covariates (age + sex + educ2 + educ3).

Emmanuel

Emmanuel W

unread,
Nov 13, 2018, 8:01:07 AM11/13/18
to lavaan
Hello, 

Here are my results that leave me more perplexed than ever:

1) With "CESD1b=~CESD1", CESD1b as indicator of som and my covariates:

Warning messages:
1: In lav_model_vcov(lavmodel = lavmodel, lavsamplestats = lavsamplestats,  :

  lavaan WARNING:
    Could not compute standard errors! The information matrix could
    not be inverted. This may be a symptom that the model is not
    identified.
2: In lav_test_satorra_bentler(lavobject = NULL, lavsamplestats = lavsamplestats,  :

  lavaan WARNING: could not invert information matrix

However, if I try :

test.mi<-'outcome1~CESD1b
lavTestScore(fit, add=test.mi)

I find:                

lhs op rhs    X2 df p.value
1 'outcome1~CESD1b ==   0 3.956  1   0.047

2) With "CESD1b=~CESD1", CESD1b as indicator of som and no covariates, I have the same warnings and:

                 lhs op rhs     X2 df p.value
1 fum_act_tot~CESD1b ==   0 10.001  1   0.002

3) Finally, when I try without the single-indicator construct for CESD1 (and without covariates necessarily, otherwise the model does not run):

            lhs op rhs    X2 df p.value
1 fum_act_tot~CESD1 ==   0 0.504  1   0.478



I can understand the difference due to the adjustment (even if it is a little surprising because the covariates have almost no effect for the four latent variables, i.e., som, pos etc...) but why such a difference between 2 and 3? 

Emmanuel

Emmanuel W

unread,
Nov 22, 2018, 12:34:11 PM11/22/18
to lavaan
Hello,

Finally, the solution (given by Dr. Rosseel) is to add "conditional.x = FALSE" as a parameter.

However, I can't explain it, but it only works if I use the 32-bit version instead of the 64-bit version of R...

Emmanuel
Reply all
Reply to author
Forward
0 new messages