Cross-lagged panel SEM with observed categorical data that have NAs in different time points

328 views

Skip to first unread message

Blain Waan

unread,

Jan 10, 2018, 9:01:02 AM1/10/18

to lavaan

I have about 3000 data (but only 265 complete cases) for a cross-lagged panel SEM. My analysis is somewhat like the attached diagram. Say, A=AGE (continuous), B=sex (binary), X=emp (binary), Y=SA (binary), Z=SE (4-category) in our case. All the variables are observed variables. In words, we want to test our hypothesized cross-lagged relationships between emp, SA, and SE where AGE and sex are the confounders of these relationships. There are missing values (NA) at different time points for these observed variables.

1) I tried the following codes first to run a simple model using only emp and SE without any confounder by declaring the categorical endogenous observed variables as suggested in http://lavaan.ugent.be/tutorial/cat.html.

d <- read.csv('newdata.csv', header=T)

# Convert the variables to factors
d$sex <- factor(d$sex)
d$emp1 <- ordered(d$emp1)
d$emp2 <- ordered(d$emp2)
d$emp5 <- ordered(d$emp5)
d$SA1 <- ordered(d$SA1)
d$SA2 <- ordered(d$SA2)
d$SA5 <- ordered(d$SA5)
d$SE1 <- ordered(d$SE1)
d$SE2 <- ordered(d$SE2)
d$SE5 <- ordered(d$SE5) 

library(lavaan)

clpm0 <- '
# synchronous covariances
SA1 ~~ emp1
SA2 ~~ emp2
SA5 ~~ emp5

# autoregressive + cross-lagged paths
SA2 ~ SA1 + emp1 
emp2 ~ emp1 + SA1 
SA5 ~ SA2 + emp2 
emp5 ~ emp2 + SA2 
'

fit0 <- sem(clpm0, data=d)

But this produces the following error message:

Error in tmp[cbind(REP$row[idx], REP$col[idx])] <- lavpartable$free[idx] : 
  NAs are not allowed in subscripted assignments

I also tried the following:

fit0 <- sem(clpm0, data=d, missing='listwise')

But got the same error message:

Error in tmp[cbind(REP$row[idx], REP$col[idx])] <- lavpartable$free[idx] : 
  NAs are not allowed in subscripted assignments

I'm pretty sure that I'm making some mistakes but as a new user of lavaan cannot figure out my mistake. Could you please tell me what is my mistake here? What happens when I use missing='pairwise' theoretically in these cases? Because then it shows me something like:

Error in tmp[cbind(REP$row[idx], REP$col[idx])] <- lavpartable$free[idx] : 
  NAs are not allowed in subscripted assignments
In addition: Warning message:
In lav_data_full(data = data, group = group, cluster = cluster,  :
  lavaan WARNING: some cases are empty and will be ignored:
  1 2 10 14 16 18 19 20 21 31 33 58 60 62 98 105 188 209 253 293 303 306 308 324 328 341 346 392 425 435 436 498 500 511 517 518 540 542 548 555 557 574 581 583 585 595 596 597

2) Second, I tried to incorporate the confounders AGE and sex. I'm confused if the codes should be as follows even if the problem of the first part is solved:

clpm1 <- '
# synchronous covariances
SA1 ~~ emp1
SA2 ~~ emp2
SA5 ~~ emp5

# autoregressive + cross-lagged paths
SA2 ~ SA1 + emp1 + AGE + sex
emp2 ~ emp1 + SA1 + AGE + sex
SA5 ~ SA2 + emp2 + AGE + sex 
emp5 ~ emp2 + SA2 + AGE + sex
'

fit1 <- sem(clpm1, data=d, missing='listwise')

Then it shows the error:

Error in vnames(FLAT, type = "ov.x", ov.x.fatal = TRUE) : 
  lavaan ERROR: model syntax contains variance/covariance/intercept formulas
  involving (an) exogenous variable(s): [SA1 emp1];
  Please remove them and try again.
In addition: Warning message:
In lavaan::lavaan(model = clpm1, data = d, missing = "listwise",  :
  lavaan WARNING: syntax contains parameters involving exogenous covariates; switching to fixed.x = FALSE

3) The final question is: if I want to run a complete analysis with the model diagram shown in the attached picture, what changes the following codes need?

clpm2 <- '
# synchronous covariances
SA1 ~~ emp1
SA2 ~~ emp2
SA5 ~~ emp5

SE1 ~~ emp1
SE2 ~~ emp2
SE5 ~~ emp5

SE1 ~~ SA1
SE2 ~~ SA2
SE5 ~~ SA5

# autoregressive + cross-lagged paths
SE1 ~ SA1 + emp1 + AGE + sex

emp2 ~ emp1 + SA1 + AGE + sex 
SA2 ~ SA1 + emp1 + AGE + sex 
SE2 ~ SE1 + SA1 + emp1 + SA2 + emp2 + AGE + sex 

emp5 ~ emp2 + SA2 + AGE + sex 
SA5 ~ SA2 + emp2 + AGE + sex 
SE5 ~ SE2 + SA2 + emp2 + SA5 + emp5 + AGE + sex 
'

# fit the model 
fit1 <- sem(clpm1, data=d, missing='listwise')

Your suggestion is highly appreciated and thanks a lot for helping out. Kind regards.