CFA with ordinal indicators: cfa() vs. lavaan()

Jeffrey Henry

unread,

May 12, 2014, 11:51:51 AM5/12/14

to lav...@googlegroups.com

Dear all,

I am trying to fit a multi-group CFA model with ordinal indicators. I would like to know if I got all lavaan specifications right for this kind of model. Therefore, I first tried my hand on the HS data. I fitted the basic multi-group model (ML estimator), then the multi-group CFA model with ordinal indicators, and then the multi-group LAVAAN model with ordinal indicators. When I fit the last two models (cfa and lavaan for ordinal indicators), I get exactly the same output (df, statistics, indexes, number of estimated/non-estimated parameters, values of the parameters, etc.). However, as a lavaan newbie, I do not want to take any chances and thus thought it would be wise to get confirmation (especially about potential redundancies in the syntax). Here are the syntaxes:

MULTI-GROUP MODEL (ML)
HS.model <-

'visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9'
fit <- cfa(HS.model, data = HolzingerSwineford1939, group = "school")

MULTIGROUP-MODEL (WLSMV) - CFA

#create 4 categories for each continuous variable
DF=HolzingerSwineford1939 #copy data frame into new object
#recode in categorical variables
vec=c("x1","x2","x3","x4","x5","x6","x7","x8","x9")#vector of variables names
MinusSD=function(x){mean(x)-sd(x)}#functions to simplify recoding
PlusSD=function(x){mean(x)+sd(x)}
#create 4 categories
library(car)
for (i in vec)
{ DF[,i]=recode(DF[,i],"lo:MinusSD(DF[,i])=0;MinusSD(DF[,i]):mean(DF[,i])=1;
mean(DF[,i]):PlusSD(DF[,i])=2;PlusSD(DF[,i]):hi=3")}
for (i in vec) {DF[,i]<-ordered(DF[,i])}#order
table(DF$x1);class(DF$x1)

HS.model <-

'visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9'
fit <- cfa(HS.model, data = DF, group = "school",estimator="WLSMV")
summary(fit)

fit <- cfa(HS.model, data = DF, group = "school",estimator="WLSMV",group.equal="thresholds")

MULTIGROUP-MODEL (WLSMV) - LAVAAN
HS.model <-
'visual =~ 1*x1 + x2 + x3
textual =~ 1*x4 + x5 + x6
speed =~ 1*x7 + x8 + x9
#cov.lv.
visual ~~ textual
visual ~~ speed
textual ~~ speed
#var.lv.
visual ~~ visual
speed ~~ speed
textual ~~ textual
#scales y
x1 ~*~ x1
x2 ~*~ x2
x3 ~*~ x3
x4 ~*~ x4
x5 ~*~ x5
x6 ~*~ x6
x7 ~*~ x7
x8 ~*~ x8
x9 ~*~ x9'
fit <- lavaan(HS.model,auto.th=TRUE,data=DF,group="school",estimator="WLSMV",group.equal="thresholds")

Thanks in advance,

Jeffrey Henry

yrosseel

unread,

May 16, 2014, 3:37:02 AM5/16/14

to lav...@googlegroups.com

On 05/12/2014 05:51 PM, Jeffrey Henry wrote:
> parameters, etc.). However, as a lavaan newbie, I do not want to take
> any chances and thus thought it would be wise to get confirmation

I think you did an excellent job. You can always check your model by
looking at the parameter table:

parTable(fit)

If two parameter tables are identical, the models are identical.

Yves.

Jeffrey Henry

unread,

May 18, 2014, 7:39:20 PM5/18/14

to lav...@googlegroups.com

Thanks for the great support! I was asking this question because I am running into an odd problem with my own data. I am testing a general-specific (or bifactor) model with one general factor + 3 specific factors, on a twin sample (grouping=MZ,DZ). Each item is assessed twice, for each twin (twin 1, twin 2) and is rated on a 4-point scale. When I treat the data as continuous (MLR), the model runs smoothly and provides results coherent with the literature and what is expected from twin models. However, when I treat the data as ordinal (estimator="WLSMV"), inter-factor covariances (i.e., twin1-twin2 covariance for each factor) are > 1. This is unexpected as I set the mean to 0 and variance to 1 for these factors so that, to my understanding, they should behave as correlations (which they do in the MLR model and other models I have tested). Also, the estimates are superior to one and the std.all are equal to the estimates. I was not sure whether there was a problem with my own data or the way I specified the bifactor model. Therefore, I took to the HS data again to sort it out. Successively, I

1) considered (x1-x9) as the variables for twin 1 , and created a second set of variables for twin 2 by adding random noise (x1b-x9b);

2) recoded all observed variables as categorical, and;

3) tested the bifactor model with all relevant specifications for interchangeable dyads (i.e., equal loadings, thresholds and scaling according to school group (taken as an equivalent for MZ:DZ), but also to member of each dyad (twin 1 - twin 2)) - see attached syntax. Note that the structure of correlations in the two school groups is not like MZ:DZ (where correlations twin1 twin 2 in MZ > DZ). But because here I fit only the phenotypic CFA without the genetic structure, I think that is ok for the example. I also obtain inter-factor covariance > 1 in the HS data. Is there something wrong with the specifications (see below)? Or, is it sth normal in threshold models? Or identification issue or estimation problem?

Thanks in advance,

Jeffrey Henry

SyntaxeBifactor.R

yrosseel

unread,

May 26, 2014, 5:34:51 AM5/26/14

to lav...@googlegroups.com

On 05/19/2014 01:39 AM, Jeffrey Henry wrote:
> Is there something wrong with the specifications (see below)?

I don't think so. At least not from a syntax perspective. I am not
familiar with the MZ/DZ stuff, but what I see here are latent variables
(eg visual and visual1) that are too similar to be distinguished. Hence
the correlation larger than one, which is typical symptom of this.

I would suggest asking help from those that are more familiar with the
genetic context. For example the authors of the NlsyLinks package.

Yves.

Reply all

Reply to author

Forward