CFA - clustered standard errors

Aaron Charlton

unread,

Nov 3, 2015, 6:56:06 PM11/3/15

to lavaan

I am doing a scale development project with Lavaan. I ran a CFA, but the problem is that I have 43 individuals answer the same questions four times, once for each set of stimuli. Is there a way I can account for this in Lavaan? I was told that I need to "cluster the standard errors", clustering together each individual's response. Does anyone know how to do that? Also, should I use the grouping aspect of the cfa function? Thanks. -Aaron

## confirmatory factor analysis

library(lavaan)

model1 <- ' history =~ HIS1 + HIS2 + HIS3 + HIS4

motivation =~ MOT1 + MOT2 + MOT3 + MOT4

emotion =~ EMO1 + EMO2 + EMO3 + EMO4

fit =~ FIT1 + FIT2 + FIT3 + FIT4 + FIT5'

fit2 <- cfa(model1, data=study1)

Stas Kolenikov

unread,

Nov 3, 2015, 10:25:32 PM11/3/15

to lav...@googlegroups.com

Probably something like

library(survey) # thanks to Thomas Lumley

library(lavaan.survey) # thanks to Daniel Oberski

person.as.cluster <- svydesign(ids=~person, probs~=1, data=study1)

fit2.clustered <- lavaan.survey(fit2, person.as.cluster, estimator="MLM")

assuming that your data are in semi-long format with each row representing one (out of the four) observations.

Alternatively, you could consider recasting your model as MTMM with the multiple methods being the multiple occasions if you think that the individuals learn in about the same way.

-- Stas Kolenikov, PhD, PStat (ASA, SSC)
-- Principal Survey Scientist, Abt SRBI
-- Education Officer, Survey Research Methods Section of the American Statistical Association
-- Opinions stated in this email are mine only, and do not reflect the position of my employer
-- http://stas.kolenikov.name

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at http://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Aaron Charlton

unread,

Nov 4, 2015, 4:47:18 PM11/4/15

to lavaan

Stas,

Thank you so much for the response. I have actually been trying to use this package, but every time I run the lavann.survey function, I get this error message:

Warning message:

'lavaan::duplicationMatrix' is deprecated.

Use 'lav_matrix_duplication' instead.

See help("Deprecated") and help("lavaan-deprecated").

Following the error message, it fails to do any clustering. Any ideas why this might be happening? Thanks.

Stas Kolenikov

unread,

Nov 4, 2015, 5:17:20 PM11/4/15

to lav...@googlegroups.com

That would be a question to Daniel Oberski, and something for him to fix. (Duplication matrices are used to convert different vectorizations of a symmetric matrix, and that's something you have to process when dealing with the moment matrices.)

-- Stas Kolenikov, PhD, PStat (ASA, SSC)
-- Principal Survey Scientist, Abt SRBI
-- Education Officer, Survey Research Methods Section of the American Statistical Association
-- Opinions stated in this email are mine only, and do not reflect the position of my employer
-- http://stas.kolenikov.name

--

yrosseel

unread,

Nov 5, 2015, 3:04:20 AM11/5/15

to lav...@googlegroups.com

On 11/04/2015 10:47 PM, Aaron Charlton wrote:
> Warning message:
> 'lavaan::duplicationMatrix' is deprecated.
> Use 'lav_matrix_duplication' instead.
> See help("Deprecated") and help("lavaan-deprecated").

This is a (completely harmless) warning message, not an error!

> Following the error message, it fails to do any clustering.

What happens? Do you get another error message? Does the model fail to
converge? Please show us the complete R script and the output, and we
may be able help you better.

Yves.

Aaron Charlton

unread,

Nov 5, 2015, 2:06:03 PM11/5/15

to lavaan

I seem to get exactly the same model with and without clustering.

Here is my code:

> str(study1)

'data.frame': 176 obs. of 21 variables:

$ ID : chr "951138372" "951138372" "951138372" "951138372" ...

$ stimulus: Factor w/ 4 levels "adidas_wc","mcD_olymp",..: 1 2 4 3 1 2 4 3 1 2 ...

$ HIS1 : int 7 5 5 6 7 3 2 2 7 6 ...

$ HIS2 : int 6 3 3 5 7 3 2 2 7 6 ...

$ HIS3 : int 6 5 5 5 7 3 2 2 7 7 ...

$ HIS4 : int 6 3 2 5 7 3 2 2 7 4 ...

$ MOT1 : int 6 5 6 6 7 3 3 6 7 4 ...

$ MOT2 : int 7 7 6 7 7 3 3 5 7 7 ...

$ MOT3 : int 5 5 5 5 1 5 3 3 4 4 ...

$ MOT4 : int 6 6 6 6 7 3 5 5 7 5 ...

$ EMO1 : int 6 5 5 5 7 2 3 3 7 2 ...

$ EMO2 : int 6 5 5 5 7 2 4 3 5 4 ...

$ EMO3 : int 5 6 6 5 7 4 3 3 5 6 ...

$ EMO4 : int 6 5 5 5 7 3 3 4 7 6 ...

$ FIT1 : int 7 2 2 6 7 2 3 3 7 1 ...

$ FIT2 : int 7 2 2 6 7 2 3 3 7 1 ...

$ FIT3 : int 7 2 2 7 7 2 3 4 7 3 ...

$ FIT4 : int 6 3 3 5 7 2 3 3 7 1 ...

$ FIT5 : int 7 2 2 6 7 2 3 4 7 1 ...

$ OUT1 : int 6 3 3 6 7 2 3 5 7 4 ...

$ OUT2 : int 7 7 2 5 5 3 3 4 7 7 ...

> ## confirmatory factor analysis

> library(lavaan)

> model1 <- ' history =~ HIS1 + HIS2 + HIS3 + HIS4

+ motivation =~ MOT1 + MOT2 + MOT3 + MOT4

+ emotion =~ EMO1 + EMO2 + EMO3 + EMO4'

> # fit =~ FIT1 + FIT2 + FIT3 + FIT4 + FIT5'

> fit2 <- cfa(model1, data=study1)

> library(survey)

> library(lavaan.survey)

> person.as.cluster <- svydesign(ids=~ID, probs=~1, data=study1)

> fit2.clustered <- lavaan.survey(fit2, person.as.cluster, estimator="MLM")

Warning message:

'lavaan::duplicationMatrix' is deprecated.

Use 'lav_matrix_duplication' instead.

See help("Deprecated") and help("lavaan-deprecated").

> summary(fit2)

lavaan (0.5-19) converged normally after 41 iterations

Number of observations 176

Estimator ML

Minimum Function Test Statistic 212.844

Degrees of freedom 51

P-value (Chi-square) 0.000

Parameter Estimates:

Information Expected

Standard Errors Standard

Latent Variables:

Estimate Std.Err Z-value P(>|z|)

history =~

HIS1 1.000

HIS2 0.952 0.034 28.164 0.000

HIS3 0.962 0.034 28.420 0.000

HIS4 0.902 0.039 22.969 0.000

motivation =~

MOT1 1.000

MOT2 0.603 0.059 10.215 0.000

MOT3 -0.346 0.093 -3.722 0.000

MOT4 0.697 0.054 12.815 0.000

emotion =~

EMO1 1.000

EMO2 1.054 0.066 15.970 0.000

EMO3 -0.410 0.078 -5.230 0.000

EMO4 0.771 0.071 10.887 0.000

Covariances:

Estimate Std.Err Z-value P(>|z|)

history ~~

motivation 1.382 0.205 6.755 0.000

emotion 1.313 0.200 6.553 0.000

motivation ~~

emotion 1.575 0.209 7.525 0.000

Variances:

Estimate Std.Err Z-value P(>|z|)

HIS1 0.121 0.029 4.122 0.000

HIS2 0.338 0.045 7.548 0.000

HIS3 0.337 0.045 7.492 0.000

HIS4 0.520 0.062 8.369 0.000

MOT1 0.674 0.101 6.701 0.000

MOT2 0.650 0.077 8.432 0.000

MOT3 2.218 0.239 9.292 0.000

MOT4 0.404 0.056 7.255 0.000

EMO1 0.625 0.089 6.995 0.000

EMO2 0.377 0.074 5.078 0.000

EMO3 1.614 0.175 9.238 0.000

EMO4 1.002 0.117 8.547 0.000

history 2.409 0.271 8.900 0.000

motivation 1.681 0.251 6.708 0.000

emotion 1.737 0.250 6.948 0.000

> summary(fit2.clustered)

lavaan (0.5-19) converged normally after 41 iterations

Number of observations 176

Estimator ML Robust

Minimum Function Test Statistic 212.844 138.145

Degrees of freedom 51 51

P-value (Chi-square) 0.000 0.000

Scaling correction factor 1.541

for the Satorra-Bentler correction

Parameter Estimates:

Information Expected

Standard Errors Robust.sem

Latent Variables:

Estimate Std.Err Z-value P(>|z|)

history =~

HIS1 1.000

HIS2 0.952 0.024 39.953 0.000

HIS3 0.962 0.027 35.710 0.000

HIS4 0.902 0.033 27.354 0.000

motivation =~

MOT1 1.000

MOT2 0.603 0.078 7.781 0.000

MOT3 -0.346 0.169 -2.044 0.041

MOT4 0.697 0.053 13.116 0.000

emotion =~

EMO1 1.000

EMO2 1.054 0.054 19.382 0.000

EMO3 -0.410 0.139 -2.956 0.003

EMO4 0.771 0.078 9.941 0.000

Covariances:

Estimate Std.Err Z-value P(>|z|)

history ~~

motivation 1.382 0.246 5.614 0.000

emotion 1.313 0.199 6.586 0.000

motivation ~~

emotion 1.575 0.213 7.395 0.000

Intercepts:

Estimate Std.Err Z-value P(>|z|)

HIS1 4.438 0.132 33.497 0.000

HIS2 4.278 0.142 30.192 0.000

HIS3 4.517 0.128 35.302 0.000

HIS4 4.205 0.128 32.868 0.000

MOT1 4.642 0.126 36.806 0.000

MOT2 5.608 0.112 50.256 0.000

MOT3 4.085 0.185 22.085 0.000

MOT4 5.188 0.103 50.450 0.000

EMO1 4.614 0.104 44.270 0.000

EMO2 4.710 0.134 35.235 0.000

EMO3 4.438 0.116 38.386 0.000

EMO4 4.506 0.113 39.945 0.000

history 0.000

motivation 0.000

emotion 0.000

Variances:

Estimate Std.Err Z-value P(>|z|)

HIS1 0.121 0.039 3.099 0.002

HIS2 0.338 0.123 2.735 0.006

HIS3 0.337 0.060 5.636 0.000

HIS4 0.520 0.099 5.238 0.000

MOT1 0.674 0.128 5.271 0.000

MOT2 0.650 0.093 6.999 0.000

MOT3 2.218 0.359 6.184 0.000

MOT4 0.404 0.067 6.074 0.000

EMO1 0.625 0.150 4.155 0.000

EMO2 0.377 0.094 4.007 0.000

EMO3 1.614 0.235 6.855 0.000

EMO4 1.002 0.223 4.499 0.000

history 2.409 0.246 9.785 0.000

motivation 1.681 0.273 6.159 0.000

emotion 1.737 0.266 6.527 0.000

> library(semTools)

> reliability(fit2)

history motivation emotion total

alpha 0.9634943 0.3767812 0.4649107 0.8679822

omega 0.9638765 0.6193784 0.7369433 0.9261453

omega2 0.9638765 0.6193784 0.7369433 0.9261453

omega3 0.9637533 0.6348940 0.7667986 0.8763399

avevar 0.8697854 0.4561635 0.5797538 0.6580224

> reliability(fit2.clustered)

history motivation emotion total

alpha 0.9634943 0.3767812 0.4649107 0.8679822

omega 0.9638765 0.6193784 0.7369433 0.9261453

omega2 0.9638765 0.6193784 0.7369433 0.9261453

omega3 0.9637533 0.6348940 0.7667986 0.8763399

avevar 0.8697854 0.4561635 0.5797538 0.6580224

>

Stas Kolenikov

unread,

Nov 5, 2015, 3:44:35 PM11/5/15

to lav...@googlegroups.com

So are HIS1, HIS2, HIS3, HIS4 your four occasions? If they are, and each individual is just one line of the data, then you don't have anything to cluster for. If you had scales administered four times, and you wanted to build a model where the latent construct is your latent variable, and items are the observed variables, then the idea of clustering could have been entertained to model the dependencies between the four occasions over time. As it stands, however, you have a CFA with no structure imposed; a very reasonable question to ask is whether you have measurement invariance between the occasions -- but it looks like motivation and emotion will likely fail that due to something odd happening on the third occasion.

-- Stas Kolenikov, PhD, PStat (ASA, SSC)
-- Principal Survey Scientist, Abt SRBI
-- Education Officer, Survey Research Methods Section of the American Statistical Association
-- Opinions stated in this email are mine only, and do not reflect the position of my employer
-- http://stas.kolenikov.name

--

Aaron Charlton

unread,

Nov 5, 2015, 4:12:44 PM11/5/15

to lavaan

Hi, Stas.

Not exactly. HIS1-HIS4 are four different scale items that hopefully measure the latent variable 'history'. My occasions are contained in the 'stimulus' variable, a factor with four levels. I have the data in long form, so there are 4 observations for each person (ID) -- one for each of the four stimuli. Thanks. -Aaron

Stas Kolenikov

unread,

Nov 5, 2015, 4:16:40 PM11/5/15

to lav...@googlegroups.com

Oh I see. Well your models aren't exactly the same with and without clustering. With clustering, the standard errors are different (as they should be; the point estimates should be the same). Also, with clustering, you should forget about the MLM fit test statistic reported (the 212 number), and only look at the "robust" statistic reported (the 138 number).

-- Stas Kolenikov, PhD, PStat (ASA, SSC)
-- Principal Survey Scientist, Abt SRBI
-- Education Officer, Survey Research Methods Section of the American Statistical Association
-- Opinions stated in this email are mine only, and do not reflect the position of my employer
-- http://stas.kolenikov.name

Aaron Charlton

unread,

Nov 5, 2015, 4:26:52 PM11/5/15

to lavaan

Stas,

Thanks! Is there a resource you would recommend on interpreting the clustered output? -Aaron

Stas Kolenikov

unread,

Nov 5, 2015, 5:57:32 PM11/5/15

to lav...@googlegroups.com

May be http://www-personal.umich.edu/~jdinardo/clusterextended.pdf

-- Stas Kolenikov, PhD, PStat (ASA, SSC)
-- Principal Survey Scientist, Abt SRBI
-- Education Officer, Survey Research Methods Section of the American Statistical Association
-- Opinions stated in this email are mine only, and do not reflect the position of my employer
-- http://stas.kolenikov.name

Reply all

Reply to author

Forward