Using lavPredict to compare group means

Erik O'Donnell

unread,

Jun 3, 2021, 11:57:03 AM6/3/21

to lavaan

Hi all :-)

I have a CFA measurement model that I have tested for measurement invariance (MI) between two groups.

Configural, loadings and intercepts are invariant, but means are not.

To compare means, I figure I can use lavPredict to get the scores, and then use the grouping variable to e.g. do a t-test or calculate Cohen's D, basically treat the scores like any old continuous variable that comes with a categorical grouping variable.

My thinking is that this would *not* be meaningful if I had not first demonstrated configural, loadings and intercepts MI. But since these are invariant, I can go ahead.

Is there any reason I can't or should not compare means in this way?

Best regard,

Erik O'Donnell

car...@web.de

unread,

Jun 3, 2021, 12:21:06 PM6/3/21

to lav...@googlegroups.com

Well, you could just turn the question around. Why would this approach be better than comparing the mean differences using multigroup CFA?

Am 03.06.21, 17:57 schrieb Erik O'Donnell <erikod...@gmail.com>:

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/b331f6ba-bf06-4551-83b5-cff091ec6ccfn%40googlegroups.com.

Erik O'Donnell

unread,

Jun 4, 2021, 8:50:24 AM6/4/21

to lavaan

"Well, you could just turn the question around. Why would this approach be better than comparing the mean differences using multigroup CFA?"

For this project, it saves me a bit of time. I have seven latent factors and several groups to test for MI.

When I use lavPredict to get the scores, I can e.g. easily create plots to compare means of all seven groups, split up the means into subgroups and compare all seven latent factors, etc.

It's generally very easy to plug predicted scores into other R-packages for analysis :-)

However, e.g. Putnick and Borestein (2016, p.77) write "One common way to do this is to set the latent factor mean to 0 in one group and allow it to vary in the second group. The estimated mean parameter in the second group represents the difference in latent means across groups. For example, if the latent factor variance is set to 1.0 and the standardized mean of the parental control latent factor is estimated at 1.00, p < .05, in the United States, then control in the United States is one standard deviation higher than control in China."

This seems a bit "hacky" to me, to be honest, like something you would do if you couldn't easily get the predicted scores, which IIRC was the case with older software.

But maybe setting the latent factor mean to 0 allowing it to vary in the second group encodes some different assumptions or produces different means?

It's definitely more work for me to do this, which I would like to avoid unless there's a compelling reason :'-)

Mauricio Garnier-Villarreal

unread,

Jun 4, 2021, 5:52:00 PM6/4/21

to lavaan

The mean of the factor is always arbitrary to the identification method you use, if you use marker variable and have a "free mean", is not more meaningful. I actually prefer the fixed variance method, as it sets the factor in a standardized metric.

You can do multiple comparisons in the SEM framework, and is the recommended method, because when you export one set of predicted factor scores, this will have some degree of difference with the estimated means in the SEM model. This because of factor inderteminancy, some ways around it is to use plausible valus (with the semTools package) or use the Chroons correction (or something similar). But I dont know of an easy use of the corrections.

So, in short, if you use exported factor scores, you will be assuming they are less variable than they really are, and will report results with some degree of difference from the estimated means in the SEM.

Erik O'Donnell

unread,

Oct 22, 2021, 1:43:40 PM10/22/21

to lav...@googlegroups.com

I've come to agree that using lavPredict is not optimal :-)

Do any of you know of any tutorial that explicitly shows the steps of how to calculate mean differences across several groups in lavaan?

For example, in another thread (https://groups.google.com/g/lavaan/c/Tk9wBUrc0II/m/_tDIYs00BgAJ) Terrence Jorgensen writes:

"...If latent variances are fixed to 1 in both groups, then differences between estimated latent means are already standardized mean differences. Assuming the latent mean is fixed to 0 in the first group, any other group's estimated latent mean is the standardized mean difference of that group from Group 1 (because any mean minus zero is the same number)."

To do this, do I need to add "MyLatentVariable ~~ 1*MyLatentVariable" and "MyLatentVariable ~ 0*1" to the model, or is this something measEq.syntax can do for me?

Would this give me a different answer than if I were to recode my grouping variable as dummy variables, and fit a SEM model with the dummy variables pointing to the latent variable? ("MyLatentVariable ~ DummyGroupA + DummyGroupB + ...")

I would love to compare results from the three approaches 1) lavPredict, 2) SEM with dummy variables and 3) fixing variances and means, but it's not clear to me how to do 3) in practice.

Mega thanks to anyone who can clarify this!

Best regards,

Erik

You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/C1v2N8krtJc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/8a2be316-f20f-4821-87d8-588ccc55f9fbn%40googlegroups.com.

Terrence Jorgensen

unread,

Oct 28, 2021, 5:54:43 AM10/28/21

to lavaan

Do any of you know of any tutorial that explicitly shows the steps of how to calculate mean differences across several groups in lavaan?

Are you familiar with the emmeans package? Mattan Ben-Shachar recently contributed lavaan methods to the semTools package. See examples on the help page:

?lavaan2emmeans

Terrence D. Jorgensen

Assistant Professor, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

http://www.uva.nl/profile/t.d.jorgensen

Reply all

Reply to author

Forward