same latent variable but different indicators for boys and girls

Ilhong Yun

unread,

Mar 10, 2018, 5:22:05 PM3/10/18

to lavaan

Hi,

This may appear a foolish question, but I desperately need some guidance.

My model is to predict delinquency from pubertal development status.

Different indicators are used to measure pubertal development for boys and girls as shown below.

Is the code for my structural model correct when I run a sem using the total sample including both boys and girls?

#measurement model

puberty =~ armhair + facehair + voice (# This is for boys' pubertal status)

puberty =~ breast + body + menarche + selfreport (#This is for girls' pubertal status)

delinquency=~ delinquency1+ delinquency2 +delinquency3

#structural model

delinquency ~ puberty

Thanks,

Ilhong

Christopher Bratt

unread,

Mar 11, 2018, 5:13:02 AM3/11/18

to lavaan

You can develop two separate models/analyses, and that is the approach I would recommend.

Now you have one latent variable with both boys' and girls' items as indicators. If you combine this into one analysis, you'll have a lot of missing data in these items (about 50% for each, presumably). One might try experimenting with such a single model (adding gender as a covariate in the model), but I would not recommend that you use this approach.

A group-based model makes less sense (one model, two groups).

Ilhong Yun

unread,

Mar 11, 2018, 7:46:18 AM3/11/18

to lavaan

Thanks for the reply, Christopher.

You are suggesting that I need to divide the total sample into two sub-samples(boys and girls) and run separate sem models. In fact, that is what I have been doing. By doing so, I ended up getting two different regression coefficients.

However, the reason I asked the original question was that I keep encountering published articles that used the full sample, generating a single regression coefficient (delinquency ~ puberty) for the entire sample (without dividing into boys and girls).

I experimented with a single model, as you mentioned, and the coefficient is not quite different from those from the separate models.

Thanks for your thought,

Ilhong

Terrence Jorgensen

unread,

Mar 11, 2018, 9:48:39 AM3/11/18

to lavaan

However, the reason I asked the original question was that I keep encountering published articles that used the full sample, generating a single regression coefficient (delinquency ~ puberty) for the entire sample (without dividing into boys and girls).

In multiple-group models, you can use different variables in different groups:

https://groups.google.com/d/msg/lavaan/GdOq3ymaE70/bAyR1RxQDwAJ

## unconstrained, configural invariance
mod.config <- '
group: boy # or group: 1, use whatever your variable labels are

#measurement model   
puberty.b =~ armhair + facehair + voice  (# This is for boys' pubertal status)
delinquency=~ delinquency1 + delinquency2 + delinquency3
#structural model 
delinquency ~ boy.slope*puberty.b

group: girl # or group: 2, use whatever your variable labels are

#measurement model   
puberty.g =~ breast + body + menarche + selfreport (#This is for girls' pubertal status)


delinquency =~ delinquency1 + delinquency2 + delinquency3
#structural model


delinquency ~ girl.slope*puberty.g
'

## add metric-invariance constraints
mod.metric <- '
group: boy # or group: 1, use whatever your variable labels are

#measurement model   
puberty.b =~ armhair + facehair + voice  (# This is for boys' pubertal status)
delinquency =~ L1*delinquency1 + L2*delinquency2 + L3*delinquency3
delinquency ~~ NA*delinquency # fixing to 1 no longer need for identification
#structural model 
delinquency ~ boy.slope*puberty.b

group: girl # or group: 2, use whatever your variable labels are

#measurement model   
puberty.g =~ breast + body + menarche + selfreport (#This is for girls' pubertal status)
delinquency =~ L1*delinquency1 + L2*delinquency2 + L3*delinquency3
#structural model 
delinquency ~ girl.slope*puberty.g
'

## specify equality constraint to test 
sameSlope <- ' girl.slope == boy.slope '

## fit models
fit.config <- cfa(mod.config, data = myData, std.lv = TRUE, group = "sex")
fit.metric <- cfa(mod.metric, data = myData, std.lv = TRUE, group = "sex")

## does H0 of metric invariance hold?
anova(fit.config, fit.metric)

## If so, you can test the H0 that the slopes are equal
fit.sameSlope <- update(fit.metric, constraints = sameSlope)
anova(fit.metric, fit.sameSlope)

I experimented with a single model, as you mentioned, and the coefficient is not quite different from those from the separate models.

If you used missing = "FIML" to estimate a single-group model, that assumes the slope is equal across sexes without being able to test that assumption because lavaan cannot estimate a latent interaction between puberty and sex. Although you can use product indicators to do so, it would get complicated, so I would recommend the multi-group approach.

Terrence D. Jorgensen

Postdoctoral Researcher, Methods and Statistics

Research Institute for Child Development and Education, the University of Amsterdam

UvA web page: http://www.uva.nl/profile/t.d.jorgensen

Ilhong Yun

unread,

Mar 12, 2018, 7:28:02 AM3/12/18

to lavaan

This is wonderful!

This is exactly what I wanted.

I am so grateful to you, Dr. Jorgensen.

Ilhong

Reply all

Reply to author

Forward