same latent variable but different indicators for boys and girls

264 views
Skip to first unread message

Ilhong Yun

unread,
Mar 10, 2018, 5:22:05 PM3/10/18
to lavaan
Hi, 

This may appear a foolish question, but I desperately need some guidance. 
My model is to predict delinquency from pubertal development status. 
Different indicators are used to measure pubertal development for boys and girls as shown below. 

Is the code for my structural model correct when I run a sem using the total sample including both boys and girls? 


  
#measurement model   
puberty =~ armhair + facehair + voice  (# This is for boys' pubertal status)
puberty =~ breast + body + menarche + selfreport (#This is for girls' pubertal status)

delinquency=~ delinquency1+ delinquency2 +delinquency3

#structural model 
delinquency ~ puberty 

Thanks, 

Ilhong 

Christopher Bratt

unread,
Mar 11, 2018, 5:13:02 AM3/11/18
to lavaan
You can develop two separate models/analyses, and that is the approach I would recommend.

Now you have one latent variable with both boys' and girls' items as indicators. If you combine this into one analysis, you'll have a lot of missing data in these items (about 50% for each, presumably). One might try experimenting with such a single model (adding gender as a covariate in the model), but I would not recommend that you use this approach.

A group-based model makes less sense (one model, two groups).

Ilhong Yun

unread,
Mar 11, 2018, 7:46:18 AM3/11/18
to lavaan
Thanks for the reply, Christopher. 

You are suggesting that I need to divide the total sample into two sub-samples(boys and girls) and run separate sem models. In fact, that is what I have been doing. By doing so, I ended up getting two different regression coefficients. 

However, the reason I asked the original question was that I keep encountering published articles that used the full sample, generating a single regression coefficient (delinquency ~ puberty) for the entire sample (without dividing into boys and girls). 


I experimented with a single model, as you mentioned, and the coefficient is not quite different from those from the separate models. 

Thanks for your thought, 

Ilhong 

Terrence Jorgensen

unread,
Mar 11, 2018, 9:48:39 AM3/11/18
to lavaan
However, the reason I asked the original question was that I keep encountering published articles that used the full sample, generating a single regression coefficient (delinquency ~ puberty) for the entire sample (without dividing into boys and girls). 

In multiple-group models, you can use different variables in different groups:


## unconstrained, configural invariance
mod
.config <- '
group: boy # or group: 1, use whatever your variable labels are

#measurement model  
puberty.b =~ armhair + facehair + voice  (# This is for boys'
pubertal status)
delinquency
=~ delinquency1 + delinquency2 + delinquency3
#structural model
delinquency
~ boy.slope*puberty.b

group: girl # or group: 2, use whatever your variable labels are

#measurement model  
puberty
.g =~ breast + body + menarche + selfreport (#This is for girls' pubertal status)

delinquency
=~ delinquency1 + delinquency2 + delinquency3
#structural model

delinquency
~ girl.slope*puberty.g
'

## add metric-invariance constraints
mod.metric <- '

group: boy # or group: 1, use whatever your variable labels are

#measurement model  
puberty
.b =~ armhair + facehair + voice  (# This is for boys' pubertal status)
delinquency
=~ L1*delinquency1 + L2*delinquency2 + L3*delinquency3
delinquency
~~ NA*delinquency # fixing to 1 no longer need for identification
#structural model
delinquency
~ boy.slope*puberty.b

group: girl # or group: 2, use whatever your variable labels are

#measurement model  
puberty
.g =~ breast + body + menarche + selfreport (#This is for girls' pubertal status)
delinquency
=~ L1*delinquency1 + L2*delinquency2 + L3*delinquency3
#structural model
delinquency
~ girl.slope*puberty.g
'

## specify equality constraint to test
sameSlope <- '
girl.slope == boy.slope '

## fit models
fit.config <- cfa(mod.config, data = myData, std.lv = TRUE, group = "sex")
fit.metric <- cfa(mod.metric, data = myData, std.lv = TRUE, group = "sex")

## does H0 of metric invariance hold?
anova(fit.config, fit.metric)

## If so, you can test the H0 that the slopes are equal
fit.sameSlope <- update(fit.metric, constraints = sameSlope)
anova(fit.metric, fit.sameSlope)


I experimented with a single model, as you mentioned, and the coefficient is not quite different from those from the separate models. 

If you used missing = "FIML" to estimate a single-group model, that assumes the slope is equal across sexes without being able to test that assumption because lavaan cannot estimate a latent interaction between puberty and sex.  Although you can use product indicators to do so, it would get complicated, so I would recommend the multi-group approach.

Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Ilhong Yun

unread,
Mar 12, 2018, 7:28:02 AM3/12/18
to lavaan
This is wonderful! 
This is exactly what I wanted. 
I am so grateful to you, Dr. Jorgensen. 

Ilhong 

Reply all
Reply to author
Forward
0 new messages