Simultaneously fitting a regression model for a factor score within an SEM

Curtis Atkisson

unread,

Sep 5, 2024, 7:13:20 PM9/5/24

to blavaan

Hi, all!

I am trying to do something, and I don't know if it will work exclusively within the blavaan framework (i.e., I may need to modify Stan code and run it outside blavaan). I have a set of manifest variables (individual characteristics) from which I would like to estimate factor scores, and then predict some normally distributed outcomes (health) with both the factor scores and the manifest variables (only X1 here). Pretty bog-standard SEM there. But I would also like to predict the factor scores using a different set of manifest variables (infrastructural) that do not connect to the rest of the SEM.

When I run this model:

'eta1 =~ X1 + X2 + X3
eta1 ~ Y1 + Y2 + Y3
Z1 ~ eta1 + X1
Z2 ~ eta1 + X2'

The effects of eta1, X1, and X2 on Z1 and Z2 (respectively) are substantially different than if I run this model:

'eta1 =~ X1 + X2 + X3
Z1 ~ eta1 + X1
Z2 ~ eta1 + X2'

Specifically, the effects are larger and have less overlap with 0 in the model with a regression model predicting eta1. The resulting path diagram of the model including the regression for eta1 is

I've been parsing the resultant Stan code and am partially on the way to understanding what is happening but not quite there yet. I was thinking someone here might know immediately. I am wondering exactly what is happening under the hood here. And maybe, what exactly is predicting z1 and z2 from eta1? Is it something like the residual of the factor score once accounting for the regression model for eta1?

This diagram seems to imply to me that y1, y2, and y3 are now part of the pathway to predicting z1 and z2, but I don't want them to be. What I want to do is simultaneously estimate the effect of the y_i's on the factor score (so that I'm integrating over all the uncertainty in the factor score) without them being on the path to predicting the z_i's. Is that what is happening here? What would I need to do to make that happen? Do I need to take this outside of the blavaan framework to do this?

Thanks!

Ed Merkle

unread,

Sep 5, 2024, 7:21:53 PM9/5/24

to blavaan

This sounds like the issue of "interpretational confounding", where the joint estimation of the full model can lead to unexpected estimates. A remedy is a two-step approach where you first estimate factor scores and then estimate the rest of the structural model. Roy Levy recently wrote about it and proposed a MUPPET approach that partly uses blavaan:

https://www.tandfonline.com/doi/abs/10.1080/10705511.2022.2154214

And I am working on some blavaan procedures to make this two-step approach easier to adapt to other Bayesian models. It is still preliminary, and I can give you further details off list if you are interested.

Ed

Curtis Atkisson

unread,

Sep 5, 2024, 7:27:03 PM9/5/24

to Ed Merkle, blavaan

That's exactly what it is! Thank you for giving me the right language.

I'm highly motivated to avoid the two-stage approach, as I want to properly propagate uncertainty. Levy's approach looks intriguing, and I'll read through it tomorrow. I'd love to see what you're working on in this area. Is there a dev branch or github issue or something you'd want to point me towards? Happy to proceed off list in whatever way is best for you.

Curtis Atkisson

--
You received this message because you are subscribed to a topic in the Google Groups "blavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/blavaan/_jb3mOqweao/unsubscribe.
To unsubscribe from this group and all its topics, send an email to blavaan+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/e3058a7b-15c0-4f14-bff7-b08ed9e00bcfn%40googlegroups.com.

Roy Levy

unread,

Sep 6, 2024, 2:44:03 PM9/6/24

to blavaan

Hi Curtis, Ed, and the group,

I’ll echo Ed in saying this looks to me to be an instance of interpretational confounding, which can loosely be described as the situation where the fitted results for the parameters for one portion of the model differ depending on whether that portion of the model is fitted by itself or embedded in a larger model. This is a phenomenon that researchers don’t want to have happen!

I’ve spent some time working on this, and have developed an approach that prevents this unwanted phenomenon. It involves multi-stage estimation, all the while propagating uncertainty properly. Ed mentioned a paper that came out in 2023:

https://www.tandfonline.com/doi/abs/10.1080/10705511.2022.2154214

that outlines the ideas. The code I wrote for that work was more of a proof of concept, and was somewhat limited in what kinds of models it could handle. I’ve worked to extend its capabilities, some of which is documented in a recent paper that just came out on the approach:

https://journals.sagepub.com/doi/10.3102/10769986241254348

I’ve continued to extend the software’s capabilities, with the hope that it could handle more and more different models and situations, and only require the user to specify blavaan expressions for the portions of the models (just as you wrote in the initial message). If you’re interested, I’d be happy to share the current version of the software with you, and work with you to get it to work for your example. I haven’t tested all the features that I see in your model. But if you’re interested, I’d be interested in coordinating with you to improve the software so that it could handle your situation.

Best wishes,
Roy Levy
Roy....@asu.edu

Reply all

Reply to author

Forward