Dear lavaan group,
I am new to SEM, so please pardon any confusion in terminology. I am attempting to estimate a model with:
-
Three exogenous variables
-
One mediator (endogenous)
-
One endogenous outcome variable
One of my exogenous variables, Knowledge, was assessed using ten statements (K1–K10) with Yes/No/Don’t Know options to measure objective knowledge. I scored all correct responses as 1 and incorrect responses as 0. All other variables are measured on a 5-point Likert scale.
I am considering a two-step approach, creating a composite score for Knowledge from K1–K10 to obtain an observed variable, which I would then integrate with the other latent variables in the SEM. I have seen this approach in recent papers, but I have also read that it may not be the best practice.
I would greatly appreciate guidance on the following:
-
Feasibility: Is it acceptable to use a composite score for Knowledge in SEM alongside latent variables, or could this introduce bias?
-
Measurement model: If I use Knowledge as a composite observed variable, should it be included in the measurement model, or only in the structural model?
-
Estimation: Given that Knowledge would be a composite observed variable (aggregated binary items) and that the other variables are ordinal, which estimation approach would be recommended in lavaan?
-
Alternatives or improvements: If using a composite is not ideal, what would you suggest as a practical alternative for a beginner that keeps the model manageable?
-
References or examples: Any examples of lavaan models that mix a composite observed variable with latent variables would be very helpful.
Thank you for your time and guidance. Any advice for someone new to SEM would be greatly appreciated.