Hi Marcus,
Thank you for your thoughtful suggestions.
In my analysis I use year as the grouping variable and for the first model I consider 5 time points. My first approach was to fit the measurement model with only one endogenous latent variable and 6 indicator variables. Because of the multicollinearity issues I had to remove one variable. Then I fitted another multigroup model with 5 indicators and one latent variable. But RMSEA of that model was higher. Then EFA suggested two factors. So I modified my model to two endogenous latent variables and RMSEA value of the model was below 0.08. Model is as follows.
basic_model <- 'attitude1 =~ imsmetn + impcntr
attitude2 =~ imbgeco + imueclt + imwbcnt
attitude1 ~~ attitude2'I was able to confirm that the model holds measurement invariance. Then I compared latent means and identified the two time points with most noticeable change. My second modeling approach is to assess how latent mean change within those two time periods with the addition of demographic and socioeconomic factors to the model. Now I'm facing warning issues with that model.
I have indeed attempted a stepwise approach, starting without the grouping variable and advancing to a model with a single latent variable. Despite this, the warning messages persist. My initial exploratory factor analysis suggested two factors, which led to the current two-latent-variable model that satisfies measurement invariance with an RMSEA below 0.08.
Regarding your inquiries, the sample size is 7005, and I've ensured the variables are distributed appropriately with no zero-inflation issues. My dataset structure is as follows:
> str(swedish_df_age)
'data.frame': 7005 obs. of 12 variables:
$ essround: int 6 6 6 6 6 6 6 6 6 6 ...
$ imsmetn : int 3 4 4 3 3 2 4 4 3 3 ...
$ imdfetn : int 3 4 4 3 3 2 3 4 2 2 ...
$ impcntr : int 4 4 4 2 3 2 3 4 2 2 ...
$ imbgeco : int 3 5 4 2 4 4 2 4 2 1 ...
$ imueclt : int 4 4 4 4 4 4 4 4 2 4 ...
$ imwbcnt : int 4 3 4 2 4 3 4 4 2 2 ...
$ gndr : int 2 1 1 1 1 1 2 2 2 2 ...
$ agea : int 4 4 4 4 4 4 4 4 4 4 ...
$ domicil : int 2 5 3 3 4 1 1 4 4 3 ...
$ eisced : int 1 1 1 4 3 1 1 1 1 1 ...
$ hinctnta: int 1 1 1 2 2 2 1 1 1 1 ...
> summary(swedish_df_age)
essround imsmetn imdfetn impcntr imbgeco imueclt
Min. : 6.000 Min. :1.000 Min. :1.000 Min. :1.00 Min. :1.000 Min. :1.000
1st Qu.: 7.000 1st Qu.:3.000 1st Qu.:3.000 1st Qu.:3.00 1st Qu.:2.000 1st Qu.:3.000
Median : 8.000 Median :3.000 Median :3.000 Median :3.00 Median :3.000 Median :4.000
Mean : 8.112 Mean :3.247 Mean :3.169 Mean :3.09 Mean :3.221 Mean :3.783
3rd Qu.:10.000 3rd Qu.:4.000 3rd Qu.:4.000 3rd Qu.:4.00 3rd Qu.:4.000 3rd Qu.:4.000
Max. :10.000 Max. :4.000 Max. :4.000 Max. :4.00 Max. :5.000 Max. :5.000
imwbcnt gndr agea domicil eisced hinctnta
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.00 Min. :1.00 Min. :1.000
1st Qu.:3.000 1st Qu.:1.000 1st Qu.:2.000 1st Qu.:2.00 1st Qu.:3.00 1st Qu.:2.000
Median :4.000 Median :1.000 Median :2.000 Median :3.00 Median :4.00 Median :2.000
Mean :3.508 Mean :1.489 Mean :2.335 Mean :2.87 Mean :3.52 Mean :2.208
3rd Qu.:4.000 3rd Qu.:2.000 3rd Qu.:3.000 3rd Qu.:4.00 3rd Qu.:5.00 3rd Qu.:3.000
Max. :5.000 Max. :2.000 Max. :4.000 Max. :5.00 Max. :5.00 Max. :3.000 I have also attempted fitting single construct CFAs. However, the standard errors couldn't be computed due to the non-inversion of the information matrix (same warning issues appear).
Regarding the correlation between demographic and socioeconomic factors, I inserted the recommended restriction, but it did not resolve the issue.
As for the syntax
demographic <~ agea + gndr + domicil, it indicates a formative model. Since I have a formative model associated with demographic and socioeconomic factors, I searched online and found that operator to form the formative relationships.
I appreciate your assistance and am open to any further insights you may have.
Warm regards,
Dinesha Dissanayake