Hello,
I'm trying to use the simulateData() function to generate Likert-esque data. It's fairly straightforward if standardized factor loadings are used, however if unstandardized factor loadings (i.e. some loadings > 1) are used, it runs into difficulty.
It seems the problem arises because when categorical data is generated, variances of the y* underlying y (i.e. the underlying continuous variable formulation) are forced to 1, and the resulting Sigma generated is not positive definite. For example, suppose we use the following simple categorical indicator model:
simple.pop<-'
f1 =~ 0.55*y1 + 1.4*y2 + 1.1*y3 + 0.8*y4
y1 | -2*t1
y2 | -1*t1
y3 | 1*t1
y4 | 2*t1
'
group1<-simulateData(simple.pop,sample.nobs=1000,debug = T)
Error in MASS::mvrnorm(n = sample.nobs[g], mu = Mu.hat[[g]], Sigma = COV, :
'Sigma' is not positive definite
Sure enough the model-implied moments from the debug call are:
[,1] [,2] [,3] [,4]
[1,] 1.000 0.77 0.605 0.44
[2,] 0.770 1.00 1.540 1.12
[3,] 0.605 1.54 1.000 0.88
[4,] 0.440 1.12 0.880 1.00
(As a sense check, I also ran it with all loadings <1, and it runs fine).
I checked the paper the parameterisation is based on:
(p 552, equation 4)
As expected the parameterisation does require that diag(Sigma) is composed of ones. It even imposes this at the expense of having the error variances of y* underlying the observed variables be negative (which would be the implication in the example above).
So my question is, what would be a reasonable way to work around this without forcing my factor loadings to be below 1?
Many thanks in advance.