# Using marginal maximum likelihood for categorical estimators

73 views

### Damiano D'Urso

Feb 7, 2019, 7:51:07 AM2/7/19
to lavaan
Hello,

I am currently try to use the marginal maximum likelihood estimation techniques to fit a categorical factor analysis model but I constantly run into this error:

Error in GLIST[[mm]] : subscript out of bounds
1: In if (n < 1L) return(integer(0L)) :
the condition has length > 1 and only the first element will be used
2: In if (n == 1L) return(1L) :
the condition has length > 1 and only the first element will be used
3: In n:2L : numerical expression has 2 elements: only the first used
4: In 1:nvar : numerical expression has 2 elements: only the first used

Here you can find an example of the code I am using both to generate the data and estimate the model:

###########################################################################################################################################
sampleSize <- 1000 #subjects per group
Groups <- 2

#setting aribtrary thresholds
thresh1 <- c(-2,0,2)
thresh2 <- c(-1,0.5,1.6)
thresh3 <- c(-1.5,0.9,1.8)
thresh4 <- c(-1.7,0.3,1.4)
thresh5 <- c(-2.1,0.8,1.7)
thresh6 <- c(-1.2,0.4,1.2)

#use a factor structure as data generating mechanism
F <- matrix(c(.9,.8, .7 , .6, .5, .8), ncol = 1) #the model
rownames(F) <- paste('V', seq(1:6), sep = '')#add labels
colnames(F) <- c('F1')

R <- F%*%t(F) # create the correlation matrix
diag(R) <- 1 # adjust the diagonal to the matrix
R

#Create two groups using the same starting values and add a grouping variable

Data <- data.frame(NA)
Data2facTot <- list(NA)
DataOrdered2fac <- list()

for (i in 1:Groups){

Data2facTot[[i]]<- as.data.frame(rmvnorm(sampleSize, sigma = R))

# Make categorical:
Data2facTot[[i]][,1] <- as.numeric(cut(Data2facTot[[i]][,1],breaks = c(-Inf,thresh1,Inf)))
Data2facTot[[i]][,2] <- as.numeric(cut(Data2facTot[[i]][,2],breaks = c(-Inf,thresh2,Inf)))
Data2facTot[[i]][,3] <- as.numeric(cut(Data2facTot[[i]][,3],breaks = c(-Inf,thresh3,Inf)))
Data2facTot[[i]][,4] <- as.numeric(cut(Data2facTot[[i]][,4],breaks = c(-Inf,thresh4,Inf)))
Data2facTot[[i]][,5] <- as.numeric(cut(Data2facTot[[i]][,5],breaks = c(-Inf,thresh5,Inf)))
Data2facTot[[i]][,6] <- as.numeric(cut(Data2facTot[[i]][,6],breaks = c(-Inf,thresh6,Inf)))

#Now calculate polychoric correlation
DataOrdered2fac[[i]] <- Data2facTot[[i]]
DataOrdered2fac[[i]][,1] <- ordered(Data2facTot[[i]][,1])
DataOrdered2fac[[i]][,2] <- ordered(Data2facTot[[i]][,2])
DataOrdered2fac[[i]][,3] <- ordered(Data2facTot[[i]][,3])
DataOrdered2fac[[i]][,4] <- ordered(Data2facTot[[i]][,4])
DataOrdered2fac[[i]][,5] <- ordered(Data2facTot[[i]][,5])
DataOrdered2fac[[i]][,6] <- ordered(Data2facTot[[i]][,6])

names(DataOrdered2fac[[i]]) <- c("a","b",'c','d','e','f')
}

Groupingvar <- c(rep('Tomatoes',sampleSize), rep('Potatoes', sampleSize))
AggregatedData <- Reduce(rbind, DataOrdered2fac)
AggregatedData\$Groupingvar <- Groupingvar

AggregatedDataNum <- Reduce(rbind, Data2facTot)
AggregatedDataNum\$Groupingvar <- Groupingvar

lavCor(DataOrdered2fac)

#compare lavcor with original cor mat generated from 2 factor model
lavCor(DataOrdered2fac)
R

#################################################################################################
#FIT THE MODEL CAN BE DONE USING DIFFERENT SETS OF CONSTRAINTS, THIS IS LAVAAN DEFAULT

#MODEL
Model2fac <- '
#Latent variable definition
F1 =~ a + b + c + d + e + f

#thresholds
a | t1 + t2 + t3
b | t1 + t2 + t3
c | t1 + t2 + t3
d | t1 + t2 + t3
e | t1 + t2 + t3
f | t1 + t2 + t3

'

#FIT DEFAULT LAVAAN MODEL
fit <- cfa(Model2fac,
AggregatedData,
group = 'Groupingvar',
#group.equal = 'thresholds',
estimator = 'MML')

And then I get the error!
Is there any problem in this version or am I doing something wrong ?

### Terrence Jorgensen

Feb 8, 2019, 6:32:18 AM2/8/19
to lavaan
I am currently try to use the marginal maximum likelihood estimation techniques to fit a categorical factor analysis model but I constantly run into this error:

Your example doesn't work because AggregatedData is a list containing only the vector Groupingvar.

Is there any problem in this version or am I doing something wrong ?

Do you have the latest development version installed?

Does this example work for you?

`myData <- read.table("http://www.statmodel.com/usersguide/chap5/ex5.16.dat")names(myData) <- c("u1","u2","u3","u4","u5","u6","x1","x2","x3","g")myData\$u1 <- ordered(myData\$u1)myData\$u2 <- ordered(myData\$u2)myData\$u3 <- ordered(myData\$u3)myData\$u4 <- ordered(myData\$u4)myData\$u5 <- ordered(myData\$u5)myData\$u6 <- ordered(myData\$u6)model <- 'f1 =~ u1 + u2 + u3f2 =~ u4 + u5 + u6'fit <- cfa(model, data = myData, estimator = 'MML')`

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

### Damiano D'Urso

Feb 11, 2019, 5:30:30 AM2/11/19
to lavaan
Dear Dr. Jorgensen,

In the original message I forgot to specify that my goal was to use MML estimation to test for measurement invariance.

The code you sent works in the case of one group but once I try to estimate the model specifying that I have multiple groups the function breaks again and I run into the same error.

Here's how I did it using your example:

`library(lavaan)`
`myData <- read.table("http://www.statmodel.com/usersguide/chap5/ex5.16.dat")names(myData) <- c("u1","u2","u3","u4","u5","u6","x1","x2","x3","g")myData\$u1 <- ordered(myData\$u1)myData\$u2 <- ordered(myData\$u2)myData\$u3 <- ordered(myData\$u3)myData\$u4 <- ordered(myData\$u4)myData\$u5 <- ordered(myData\$u5)myData\$u6 <- ordered(myData\$u6)`
`myData\$g[1050:2200] = 2`
`model <- 'f1 =~ u1 + u2 + u3f2 =~ u4 + u5 + u6'`
`fit <- cfa(model, data = myData, group = 'g',estimator = 'MML')summary(fit)`

Is it a bug or for multiple groups the MML estimation has not been implemented yet?

p.s. I might have sent this privately but I couldn't find the private message and that's why I am writing it again to make it public.