Using marginal maximum likelihood for categorical estimators

84 views
Skip to first unread message

Damiano D'Urso

unread,
Feb 7, 2019, 7:51:07 AM2/7/19
to lavaan
Hello,

I am currently try to use the marginal maximum likelihood estimation techniques to fit a categorical factor analysis model but I constantly run into this error: 

Error in GLIST[[mm]] : subscript out of bounds
In addition: Warning messages:
1: In if (n < 1L) return(integer(0L)) :
  the condition has length > 1 and only the first element will be used
2: In if (n == 1L) return(1L) :
  the condition has length > 1 and only the first element will be used
3: In n:2L : numerical expression has 2 elements: only the first used
4: In 1:nvar : numerical expression has 2 elements: only the first used

Here you can find an example of the code I am using both to generate the data and estimate the model:



###########################################################################################################################################
sampleSize <- 1000 #subjects per group
Groups <- 2

#setting aribtrary thresholds
thresh1 <- c(-2,0,2)
thresh2 <- c(-1,0.5,1.6)
thresh3 <- c(-1.5,0.9,1.8)
thresh4 <- c(-1.7,0.3,1.4)
thresh5 <- c(-2.1,0.8,1.7)
thresh6 <- c(-1.2,0.4,1.2)

#use a factor structure as data generating mechanism
F <- matrix(c(.9,.8, .7 , .6, .5, .8), ncol = 1) #the model 
rownames(F) <- paste('V', seq(1:6), sep = '')#add labels
colnames(F) <- c('F1')

R <- F%*%t(F) # create the correlation matrix
diag(R) <- 1 # adjust the diagonal to the matrix
R

#Create two groups using the same starting values and add a grouping variable

Data <- data.frame(NA)
Data2facTot <- list(NA)
DataOrdered2fac <- list()

for (i in 1:Groups){ 
  
Data2facTot[[i]]<- as.data.frame(rmvnorm(sampleSize, sigma = R))

# Make categorical:
Data2facTot[[i]][,1] <- as.numeric(cut(Data2facTot[[i]][,1],breaks = c(-Inf,thresh1,Inf)))
Data2facTot[[i]][,2] <- as.numeric(cut(Data2facTot[[i]][,2],breaks = c(-Inf,thresh2,Inf)))
Data2facTot[[i]][,3] <- as.numeric(cut(Data2facTot[[i]][,3],breaks = c(-Inf,thresh3,Inf)))
Data2facTot[[i]][,4] <- as.numeric(cut(Data2facTot[[i]][,4],breaks = c(-Inf,thresh4,Inf)))
Data2facTot[[i]][,5] <- as.numeric(cut(Data2facTot[[i]][,5],breaks = c(-Inf,thresh5,Inf)))
Data2facTot[[i]][,6] <- as.numeric(cut(Data2facTot[[i]][,6],breaks = c(-Inf,thresh6,Inf)))

#Now calculate polychoric correlation
DataOrdered2fac[[i]] <- Data2facTot[[i]]
DataOrdered2fac[[i]][,1] <- ordered(Data2facTot[[i]][,1])
DataOrdered2fac[[i]][,2] <- ordered(Data2facTot[[i]][,2])
DataOrdered2fac[[i]][,3] <- ordered(Data2facTot[[i]][,3])
DataOrdered2fac[[i]][,4] <- ordered(Data2facTot[[i]][,4])
DataOrdered2fac[[i]][,5] <- ordered(Data2facTot[[i]][,5])
DataOrdered2fac[[i]][,6] <- ordered(Data2facTot[[i]][,6])

names(DataOrdered2fac[[i]]) <- c("a","b",'c','d','e','f')
}

Groupingvar <- c(rep('Tomatoes',sampleSize), rep('Potatoes', sampleSize))
AggregatedData <- Reduce(rbind, DataOrdered2fac)
AggregatedData$Groupingvar <- Groupingvar

AggregatedDataNum <- Reduce(rbind, Data2facTot)
AggregatedDataNum$Groupingvar <- Groupingvar

lavCor(DataOrdered2fac)

#compare lavcor with original cor mat generated from 2 factor model
lavCor(DataOrdered2fac)
R

#################################################################################################
#FIT THE MODEL CAN BE DONE USING DIFFERENT SETS OF CONSTRAINTS, THIS IS LAVAAN DEFAULT

#MODEL
Model2fac <- '
#Latent variable definition
F1 =~ a + b + c + d + e + f

#thresholds
a | t1 + t2 + t3
b | t1 + t2 + t3
c | t1 + t2 + t3
d | t1 + t2 + t3
e | t1 + t2 + t3
f | t1 + t2 + t3

'

#FIT DEFAULT LAVAAN MODEL
fit <- cfa(Model2fac, 
           AggregatedData, 
           group = 'Groupingvar', 
           #group.equal = 'thresholds',
           estimator = 'MML')


And then I get the error!
Is there any problem in this version or am I doing something wrong ?

Thanks for any help you can provide me 

Terrence Jorgensen

unread,
Feb 8, 2019, 6:32:18 AM2/8/19
to lavaan
I am currently try to use the marginal maximum likelihood estimation techniques to fit a categorical factor analysis model but I constantly run into this error: 

Your example doesn't work because AggregatedData is a list containing only the vector Groupingvar.

Is there any problem in this version or am I doing something wrong ?

Do you have the latest development version installed?


Does this example work for you?

myData <- read.table("http://www.statmodel.com/usersguide/chap5/ex5.16.dat")
names
(myData) <- c("u1","u2","u3","u4","u5","u6","x1","x2","x3","g")
myData$u1
<- ordered(myData$u1)
myData$u2
<- ordered(myData$u2)
myData$u3
<- ordered(myData$u3)
myData$u4
<- ordered(myData$u4)
myData$u5
<- ordered(myData$u5)
myData$u6
<- ordered(myData$u6)

model
<- '
f1 =~ u1 + u2 + u3
f2 =~ u4 + u5 + u6
'

fit
<- cfa(model, data = myData, estimator = 'MML')


Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Damiano D'Urso

unread,
Feb 11, 2019, 5:30:30 AM2/11/19
to lavaan
Dear Dr. Jorgensen,

Thank you for your answer!

In the original message I forgot to specify that my goal was to use MML estimation to test for measurement invariance.

The code you sent works in the case of one group but once I try to estimate the model specifying that I have multiple groups the function breaks again and I run into the same error.

Here's how I did it using your example:

library(lavaan)



myData
<- read.table("http://www.statmodel.com/usersguide/chap5/ex5.16.dat")
names
(myData) <- c("u1","u2","u3","u4","u5","u6","x1","x2","x3","g")
myData$u1
<- ordered(myData$u1)
myData$u2
<- ordered(myData$u2)
myData$u3
<- ordered(myData$u3)
myData$u4
<- ordered(myData$u4)
myData$u5
<- ordered(myData$u5)
myData$u6
<- ordered(myData$u6)



myData$g
[1050:2200] = 2



model
<- '
f1 =~ u1 + u2 + u3
f2 =~ u4 + u5 + u6
'

fit
<- cfa(model, data = myData, group = 'g',estimator = 'MML')




summary
(fit)




Is it a bug or for multiple groups the MML estimation has not been implemented yet?

Thanks again for your help.

p.s. I might have sent this privately but I couldn't find the private message and that's why I am writing it again to make it public.

Terrence Jorgensen

unread,
Feb 19, 2019, 2:51:48 PM2/19/19
to lavaan
The code you sent works in the case of one group but once I try to estimate the model specifying that I have multiple groups the function breaks again and I run into the same error.

For me too.  Yves might not have implemented this for multiple groups yet.  You could try it with a simpler multigroup model first to see if the error still occurs (e.g., 'u1 ~ u2')
Reply all
Reply to author
Forward
0 new messages