Can I know the intraclass correlation (ICC) in mixedmirt() products?

277 views
Skip to first unread message

Seongho Bae

unread,
Mar 13, 2017, 5:32:15 AM3/13/17
to mirt-package
Hi, Phil.

Can I know the intraclass correlation (ICC) in mixedmirt() products? If can I, How to estimate it?

Seongho

Phil Chalmers

unread,
Mar 14, 2017, 4:39:39 PM3/14/17
to Seongho Bae, mirt-package
It's just the ratio of relevant nested variance terms. So if you have a group variance and an individual observation variance term then ICC = var_g / (var_g + var_i) w.r.t the group cluster. HTH.

Phil

--
You received this message because you are subscribed to the Google Groups "mirt-package" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elia Emmers

unread,
Sep 28, 2017, 3:32:05 AM9/28/17
to mirt-package
Hi. I'm bringing back this thread just to keep all the intraclass correlation questions within the same thread. I've been looking for a quick example on how to do this but didn't really find anything. 

For instance, if I take the xample from the R documentation for the mixedmirt() function (https://www.rdocumentation.org/packages/mirt/versions/1.25/topics/mixedmirt)
and I do:

set.seed(1234)
N <- 750
a <- matrix(rlnorm(10,.3,1),10,1)
d <- matrix(rnorm(10), 10)
Theta <- matrix(sort(rnorm(N)))
pseudoIQ <- Theta * 5 + 100  + rnorm(N, 0 , 5)
pseudoIQ <- (pseudoIQ - mean(pseudoIQ))/10  #rescale variable for numerical stability
group <- factor(rep(c('G1','G2','G3'), each = N/3))
data <- simdata(a,d,N, itemtype = rep('2PL',10), Theta=Theta)
covdata <- data.frame(group, pseudoIQ)
#use parallel computing
mirtCluster()


mod1 <- mixedmirt(data, covdata, model, fixed = ~ 0 + group + items)

When I look at the summary function for 'mod1' I find:

--------------
RANDOM EFFECT COVARIANCE(S):
Correlations on upper diagonal

$Theta
       Theta
Theta 0.0719

So is Theta=.0719 what Phil wrote down as "var_g" ? And how do I obtain var_i? Is var_i something like:

>aa <- randef(mod1)
var(aa$Theta)
            Theta
Theta 0.008653158


So that 0.0719/0.0719+0.008653158 is the intra class correlation?

Phil Chalmers

unread,
Sep 28, 2017, 9:40:40 AM9/28/17
to Elia Emmers, mirt-package
Not in this case no (though the general idea is correct). This model only has 1 random effect term, the variance in the Theta's (or in the vernacular of MLMs, a level 1 random effect). You would need a second random effect, such as a grouping variable (e.g., individuals nested within schools), to obtain an ICC. What's being done here is just a summation of the model-implied variance, and the estimated variance from the same model obtained through plausible value imputations (so, basically redundant information). HTH.

Phil

Elia Emmers

unread,
Sep 28, 2017, 12:54:20 PM9/28/17
to mirt-package
Once I saw my own code, I realized as well that I'm basically taking ratios of the same variance of Theta. You are so right, thank you for pointing it out. No more coding after midnight ;)

In any case, elaborating from the mixedmirt() R documentation example:

....

covdata$group <- factor(rep(paste0('G',1:50), each = N/50))


rmod1 <- mixedmirt(data, covdata, 1, fixed = ~ 0 + items, random = ~ 1|group)
summary(rmod1)

Call:
mixedmirt(data = data, covdata = covdata, model = 1, fixed = ~0 + 
    items, random = ~1 | group)


--------------
RANDOM EFFECT COVARIANCE(S):
Correlations on upper diagonal

$Theta
       F1
F1 0.0498

$group
          COV_group
COV_group      1.08

What I should be doing to get an estimate of the intra class correlaion in this case should be:

> 1.108 / (1.108+0.0498)
[1] 0.9569874


This should be the correct intra class correlation, right? 

In a related query, I saw your article on "ordinal alpha". Thank-you very much for writing it. It was about time someone did. One thing I was hoping to see were recommendations for applied researchers like myself about which other estimates of reliability are out there when we have ordinal data, particularly binary and 3-point scales. I am used to CFA-based reliability estimates where I fit 1-Factor models to the polychoric or tetrachoric correlation matrix and take the reliability as some ratio of the factor loadings to error variances (like McDonald's omega). Would this approach be more sensible than 'ordinal alpha'? Or is there a better-informed, IRT approach that you would recommend?

Phil Chalmers

unread,
Sep 28, 2017, 1:09:44 PM9/28/17
to Elia Emmers, mirt-package
On Thu, Sep 28, 2017 at 12:54 PM, Elia Emmers <eliaemmers@gmail.com> wrote:
Once I saw my own code, I realized as well that I'm basically taking ratios of the same variance of Theta. You are so right, thank you for pointing it out. No more coding after midnight ;)

Coding after midnight? Probably fine. Posting questions publically after midnight? Probably not (we've all been there before, so no worries). 
 

In any case, elaborating from the mixedmirt() R documentation example:

....

covdata$group <- factor(rep(paste0('G',1:50), each = N/50))


rmod1 <- mixedmirt(data, covdata, 1, fixed = ~ 0 + items, random = ~ 1|group)
summary(rmod1)

Call:
mixedmirt(data = data, covdata = covdata, model = 1, fixed = ~0 + 
    items, random = ~1 | group)


--------------
RANDOM EFFECT COVARIANCE(S):
Correlations on upper diagonal

$Theta
       F1
F1 0.0498

$group
          COV_group
COV_group      1.08

What I should be doing to get an estimate of the intra class correlaion in this case should be:

> 1.108 / (1.108+0.0498)
[1] 0.9569874


This should be the correct intra class correlation, right? \

Correct. Basically, this ratio says that "knowing the group membership alone will predict 95.6% of the variability in the ability parameters"....so knowing the grouping tells you nearly everything about the separation between individuals.  

In a related query, I saw your article on "ordinal alpha". Thank-you very much for writing it. It was about time someone did. One thing I was hoping to see were recommendations for applied researchers like myself about which other estimates of reliability are out there when we have ordinal data, particularly binary and 3-point scales. I am used to CFA-based reliability estimates where I fit 1-Factor models to the polychoric or tetrachoric correlation matrix and take the reliability as some ratio of the factor loadings to error variances (like McDonald's omega). Would this approach be more sensible than 'ordinal alpha'? Or is there a better-informed, IRT approach that you would recommend?

Thanks for the kind words. Omega is still fine, just remember to use the correlation based on *observed* data. While polychoric/tetrachoric correlations are nice to obtain sufficient statistics for estimating other model-implied parameters, when it comes to describing the behaviour of observed variables (such as weighted/unweighted total scores) you need to pick suitable information. In that sense, estimates like coefficient alpha are still fine for ordinal data, and will correlate quite highly with anything latent trait models like IRT will provide (see the empirical_rxx() function in mirt, and compare this to alpha; this works for other models though, not just ordinal, so it's more general). For your CFA example just fit your model to the observed covariance matrix and compute omega or whatever other reliability estimate from that, because that will be considerably more appropriate. HTH.

Phil

Elia Emmers

unread,
Sep 29, 2017, 5:03:02 AM9/29/17
to mirt-package
You touch on an important point there that I’d appreciate if you could help me understand a little bit more. Throughout most of what I’ve learnt (mostly in the context of SEM, slowly progressing towards IRT) when I deal with categorical data the process is usually something like observed ordinal data --> polychoric correlation matrix ---> CFA fitted to the polychoric correlation matrix. Even the classic Muthen (1984) page #120 says:

 “… consider situations with y's being ordered categorical or continuous. Here, polychoric, polyserial, and ordinary Pearson product-moment correlations may be analyzed in Case A. The first two estimation stages estimate polychoric and polyserial correlations in the same way as discussed in Olsson (1979a) and Olsson et al. (1982).” 

So the preponderance of the analysis is on the polychoric/tetrachoric matrix. CFA measurement models estimate their loadings and error variances from this matrix and, as in my case, using McDonald’s omega usually means you’re fitting a CFA to ordinal data using Muthen’s approach of diagonally-weighted least squares. What you suggest, however, is to ignore the ordinal nature of the data and fit the CFA models to the observed covarianace matrix if I want to obtain omega? Why would it be the case that if I want to estimate and interpret a regular measurement model I should use the polychoric correlation matrix but if I want McDonald’s omega I should use the observed covariance matrix? Does it pertain on whether or not I’m making inferences about the observed scores VS the hypothetical, continuous latent y* (using Muthen’s notation for y-star)? I’m a little confused now :(

Phil Chalmers

unread,
Sep 29, 2017, 10:19:41 AM9/29/17
to Elia Emmers, mirt-package
Using polychoric correlations is only a means to an end to obtain model parameters as it is *implied* by the ogive/probit model. So they are fine for parameter estimation and inferences, but if you were to compute expected values using the usual

                                              Y* = δ + λ f + e​

where f is the factor score, then this assumes a continuity which is not observed. Hence, a transformation function for Y* must be used to change the continuous score to categorical data (Y). If you use the polychoric for reliability, which is about observable composite variables, then you should be trying to make statements about Y rather than Y*.   

The analogue in the IRT context would be someone who (a) estimates IRT models, (b) substitutes them into a linear model instead of the IRT model, and (c) drawing inferences based on the linear composite about reliability of the observed scores. That would clearly be a problem, and obviously rather silly, however this is the same flavour that one has to be careful with in the SEM/CFA context. HTH.

Phil

--

Elia Emmers

unread,
Sep 30, 2017, 4:12:07 AM9/30/17
to mirt-package
Thank-you for taking the time to explain this to me. I was suspecting it had something to do with jumping from Y* to Y but you've helped me see it goes deeper to the point of deciding whether you're making claims about Y or Y*. If you want to say something about Y, keep the analysis at the Y level and if you want to make claims about Y* then switch approaches.

This was super insightful. Once again, thank you. 
Reply all
Reply to author
Forward
0 new messages