multilevel multidimensional Rasch PCM model

451 views
Skip to first unread message

Diah Wihardini

unread,
Mar 2, 2016, 1:34:30 PM3/2/16
to mirt-package
Hi Phil,

I hope this email finds you well. 

I am excited to finally find "mirt"  that apparently can do generalized linear mixed models, and have just started learning about it. My aim is to run a multilevel (3-levels: items nested in students and students nested in schools) and multidimensional Rasch model with partial-credit scoring for the polytomous items. Please kindly advise if the function "mixedirt" would actually be the best option to run my proposed model.

Looking forward to hearing from you.

many thanks,

Diah

Phil Chalmers

unread,
Mar 3, 2016, 11:28:03 AM3/3/16
to Diah Wihardini, mirt-package
Hi Diah,

Yes, mixedmirt() should be able to fit multilevel Rasch models for the partial credit model. Though random item effects are currently not possible (this would require setting up Rating-scale models for each item, which I haven't gotten around to yet and have little inspiration to do), but perhaps one day I'll add this feature in. Cheers.

Phil

--
You received this message because you are subscribed to the Google Groups "mirt-package" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Diah Wihardini

unread,
Mar 3, 2016, 3:46:42 PM3/3/16
to Phil Chalmers, mirt-package
Many thanks for the confirmation! I really appreciate it. Keep the good work!

best,

Diah

Diah Wihardini

unread,
Mar 15, 2016, 12:37:45 AM3/15/16
to mirt-package, rphilip....@gmail.com
Hi Phil,

Could you please advise if my codes below are correct to run an unconditional unidimensional PCM 3-level model with school ID as the group indicator? 


group <- mydata$SCHOOLID

## To prepare the dataframe to include covariates later
lr_eqn <- as.data.frame(group)
rmod1b <- mixedmirt(dat,covdata = lr_eqn, model = 1, fixed = ~ 0, random = ~ 1 | group)


With a dataset of about 5000 students taking 70 items and quite a number of missing items, it took me about 4 hours to run 2000 iterations and gave out:

Stage 3 = 2000, LL = -303992.9, AR(0.50; 2.45) = [0.32; 0.04], gam = 0.0006, Max-Change = 0.0513
MHRM terminated after 2000 iterations.

and


RANDOM EFFECT COVARIANCE(S):
Correlations on upper diagonal

$Theta
    F1
F1 248

$group
          COV_group
COV_group    0.0902


The level-2 variance is definitely too big; is it probably due to the missingness? So that the model calibration couldn't converge? 

Looking forward to hearing from you.

many thanks in advance,

Diah

Phil

To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package+unsubscribe@googlegroups.com.

Phil Chalmers

unread,
Mar 15, 2016, 9:48:48 AM3/15/16
to Diah Wihardini, mirt-package
On Tue, Mar 15, 2016 at 12:37 AM, Diah Wihardini <diah.wi...@gmail.com> wrote:
Hi Phil,

Could you please advise if my codes below are correct to run an unconditional unidimensional PCM 3-level model with school ID as the group indicator? 


group <- mydata$SCHOOLID

## To prepare the dataframe to include covariates later
lr_eqn <- as.data.frame(group)
rmod1b <- mixedmirt(dat,covdata = lr_eqn, model = 1, fixed = ~ 0, random = ~ 1 | group)


Does group appear as a variable in lr_eqn? It's hard to tell where group is coming from here do to the scope. Otherwise it looks fine for the most part.

That being said, if you have a lot of groups then this should run faster:

rmod1b <- mixedmirt(dat,covdata = lr_eqn, model = 1, fixed = ~ 0, lr.random = ~ 1 | group)

The lr.random terms of really for multilevel models which decompose the ability terms, so this is asymptotically equivalent above but may be faster computationally.




With a dataset of about 5000 students taking 70 items and quite a number of missing items, it took me about 4 hours to run 2000 iterations and gave out:

Stage 3 = 2000, LL = -303992.9, AR(0.50; 2.45) = [0.32; 0.04], gam = 0.0006, Max-Change = 0.0513
MHRM terminated after 2000 iterations.

and


RANDOM EFFECT COVARIANCE(S):
Correlations on upper diagonal

$Theta
    F1
F1 248

$group
          COV_group
COV_group    0.0902


The level-2 variance is definitely too big; is it probably due to the missingness? So that the model calibration couldn't converge? 

Hmmm, that's a bit too large for my liking. The model is either not identified empirically, or something else is up coding wise. Try running the other syntax above first, and if the problem persists then I'll take a closer look.

Phil
 

Looking forward to hearing from you.

many thanks in advance,

Diah



On Thursday, March 3, 2016 at 12:46:42 PM UTC-8, Diah Wihardini wrote:
Many thanks for the confirmation! I really appreciate it. Keep the good work!

best,

Diah

On Thu, Mar 3, 2016 at 8:28 AM, Phil Chalmers <rphilip....@gmail.com> wrote:
Hi Diah,

Yes, mixedmirt() should be able to fit multilevel Rasch models for the partial credit model. Though random item effects are currently not possible (this would require setting up Rating-scale models for each item, which I haven't gotten around to yet and have little inspiration to do), but perhaps one day I'll add this feature in. Cheers.

Phil

On Wed, Mar 2, 2016 at 1:34 PM, Diah Wihardini <diah.wi...@gmail.com> wrote:
Hi Phil,

I hope this email finds you well. 

I am excited to finally find "mirt"  that apparently can do generalized linear mixed models, and have just started learning about it. My aim is to run a multilevel (3-levels: items nested in students and students nested in schools) and multidimensional Rasch model with partial-credit scoring for the polytomous items. Please kindly advise if the function "mixedirt" would actually be the best option to run my proposed model.

Looking forward to hearing from you.

many thanks,

Diah

--
You received this message because you are subscribed to the Google Groups "mirt-package" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package...@googlegroups.com.

Diah Wihardini

unread,
Mar 15, 2016, 11:13:17 AM3/15/16
to Phil Chalmers, mirt-package
Many thanks for the quick reply, Phil. 

Yes, group (209 schools) appears as a variable in lr_eqn with values ranging from 1 - 209. It's PISA data and thus about 30% of data is missing here and there. I've tried to use lr.random; but an error message appears "Error: lr.random input not yet supported". I'm using mirt_1.15.

really appreciate your response,

Diah

Phil Chalmers

unread,
Mar 15, 2016, 11:14:36 AM3/15/16
to Diah Wihardini, mirt-package
Update to 1.16, it should be available now. Cheers.

Phil

Diah Wihardini

unread,
Mar 15, 2016, 12:56:19 PM3/15/16
to Phil Chalmers, mirt-package
Hi Phil,

Great; I just updated the package and ran my model. It did get to Stage 3 a lot faster; but it stayed fluctuating on LL about -302500 to -302800 with max-change around 0.05ish. It's been running for about 1 hour now.

> rmod1b <- mixedmirt(dat,covdata = lr_eqn, model = 1, fixed = ~0, lr.random = ~ 1 | group)
Stage 3 = 1016, LL = -302726.2, AR(0.50; 0.00) = [0.32; 0.26], gam = 0.0010, Max-Change = 0.0569

The data in "dat" is in dataframe with 5500ish rows and 70 columns; missing values stored as NA.

Please kindly advise what seems to be the problem? I've run the model with only 2-levels (items nested in students) on mirt and the results are very closely agreeable with those of ConQuest.

many thanks,

Diah

Phil Chalmers

unread,
Mar 15, 2016, 6:54:35 PM3/15/16
to Diah Wihardini, mirt-package
The likelihood is probably not well defined, so there's no strong unique solution but several equally good ones. I've played around with the PISA datasets before and I know they are pretty messy in terms of fitting models (and seem to be pretty riddled with multidimensionality). 

As far as I can tell everything looks to be in order for PCMs when the data has well defined components:

set.seed(1)
N <- 5000
nitems <- 50
a <- matrix(rep(1,nitems),nitems,1)
d <- cbind(0, matrix(rnorm(nitems*3), nitems))
cluster = 200
random_intercept = rnorm(cluster,0,.5)
Theta = numeric()
for (i in 1:cluster)
    Theta <- c(Theta, rnorm(N/cluster,0,1) + random_intercept[i])

group = factor(rep(paste0('G',1:cluster), each = N/cluster))
covdata <- data.frame(group)
dat <- simdata(a,d,N, itemtype = rep('gpcm',nitems), Theta=matrix(Theta))
dat[sample(floor(length(dat)*.3), 1:length(dat))] <- NA

# null model
# mirtCluster() # to compute LL faster, if your RAM can manage it
mod1 <- mixedmirt(dat, covdata, 1, random = ~ 1|group, SE=FALSE, draws = 1)
summary(mod1)
coef(mod1, simplify=TRUE)

Phil

Phil Chalmers

unread,
Mar 15, 2016, 7:25:00 PM3/15/16
to Diah Wihardini, mirt-package
Sorry, change random = ~ 1|group to lr.random = ~ 1|group to save yourself some time in the previous example. Cheers.

Phil

Diah Wihardini

unread,
Mar 15, 2016, 9:25:18 PM3/15/16
to Phil Chalmers, mirt-package
Hi Phil,

Thanks so much for your reply. My comments are written after ***

On Tue, Mar 15, 2016 at 3:53 PM, Phil Chalmers <rphilip....@gmail.com> wrote:
The likelihood is probably not well defined, so there's no strong unique solution but several equally good ones. I've played around with the PISA datasets before and I know they are pretty messy in terms of fitting models (and seem to be pretty riddled with multidimensionality). 

*** Yes, I just read your 2015 paper mentioning that the school-level residual variance for the PISA 2003 math was only 0.051 for unidimensional EMEIRT - pretty low. My dataset is on the likert-scale non-cognitive items, which are multidimensional by design. Maybe I should go directly to the multilevel multidimensional model then, to see if it's identified better?


As far as I can tell everything looks to be in order for PCMs when the data has well defined components:

set.seed(1)
N <- 5000
nitems <- 50
a <- matrix(rep(1,nitems),nitems,1)
d <- cbind(0, matrix(rnorm(nitems*3), nitems))
cluster = 200
random_intercept = rnorm(cluster,0,.5)
Theta = numeric()
for (i in 1:cluster)
    Theta <- c(Theta, rnorm(N/cluster,0,1) + random_intercept[i])

group = factor(rep(paste0('G',1:cluster), each = N/cluster))
covdata <- data.frame(group)
dat <- simdata(a,d,N, itemtype = rep('gpcm',nitems), Theta=matrix(Theta))
dat[sample(floor(length(dat)*.3), 1:length(dat))] <- NA

*** I don't quite understand about the purpose about the above-mentioned codes. Please kindly advise what to with them

# null model
# mirtCluster() # to compute LL faster, if your RAM can manage it
mod1 <- mixedmirt(dat, covdata, 1, random = ~ 1|group, SE=FALSE, draws = 1)
summary(mod1)
coef(mod1, simplify=TRUE)

*** Yes, I have included mirtCluster() as I'm running it on my Yosemite-based macbook air. So, 
(1) the SE=FALSE is to suppress the long-time consuming calculation of SE in the information matrix right? 
(2) why do we only need to take 1 Monte Carlo draw with "draws = 1"  when the default is 5000? 
(3) After about 3.5 hours (the information matrix did take about half of the time) this morning, I got:

$Theta
    F1
F1 207

--------------
RANDOM EFFECT COVARIANCE(S):
Correlations on upper diagonal

$group
          COV_group
COV_group      0.44

or,

$GroupPars
        MEAN_1  COV_11
par          0 207.408
CI_2.5      NA 206.166
CI_97.5     NA 208.651

$group
        COV_group_group
par                0.44
CI_2.5              NaN
CI_97.5             NaN

(4) I'm still running the revised code as you gave, by defining SE = FALSE and draws = 1. I'll let you know how it turned out.


Many thanks,

Diah

Diah Wihardini

unread,
Mar 16, 2016, 4:41:00 AM3/16/16
to Phil Chalmers, mirt-package
Hi Phil,

I see now what you meant by giving those simulation codes as they represented what my dataset might look like. I ran them and they did give a reasonable result with the lr.random command in a not-so-long running time. I guess it's then my dataset's complex structure that has made mixedmirt not work as expected.

thanks,

Diah



Diah Wihardini

unread,
Mar 16, 2016, 4:45:57 PM3/16/16
to Phil Chalmers, mirt-package
Hi Phil,

I finally tried to do a 3-level and 3-dimensional model on my dataset, and did the following:

> dat <- mydata[1:70]
> group <- mydata$SCHOOLID
> lr_eqn <- as.data.frame(group)
> model <- mirt.model('
+                     F1 = 1-30
+                     F2 = 36-48
+                     F3 = 31-35, 49-70
+                     COV = F1*F2*F3,F1*F1, F2*F2, F3*F3')
> rmod3b <- mixedmirt(dat,covdata = lr_eqn, model, fixed = ~0, lr.random = ~ 1 |group)

But then it spat out an error message like this:

Stage 1 = 99, LL = -290900.7, AR(0.40) = [0.16], Max-Change = 0.0045

Error in draw.thetas(theta0 = gtheta0[[1L]], pars = pars[[1L]], fulldata = Data$fulldata[[1L]],  : 
  MH sampler failed. Model is likely unstable or may need better starting valuesFALSE


Should I just then try to feed in initial parameters obtained from the 2-level model to make it work? Please advise.

Many thanks,

Diah

Phil Chalmers

unread,
Mar 19, 2016, 9:26:42 AM3/19/16
to mirt-package, rphilip....@gmail.com
Could you provide a reproducible case? I still think this dataset is too unstable to estimate these IRT models, but perhaps I could try to make the algorithms more robust.


Phil


Phil


Phil

To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package+unsubscribe@googlegroups.com.

Diah Wihardini

unread,
Mar 19, 2016, 11:40:34 AM3/19/16
to Phil Chalmers, mirt-package
Hi Phil,

Many thanks for your response.

Yes, I have a cleaner dataset with no missing values obtained from Stata (as attached). I have compared the mirt results for the 2-level PCM model for both dichotomous cognitive (q1-q8) and polytomous attitudinal (att1-att5) items with those of Stata (using gsem), and the parameters are very comparable. But then, the results were different for the 3-level PCM model for the cognitive items between mixedmirt and gsem in Stata. 

The 3-level PCM model for the attitudinal items was also not converging in mixedmirt, giving the same magnitude for Level-2 variance as my PISA dataset, i.e. Cov_11 = 200ish. However, when I used the graded-response type for the attitudinal items, the 3-level model converged pretty quickly. I didn't compare the results with Stata as I don't quite know how to do GRM with gsem.

Then, I used 'graded' for my polytomous PISA data and it worked well, converging quickly.

Looking forward to hearing from you.

many thanks,

Diah





Phil


Phil


Phil

To unsubscribe from this group and stop receiving emails from it, send an email to mirt-package...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.










--
You received this message because you are subscribed to a topic in the Google Groups "mirt-package" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mirt-package/UnwxQPVWYkM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mirt-package...@googlegroups.com.
gsem_cfa_trial.R
gsem_dataprep.do
gsem_cfa.dta

Diah Wihardini

unread,
May 25, 2016, 8:25:08 PM5/25/16
to Phil Chalmers, mirt-package
Hi Phil,

Hope all is well.

Fyi, my 3-level unidimensional model did finally converge and output reasonable results after I put in a particular set of starting values (obtained from fitting a 2-level model with the groups assigned as fixed effects in ConQuest). Yay!

Now, I'm trying to fit in a 3-level 3-dimensional model using initial parameter values obtained from fitting a 2-level 3-dimensional model in mirt. However, I got an error message "Error in draw.thetas(theta0 = gtheta0[[1L]], pars = pars[[1L]], fulldata = Data$fulldata[[1L]],  : 
  MH sampler failed. Model is likely unstable or may need better starting valuesFALSE".

Could you please advise on what this means? Maybe I'd need to use a similar approach to get an acceptable set of starting values to work on mixedmirt.

thanks in advance,

Diah

Phil Chalmers

unread,
May 26, 2016, 12:12:24 PM5/26/16
to Diah Wihardini, mirt-package
Hi Diah,

Glad to hear that you've had some success, starting values can be a real pain when estimating complex models.

Could you provide a script that reproduces the problem which uses the previously attached data. This problem has come up before and generally relates to instability in the covariance matrix when drawing values stochastically. I'm still not sure what the best way to deal with this is without experimenting a bit, but at least I can make the error message a bit nicer. Cheers.

Phil
Reply all
Reply to author
Forward
0 new messages