full information maximum Likelihood

189 views
Skip to first unread message

Guido

unread,
Sep 11, 2020, 8:33:51 AM9/11/20
to lavaan
Hello all,

I need some help with the following:

I have performed a CFA with Lavaan using the following code:
fit_extended_ML <- cfa(CFA_model_P_extended, data=P_items_extended, orthogonal=TRUE, estimator="MLMV")

If I am correct, Lavaan uses the correlation matrix for this CFA. But I would like to use the item responses with the full information maximum likelihood. How can I do that?
 
Could anyone point me in the right direction?

Thanks in advance.

Cheers,

Guido

Patrick (Malone Quantitative)

unread,
Sep 11, 2020, 11:02:23 AM9/11/20
to lav...@googlegroups.com
Guido,

lf you direct lavaan to a dataframe, it uses the raw data and constructs the (missing-adjusted) variance/covariance  (not correlation) matrix internally.

Pat

--
You received this message because you are subscribed to the Google Groups "lavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lavaan+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lavaan/cfb84949-d957-43dd-a1e6-76de76ddbe7bn%40googlegroups.com.


--
Patrick S. Malone, Ph.D., Malone Quantitative
NEW Service Models: http://malonequantitative.com

He/Him/His

Yves Rosseel

unread,
Sep 12, 2020, 3:41:03 AM9/12/20
to lav...@googlegroups.com
Are you looking for the missing = "ml" option? This will switch to
'fiml', in case you have missing data (and you are willing to assume MAR.)

When data is complete, the mean vector and sample variance/covariance
matrix are sufficient statistics when the estimator is from the ML
family. In that sense, ML is always 'full information'.

Yves.
> --
> You received this message because you are subscribed to the Google
> Groups "lavaan" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to lavaan+un...@googlegroups.com
> <mailto:lavaan+un...@googlegroups.com>.
> <https://groups.google.com/d/msgid/lavaan/cfb84949-d957-43dd-a1e6-76de76ddbe7bn%40googlegroups.com?utm_medium=email&utm_source=footer>.

Guido

unread,
Sep 13, 2020, 8:50:39 AM9/13/20
to lavaan
So, Pat, does this mean that because I have specified my dataframe (data=P_items_extended), Lavaan automaticaly uses he item responses with the full information maximum likelihood? I that case I wrongly assumed it used the correlation matrix as input?

G. 

Guido

unread,
Sep 13, 2020, 8:53:51 AM9/13/20
to lavaan
Yves, no I am not looking for the missing= ml option. My data is complete and I want to make sure it uses the item responses as input and not te correlation matrix of the dataframe. And if I use another estimator like DWLS? It is still full information?

Guido

Patrick (Malone Quantitative)

unread,
Sep 13, 2020, 9:44:33 AM9/13/20
to lav...@googlegroups.com
If your data has no missing, as you said to Yves, then the variance/covariance matrix lavaan derives from the raw data will be identical to one you create yourself.

I'm no longer sure what you're asking, since DWLS isn't ML, so I'll leave the rest to Yves.

Yves Rosseel

unread,
Sep 13, 2020, 9:51:04 AM9/13/20
to lav...@googlegroups.com
On 9/13/20 2:53 PM, Guido wrote:
> Yves, no I am not looking for the missing= ml option. My data is
> complete and I want to make sure it uses the item responses as input and
> not te correlation matrix of the dataframe.

If you use ML, and the data is complete, then lavaan will use the sample
covariance matrix of your data.frame. However, if one would use casewise
estimation instead (using the raw data), the results would be identical.

> And if I use another
> estimator like DWLS? It is still full information?

No. DWLS is a limited-information estimator. Only useful for categorical
data.

Yves.

Terrence Jorgensen

unread,
Sep 14, 2020, 9:46:21 AM9/14/20
to lavaan
I want to make sure it uses the item responses as input 

You mean like Mplus offers?  That is marginal ML estimation, and you can request it in lavaan with estimator = "MML"

HS9 <- HolzingerSwineford1939[,c("x1","x2","x3","x4","x5",
                                 
"x6","x7","x8","x9")]
HSbinary <- as.data.frame( lapply(HS9, cut, 2, labels=FALSE) )
HS
.model <- ' visual  =~ x1 + x2 + x3 '
fit
<- cfa(HS.model, data=HSbinary, ordered=names(HSbinary), estimator = "MML")

But it is an experimental feature, and you will see that it runs veeeeeery sloooooowly (numerical integration) and might not even converge for models with more than 1 or 2 constructs (dimensions across which to integrate).

If you want a ML estimator for ordinal data, consider the much more efficient pairwise ML estimator, estimator = "PML"


But as the name suggests, it is still limited information. But each pairwise likelihood is calculated using full information available for that pair of variables.  Read the paper for details.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Guido

unread,
Sep 17, 2020, 3:18:44 AM9/17/20
to lavaan
Thank you soo much for this. I will look into it.

Yves Rosseel

unread,
Sep 17, 2020, 3:44:41 AM9/17/20
to lav...@googlegroups.com
Another alternative (for CFA type models), is to use the mirt package,
which provides a fairly efficient implementation of marginal ML.

Yves.

On 9/17/20 9:18 AM, Guido wrote:
> Thank you soo much for this. I will look into it.
>
> On Monday, 14 September 2020 at 15:46:21 UTC+2 Terrence Jorgensen wrote:
>
> I want to make sure it uses the item responses as input
>
>
> You mean like M/plus/ offers?  That is marginal ML estimation, and
> you can request it in lavaan withestimator = "MML"
>
> |
> HS9 <-HolzingerSwineford1939[,c("x1","x2","x3","x4","x5",
> "x6","x7","x8","x9")]
> HSbinary<-as.data.frame(lapply(HS9,cut,2,labels=FALSE))
> HS.model <-' visual  =~ x1 + x2 + x3 '
> fit <-cfa(HS.model,data=HSbinary,ordered=names(HSbinary),estimator
> ="MML")
> |
>
> But it is an experimental feature, and you will see that it runs
> veeeeeery sloooooowly (numerical integration) and might not even
> converge for models with more than 1 or 2 constructs (dimensions
> across which to integrate).
>
> If you want a ML estimator for ordinal data, consider the much more
> efficient pairwise ML estimator, estimator ="PML"
>
> https://doi.org/10.1016/j.csda.2012.04.010
>
> But as the name suggests, it is still limited information. But each
> pairwise likelihood is calculated using full information available
> for that pair of variables.  Read the paper for details.
>
> Terrence D. Jorgensen
> Assistant Professor, Methods and Statistics
> Research Institute for Child Development and Education, the
> University of Amsterdam
> http://www.uva.nl/profile/t.d.jorgensen
>
> --
> You received this message because you are subscribed to the Google
> Groups "lavaan" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to lavaan+un...@googlegroups.com
> <mailto:lavaan+un...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/lavaan/4e930c73-30c5-4016-b91a-85111215ba22n%40googlegroups.com
> <https://groups.google.com/d/msgid/lavaan/4e930c73-30c5-4016-b91a-85111215ba22n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Guido

unread,
Sep 17, 2020, 7:01:03 AM9/17/20
to lavaan
Slow is an understatement. Normally a CFA with other estimators does only take about 20-30 sec. Now it is still computing after 30 minutes! Fans blowing like crazy and laptop heating up to very high levels (with i7 quad core and 16GB RAM). I had to shut it off because I don't want my laptop overheating...Problem arises with both MML and PML.
 I have used these codes for a 1 dimensional model containing 52 ordinal items as variables:

#MML

fit_extended_ML <- cfa(CFA_model_P_extended, data=P_items_extended, ordered=names(P_items_extended), estimator="MML")

fitMeasures_extended_ML <- fitMeasures(fit_extended_ML, fit.measures = c("cfi","tli","rmsea","srmr"))

fitMeasures_extended_ML

 

#PML

fit_extended_ML <- cfa(CFA_model_P_extended, data=P_items_extended, ordered=names(P_items_extended), estimator="PML")

fitMeasures_extended_ML <- fitMeasures(fit_extended_ML, fit.measures = c("cfi","tli","rmsea","srmr"))

fitMeasures_extended_ML


Guido

unread,
Sep 17, 2020, 7:07:33 AM9/17/20
to lavaan
After 40 minutes:

<0 x 0 matrix>

<0 x 0 matrix>

<0 x 0 matrix>

<0 x 0 matrix>

<0 x 0 matrix>

Warning messages:

1: In lav_model_estimate(lavmodel = lavmodel, lavpartable = lavpartable,  :

  lavaan WARNING: the optimizer warns that a solution has NOT been found!

2: In lav_model_lik_mml(lavmodel = lavmodel, THETA = THETA, TH = TH,  :

  lavaan WARNING: --- VETAx not positive definite

3: In lav_model_gradient_mml(lavmodel = lavmodel, GLIST = GLIST, THETA = THETA[[g]],  :

  lavaan WARNING: --- VETAx not positive definite

4: In lav_model_lik_mml(lavmodel = lavmodel, THETA = THETA, TH = TH,  :

  lavaan WARNING: --- VETAx not positive definite

5: In lav_model_lik_mml(lavmodel = lavmodel, THETA = THETA, TH = TH,  :

  lavaan WARNING: --- VETAx not positive definite

6: In lav_model_gradient_mml(lavmodel = lavmodel, GLIST = GLIST, THETA = THETA[[g]],  :

  lavaan WARNING: --- VETAx not positive definite

> fit_extended_ML

lavaan 0.6-6 did NOT end normally after 353 iterations

** WARNING ** Estimates below are most likely unreliable

 

  Estimator                                        MML

  Optimization method                           NLMINB

  Number of free parameters                        260

                                                     

  Number of observations                          1022

                                                     

Error in lav_fit_measures(object = object, fit.measures = fit.measures,  :

  lavaan ERROR: fit measures not available if model did not converge


Mauricio Garnier-Villarreal

unread,
Sep 17, 2020, 10:13:31 AM9/17/20
to lavaan
Guido

Can I ask, what is you overall objective at the end with this? Depending on that we could lead you to a better option. As you see the optons in lavaan for this are limited. I would second Yves, recommending to use mirt instead, from the IRT side it does uses full information and provide model anditem fit indices
Reply all
Reply to author
Forward
0 new messages