Purposefully under-identified model (inconsistent results from lavaan vs. Mplus)

316 views
Skip to first unread message

Yi F.

unread,
Jul 5, 2018, 10:54:48 PM7/5/18
to lavaan
Hello,

I was trying to fit a model that is theoretically determined to be empirically under-identified. I fitted the same model to the same set of data in both lavaan and Mplus. In Mplus, I received the error message saying "THE STANDARD ERRORS OF THE MODEL PARAMETER ESTIMATES COULD NOT BE COMPUTED.  THE MODEL MAY NOT BE IDENTIFIED.  CHECK YOUR MODEL", which is consistent with my expectation since the model should be empirically under-identified. But in lavaan, the model has converged pretty quickly and gave me standard error estimates and everything. 

I understand that the optimizer is not aware of the identification issues. So I tried to code up the fitting function myself and run it with nlminb. nlminb has given me converged results, but with positive values on the diagonal of the Hessian matrix (so standard errors could not be computed based on observed information). 

Now I am really confused about the way lavaan works out this model... Could you please help me understand why this model is estimable in lavaan, without any warning messages?

Thanks so much!

Best,
Yi


mod.mg1='
group:1

int=~ 1*V1 + 1*V2
slp =~ 0*V1 + 1*V2
int ~~ start(0.292)*int+a*int  #variance
slp ~~ start(0.5)*slp + b*slp
int ~~ z*slp  #covariance
V1 ~~ r1*V1
V2 ~~ r2*V2   # residual variance for t2
int ~ c*1  # intercepts
slp ~ d*1
V1 ~ 0*1
V2 ~ 0*1

group:2
int =~ 1*V1 + 1*V2
slp =~ 0*V1 + 2*V2
int ~~ start(0.292)*int+a*int  #variance
slp ~~ start(0.5)*slp+b*slp
int ~~ z*slp  #covariance
V1 ~~ r1*V1
V2 ~~ r3*V2   # residual variance for t3
int ~ c*1  # intercepts
slp ~ d*1
V1 ~ 0*1
V2 ~ 0*1
'

fit.mg1=lavaan(mod.mg1,cdata.2,group="g",estimator="ML",information="observed",control=list(trace=0,init_nelder_mead=TRUE),mimic="Mplus",verbose=T)


Terrence Jorgensen

unread,
Jul 9, 2018, 9:42:09 AM7/9/18
to lavaan
Can you explain why this model is not identified?  And could you provide a reproducible example, either by uploading the data you used or by using simulated data, along with the Mplus script you say is equivalent?  You can investigate the equivalence of estimated parameters by comparing the Mplus TECH1 output to the output of lavInspect(fit.mg1, "free")


Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam

Yi F.

unread,
Jul 9, 2018, 2:02:32 PM7/9/18
to lavaan
Hi Dr. Jorgensen,

Thank you so much for your reply! This model is under-identified as there is indeterminacy issue with the parameters named "b", "r2", and "r3" in the model. Empirically (or locally) the model needs to find a solution for these 3 parameters with only 2 available pieces of information.

I have compared the Mplus Tech1 output with the lavInspec output as you suggested, and they look to be equivalent. I am attaching my simulated data as well as the Mplus syntax below. 

Best,
Yi

TITLE: MG approach
DATA: FILE IS data2.dat;
VARIABLE: NAMES ARE t1 t2 g;
GROUPING IS g (1=miss3 2=miss2);

MODEL:
I by t1@1
t2@1;
S by t1@0
t2@1;
I (vi);
S (vs);
I WITH S (c);
[I] (i);
[S] (s);
[t1-t2@0];
t1 (r1);
t2;

MODEL miss3:

I by t1@1
t2@1;
S by t1@0
t2@1;
I (vi);
S (vs);
I WITH S (c);
[I] (i);
[S] (s);
[t1-t2@0];
t1 (r1);
t2;

MODEL miss2:

I by t1@1
t2@1;
S by t1@0
t2@2;
I WITH S (c);
I (vi);
S (vs);
[I] (i);
[S] (s);
[t1-t2@0];
t1 (r1);
t2;

OUTPUT:
SAMPSTAT TECH1;



data2.dat

Yves Rosseel

unread,
Jul 12, 2018, 5:44:26 AM7/12/18
to lav...@googlegroups.com, yife...@umd.edu
> Thank you so much for your reply! This model is under-identified as
> there is indeterminacy issue with the parameters named "b", "r2", and
> "r3" in the model. Empirically (or locally) the model needs to find a
> solution for these 3 parameters with only 2 available pieces of information.

Agreed. Typically, the optimizer will find 'a' solution. But there are
in fact many possible solutions. Note that Mplus found another solution,
with different values for those unidentified parameters. The optimizer,
as you noted correctly, has no idea, and will not complain, as long as
it believes that a (local) minimum has been found.

The standard errors should (ideally) reveal the non-identification. If
there were no equality constraints in your model, it is very likely that
you would get a non-positive information matrix (if the model is not
identified), and lavaan would not compute standard errors (and give a
warning). If you have equality constraints, things are a bit more
complicated. Let me explain first show how standard errors are computed
in lavaan for your case:

fit.mg1=lavaan(mod.mg1,cdata.2,group="g",estimator="ML",information="observed",
control=list(trace=0,init_nelder_mead=TRUE),mimic="Mplus",verbose=T)

information <- lavTech(fit.mg1, "information.observed")
lavmodel <- fit.mg1@Model
H <- lavm...@con.jac
lambda <- lavm...@con.lambda
H0 <- matrix(0, nrow(H), nrow(H))
H10 <- matrix(0, ncol(information), nrow(H))
DL <- 2 * diag(lambda, nrow(H), nrow(H))
E3 <- rbind(cbind(information, H10, t(H)), cbind(t(H10),
DL, H0), cbind(H, H0, H0))
information <- E3

information.auginv <- MASS::ginv(information,
tol = .Machine$double.eps^(3/4))
npar <- lavm...@nx.free
information.inv <- information.auginv[1:npar, 1:npar, drop = FALSE]

# SEs
round(sqrt(diag(information.inv / (nobs(fit.mg1)))), 3)

The above chunk of code is what lavaan uses to compute the 'inverse' of
an 'augmented' information matrix. The augmentation has to do with the
equality constraints. As the equality constraints are often needed to
ensure identification, it is quite natural that the augmented
information matrix is not positive definite. That is why we have use a
generalized inverse (the MASS::ginv() call). And that is why you got
standard errors in lavaan.

I have not found a way to detect model under-identification in a
bullet-proof way, if the model contains equality constraints. But this
needs my attention anyway. Could you open an issue about this on github?
Then I will return to this when I have the time.

Yves.

Yi Feng

unread,
Jul 12, 2018, 11:47:54 AM7/12/18
to lavaan
Hi Dr. Rosseel,

Thank you so much for your kind reply and detailed explanation! This is super helpful. I really appreciate it. 

I'll make sure to open an issue about this on Github as well.  

Best,
Yi

corrie....@griffithuni.edu.au

unread,
Sep 25, 2018, 10:55:09 PM9/25/18
to lavaan
Hi, Just jumping in here with a related question. I am getting a repeated warning that a local solution cannot be found whilst bootstrapping a mediation model. I am getting a solution in time and I have SEs and the model fit is ok. 

I get this message: "lavaan WARNING: only 4982 bootstrap draws were successful". Given I specified 5000, I don't know that it's too much of an issue if the model is otherwise OK? If this is not the right place to ask can you please direct me to where I may get an answer.

Terrence Jorgensen

unread,
Sep 28, 2018, 5:43:54 AM9/28/18
to lavaan
I get this message: "lavaan WARNING: only 4982 bootstrap draws were successful". Given I specified 5000, I don't know that it's too much of an issue if the model is otherwise OK? 

That sounds reasonable.  But feel free to ask on SEMNET or CrossValidated, too.

Terrence D. Jorgensen
Postdoctoral Researcher, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam
Reply all
Reply to author
Forward
0 new messages