logLik in sem output is NA

750 views
Skip to first unread message

Galla Placidia

unread,
Jan 23, 2016, 11:23:30 AM1/23/16
to lavaan
I am running a series of path analyses and sem models with lavaan (sem) and I keep getting NA's for the log likelihood (logLik), AIC and BIC. However the model has positive degrees of freedom, the optimization claims to have converged, and I do get non-zero and non-NA estimates for the chi-square statistics, cfi, rmsea and other fit measures. I selected ML as the method of estimation.

My model has a measurement component component in the response and manifest independent variables. So it's something like this:

# Measurement model


f1
=~ Q1 + Q2 + Q3
f2
=~ Q4 + Q5 + Q6
f3
=~ Q7 + Q8 + Q9


# regression model


f1
~ x + y + z
f2
~ x + y + z
f3
~ x + y + z


There are actually more manifests in the real model, but this is the gist of it. 


I have a couple of problems with the output:

1) I thought that the chi-square test statistic was basically a function of log likelihood at the maximum, so I don't understand why I got chisq but not logLik.
2) In general, I don't understand why fmin (from fit.measures) is different from logLik, since I assumed that logLik was the log likelihood at the maximum.
3) When the regression piece of a sem model consists solely of manifest variables, how does the model get estimated? Is it different from sem models with latent independent variables?


Terrence Jorgensen

unread,
Jan 25, 2016, 4:56:47 AM1/25/16
to lavaan
1) I thought that the chi-square test statistic was basically a function of log likelihood at the maximum, so I don't understand why I got chisq but not logLik.

That is unexpected, since you are using MLE.  Without a logLik, AIC and BIC can't be calculated, but if chi-squared is provided, then any fit index based on chi-squared can also be calculated.  Can you provide enough data (and script) to reproduce the problem?

2) In general, I don't understand why fmin (from fit.measures) is different from logLik, since I assumed that logLik was the log likelihood at the maximum.

fmin * N = the chi-squared value (i.e., the likelihood-ratio test), which -2 times the difference between 2 different log-likelihoods: the likelihood of the data under the target model (H0), and the likelihood of the data under the saturated model (H1).  fmin is not itself a log-likelihood or a likelihood ratio, just a shortcut software developers used to calculate the LRT statistic from complete-data summary statistics rather than calculating and summing the N log-likelihoods for both H0 and H1 models, then calculating the difference of the sums.  I don't know why you aren't getting a logLik for converged models, but fmin is a function of the observed and model-implied summary statistics, so it can be calculated using model parameters -- no need for likelihoods (unless you have missing data).

3) When the regression piece of a sem model consists solely of manifest variables, how does the model get estimated? Is it different from sem models with latent independent variables?

The default is fixed.x = FALSE, so the (co)variances of exogenous predictors are not estimated.  In your example, the multinormal likelihood (given the model parameters) of the 9 endogenous variables would be calculated conditional on fixed values of the 3 exogenous variables (like residual likelihood?).   You can choose to treat the predictors as random variables by setting fixed.x = TRUE, in which case the multinormal likelihood (given the model parameters) of all 12 variables is calculated.  Is logLik missing for either or both of these settings?  Again, a reproducible example is the only way to get to the bottom of it.  And check that you are using the latest version of lavaan (0.5-20:

sessionInfo()

Terry

Galla Placidia

unread,
Jan 25, 2016, 9:13:34 AM1/25/16
to lavaan
 
Turns out, I was running 0.5-18. When I upgraded to 0.5-20, I got finite log likelihoods.  The other statistics and estimates remained the same. So that's good.
 
 
I played around with the model and the data a bit before doing the upgrade. The problem concerned one of my independent variables, Gender. It is the only categorical variable in the model, so maybe that's the problem.  I have no missing data (incomplete cases were dealt with before running lavaan), and the Gender variable splits fairly evenly between girls and boys. All other variables had been standardized to mean 0.
 
In any case, whatever the problem, it seems to have been resolved between versions 18 and 20. Thanks for the help.
 
In any case,

yrosseel

unread,
Jan 25, 2016, 1:46:53 PM1/25/16
to lav...@googlegroups.com
On 01/25/2016 03:13 PM, Galla Placidia wrote:
> Turns out, I was running 0.5-18. When I upgraded to 0.5-20, I got finite
> log likelihoods. The other statistics and estimates remained the same.

If I remember correctly, the reason for getting 'NA' was that some
elements (variables) in the data.frame were not of type 'numeric'; you
can see this by typing

varTable(myData)

Yves.

Galla Placidia

unread,
Jan 25, 2016, 2:04:43 PM1/25/16
to lav...@googlegroups.com
That would explain it. Only now, it seems to work. Do you numerize factors in the current version?



--
You received this message because you are subscribed to a topic in the Google Groups "lavaan" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lavaan/7TybOmz1x7g/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lavaan+un...@googlegroups.com.
To post to this group, send email to lav...@googlegroups.com.
Visit this group at https://groups.google.com/group/lavaan.
For more options, visit https://groups.google.com/d/optout.

Annika Otto

unread,
Feb 12, 2017, 1:36:21 PM2/12/17
to lavaan
Thanks, Yves! That revealed the source of my'NA' as well! :-) 
Reply all
Reply to author
Forward
0 new messages