Definition of FIML as implemented by lavaan

26 views
Skip to first unread message

Robert Dodier

unread,
Mar 24, 2026, 5:53:52 PM (11 days ago) Mar 24
to lavaan
I am looking at different ways to handle missing data in parameter estimation and I have encountered FIML as one of the possibilities. Before we go on, I'll mention that I have experience with Bayesian inference and also maximum likelihood estimation and related topics, so I am looking ideally for a simple statement about FIML in terms of likelihood, conditional and marginal distributions, etc.

How exactly is FIML defined, as implemented in lavaan or any other package? I haven't been able to find a technical statement of that. The closest I have seen is a video (https://www.youtube.com/watch?v=6CQ526G8rOk) which says that to calculate the FIML one simply omits the terms for the missing variables in each case. Unfortunately that doesn't make a lot of sense to me.

It appears that FIML is associated with SEM but if there is a statement of FIML in more general terms, I would be interested to hear it.

For what it's worth, from a Bayesian point of view, the likelihood function taking cases with missing variables into account would be something like L(parameters) = \int p(Y | present(X), missing(X), parameters) p(missing(X) | present(X)) d(missing(X)) where p(missing(X) | present(X)) is a model of the relationship among all the X variables, such as a joint Gaussian from which a conditional Gaussian for missing(X) given present(X) is derived. I don't suppose FIML is anywhere in the neighborhood of that?

Thank you for your help, I appreciate it very much.

Robert Dodier

Terrence Jorgensen

unread,
Mar 24, 2026, 6:19:34 PM (11 days ago) Mar 24
to lavaan
The same multivariate-normal (log-)likelihood function is applied to any case, but for each case the input data is only their observed variables, and the input parameters are only the means and (co)variances of those observed variables.  The marginal likelihood is discussed here:


Terrence D. Jorgensen    (he, him, his)
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam
http://www.uva.nl/profile/t.d.jorgensen



Robert Dodier

unread,
Mar 24, 2026, 7:33:29 PM (11 days ago) Mar 24
to lavaan
On Tuesday, March 24, 2026 at 3:19:34 PM UTC-7 Terrence Jorgensen wrote:
The same multivariate-normal (log-)likelihood function is applied to any case, but for each case the input data is only their observed variables, and the input parameters are only the means and (co)variances of those observed variables.  The marginal likelihood is discussed here:



Thanks, that helps a lot. I see that in fact one really does just exclude the missing variables from calculations; it makes sense now.

Robert Dodier
 
Reply all
Reply to author
Forward
0 new messages