Missing Data in blavaan using Stan

hal...@gmail.com

unread,

Mar 30, 2022, 11:57:37 AM3/30/22

to blavaan

Hi,

I'm trying to wrap my head around how missing data is accounted for in blavaan when using it with Stan. Merkle et al. (2021) (https://www.jstatsoft.org/article/view/v100i06) state blavaan is doing something similar to FIML with lavaan when handling missing values, but I just want to clarify a couple things for myself since I don't think I totally understand how this works.

- Given what Merkle et al. discuss, the posterior estimates for parameters in a blavaan model with missing data (on endogenous variables) are "accounting" for the missingness in a somewhat similar way to what lavaan does with FIML (rather than listwise deleting the data). So, for example, in a blavaan growth model, as long as a person has at least one data point, they are included in the model estimation?

- When I use blavPredict() to predict the missing values, the estimates I obtain for that reflect the posterior distribution of values for only the cells in the data frame with a value of "NA." Is that correct?

- I use brms quite a bit as well, and I'm just curious if anyone could elaborate on how the mi() function works compared to what blavaan is doing with its "full information" approach. Are these the same thing on Stan's backend, since the blavPredict function essentially produces imputations of missing values?

Thanks so much for the input and for all your work on this package!

Garret

Ed Merkle

unread,

Mar 30, 2022, 3:02:11 PM3/30/22

to hal...@gmail.com, blavaan

Garret,

See below....

On Wed, 2022-03-30 at 08:57 -0700, hal...@gmail.com wrote:

Hi,

- Given what Merkle et al. discuss, the posterior estimates for parameters in a blavaan model with missing data (on endogenous variables) are "accounting" for the missingness in a somewhat similar way to what lavaan does with FIML (rather than listwise deleting the data). So, for example, in a blavaan growth model, as long as a person has at least one data point, they are included in the model estimation?

I think the easiest way to think about it is that blavaan skips over missing observations and uses everything that is observed. So, in general, yes, a person with one data point will be included.

But, if you use the setting fixed.x=TRUE, blavaan will delete the whole row of data if an observed exogenous variable is missing. This is the same behavior as lavaan with missing="ml".

- When I use blavPredict() to predict the missing values, the estimates I obtain for that reflect the posterior distribution of values for only the cells in the data frame with a value of "NA." Is that correct?

I believe you mean that you are using blavPredict() with type="ymis". If so, then yes, each column of the resulting matrix will correspond to a specific cell of data that has NA.

- I use brms quite a bit as well, and I'm just curious if anyone could elaborate on how the mi() function works compared to what blavaan is doing with its "full information" approach. Are these the same thing on Stan's backend, since the blavPredict function essentially produces imputations of missing values?

I think blavaan is different from brms here. I think that mi() allows you to specify an extra regression equation to say how missing predictors should be modeled. I think this is especially needed in regression models because, traditionally, the predictors (x's) are considered fixed and unmodeled. This means that missing predictors would have to be deleted if we did not do something extra.

I think you could do something similar to mi() using the blavaan model specification. But, for fixed.x=FALSE, it is often not needed because the predictor variables are already included in the model and can be skipped over like any other variable.

And one other thing that may be relevant: blavPredict() is doing extra computations after model estimation, so will not influence what happens during model estimation.

Ed

--

Garret Hall

unread,

Mar 31, 2022, 7:07:37 PM3/31/22

to Ed Merkle, blavaan

Thank you for these responses. This is extremely helpful!

Garret

Reply all

Reply to author

Forward