When using the default target="stan", I am able to fit the model, but I don't believe I can monitor and then get results for the missing values. Is there a way to do that?
Alternatively, I would think that using draws for these values from the posterior predictive distribution would work, as that is essentially the same thing. I know that I can conduct posterior predictive checks through the ppmc function, but I don't see how to get the posterior predicted datasets themselves. Is there a way to do that?
I have also tried to run the model in JAGS through target="jags" and was able to monitor the variables and get posteriors for them. However, I noticed some unexpected behavior for the regression parameters the in chains when running my model. I'm working on that now and see it occurring even using complete data, so I will likely post a separate thread about that if I can't figure it out on my own.
we can use the lavPredict() function from lavaan
--
You received this message because you are subscribed to the Google Groups "blavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blavaan+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/9b9cb06a-6f3c-4031-bfe4-03ed87db31f5n%40googlegroups.com.
Hi everyone,
There are several situations that have been mentioned here that we can distinguish. (Sorry for having conflated a few things and shifted the focus a few times.) I figured I’d lay them out as I see them.
1. Imputations for missing data under a fitted model. This was what I asked about initially in the situation of regression with missingness on a predictor as well as missingness on the outcome. What we’d want here is, for each case with missingness, the posterior predictive distribution for the missing values for that case, conditional on that case’s observed data. I believe I can get this using blavaan, by using blavaan to run the model in JAGS.
2. The posterior predictive distribution for the data, as typically constructed in posterior predictive model checking. (Set aside any issue of missing data and suppose we have complete data.) My understanding of Mauricio’s comment was that this can be accomplished using the lavPredict(MOD, type="ov") approach in the ppmc() function. My understanding from ?lavPredict is that this function would be useful for latent variable models, but in models with no latent variables it would not work. So in a measured variable regression model (with complete data), it wouldn’t return posterior predictive data for the predictor and outcome. Rather it would return the observed values for the predictor and outcome.
3. The prior predictive distribution. It would seem that any functions that would work for the posterior predictive distribution just mentioned in (2) would work here with minimal adjustment. The idea would be use draws from the prior distribution for the parameters (e.g., through prisamp = TRUE) rather than draws from the posterior distribution.
4. The posterior predictive distribution for some of the variables, given others. Again, let’s suppose we have complete data. We declare some variables to be of predictive interest, and the rest are declared to those used to make such predictions. Let’s stick with the typical regression setup, where the variable of predictive interest is the outcome, and the rest are the predictors. What we’d want here is, for each case, the posterior predictive distribution for the outcome, conditional on that case’s values for the predictors.
I have used JAGS code to obtain each of these, but would certainly like to use blavaan to obtain them.
Thanks,
Roy
To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/9d9bf550c78970cbc8d9ba0539091222a40f04c5.camel%40gmail.com.
Thanks Mauricio. Based on your comment and looking into lavPredict a bit more, I now believe I was mistaken in saying that the use lavPredict(MOD, type="ov") inside the ppmc() function would give me the posterior predictive distribution as is typically constructed in posterior predictive model checking (what I laid out in scenario 2, with implications for the prior predictive distribution in scenario 3).
I understand that this would yield *predicted* values for the indicators. I haven’t looked into it enough to know if these are conditional on the values of the latent variables for each case or marginalizing over them, but either way that’s not what I am looking to obtain. I’m looking to obtain draws from the predictive distribution.
To make this concrete, suppose we have two draws from MCMC for which the values for all the unknown parameters are exactly same. In that case, I believe using lavPredict(MOD, type="ov") would yield the same predicted values for the observables. What I would like is a draw from the predictive distribution, which would vary even when the values for all the unknown parameters are exactly same.
Per Mauricio’s comments, I know I can do all of these things in R using output from blavaan, in addition to doing this in JAGS directly. I was just asking if there was a way to request these sorts of things from the blavaan object.
Thanks,
Roy
remotes::install_github("ecmerkle/blavaan", INSTALL_opts = "--no-multiarch")
## (fit your model with missing data, say the object is called fit)
yimp <- blavaan:::blavPredict(fit, type="ymis")
I understand that this would yield *predicted* values for the indicators. I haven’t looked into it enough to know if these are conditional on the values of the latent variables for each case or marginalizing over them, but either way that’s not what I am looking to obtain. I’m looking to obtain draws from the predictive distribution.
--
You received this message because you are subscribed to the Google Groups "blavaan" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blavaan+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/blavaan/abc861c8-c880-4169-bff9-1aa66019ace6n%40googlegroups.com.
So on this JAGS and stan via blavaan agree. However they don’t yield completely comparable results for the imputations for the missing outcome values (i.e., for the first 10 cases). As a summary, they do yield comparable results for the posterior means. But the posterior standard deviations using the stan via blavaan and the blavPredict function are between 13.3-13.8 and the posterior standard deviations using JAGS are in between 4-5. This suggests the variability of the imputations from blavPredict is a bit larger than that from JAGS. I hope this might suggest a simple fix.