interpret simsem post-hoc power analysis

152 views
Skip to first unread message

David Martinez

unread,
Dec 10, 2018, 6:55:27 PM12/10/18
to lavaan
I'm hoping someone can help me interpret results from a post-hoc power analysis I did using simsem and this example.

Here's what I get when I use the pValue function:

chisq = 0.5606407; aic =  0.9176201; bic = 0.9828375, rmsea = 0.5606407; cfi = 0.5526316; tli = 0.5526316; srmr = 0.8878719; andRule =  0.4691076; orRule =0.9988558. 

And here is what I get from the getCutOff function (with alpha = .05):

chisq = 492.45; aic = -6610.11; bic = -6260.27; rmsea = 0.0647; cfi = .934; tli = .921

For the pValue function, am I to interpret the values next to the fit indexes as indicating that my model was not significantly different from the simulation result? In this case, is the simulation result some kind of composite?

For the andRule and OrRule values---in this case, these values represent proportions when comparing my model to all of the simulated models, right? So, from the andRule value, compared to all simulated models, my model did better on all fit indexes 46.9% of the time? 

for the getCutOff function, am I again comparing to some composite model? Then, compared to this composite, my model would fail to be rejected/be retained if the fit is at least as good as these values? 

Finally, if I'm interpreting the andRule and orRule values, it seems to me like these are the values that are most related to the "power" of my model, right? Am I supposed to aim for 80% on one of these? 

Any help would be greatly appreciated! 

Thank you, 
David



 




Alex Schoemann

unread,
Dec 12, 2018, 9:16:52 AM12/12/18
to lavaan
Hi David,

Usually those functions are used for model evaluation instead of power analysis. Without seeing more code I'm not exactly sure what the results represent (the pValue function can return different things depending on the options). Can you provide example code?

Thanks,

Alex

David Martinez

unread,
Dec 12, 2018, 5:22:41 PM12/12/18
to lavaan
Hi Alex, 

Yeah, I don't think I saw anything about a post-hoc power analysis anywhere but this seemed appropriate. I'd be grateful if you can direct me to other resources!

I've posted what I did below. 

Thanks!

-----------------------------

library(simsem)

NSFdata <-read.csv("NSFdata.csv")

hs.model <- ' SL =~ ASL+ PSL+ DPSL+ TTSL
            WL =~ TWL + PWL + DPWL + TTWL
            spokenPSTM =~ letSpan + NWRec + Nwspan
            signedPSTM =~ NSPT + proSign + signCon
            WMC =~ Ospan + SymSpan + RoSpan + numSeries + letSets + ravens + slat
            Gc=~ Info + Vocab + Gram + Reading
            Gf=~numSeries + letSets + ravens + slat

            Gf~~ 0*WMC
            TTWL~~TTSL
            letSets~~numSeries '
fit <- cfa(hs.model, data=NSFdata)
datamodel.nomis<-model.lavaan(fit, std=FALSE)

output.nomis <- sim(1000, n=nrow(NSFdata), datamodel.nomis)

loading <- matrix(0, 25, 7)
loading[1:4, 1]<-NA
loading[5:8, 2]<-NA
loading[9:11, 3]<-NA
loading[12:14, 4]<-NA
loading[15:17, 5]<-NA
loading[18:21, 6]<-NA
loading[22:25, 7]<-NA
loading.mis <- matrix("runif(1, -0.2, 0.2)", 25, 7) #will need to account for split loadings

loading.mis[is.na(loading)] <- 0
datamodel.mis <- model.lavaan(fit, std=FALSE, LY=loading.mis)

output.mis <- sim(1000, n=nrow(NSFdata), datamodel.mis)
plotCutoff(output.mis, 0.05)
getCutoff(output.mis, .05)
pValue(fit, output.mis))

Terrence Jorgensen

unread,
Dec 13, 2018, 8:53:39 AM12/13/18
to lavaan
I'm hoping someone can help me interpret results from a post-hoc power analysis I did using simsem and this example.

That example has nothing to do with determining power.  It is only about evaluating the fit of your model using a less restrictive H0 (i.e., that your hypothesized model approximately, rather than exactly, matches the true data-generating model).  This example is about power analysis:


Here's what I get when I use the pValue function:

chisq = 0.5606407; aic =  0.9176201; bic = 0.9828375, rmsea = 0.5606407; cfi = 0.5526316; tli = 0.5526316; srmr = 0.8878719; andRule =  0.4691076; orRule =0.9988558. 

And here is what I get from the getCutOff function (with alpha = .05):

chisq = 492.45; aic = -6610.11; bic = -6260.27; rmsea = 0.0647; cfi = .934; tli = .921

For the pValue function, am I to interpret the values next to the fit indexes as indicating that my model was not significantly different from the simulation result?

Yes.  

In this case, is the simulation result some kind of composite?

I'm not sure what you mean by composite.  The p value for the chi-squared statistic from your summary(fit) output is calculated as the proportion of a central (i.e., H) is true) chi-squared distribution with those df that exceeds your observed statistic.  Same idea with RMSEA's p value.  The pValue() function instead calculates the proportion of the empirical distribution from your output.mis result that exceeds your observed statistic (not only chi-squared and RMSEA, but any other index you request, since the simulation yields an empirical distribution of all of them, assuming your output.mis model represents the true data-generating process.

For the andRule and OrRule values---in this case, these values represent proportions when comparing my model to all of the simulated models, right?

No, the andRule p value is the proportion of the 1000 simulated data sets for which ALL fit measures indicated worse fit than your SET of observed fit measures.  the orRule p value is the proportion of the 1000 simulated data sets for which ANY fit measures indicated worse fit than your SET of observed fit measures.  You should only pay attention to either of these if you don't trust the result from a single fit measure, so you want a more strict (andRule) or lax (orRule) criterion for rejecting your model.

So, from the andRule value, compared to all simulated models, my model did better on all fit indexes 46.9% of the time? 

It is the same model fitted to your actual and each simulated data set.  The model fit your data better (according to all fit measures) than 46.9% of the simulated data sets

for the getCutOff function, am I again comparing to some composite model? Then, compared to this composite, my model would fail to be rejected/be retained if the fit is at least as good as these values? 

There is no composite model, only your hypothesized model and the data-generating model with minor cross-loadings.  Like the p value, the critical value is obtained by finding the 100*(1 - alpha)-th (e.g., 95th) percentile of the empirical distribution (so the 50th highest chi-squared from fitting the model to 1000 simulated data sets, or the 50th lowest CFI).

Finally, if I'm interpreting the andRule and orRule values, it seems to me like these are the values that are most related to the "power" of my model, right? Am I supposed to aim for 80% on one of these? 

No, see my first comment.

Terrence D. Jorgensen
Assistant Professor, Methods and Statistics
Research Institute for Child Development and Education, the University of Amsterdam
 

Alex Schoemann

unread,
Dec 13, 2018, 10:28:14 AM12/13/18
to lavaan
Yup, I'll agree with everything Terry said, and I'll add that post-hoc power analyses are generally not very useful. See this link for a nice explanation: http://daniellakens.blogspot.com/2014/12/observed-power-and-what-to-do-if-your.html

David Martinez

unread,
Dec 13, 2018, 6:10:38 PM12/13/18
to lavaan
Thanks Terrence! 

By composite I meant a model with parameters derived from the 1000 simulations. 


On Thursday, December 13, 2018 at 8:53:39 AM UTC-5, Terrence Jorgensen wrote:

David Martinez

unread,
Dec 13, 2018, 6:13:54 PM12/13/18
to lavaan
Thanks Alex! 
Reply all
Reply to author
Forward
0 new messages