Exploratory Factor Analysis (EFA) using
fa() function provides in its output a row called
Proportion Var. You can typically get the variance explained for the whole model by summing the
Proportion Var for each component. For example:
# Make EFA model
library(psych)
fit <- fa(mydata, nfactors = 4, rotate = "oblimin", scores = "Bartlett", fm = "minres")
# Get variance explained for each component
fit$Vaccounted[2,]
# Get sum of variance explained
sum(unlist(fit$Vaccounted[2,]))
My coauthor and I would like to compare this number for EFA with its equivalent for Confirmatory Factor Analysis (CFA), but the cfa() function does not seem to provide Proportion Var per default.
So first, is there a statistical reason why the cfa() output does not provide Proportion Var or an equivalent? Should we be avoiding attempting this comparison or even obtaining explained variance for CFA at all?
Second, I have attempted to calculate this value myself through a custom function (using
source 1 and
source 2):
mycfa <- function(data, indices) {
d <- data[indices,]
fit <- cfa(model, data = d, estimator = "MLR")
inspect(fit, what = "std")$lambda -> loadings
sum(loadings[,1]^2) -> SS.loadings
Proportion.Var.MR1 <- sum(loadings[,1]^2)/nrow(loadings)
Proportion.Var.MR2 <- sum(loadings[,2]^2)/nrow(loadings)
Proportion.Var.MR3 <- sum(loadings[,3]^2)/nrow(loadings)
Proportion.Var.MR4 <- sum(loadings[,4]^2)/nrow(loadings)
output <- Proportion.Var.MR1 + Proportion.Var.MR2 + Proportion.Var.MR3 + Proportion.Var.MR4
output
}
The idea would be to compare two distributions/histograms of 10,000 bootstrapped samples of the explained variances for the EFA and CFA, respectively, to see if they overlap, and to what extent. Would that approach make sense?
So to bootstrap the 10,000 explained variances for CFA, I simply use the custom function above along with the boot package:
library(boot)
(vars.boot <- boot(data = mydata, statistic = mycfa, R = 10000))
The output, however, provides 275 impossible values (out of 10,000)—i.e., values greater than 1 (the variance explained shouldn't be greater than 1). Would there be any explanation for this? That's not much, but it does seem like a problem. Is the function at fault here, or is the bootstrapping, or the combination of the two?
While running my model, I do get this warning:
In lav_object_post_check(object) :
lavaan WARNING: some estimated ov variances are negative
But I am not sure what to do with this or what the implications are. Could this be related to the impossible bootstrapped values? Thank you very much.
Related/relevant conversations: