Hello,
I am a researcher from South Korea currently using your Genomic SEM tool. Thank you very much for developing such a powerful and innovative resource for the field.
I would like to ask a question regarding the calculation of the Q_SNP statistic.
According to the GitHub code, the section
#new Q_SNP calculation
if(Q_SNP){
#number of factors
lv<-colnames(lavInspect(Model1_Results,"cor.lv"))
#number of factors with estimated SNP effects
#split model code by line
lines_SNP <- strsplit(model, "\n")[[1]]
#remove any spacing to ensure matching
lines_SNP <- str_replace_all(lines_SNP, fixed(" "), "")
# Use grep to find lines containing "SNP"
if(TWAS){
lines_SNP <- lines_SNP[grepl("Gene", lines_SNP)]
}else{lines_SNP <- lines_SNP[grepl("SNP", lines_SNP)]}
#subset to factors with estimated SNP effects
lv <- lv[lv %in% gsub(" ~.*|~.*", "", lines_SNP)]
#name the columns and rows of V_SNP according to the genetic covariance matrix for subsetting in loop below
colnames(V_SNP)<-colnames(S_LD)
rownames(V_SNP)<-colnames(V_SNP)
Q_SNP_result<-vector()
Q_SNP_df<-vector()
#loop calculating Q_SNP for each factor
for(b in 1:length(lv)){
#determine the indicators for the factor
indicators<-subset(Model_Output$rhs,Model_Output$lhs == lv[b] & Model_Output$op == "=~")
#check that the factor indicators are observed variables
#for which SNP-phenotype covariances will be in the S matrix
#otherwise put Q_SNP as NA for second-order factors
if(indicators[1] %in% colnames(V_SNP)){
#subset V_SNP to the indicators for that factor
V_SNP_i<-V_SNP[indicators,indicators]
#calculate eigen values V_SNP matrix
Eig<-as.matrix(eigen(V_SNP_i)$values)
#create empty matrix
Eig2<-diag(length(indicators))
#put eigen values in diagonal of the matrix
diag(Eig2)<-Eig
#Pull P1 (the eigen vectors of V_SNP)
P1<-eigen(V_SNP_i)$vectors
#eta: the residual covariance matrix for the SNP effects in column 1 for just the indicators
eta<-as.vector(implied2[1,indicators])
#calculate model chi-square for only the SNP portion of the model for those factor indicators
Q_SNP_result[b]<-t(eta)%*%P1%*%solve(Eig2)%*%t(P1)%*%eta
Q_SNP_df[b]<-length(indicators)-1
}else{
Q_SNP_result[b]<-NA
Q_SNP_df[b]<-length(indicators)-1
}
}
}
appears to be responsible for calculating the Q_SNP statistic. From my understanding, the code identifies latent factors directly regressed on the SNP, and then calculates Q_SNP based on the observed indicators of those latent factors.
However, in a hierarchical model such as:
model <- 'F1 =~ O1 + O2 + O3
F2 =~ O4 + O5 + O6
F3 =~ O7 + O8 + O9
HF =~ F1 + F2 + F3
HF ~ SNP
'
the code would recognize that the SNP is connected to the higher-order factor (HF). But since the indicators of HF (F1, F2, F3) are themselves latent, these do not match the observed variables in V_SNP, and the function seems to return NA.
My question is: is there currently a way to compute Q_SNP for hierarchical (higher-order) factors? For example, is it possible to extend the calculation by mapping the higher-order factor loadings down to the observed indicators?
Best regards,
Hyun Seok Do
--
You received this message because you are subscribed to the Google Groups "Genomic SEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genomic-sem-us...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/c1154b79-a9fc-4d6d-b8a1-30005167943bn%40googlegroups.com.