Estimating Q_SNP in hierarchical model

38 views
Skip to first unread message

Hyun Seok Do

unread,
Oct 2, 2025, 8:32:56 AMOct 2
to Genomic SEM Users

Hello,
I am a researcher from South Korea currently using your Genomic SEM tool. Thank you very much for developing such a powerful and innovative resource for the field.

I would like to ask a question regarding the calculation of the Q_SNP statistic.

 According to the GitHub code, the section  

 #new Q_SNP calculation
    if(Q_SNP){
      #number of factors
      lv<-colnames(lavInspect(Model1_Results,"cor.lv"))
     
      #number of factors with estimated SNP effects
      #split model code by line
      lines_SNP <- strsplit(model, "\n")[[1]]
     
      #remove any spacing to ensure matching
      lines_SNP <- str_replace_all(lines_SNP, fixed(" "), "")
     
      # Use grep to find lines containing "SNP"
      if(TWAS){
        lines_SNP <- lines_SNP[grepl("Gene", lines_SNP)]
      }else{lines_SNP <- lines_SNP[grepl("SNP", lines_SNP)]}
     
      #subset to factors with estimated SNP effects
      lv <- lv[lv %in% gsub(" ~.*|~.*", "", lines_SNP)]
     
      #name the columns and rows of V_SNP according to the genetic covariance matrix for subsetting in loop below
      colnames(V_SNP)<-colnames(S_LD)
      rownames(V_SNP)<-colnames(V_SNP)
     
      Q_SNP_result<-vector()
      Q_SNP_df<-vector()
     
      #loop calculating Q_SNP for each factor
      for(b in 1:length(lv)){
       
        #determine the indicators for the factor
        indicators<-subset(Model_Output$rhs,Model_Output$lhs == lv[b] & Model_Output$op == "=~")
       
        #check that the factor indicators are observed variables
        #for which SNP-phenotype covariances will be in the S matrix
        #otherwise put Q_SNP as NA for second-order factors
        if(indicators[1] %in% colnames(V_SNP)){
          #subset V_SNP to the indicators for that factor
          V_SNP_i<-V_SNP[indicators,indicators]
         
          #calculate eigen values V_SNP matrix
          Eig<-as.matrix(eigen(V_SNP_i)$values)
         
          #create empty matrix
          Eig2<-diag(length(indicators))
         
          #put eigen values in diagonal of the matrix
          diag(Eig2)<-Eig
         
          #Pull P1 (the eigen vectors of V_SNP)
          P1<-eigen(V_SNP_i)$vectors
         
          #eta: the residual covariance matrix for the SNP effects in column 1 for just the indicators
          eta<-as.vector(implied2[1,indicators])
         
          #calculate model chi-square for only the SNP portion of the model for those factor indicators
          Q_SNP_result[b]<-t(eta)%*%P1%*%solve(Eig2)%*%t(P1)%*%eta
          Q_SNP_df[b]<-length(indicators)-1
        }else{
          Q_SNP_result[b]<-NA
          Q_SNP_df[b]<-length(indicators)-1
        }
      }
     
    }


  appears to be responsible for calculating the Q_SNP statistic. From my understanding, the code identifies latent factors directly regressed on the SNP, and then calculates Q_SNP based on the observed indicators of those latent factors.  

  However, in a hierarchical model such as:


model <- 'F1 =~ O1 + O2 + O3
F2 =~ O4 + O5 + O6
F3 =~ O7 + O8 + O9

HF =~ F1 + F2 + F3

HF ~ SNP
'  

the code would recognize that the SNP is connected to the higher-order factor (HF). But since the indicators of HF (F1, F2, F3) are themselves latent, these do not match the observed variables in V_SNP, and the function seems to return NA.

My question is: is there currently a way to compute Q_SNP for hierarchical (higher-order) factors? For example, is it possible to extend the calculation by mapping the higher-order factor loadings down to the observed indicators?

Best regards,
Hyun Seok Do


Elliot Tucker-Drob

unread,
Oct 2, 2025, 10:04:46 AMOct 2
to Hyun Seok Do, Genomic SEM Users
The code you are looking at is a computationally efficient implementation for estimating Q for factor models with the GWAS phenotypes as the indicators. For hierarchical models in which the indicators of the higher-order factor(s) are themselves factors, you can instead fit an independent pathways model (SNP effects on all indicators of the higher order factor, but not on the higher-order factor) and compare the fit to a common pathways model (SNP effects on just the higher order factor) using a nested chisq test. Note that we currently recommend that you fix the unstandardized loadings for the measurement model to the estimates from the model without SNPs before adding SNPs.

See Figure 2 of:

Grotzinger, A. D., …Tucker-Drob, E. M., & Nivard, M. G. (2022). Genetic architecture of 11 major psychiatric disorders at biobehavioral, functional genomic, and molecular genetic levels of analysis. Nature Genetics [Tucker-Drob & Nivard jointly directed this work] Link

You do not need to adapt the internal code to do this. You just fit the two models to each SNP using userGWAS.



--
You received this message because you are subscribed to the Google Groups "Genomic SEM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to genomic-sem-us...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/genomic-sem-users/c1154b79-a9fc-4d6d-b8a1-30005167943bn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages