Covariance Matrix for Interaction metrics?

49 views
Skip to first unread message

Luke Cassar

unread,
May 14, 2026, 10:28:55 AMMay 14
to plink2-users
Hi there, I'm currently trying to carry out studies on gene-environmental interaction. I'm currently trying to use epidemiological stratification and plink's logistic regression with interaction terms separately to compare both methods. My problem is that I want to extract the covariance matrix for each variant so that I can calculate the confidence intervals of RERI (relative excess risk due to interaction) so that I can measure interactions on an additive scale for this method. For an interaction model, it's usually enough to extract the point estimates and transform them. But since I require the confidence interval for RERI, I need the covariance estimates as well, but this output isn't present by default in plink2's --glm output to my knowledge.

Is there a way to extract this somehow without having to make my own logistic/firth regression model in R and running it on each variant?

I also attached a paper here in case the question isn't fully clear regarding what I'm trying to do with transforming the model's coefficients so I can calculate RERI from the interaction model.

Thanks for your time and patience. Any guidance would be greatly appreciated on this.
InteractionTutorial_EM-1 (1).pdf

Chris Chang

unread,
May 16, 2026, 12:08:03 PMMay 16
to Luke Cassar, plink2-users
In the short term, you'd need to run this in something like R, sorry.

In the longer term, an option could be added to --glm to generate an extra output file containing these covariance matrices.  (It would be a poor fit for the usual --glm output file, which can have more than one line per regression.)  How would you want such a file to be formatted?

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/plink2-users/aa6c8501-b3a2-40cd-9eef-cb1402d7e0a8n%40googlegroups.com.

Luke Cassar

unread,
May 18, 2026, 4:20:51 AMMay 18
to plink2-users
That's ok, for now I can make an individual model in R work, but it would be nice to add to plink2 in the future. 

If it does get added, the individual betas of the model (main effect of the SNP, effect of the environmental variable/covariate, effect of the product term, etc.) being outputted as columns along with variant ID would work very well. RERI can be calculated from these betas using: RERI = exp (B_G + B_E + B_GxE) - exp (B_G) - exp (B_E) + 1.  Currently I'm only keeping the product term rows in plink's output, but I assume the betas can just be achieved by including the rows for each variable and avoiding hide-covar in --glm.

The main issue is I also need to compute the confidence interval for RERI. Below, I've attached some Obsidian notes and some R code for what I'm currently doing to get what I need, for the sake of not posting a huge script I'm going to skip a lot of steps and try to simplify the variable names.

First, I would need to calculate RERI's gradient: 
```
∂RERI/∂γ₁ = exp(γ₁+γ₂+γ₃) − exp(γ₁) = OR₁₁ − OR₁₀
∂RERI/∂γ₂ = exp(γ₁+γ₂+γ₃) − exp(γ₂) = OR₁₁ − OR₀₁
∂RERI/∂γ₃ = exp(γ₁+γ₂+γ₃) = OR₁₁
```

```R
g_reri <- c(rr$OR_11 - rr$OR_10, 
                   rr$OR_11 - rr$OR_01, 
                   rr$OR_11)
```
Then I need the full covariance matrix of the model:

```R
get_fit_vcov <- function(fit) { 
  out <- tryCatch(stats::vcov(fit), error = function(e) NULL) 
  if (is.null(out) && inherits(fit, "logistf") && !is.null(fit$var)) out <- fit$var out 
}
```

```R
extract_V3 <- function(fit) { 
  V <- get_fit_vcov(fit) 
  if (is.null(V)) return(NULL) 
  need <- c(COEF_G, COEF_E, COEF_GE) 
  if (!all(need %in% rownames(V)) || !all(need %in% colnames(V))) return(NULL) 
  V[need, need, drop = FALSE] }
```

I'm only looking for the main interaction between the SNP and the environmental variable so I'm not including any coefficients from other covariates that could be in the model.

Next, I'd use this fit to get the variance of RERI:

```R
var_reri <- drop(t(g_reri) %*% V3 %*% g_reri)
```

And with the variance calculated:

```R
out$RERI_SE <- sqrt(var_reri) 
out$RERI_L95 <- rr$RERI - Z_975 * out$RERI_SE 
out$RERI_U95 <- rr$RERI + Z_975 * out$RERI_SE
```
It would be ideal for me if the output file also had other metrics from model itself, such as AIC and Log-Likelihood.

I'm still getting the hang of this so I'm sorry if the explanation wasn't fully clear on my end but this should be everything I need to calculate RERI and other metrics like AP and S. I'm doing a similar procedure without the beta transformation for stratification analysis (coding strata as dummy variables in a single model) so I can calculate RERI CIs with different methods like the standard delta method I show here, the MOVER method, and bootstrapping. 

So to recap, I mainly need the full covariance matrix of the model, which in it's most basic form is a 3x3 matrix like so:2026-05-18_10-12.png

I get this in R using: stats::vcov(fit)

From which I would extract the following:

COV_G_E — Cov(B_G, B_E)
COV_G_GxE — Cov(B_G, B_GxE)
COV_E_GxE — Cov(B_E, B_GxE)

And if it's possible to get the betas for every coefficient in the model, including the intercept, does plink already have a way to do this? Correct me if I'm missing it. 

For bootstrapping and profile likelihood CIs, I don't think there's a way to really do this downstream, as they require the option to refit the model for each variant multiple times, and it requires the full coefficient vector to work. 

As for the file format, maybe just having each of the coefficients as columns, and the results of the matrix as columns as well, would be enough so long as the columns are labelled clearly. I can handle the rest downstream as is with R.

Regardless, thank you for taking the time to read this. Plink2 has been indispensable to my work, and having functionality like this for GxE interactions is something I haven't seen available anywhere.
Reply all
Reply to author
Forward
0 new messages