PLINK2: covar-variance-standardize option

210 views
Skip to first unread message

OliviaL

unread,
Oct 1, 2019, 5:51:08 AM10/1/19
to plink2-users

Hi,

 

I have been using the newest version of PLINK2 (https://www.cog-genomics.org/plink/2.0/) to perform a sex-SNP interaction analysis with a binary outcome.  

 

I ran the following command:

 

plink2 --bgen [FILE].bgen --sample [FILE].sample --pheno-name [pheno] --covar [FILE].phen --covar-name sex PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 --logistic interaction --parameters 1-13  --out [New FILE]

 

and the following error was output:

 

Warning: Skipping --glm regression on phenotype 'pheno', since

genotype/covariate scales vary too widely for numerical stability of the

current implementation. Try rescaling your covariates with e.g.

--covar-variance-standardize.

 

So I added --covar-variance-standardize to the command:

 

plink2 --bgen [FILE].bgen --sample [FILE].sample --pheno-name [pheno] --covar [FILE].phen --covar-name sex PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10 --logistic interaction --parameters 1-13 --covar-variance-standardize --out [New FILE]

 

 

and the analysis ran.

 

 

I found that PC1 and PC2 were causing the error (I included each covariate in the model separately, to see which one caused the error) though I couldn’t work out why. I standardised the PCs, and not sex, and ran the analysis without --covar-variance-standardize. It ran fine, but the output was different to when including --covar-variance-standardize.

 

 

It looks like when you use --covar-variance-standardize, it standardises all covariates in the model (sex & the PCs). Is that correct?   

Is this why the output is different when using the --covar-variance-standardize vs. just standardising the PCs?

 

Many thanks,

Olivia

 

 

Christopher Chang

unread,
Oct 1, 2019, 12:42:04 PM10/1/19
to plink2-users
1. Yes, --covar-variance-standardize linearly transforms the covariates to have mean 0, variance 1, so the raw --glm output will be different as a consequence.

2. However, have you looked at *how* the output differs?  All values for the genotype, including BETA/SE/P, should *not* be different.  P-values for the covariates also shouldn't be different!  Only BETA/SE for the covariates should be changed by the rescaling, and this is irrelevant to most analyses.
Note that if you care about the BETA/SE values for sex, but not the PCs, you can choose which covariates to standardize ("--covar-variance-standardize PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10").

3. As for why PC1/PC2 might be causing the scale error, these usually reflect geographic ancestry, so if your .bgen file only contains samples with e.g. predominantly UK ancestry, all the PC1/PC2 values might be clumped very, very close together.

OliviaL

unread,
Oct 3, 2019, 6:10:26 AM10/3/19
to plink2-users
Many thanks for your reply!

Ah, ok I didn't realise you could state which covariates to standardise.

Re. answer 3 - Great, I did wonder if that was the case. 


Many thanks,
Olivia 
Reply all
Reply to author
Forward
0 new messages