I am running logistic regression for a binary trait with PLINK2, but I found it runs much slower compared to REGENIE. Using the same PGEN file, PLINK2 took almost two days to finish three binary traits, whereas REGENIE completed seven binary traits in just one day, even with double the sample size. I’ve attached the log file below. Could you let me know if there’s anything I might have done wrong that caused PLINK2 to run more slowly, or if it’s expected that REGENIE performs faster than PLINK2?
PLINK v2.0.0-a.7LM AVX2 Intel (1 Sep 2025)
Options in effect:
--covar ../../GWAS_INPUT/20250719_covar_both.txt
--covar-name BIRTH_YEAR PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8 PC9 PC10
--covar-variance-standardize
--extract ../../GWAS_INPUT/wb_qced_snps.snplist
--glm no-x-sex hide-covar
--keep ../../GWAS_INPUT/20250703_both_WB_geno_qced_eids.txt
--keep-males
--memory 256000
--out ../../PLINK_LM_OUTPUT/all_13_traits/20250828_male_add
--pfile ../../../ukb_pfile/wb_qced_maf0.001
--pheno ../../GWAS_INPUT/20250827_phenos.txt
--pheno-name G43 F20 E050
Hostname:
c314-112.ls6.tacc.utexas.eduWorking directory: /scratch/09059/xliaoyi/harpak_lab/ukb/CODE/PLINK_LM
Start time: Sat Sep 20 16:48:28 2025
Random number seed: 1758404908
257263 MiB RAM detected, ~244060 available; reserving 243996 MiB for main
workspace.
Allocated 182997 MiB successfully, after larger attempt(s) failed.
Using up to 128 threads (change this with --threads).
402040 samples (217326 females, 184714 males; 402040 founders) loaded from
../../../ukb_pfile/wb_qced_maf0.001.psam.
13868376 variants loaded from ../../../ukb_pfile/wb_qced_maf0.001.pvar.
3 binary phenotypes loaded.
--extract: 13868376 variants remaining.
--keep: 402040 samples remaining.
217326 samples removed due to sex filter(s).
Warning: Phenotype/covariate value in [-8, -9) or (-9, -10] present, when -9
is treated as missing. Use --no-input-missing-phenotype to treat -9 as a
numeric value (missing values can be indicated by 'NA'), or
--neg9-pheno-really-missing to suppress this warning.
11 covariates loaded from ../../GWAS_INPUT/20250719_covar_both.txt.
184714 samples (0 females, 184714 males; 184714 founders) remaining after main
filters.
--covar-variance-standardize: 11 covariates transformed.
Calculating allele frequencies... done.
13868376 variants remaining after main filters.
--glm logistic-Firth hybrid regression on phenotype 'F20': done.
Results written to ../../PLINK_LM_OUTPUT/all_13_traits/20250828_male_add.F20.glm.logistic.hybrid .
--glm logistic-Firth hybrid regression on phenotype 'G43': done.
Results written to ../../PLINK_LM_OUTPUT/all_13_traits/20250828_male_add.G43.glm.logistic.hybrid .
--glm logistic-Firth hybrid regression on phenotype 'E050': done.
Results written to ../../PLINK_LM_OUTPUT/all_13_traits/20250828_male_add.E050.glm.logistic.hybrid .
End time: Mon Sep 22 14:35:48 2025