How to tell "--glm" to use dosage instead of hardcalls

123 views
Skip to first unread message

Nick Hirschmueller

unread,
Jul 7, 2021, 5:55:14 AM7/7/21
to plink2-users
Hello,

I just upgraded form plink1.9 to plink2, because I read that support for "dosage data" is much better. 
I imputed my variants with minimac4 and I can successfully converted my vcf to the pfile format using this command:

```bash
Options in effect:
  --maf 0.05
  --make-pgen
  --out test
  --pheno  phenotype.txt
  --update-sex gender.txt
  --vcf all.vcf.gz dosage=HDS
```
In the documentation, I read that plink2 automatically makes hardcalls from the dosage data that I imported. 

I want to use "--glm" to predict a quantitative  phenotype, and I was wondering, if I have to tell plink2 specifically that it is supposed to use the dosage and not the hardcalls for my alleles. 

On a similar note, is it also possible to tell "--glm" to use the hardcalls, even if I imported with the dosage setting?
Any help is much appreciated! Thanks 





Christopher Chang

unread,
Jul 7, 2021, 1:15:44 PM7/7/21
to plink2-users
0. A heads-up: there are still some relevant functions implemented in plink 1.9 which don't yet have an analogue in plink 2.0; be prepared to use --make-bed followed by plink 1.9 --bfile to invoke them.

1. --glm always uses dosages when they're available.  If you want to base it on hardcalls instead, you can use "--make-pgen erase-dosage" to generate a dataset with only hardcalls, and then run --glm on that.  I'll add a comment in the --glm documentation that spells this out.

Nick Hirschmueller

unread,
Jul 8, 2021, 6:22:07 AM7/8/21
to plink2-users
Thank you very much for the quick answer. If possible, maybe also add a line in the `*.log` which says that dosages were used (or hardcalls respectively).

I was also wondering if it is possible to convert the  *pgen file to a human readable format, for inspection (I wanna check how the dosage information is saved in the file).
I tried `plink2 --pfile test --recode --out human_readable` which gives me an error ( Only VCF, oxford, bgen-1.x, haps, hapslegend, A, AD, A-transpose, and
ind-major-bed output have been implemented so far).
I also tried to convert the pfile first to a bed file and then using plink1 the bed file to a *map and *ped file, but it seems like the dosage information is lost along the way.

Christopher Chang

unread,
Jul 8, 2021, 11:33:52 AM7/8/21
to plink2-users
--export/--recode now requires you to specify the file format (and ped has been deliberately excluded from plink 2.0 so far due to its inefficiency).  VCF is the standard human-readable choice today.  Note that, by default, the exported VCF does not include dosage information; adding "vcf-dosage=DS" to the flag changes this.

So you probably want to replace "--recode" with "--export vcf vcf-dosage=DS" in your first command line.
Reply all
Reply to author
Forward
0 new messages