--export or --recode ?

68 views
Skip to first unread message

jie huang

unread,
Mar 3, 2025, 9:37:35 PM3/3/25
to plink2-users

Dear Chris:

Please see the screenshot below from the PLINK2 tutorial. It seems that both --export  vcf and --recode vcf work fine. Is there actually any difference between these two options?

屏幕截图 2025-03-02 231616.jpg

Similarly, the UK Biobank imputed data is given as BGEN format, while plink2 uses PGEN format. I found that the data size is very similar between these two. Just to confirm with you, is there any key information (such as phasing) stored in one format but not in the other?

Finally, as the pangenome reference is getting popularity, I think the idea and practice of Reference Allele vs. Alternative Allele will gradually become outdated. Don't know if PLINK2 has some solution to address this issue.

Best regards,
Jie

Christopher Chang

unread,
Mar 4, 2025, 2:24:20 AM3/4/25
to plink2-users
1. Oops, both should read --export in the documentation.  Thanks for reporting this anomaly, it has been fixed.
Anyway, yes, "--recode vcf" still works since it's just translated to "--export vcf".  What you can no longer do with plink 2.0 is "--recode" with no modifier.

2. While PGEN can store dosages, it cannot store per-genotype probabilities.  If you are performing an analysis where dosages aren't good enough and you actually need per-genotype probabilities, you shouldn't use PLINK2.

3. PLINK2 does not require true REF alleles to exist.  Ordinary .bed+.bim+.fam filesets don't have reliable information about REF/ALT, so PLINK2 marks REF alleles imported from them as provisional (until something like --ref-allele or --ref-from-fa is applied).

jie huang

unread,
Mar 4, 2025, 6:29:24 AM3/4/25
to plink2-users

Dear Chris:

As always, thank you so much!

Best regards,
Jie

Reply all
Reply to author
Forward
0 new messages