Dosage rounding in --make-bed

109 views
Skip to first unread message

Dan Gealow

unread,
May 6, 2024, 2:43:39 PM5/6/24
to plink2-users
Since plink1 .bed files can only represent hardcalls, --make-bed on a dataset with non-integer dosages must be rounding and/or thresholding the dosages. How it does so is not currently documented under https://www.cog-genomics.org/plink/2.0/data#make_bed , so I'm not sure whether all dosages will be rounded to the nearest integer; or whether some low-confidence calls will be set to NA (as plink1 did when reading in e.g. bgen files).

I think from reading https://www.cog-genomics.org/plink/2.0/input#dosage_import_settings that it's the latter by default, but it would be helpful for this to be clarified under the --make-bed documentation as well.

Relatedly, it seems like fill-missing-from-dosage is not an option to --make-bed -- is there any way to guarantee that *all* dosages will be rounded to hardcalls (and not NA'd) when converting a fileset to .bed?

-Dan

Christopher Chang

unread,
May 6, 2024, 3:46:45 PM5/6/24
to plink2-users
This isn't documented under --make-bed because the hardcalls exported by --make-bed are all already in the .pgen (see the first paragraph under the #dosage_import_settings link you posted); there is no additional rounding step that occurs during --make-bed.

Thus, to export a .bed with no missing calls, first create a .pgen with no missing calls, with e.g. "--make-pgen fill-missing-from-dosage".

Dan Gealow

unread,
May 6, 2024, 4:06:28 PM5/6/24
to plink2-users
Ah, thanks--might be good to include that clarification under --make-bed then. ("This uses the closest hardcall data included in the .pgen file, see ..." or something like that.) Actually, I'm still a bit unclear how this will behave if the input data isn't a pgen fileset; e.g. if the command starts with --bgen .

Christopher Chang

unread,
May 6, 2024, 4:22:30 PM5/6/24
to plink2-users
Reply all
Reply to author
Forward
0 new messages