fst hudson's method plink 2.0

1,128 views
Skip to first unread message

Davide Piffer

unread,
Jul 25, 2022, 7:08:42 AM7/25/22
to plink2-users
I have successfully computed Fst using plink 1.9. It's very simple.
The plink 2.0 syntax is puzzling me. 
I tried: ./plink2 --bfile 1kgtot_filt --within ~/genetics_2020/within_plink_1KG.tab --fst hudson --double-id --extract ~/genetics_2020/SNPs.txt --out global_1kg_superpop

Output was:
--within: 4 non-null categories present.
1 categorical phenotype loaded (2000 values).
--extract: 3257 variants remaining.
3257 variants remaining after main filters.
Error: --fst phenotype 'hudson' not loaded.

I don't know what I am supposed to write after fst to specify "phenotype" as the data has no phenotype. 
Also, I dont understand the correct syntax to specify hudson method.



Christopher Chang

unread,
Jul 25, 2022, 10:13:40 AM7/25/22
to plink2-users

"--within constructs a PLINK 2 categorical phenotype out of a PLINK 1.x 'cluster' file."
"If no phenotype name is given, it defaults to 'CATPHENO'."

So the phenotype name to provide to --fst is "CATPHENO".

2. The --fst documentation has the following flag usage summary on top:

--fst <categorical or binary phenotype name> ['method='<method name>]
      ['blocksize='<jackknife block size>] ['cols='<column set descriptor>]
      ['report-variants'] ['zs'] ['vcols='<column set descriptor>]
      ['base='<pop. ID> | 'ids='<pop. ID> | 'file='<pop.-ID-pair file>]
      [other population ID(s) for base=/ids=...]


The method part says "['method='<method name>]".

In the "Interpreting our flag usage summaries" section of the General usage page, the relevant text is as follows:

  * ['quoted_text='<description of value>] denotes an optional modifier that must begin with the quoted text, and be followed by a value with no whitespace in between. '|' may also be used here to indicate mutually exclusive options. E.g. "--glm perm" and "--glm mperm=10000" are both valid, and "--glm perm mperm=10000" invalid, given the summary

  --glm ['perm' | 'mperm='<value>] ...

Davide Piffer

unread,
Jul 25, 2022, 11:48:40 AM7/25/22
to plink2-users
Thanks. It now works. I used this: /plink2 --bfile 1kgtot_filt --within ~/genetics_2020/within_plink_1KG.tab --fst CATPHENO method=hudson --double-id --extract ~/genetics_2020/SNPs.txt --out pairwise_Ea3_1kg_superpop_hudson

The default pairwise calculation that we get is great and something that I had looked forward to get.
I also need the global Fst estimate. The documentation says 
  • 'ids=' specifies an all-vs.-all comparison within the given set of populations.
However, if I specify all the IDS, the output is not a single global Fst value like in PLINK 1.9, but I get pairwise values.
Is there an option to obtain global Fst like in PLINK 1.9?

Christopher Chang

unread,
Jul 25, 2022, 12:19:54 PM7/25/22
to plink2-users
Not for now.  This will probably be added back at some point, but it's a lower priority than missing PLINK 2.0 functionality that *isn't* covered by PLINK 1.9.
Reply all
Reply to author
Forward
0 new messages