best practice for IBS distance matrix generation

165 views
Skip to first unread message

Matteo Ungaro

unread,
Feb 24, 2025, 11:37:22 AM2/24/25
to plink2-users
Hi there,

I'm planning on plotting a tree with ggtree using as input a PLINK2.0 IBS distance matrix.

To generate the matrix I'm doing the following:
~/./plink2 --vcf lupine_panel_no-mis.vcf.gz --make-pgen --vcf-half-call m --max-alleles 2 --allow-extra-chr --out lupine_panel_ibs ~/./plink2 --vcf lupine_panel_no-mis.vcf.gz --freq --out lupine_panel_ibs_maf ~/./plink2 --pfile lupine_panel_ibs --make-rel square --read-freq lupine_panel_ibs_maf.afreq --out phylo_tree
Then, I've been using the .rel and .rel.id files as input to R, but it appears the format is not compatible... is there anything wrong with what I'm doing, or else should I do something differently?

The .rel file looks like this
1.53578 -0.105267 -0.0945413 -0.083604 -0.209443 -0.139285 -0.208677 -0.199941 -0.209932 -0.130887 -0.082297 -0.0719018 -0.105267 1.60124 -0.142996 -0.023115 -0.243876 -0.0435319 -0.243711 -0.256835 -0.255688 -0.132416 -0.102566 -0.0512363 -0.0945413 -0.142996 1.70865 -0.00168153 -0.229154 -0.0920099 -0.201994 -0.278662 -0.210596 -0.190707 -0.164444 -0.101866 -0.083604 -0.023115 -0.00168153 1.36444 -0.176534 -0.0313074 -0.178905 -0.256911 -0.220043 -0.199516 -0.16628 -0.0265423 -0.209443 -0.243876 -0.229154 -0.176534 1.66881 -0.223521 0.672806 -0.346079 -0.25505 -0.243328 -0.219959 -0.194674 -0.139285 -0.0435319 -0.0920099 -0.0313074 -0.223521 1.6054 -0.209163 -0.304027 -0.216554 -0.225491 -0.186448 0.0659375 -0.208677 -0.243711 -0.201994 -0.178905 0.672806 -0.209163 1.63191 -0.329851 -0.252808 -0.252563 -0.233733 -0.193315 -0.199941 -0.256835 -0.278662 -0.256911 -0.346079 -0.304027 -0.329851 2.72659 -0.253858 -0.0673777 -0.158547 -0.274507 -0.209932 -0.255688 -0.210596 -0.220043 -0.25505 -0.216554 -0.252808 -0.253858 2.45426 -0.233602 -0.159556 -0.186578 -0.130887 -0.132416 -0.190707 -0.199516 -0.243328 -0.225491 -0.252563 -0.0673777 -0.233602 1.91195 -0.0165982 -0.219464 -0.082297 -0.102566 -0.164444 -0.16628 -0.219959 -0.186448 -0.233733 -0.158547 -0.159556 -0.0165982 1.64659 -0.156163 -0.0719018 -0.0512363 -0.101866 -0.0265423 -0.194674 0.0659375 -0.193315 -0.274507 -0.186578 -0.219464 -0.156163 1.41031
while the .rel.id has this structure
#IID INLUP00165 INLUP00169 INLUP00208 INLUP00214 INLUP00228 INLUP00233 INLUP00245 INLUP00325 INLUP00332 INLUP00393 INLUP00418 INLUP00496
I looked up a bit, and it seems that ggtree needs a Newick file format. Is this something that I can get from PLINK2.0, or should I attempt to produce it otherwise. In particular, I did the following to organize the matrix-individual information in a single file
grep -v '#' phylo_tree.rel.id | paste -d '\t' - phylo_tree.rel | awk 'BEGIN {print "'$(head -1 phylo_tree.rel.id)'"} {print $0}' | sed 's/\t/ /g' > phylo_tree.phy
which return the following
#IID INLUP00165 1.53578 -0.105267 -0.0945413 -0.083604 -0.209443 -0.139285 -0.208677 -0.199941 -0.209932 -0.130887 -0.082297 -0.0719018 INLUP00169 -0.105267 1.60124 -0.142996 -0.023115 -0.243876 -0.0435319 -0.243711 -0.256835 -0.255688 -0.132416 -0.102566 -0.0512363 INLUP00208 -0.0945413 -0.142996 1.70865 -0.00168153 -0.229154 -0.0920099 -0.201994 -0.278662 -0.210596 -0.190707 -0.164444 -0.101866 INLUP00214 -0.083604 -0.023115 -0.00168153 1.36444 -0.176534 -0.0313074 -0.178905 -0.256911 -0.220043 -0.199516 -0.16628 -0.0265423 INLUP00228 -0.209443 -0.243876 -0.229154 -0.176534 1.66881 -0.223521 0.672806 -0.346079 -0.25505 -0.243328 -0.219959 -0.194674 INLUP00233 -0.139285 -0.0435319 -0.0920099 -0.0313074 -0.223521 1.6054 -0.209163 -0.304027 -0.216554 -0.225491 -0.186448 0.0659375 INLUP00245 -0.208677 -0.243711 -0.201994 -0.178905 0.672806 -0.209163 1.63191 -0.329851 -0.252808 -0.252563 -0.233733 -0.193315 INLUP00325 -0.199941 -0.256835 -0.278662 -0.256911 -0.346079 -0.304027 -0.329851 2.72659 -0.253858 -0.0673777 -0.158547 -0.274507 INLUP00332 -0.209932 -0.255688 -0.210596 -0.220043 -0.25505 -0.216554 -0.252808 -0.253858 2.45426 -0.233602 -0.159556 -0.186578 INLUP00393 -0.130887 -0.132416 -0.190707 -0.199516 -0.243328 -0.225491 -0.252563 -0.0673777 -0.233602 1.91195 -0.0165982 -0.219464 INLUP00418 -0.082297 -0.102566 -0.164444 -0.16628 -0.219959 -0.186448 -0.233733 -0.158547 -0.159556 -0.0165982 1.64659 -0.156163 INLUP00496 -0.0719018 -0.0512363 -0.101866 -0.0265423 -0.194674 0.0659375 -0.193315 -0.274507 -0.186578 -0.219464 -0.156163 1.41031
Still, I'm unable to load the matrix in R... any help is much appreciated. Thanks in advance!

Christopher Chang

unread,
Feb 25, 2025, 1:57:27 PM2/25/25
to plink2-users
plink2 doesn't directly generate phylogenetic trees.  You will need to use another program (or R package) to generate a tree from a distance or relationship matrix.

Chris Chang

unread,
Mar 2, 2025, 12:22:39 PM3/2/25
to plink2-users, Matteo Ungaro
I should have added that the relationship statistic computed by --make-rel is not IBS; for now, use plink 1.9 --ibs-matrix for that.

On Tue, Feb 25, 2025 at 10:57 AM Christopher Chang <chrch...@gmail.com> wrote:
plink2 doesn't directly generate phylogenetic trees.  You will need to use another program (or R package) to generate a tree from a distance or relationship matrix.

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/plink2-users/1e8e816d-f1e2-4be1-8b64-88a24a43d3a1n%40googlegroups.com.

Matteo Ungaro

unread,
Mar 2, 2025, 4:07:29 PM3/2/25
to plink2-users
Hi Chris,

    thanks for following up, and sorry for haven't replied on the main thread in my last mail. I see, this was also part of my question and what I'm getting is that in PLINK2.0 at the moment there is no equivalent function to calculate IBS; I'll switch back to v1.9. Thanks!

P. S. in the meantime I managed to get my code to import, wrangle and plot a tree-based representation of a matrix sorted out in R

Chris Chang

unread,
Mar 2, 2025, 5:06:49 PM3/2/25
to Matteo Ungaro, plink2-users
Incidentally, this is easy to add to plink 2.0 -- as is, you can get an IBS matrix by postprocessing the output of "--make-king-table cols=+ibs1".  It has just been low-priority since the plink 1.9 implementation has been good enough for the use cases I've seen.

Reply all
Reply to author
Forward
0 new messages