LD search for multi-allelic variants

73 views
Skip to first unread message

Ethan Kreuzer

unread,
Jul 24, 2023, 11:06:13 AM7/24/23
to plink2-users
I have the following command I am using to calculate LD for variants I want to find proxies for:

plink --bfile "${PLINK_REF_PANEL}/${chrom}" \

        --r2 --ld-window 1000000 \

        --ld-window-r2 ${PLINK_MIN_R2} \

        --keep "${PLINK_REF_PANEL_KEEP}" \

        --out "${out_prefix}" \

        --ld-snp-list "${SNAPTMP}/SNAP.input.proxy" \


I have noticed that when a variant is not biallelic, plink does not generate a list of proxies and the search result is void. Is plink not able to calculate ld when the input variant is multiallelic? Is this a limitation of plink or is it not possible to calculate LD when the a variant is multiallelic and there are no other tools that can do this.

Chris Chang

unread,
Jul 24, 2023, 11:16:26 AM7/24/23
to Ethan Kreuzer, plink2-users
This question does not make sense as stated, because the .bed file format can’t directly store multiallelic variants.  Please be MUCH more precise, including a reproducible example of a case where a result is expected but you don’t see it.

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/plink2-users/86a38809-df93-4bb2-99d9-a29c3af593ben%40googlegroups.com.

Ethan Kreuzer

unread,
Jul 24, 2023, 11:44:57 AM7/24/23
to plink2-users
Of course. Here is an example:
"${SNAPTMP}/SNAP.input.proxy" contains these varaints:
"

rs1557550

rs111368459

rs1632969

rs17206070

rs281861394

"

And all these variants present in my .bed file and are multi-allelic. To make my .bed files, I retained only the first instance of a variant when it was a multi-allelic variant that had multiple bi-allelic entries :


[-----.-------@hydra1 data]$ grep "rs1557550" 6.bim

6 rs1557550 0 32484425 G C

[ethan.kreuzer@hydra1 data]$ grep "rs111368459" 6.bim

6 rs111368459 0 29697462 G C

[ethan.kreuzer@hydra1 data]$ grep "rs1632969" 6.bim

6 rs1632969 0 29811858 A T

[ethan.kreuzer@hydra1 data]$ grep "rs17206070" 6.bim

6 rs17206070 0 32684584 A T

[ethan.kreuzer@hydra1 data]$ grep "rs281861394" 6.bim

6 rs281861394 0 32665831 A C


This way I could retain multi-allelic variants in my .bed files AND not have duplicate entries.


The output file of my plink command was this:

"

[-----.-------@hydra1 SNAPTMP]$ cat SNAP.6.proxy.ld

CHR_A BP_A SNP_A CHR_B BP_B SNP_B R2

"


Why did this not yield any results???

 

Chris Chang

unread,
Jul 24, 2023, 11:49:44 AM7/24/23
to Ethan Kreuzer, plink2-users
I asked for a fully reproducible example, so I could make a precise determination as to whether there SHOULD be results or not.  You have not yet provided that.

Chris Chang

unread,
Jul 24, 2023, 12:14:15 PM7/24/23
to Ethan Kreuzer, plink2-users
You don't need to send your entire original .bed file, but you must be able to identify at least ONE result that appears to be mistaken, and post enough information to reproduce that.  If you can't identify even one specific missing result, you're probably just wrong.

On Mon, Jul 24, 2023 at 9:11 AM Ethan Kreuzer <ethan....@gmail.com> wrote:
I would need to send my reference genome .bed files to give a fully reproducible example, which I’m not too sure is feasible. What more information do you need?

Sent from my iPhone

On Jul 24, 2023, at 11:49 AM, Chris Chang <chrch...@gmail.com> wrote:


Reply all
Reply to author
Forward
0 new messages