Recode A

176 views
Skip to first unread message

yujin kim

unread,
Dec 29, 2021, 6:17:24 AM12/29/21
to plink2-users
Hello everyone,

 I'd like to predict the A disease response depending on individual's genotype, using effect allele(variants)

Now,  I have two files

1. One is variants summary statistics  (rsid, position, effect allele) related to A disease susceptibility , tab delimited file. 

2. The other one is plink format (bed, bim, fam) ,different from 1 cohort.

I  thought that I can use --recod A for my bfile(bed,bim,fam) to do allele count based on my first file.

To be specific, 
if the A is protective allele in rs12345 ,which has negative beta value , then 
the one who has AA genotype will be coded as 2. (AC or AT could be coded as 1) 

But, when I looked up the plink tutorial, there is no explanation how I can use summary statistics which can provide effect allele in detail.

Is there any plink option I can use to do what I said above?

Just in case, I want to let you know that I'd like to do allele count , not the PRS which can use --score option. (but I want to know I can do allele count using score option because I think the option including allele count) 

Thank you in advance.

Best regards,

YJ kim.



Christopher Chang

unread,
Dec 29, 2021, 11:28:27 AM12/29/21
to plink2-users
To see every variant listed separately:
  a. From your summary statistics file, generate a file that has variant IDs in the first column, and effect alleles in the second.
  b. "plink --bfile ... --recode A --recode-allele two_col_sum_stats.tsv"

To see just the allele-count sums:
  a. From your summary statistics file, generate a file that has variant IDs in the first column, effect alleles in the second, and "1" in the third.
  b. "plink --bfile ... --score three_col_sum_stats.tsv sum"

yujin kim

unread,
Jan 3, 2022, 12:19:19 AM1/3/22
to plink2-users
Thank you for your anwer! 

It is really helpful for me.

One more question I want to ask..

I'd like to code genotype with a risk allele which as positive beta value which is reverse to the way I code genotype based on a protective allele.

While a protective allele increases the score depending on individual genotype, a risk allele decreases it. 

To be specific, if the C is risk allele in rs12345 ,then the one who has CC genotype will be coded as 0. (CA  could be coded as 1) 

Is there any option I can reflect those logic to my target data using base data(summary statistics)?

Thank you very much again.

Happy New Year!

Best Regards,

YJ kim.


2021년 12월 30일 목요일 오전 1시 28분 27초 UTC+9에 chrch...@gmail.com님이 작성:

Christopher Chang

unread,
Jan 3, 2022, 12:32:06 PM1/3/22
to plink2-users
Just change the allele in the second column?
Reply all
Reply to author
Forward
0 new messages